Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / mosaicml/streaming issues and pull requests

#528 - Offload exception to mds_write.

Pull Request - State: closed - Opened by XiaohanZhangCMU 12 months ago

#527 - doc: add NDArray format

Pull Request - State: closed - Opened by OrenLeung 12 months ago

#526 - First Class Support for Numpy Arrays

Issue - State: closed - Opened by OrenLeung 12 months ago - 1 comment
Labels: enhancement

#525 - improve exception error messages for downloading

Pull Request - State: closed - Opened by Skylion007 12 months ago - 1 comment

#524 - Organize utils.

Pull Request - State: closed - Opened by knighton 12 months ago

#523 - Dataset kwargs switchover.

Pull Request - State: closed - Opened by knighton 12 months ago - 1 comment

#522 - RAM Out of memory when using Streaming Dataloader

Issue - State: closed - Opened by kietbg0079 12 months ago - 3 comments
Labels: bug

#521 - Allow Stream's `repeat` option to cycle through entire dataset before repeating, when `shuffle=True`

Issue - State: open - Opened by m-harmonic 12 months ago - 5 comments
Labels: enhancement

#518 - Bump databricks-sdk from 0.8.0 to 0.14.0

Pull Request - State: closed - Opened by dependabot[bot] 12 months ago
Labels: dependencies

#517 - Fixed bugs when trying to use very small datasets

Pull Request - State: closed - Opened by snarayan21 12 months ago

#516 - merge_index doesn't show path with subfolders

Issue - State: open - Opened by germanjke 12 months ago - 5 comments

#515 - Fixed comments and update dataframe_to_MDS API signature

Pull Request - State: closed - Opened by karan6181 12 months ago - 1 comment

#514 - Simulator bug fixes (proportion, repeat, yaml ingestion)

Pull Request - State: closed - Opened by snarayan21 12 months ago

#513 - Bump pydantic from 2.4.2 to 2.5.2

Pull Request - State: closed - Opened by dependabot[bot] 12 months ago
Labels: dependencies

#512 - Add support for the Canned ACL environment variable for AWS S3

Pull Request - State: closed - Opened by karan6181 12 months ago - 1 comment

#511 - option to host files via https/remote being a https url

Issue - State: open - Opened by felix-red-panda almost 1 year ago - 3 comments
Labels: enhancement

#510 - different eval on different dataloader streams

Issue - State: closed - Opened by germanjke about 1 year ago - 2 comments

#509 - Bump pydantic from 2.4.2 to 2.5.1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#508 - Bump databricks-sdk from 0.8.0 to 0.13.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#507 - cache_limit decreases utilization

Issue - State: closed - Opened by germanjke about 1 year ago - 3 comments

#506 - Bump yamllint from 1.32.0 to 1.33.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#505 - choose a subset of shards for dataloading

Issue - State: closed - Opened by gongbudaizhe about 1 year ago - 4 comments
Labels: enhancement

#504 - Fix for CVE-2023-47248

Pull Request - State: closed - Opened by bandish-shah about 1 year ago

#503 - Is `StreamingDataset` compatible with Ray distributed training?

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 2 comments

#502 - Adding warning messages for new defaults

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#501 - Retrieve batch size correctly from vision yamls for simulator

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#500 - Migrate pydocstyle to ruff

Pull Request - State: closed - Opened by Skylion007 about 1 year ago

#498 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.14.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#497 - Bump uvicorn from 0.23.2 to 0.24.0.post1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#496 - Bump fastapi from 0.104.0 to 0.104.1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#495 - Bumping version for streaming v0.7.0

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#494 - Move examples out, merge base/ upward

Pull Request - State: closed - Opened by knighton about 1 year ago - 2 comments

#493 - ERROR: Unexpected bus error encountered in worker.

Issue - State: closed - Opened by yiyihum about 1 year ago - 2 comments
Labels: bug

#492 - Change column name

Issue - State: closed - Opened by kietna1809 about 1 year ago - 1 comment

#491 - Bump pytest from 7.4.2 to 7.4.3

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#490 - Bump pypandoc from 1.11 to 1.12

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#489 - Support for sub-sampling long videos

Issue - State: open - Opened by con-bren about 1 year ago - 4 comments
Labels: enhancement

#488 - "Golden spike" PR

Pull Request - State: open - Opened by knighton about 1 year ago - 1 comment

#487 - Update release yaml to not write anything to GitHub

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#486 - Check for invalid hash algorithm name

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#485 - Bump databricks-sdk from 0.8.0 to 0.12.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#484 - Update __init__.py

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#483 - Hf ingestion

Pull Request - State: open - Opened by XiaohanZhangCMU about 1 year ago

#482 - Bump sphinx-tabs from 3.4.1 to 3.4.4

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#481 - Bump gitpython from 3.1.37 to 3.1.40

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#480 - Bump fastapi from 0.103.2 to 0.104.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#479 - Better default values for StreamingDataset args

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#478 - Modify dataframe_to_mds to accept streaming DF

Pull Request - State: open - Opened by maddiedawson about 1 year ago

#477 - do not remove local directory when out is local

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#475 - Request for more informative error messages when cache_limit is hit

Issue - State: closed - Opened by thayes427 about 1 year ago - 6 comments
Labels: enhancement

#474 - Bump version to 0.6.1

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#473 - Fixed codeql out of disk space issue

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#472 - Maintain order for merge_index_from_list

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago - 1 comment

#471 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.13.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 2 comments
Labels: dependencies

#470 - Bump databricks-sdk from 0.8.0 to 0.11.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#469 - Fix doc strings

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#468 - Merge index util

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#467 - Update example notebook for merging datasets

Pull Request - State: closed - Opened by amitani about 1 year ago - 2 comments

#466 - Update MCLI credential page for Databricks

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#465 - Integration test for dataframe_to_mds and merge_index

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#464 - Best strategy to do joint image/video training

Issue - State: open - Opened by universome about 1 year ago - 1 comment

#463 - Add py1e warning when Shuffle block size is smaller than shard size

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#462 - Convert folder of aws s3 parquet to s3 mds

Issue - State: closed - Opened by segalinc about 1 year ago - 5 comments
Labels: enhancement

#461 - `ValueError: invalid literal for int() with base 10` when instantiating `StreamingDataset`

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 3 comments
Labels: bug

#460 - `get_partitions_orig` returns negative values when number of samples is low

Issue - State: closed - Opened by antoinebrl about 1 year ago - 5 comments
Labels: bug

#459 - MDSWriter encountered GCS upload error but kept going

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 5 comments

#458 - Bump databricks-sdk from 0.8.0 to 0.10.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#457 - How to clean up shared memory file descriptors?

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 6 comments

#456 - Update integration test to include sample order comparison

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#455 - Bump pydantic from 2.3.0 to 2.4.2

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#454 - Bump fastapi from 0.103.1 to 0.103.2

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#452 - Fix broken bibtext

Pull Request - State: closed - Opened by Skylion007 about 1 year ago

#451 - `{MDS,Joint}Writer` should be more tolerant to transient failures in remote storage

Issue - State: closed - Opened by thempatel about 1 year ago - 8 comments
Labels: bug

#449 - Add merge index file utility

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#448 - Add a retry logic with backoff and jitter

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#447 - How to handle broken samples?

Issue - State: open - Opened by universome about 1 year ago - 6 comments

#446 - Bump gitpython from 3.1.36 to 3.1.37

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#445 - Bump databricks-sdk from 0.8.0 to 0.9.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#444 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.12.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 2 comments
Labels: dependencies

#443 - Training on PQ shards

Pull Request - State: open - Opened by knighton about 1 year ago

#442 - py1e randomized

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#441 - A standard Batch type of Transformers is not handled by StreamingDataloader

Issue - State: closed - Opened by Hubert-Bonisseur about 1 year ago - 1 comment
Labels: bug

#440 - Enabled the allgather test

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#439 - Fix stylistic issues (mostly 100col, docstring conventions)

Pull Request - State: closed - Opened by knighton about 1 year ago - 1 comment

#438 - Silence spurious lint warning

Pull Request - State: closed - Opened by knighton about 1 year ago - 2 comments

#437 - Can't get mid epoch resumption to work

Issue - State: closed - Opened by Hubert-Bonisseur about 1 year ago - 2 comments
Labels: bug

#436 - Bump pytest-codeblocks from 0.16.1 to 0.17.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#435 - Bump gitpython from 3.1.34 to 3.1.36

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#434 - Validate writer arguments

Pull Request - State: closed - Opened by karan6181 about 1 year ago - 2 comments

#433 - Bump version to 0.6.0

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago

#432 - changed choose to epoch_size in stream proportion docstring

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#431 - Total number of samples when using Streams with proportion?

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 1 comment

#430 - tag shared and temp files with username

Pull Request - State: open - Opened by acutkosky about 1 year ago - 1 comment

#429 - multiple users on same system encounter permissions errors

Issue - State: open - Opened by acutkosky about 1 year ago - 8 comments
Labels: bug