Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / mosaicml/streaming issues and pull requests
#528 - Offload exception to mds_write.
Pull Request -
State: closed - Opened by XiaohanZhangCMU 12 months ago
#527 - doc: add NDArray format
Pull Request -
State: closed - Opened by OrenLeung 12 months ago
#526 - First Class Support for Numpy Arrays
Issue -
State: closed - Opened by OrenLeung 12 months ago
- 1 comment
Labels: enhancement
#525 - improve exception error messages for downloading
Pull Request -
State: closed - Opened by Skylion007 12 months ago
- 1 comment
#524 - Organize utils.
Pull Request -
State: closed - Opened by knighton 12 months ago
#523 - Dataset kwargs switchover.
Pull Request -
State: closed - Opened by knighton 12 months ago
- 1 comment
#522 - RAM Out of memory when using Streaming Dataloader
Issue -
State: closed - Opened by kietbg0079 12 months ago
- 3 comments
Labels: bug
#521 - Allow Stream's `repeat` option to cycle through entire dataset before repeating, when `shuffle=True`
Issue -
State: open - Opened by m-harmonic 12 months ago
- 5 comments
Labels: enhancement
#520 - Moved local directory creation and existence check from CloudUploader to Writer class
Pull Request -
State: open - Opened by karan6181 12 months ago
#519 - Add flag to allow or reject datasets containing unsafe types (i.e., Pickle)
Pull Request -
State: closed - Opened by knighton 12 months ago
#518 - Bump databricks-sdk from 0.8.0 to 0.14.0
Pull Request -
State: closed - Opened by dependabot[bot] 12 months ago
Labels: dependencies
#517 - Fixed bugs when trying to use very small datasets
Pull Request -
State: closed - Opened by snarayan21 12 months ago
#516 - merge_index doesn't show path with subfolders
Issue -
State: open - Opened by germanjke 12 months ago
- 5 comments
#515 - Fixed comments and update dataframe_to_MDS API signature
Pull Request -
State: closed - Opened by karan6181 12 months ago
- 1 comment
#514 - Simulator bug fixes (proportion, repeat, yaml ingestion)
Pull Request -
State: closed - Opened by snarayan21 12 months ago
#513 - Bump pydantic from 2.4.2 to 2.5.2
Pull Request -
State: closed - Opened by dependabot[bot] 12 months ago
Labels: dependencies
#512 - Add support for the Canned ACL environment variable for AWS S3
Pull Request -
State: closed - Opened by karan6181 12 months ago
- 1 comment
#511 - option to host files via https/remote being a https url
Issue -
State: open - Opened by felix-red-panda almost 1 year ago
- 3 comments
Labels: enhancement
#510 - different eval on different dataloader streams
Issue -
State: closed - Opened by germanjke about 1 year ago
- 2 comments
#509 - Bump pydantic from 2.4.2 to 2.5.1
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#508 - Bump databricks-sdk from 0.8.0 to 0.13.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#507 - cache_limit decreases utilization
Issue -
State: closed - Opened by germanjke about 1 year ago
- 3 comments
#506 - Bump yamllint from 1.32.0 to 1.33.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#505 - choose a subset of shards for dataloading
Issue -
State: closed - Opened by gongbudaizhe about 1 year ago
- 4 comments
Labels: enhancement
#504 - Fix for CVE-2023-47248
Pull Request -
State: closed - Opened by bandish-shah about 1 year ago
#503 - Is `StreamingDataset` compatible with Ray distributed training?
Issue -
State: closed - Opened by genesis-jamin about 1 year ago
- 2 comments
#502 - Adding warning messages for new defaults
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#501 - Retrieve batch size correctly from vision yamls for simulator
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#500 - Migrate pydocstyle to ruff
Pull Request -
State: closed - Opened by Skylion007 about 1 year ago
#499 - Fixing simulator command with simulation directories being included in package
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#498 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.14.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#497 - Bump uvicorn from 0.23.2 to 0.24.0.post1
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#496 - Bump fastapi from 0.104.0 to 0.104.1
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#495 - Bumping version for streaming v0.7.0
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#494 - Move examples out, merge base/ upward
Pull Request -
State: closed - Opened by knighton about 1 year ago
- 2 comments
#493 - ERROR: Unexpected bus error encountered in worker.
Issue -
State: closed - Opened by yiyihum about 1 year ago
- 2 comments
Labels: bug
#492 - Change column name
Issue -
State: closed - Opened by kietna1809 about 1 year ago
- 1 comment
#491 - Bump pytest from 7.4.2 to 7.4.3
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#490 - Bump pypandoc from 1.11 to 1.12
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#489 - Support for sub-sampling long videos
Issue -
State: open - Opened by con-bren about 1 year ago
- 4 comments
Labels: enhancement
#488 - "Golden spike" PR
Pull Request -
State: open - Opened by knighton about 1 year ago
- 1 comment
#487 - Update release yaml to not write anything to GitHub
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#486 - Check for invalid hash algorithm name
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#485 - Bump databricks-sdk from 0.8.0 to 0.12.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#484 - Update __init__.py
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#483 - Hf ingestion
Pull Request -
State: open - Opened by XiaohanZhangCMU about 1 year ago
#482 - Bump sphinx-tabs from 3.4.1 to 3.4.4
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#481 - Bump gitpython from 3.1.37 to 3.1.40
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#480 - Bump fastapi from 0.103.2 to 0.104.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#479 - Better default values for StreamingDataset args
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#478 - Modify dataframe_to_mds to accept streaming DF
Pull Request -
State: open - Opened by maddiedawson about 1 year ago
#477 - do not remove local directory when out is local
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#476 - Relaxing divisibility constraints on num_canonical_nodes and num_physical_nodes
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#475 - Request for more informative error messages when cache_limit is hit
Issue -
State: closed - Opened by thayes427 about 1 year ago
- 6 comments
Labels: enhancement
#474 - Bump version to 0.6.1
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#473 - Fixed codeql out of disk space issue
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#472 - Maintain order for merge_index_from_list
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
- 1 comment
#471 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.13.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 2 comments
Labels: dependencies
#470 - Bump databricks-sdk from 0.8.0 to 0.11.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#469 - Fix doc strings
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#468 - Merge index util
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#467 - Update example notebook for merging datasets
Pull Request -
State: closed - Opened by amitani about 1 year ago
- 2 comments
#466 - Update MCLI credential page for Databricks
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#465 - Integration test for dataframe_to_mds and merge_index
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#464 - Best strategy to do joint image/video training
Issue -
State: open - Opened by universome about 1 year ago
- 1 comment
#463 - Add py1e warning when Shuffle block size is smaller than shard size
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#462 - Convert folder of aws s3 parquet to s3 mds
Issue -
State: closed - Opened by segalinc about 1 year ago
- 5 comments
Labels: enhancement
#461 - `ValueError: invalid literal for int() with base 10` when instantiating `StreamingDataset`
Issue -
State: closed - Opened by genesis-jamin about 1 year ago
- 3 comments
Labels: bug
#460 - `get_partitions_orig` returns negative values when number of samples is low
Issue -
State: closed - Opened by antoinebrl about 1 year ago
- 5 comments
Labels: bug
#459 - MDSWriter encountered GCS upload error but kept going
Issue -
State: closed - Opened by genesis-jamin about 1 year ago
- 5 comments
#458 - Bump databricks-sdk from 0.8.0 to 0.10.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#457 - How to clean up shared memory file descriptors?
Issue -
State: closed - Opened by genesis-jamin about 1 year ago
- 6 comments
#456 - Update integration test to include sample order comparison
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#455 - Bump pydantic from 2.3.0 to 2.4.2
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#454 - Bump fastapi from 0.103.1 to 0.103.2
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#453 - All my JsonWritter jobs failed due to WARNING urllib3.connectionpool Connection pool is full, discarding connection. Connection pool size: 10
Issue -
State: closed - Opened by viyjy about 1 year ago
- 7 comments
#452 - Fix broken bibtext
Pull Request -
State: closed - Opened by Skylion007 about 1 year ago
#451 - `{MDS,Joint}Writer` should be more tolerant to transient failures in remote storage
Issue -
State: closed - Opened by thempatel about 1 year ago
- 8 comments
Labels: bug
#450 - Fix BatchFeature of Transformers not handled by StreamingDataloader
Pull Request -
State: closed - Opened by Hubert-Bonisseur about 1 year ago
#449 - Add merge index file utility
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#448 - Add a retry logic with backoff and jitter
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#447 - How to handle broken samples?
Issue -
State: open - Opened by universome about 1 year ago
- 6 comments
#446 - Bump gitpython from 3.1.36 to 3.1.37
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#445 - Bump databricks-sdk from 0.8.0 to 0.9.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#444 - Update google-cloud-storage requirement from <2.11.0,>=2.9.0 to >=2.9.0,<2.12.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 2 comments
Labels: dependencies
#443 - Training on PQ shards
Pull Request -
State: open - Opened by knighton about 1 year ago
#442 - py1e randomized
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#441 - A standard Batch type of Transformers is not handled by StreamingDataloader
Issue -
State: closed - Opened by Hubert-Bonisseur about 1 year ago
- 1 comment
Labels: bug
#440 - Enabled the allgather test
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
#439 - Fix stylistic issues (mostly 100col, docstring conventions)
Pull Request -
State: closed - Opened by knighton about 1 year ago
- 1 comment
#438 - Silence spurious lint warning
Pull Request -
State: closed - Opened by knighton about 1 year ago
- 2 comments
#437 - Can't get mid epoch resumption to work
Issue -
State: closed - Opened by Hubert-Bonisseur about 1 year ago
- 2 comments
Labels: bug
#436 - Bump pytest-codeblocks from 0.16.1 to 0.17.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#435 - Bump gitpython from 3.1.34 to 3.1.36
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies
#434 - Validate writer arguments
Pull Request -
State: closed - Opened by karan6181 about 1 year ago
- 2 comments
#433 - Bump version to 0.6.0
Pull Request -
State: closed - Opened by XiaohanZhangCMU about 1 year ago
#432 - changed choose to epoch_size in stream proportion docstring
Pull Request -
State: closed - Opened by snarayan21 about 1 year ago
#431 - Total number of samples when using Streams with proportion?
Issue -
State: closed - Opened by genesis-jamin about 1 year ago
- 1 comment
#430 - tag shared and temp files with username
Pull Request -
State: open - Opened by acutkosky about 1 year ago
- 1 comment
#429 - multiple users on same system encounter permissions errors
Issue -
State: open - Opened by acutkosky about 1 year ago
- 8 comments
Labels: bug