Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / mosaicml/streaming issues and pull requests

#428 - Bump pytest from 7.4.1 to 7.4.2

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 2 comments
Labels: dependencies

#427 - Bump gitpython from 3.1.34 to 3.1.35

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#426 - Optimize dataframe writer (small change)

Pull Request - State: closed - Opened by Skylion007 about 1 year ago

#425 - Fix deadlock

Pull Request - State: closed - Opened by acutkosky about 1 year ago - 4 comments

#424 - Fixed python version

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#423 - Remove torchtext

Pull Request - State: closed - Opened by mvpatel2000 about 1 year ago - 1 comment

#422 - Fix nb

Pull Request - State: closed - Opened by XiaohanZhangCMU about 1 year ago - 1 comment

#421 - How to enable mid-epoch resumption?

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 2 comments

#420 - Raise an exception if cache limit is too low

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#419 - Does the `shuffle` arg on the dataloader do anything?

Issue - State: closed - Opened by genesis-jamin about 1 year ago - 2 comments

#417 - StreamingDataset deadlocks when cache_limit is smaller than the size of the shards

Issue - State: closed - Opened by Hubert-Bonisseur about 1 year ago - 1 comment
Labels: bug

#416 - Support online de-compressing of shards on LocalDataset as it is already done for StreamingDataset

Issue - State: closed - Opened by sagnak about 1 year ago - 2 comments
Labels: enhancement

#414 - Bump databricks-sdk from 0.6.0 to 0.8.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#413 - Bump fastapi from 0.103.0 to 0.103.1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#412 - Bump databricks-sdk from 0.6.0 to 0.7.1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#411 - Bump pytest from 7.4.0 to 7.4.1

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#410 - Bump gitpython from 3.1.32 to 3.1.34

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#409 - Trying to instantiate streamingdataset from hydra/omegaconf causes error

Issue - State: closed - Opened by mpetri about 1 year ago - 8 comments
Labels: bug

#408 - Stratified Batching

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#407 - Per Stream Batching

Pull Request - State: closed - Opened by snarayan21 about 1 year ago

#406 - Reusable local directory when remote is None

Pull Request - State: closed - Opened by karan6181 about 1 year ago

#405 - File Not Found Error Training on Amazon Sagemaker

Issue - State: closed - Opened by dheepan-aws about 1 year ago - 4 comments
Labels: bug

#404 - Bump databricks-sdk from 0.3.1 to 0.6.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#403 - Bump pydantic from 2.2.1 to 2.3.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#402 - Bump fastapi from 0.101.1 to 0.103.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#401 - Resuable local directory

Issue - State: closed - Opened by luke-han about 1 year ago - 4 comments
Labels: enhancement

#400 - Multi node training

Issue - State: closed - Opened by palash04 over 1 year ago - 8 comments
Labels: bug

#399 - example requested: debug individual shard for poison pills

Issue - State: closed - Opened by mooreniemi over 1 year ago - 6 comments

#397 - Support tensor parallel/pipeline parallel

Issue - State: closed - Opened by gongel over 1 year ago - 8 comments
Labels: enhancement

#396 - Fix MosaicML platform credential setup links

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#395 - Audio?

Issue - State: closed - Opened by cinjon over 1 year ago - 6 comments
Labels: enhancement

#394 - Expanded range shuffle

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#393 - How to embody this characteristic: True Determinism

Issue - State: closed - Opened by gongel over 1 year ago - 2 comments
Labels: bug

#392 - Per-stream processing

Issue - State: open - Opened by lorabit110 over 1 year ago - 7 comments
Labels: enhancement

#391 - Improve shard efficiency of sampling for fractional stream repeats.

Pull Request - State: closed - Opened by knighton over 1 year ago

#390 - Plug hole in MDS type system: add arbitrary-precision decimal

Pull Request - State: closed - Opened by knighton over 1 year ago - 2 comments

#389 - Bump pydantic from 2.1.1 to 2.2.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#388 - Bump furo from 2023.7.26 to 2023.8.19

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: dependencies

#387 - Bump fastapi from 0.101.0 to 0.101.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#386 - deadlock when process exits before end of epoch

Issue - State: closed - Opened by acutkosky over 1 year ago - 2 comments
Labels: bug

#385 - The Simulator!

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#384 - Experiment: much more powerful MDS type system ("DBS").

Pull Request - State: open - Opened by knighton over 1 year ago

#383 - Fixed predownload value to zero issue

Pull Request - State: closed - Opened by karan6181 over 1 year ago - 1 comment

#382 - Fixed fake AWS credentials

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#381 - Compatibility with transformers.Trainer

Issue - State: open - Opened by lorabit110 over 1 year ago - 7 comments
Labels: enhancement

#380 - Better documentation for epoch_size

Issue - State: closed - Opened by lorabit110 over 1 year ago - 4 comments
Labels: enhancement

#379 - Benchmarking partitioning

Pull Request - State: closed - Opened by knighton over 1 year ago - 2 comments

#378 - fixed comments

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#377 - Stream unspecified docstring change

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#376 - Add google Application Default Credentials to download

Pull Request - State: closed - Opened by fgerzer over 1 year ago - 2 comments

#375 - Add a regression test for mixing of different dataset streams

Pull Request - State: closed - Opened by b-chu over 1 year ago

#374 - Epoch size default behavior

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#373 - Py1br algorithm implementation

Pull Request - State: closed - Opened by snarayan21 over 1 year ago - 1 comment

#372 - Check if index.json exists locally before downloading

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#371 - how to write mds file parallel? or is it support multiple files?

Issue - State: closed - Opened by Gustfh over 1 year ago - 2 comments
Labels: enhancement

#370 - version bump to 0.5.2

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#369 - Fixed CI test to perform proper directory cleanup

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#368 - Bump uvicorn from 0.23.1 to 0.23.2

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#367 - Bump fastapi from 0.100.0 to 0.101.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#366 - Bump pydantic from 1.10.11 to 2.1.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#365 - Bench/plot sample access times across data and across formats

Pull Request - State: closed - Opened by knighton over 1 year ago - 2 comments

#364 - Apply ruff pre-commit hook

Pull Request - State: closed - Opened by Skylion007 over 1 year ago

#363 - Add delta to mds converter

Pull Request - State: closed - Opened by XiaohanZhangCMU over 1 year ago - 3 comments

#362 - Add support for Databricks File System backend

Pull Request - State: closed - Opened by maddiedawson over 1 year ago - 4 comments

#361 - Add support for downloading from Unity Catalog volumes

Pull Request - State: closed - Opened by maddiedawson over 1 year ago - 2 comments

#360 - Add download_from_databricks_uc_volume to streaming.base.storage

Pull Request - State: closed - Opened by maddiedawson over 1 year ago - 1 comment

#359 - Add a regression test for shuffling sample order

Pull Request - State: closed - Opened by b-chu over 1 year ago

#358 - Add iteration time test as part of regression testing

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#357 - mds ndarray int conversion

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#356 - Fixed sampling

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#355 - Mds int conversion

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#354 - Bump furo from 2023.5.20 to 2023.7.26

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#353 - Bump pydantic from 1.10.11 to 2.1.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#352 - Bump uvicorn from 0.23.1 to 0.23.2

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#351 - Bump fastapi from 0.100.0 to 0.100.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#350 - Request for more ways to divide shards besides size_limit

Issue - State: closed - Opened by luke-han over 1 year ago - 2 comments
Labels: enhancement

#349 - Fixed sampling

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#348 - added default behavior if no streams and epoch_size specified

Pull Request - State: closed - Opened by snarayan21 over 1 year ago

#347 - Use the training process to inform the data loading process

Issue - State: closed - Opened by xiamengzhou over 1 year ago - 5 comments

#346 - Download the index.json file as tmp extension until it finishes

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#345 - json.decoder.JSONDecodeError: Index file is empty or corrupted

Issue - State: closed - Opened by shivshandilya over 1 year ago - 2 comments

#344 - mixing dataset, get the dataset name

Issue - State: closed - Opened by gongbudaizhe over 1 year ago - 2 comments

#343 - Update contribution guide and improved unittest logic

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#342 - Mixing Datasets

Issue - State: closed - Opened by Srini-98 over 1 year ago - 5 comments
Labels: enhancement

#341 - Fixed accidental shard delete test

Pull Request - State: closed - Opened by karan6181 over 1 year ago

#340 - Updated pre commit packages

Pull Request - State: closed - Opened by snarayan21 over 1 year ago - 1 comment

#339 - cache_limit bottlenecks the performance

Issue - State: closed - Opened by mayankjobanputra over 1 year ago - 12 comments
Labels: bug

#338 - Bump uvicorn from 0.23.0 to 0.23.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#337 - Using Streaming Dataset for Multi Node DDP

Issue - State: closed - Opened by Srini-98 over 1 year ago
Labels: bug

#336 - Could you explain the principle behind streaming?

Issue - State: closed - Opened by deeptimhe over 1 year ago - 10 comments

#335 - Create index.json from set of mds shards

Issue - State: closed - Opened by justinpinkney over 1 year ago - 5 comments
Labels: enhancement

#333 - human-readable suffixes for size_limit and epoch_size

Pull Request - State: closed - Opened by snarayan21 over 1 year ago - 1 comment

#332 - Shared Memory issue with multiple instances of Streaming Dataset in a multi-gpu setup

Issue - State: open - Opened by shivshandilya over 1 year ago - 23 comments
Labels: bug

#331 - There appear to be 2 leaked shared_memory objects to clean up at shutdown

Issue - State: closed - Opened by guoyejun over 1 year ago - 7 comments
Labels: bug

#330 - Fix init local dir zip-only shard handling

Pull Request - State: closed - Opened by knighton over 1 year ago

#329 - Bump gitpython from 3.1.31 to 3.1.32

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies