Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / mosaicml/streaming issues and pull requests
#328 - Bump pydantic from 1.10.9 to 1.10.11
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#327 - Bump uvicorn from 0.22.0 to 0.23.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#326 - Bump pydantic from 1.10.9 to 2.0.3
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 1 comment
Labels: dependencies
#325 - astreaming/base/shared/prefix.py: use a complete command to clean up …
Pull Request -
State: closed - Opened by guoyejun over 1 year ago
#324 - DDP with streaming got duplicate data
Issue -
State: closed - Opened by gongel over 1 year ago
- 4 comments
Labels: bug
#323 - Bump pydantic from 1.10.9 to 2.0.2
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 2 comments
Labels: dependencies
#322 - Bump fastapi from 0.98.0 to 0.100.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 1 comment
Labels: dependencies
#321 - Bump pydantic from 1.10.9 to 2.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 1 comment
Labels: dependencies
#320 - Bump fastapi from 0.98.0 to 0.99.1
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 1 comment
Labels: dependencies
#319 - Add a regression test for StreamingDataset using cloud providers
Pull Request -
State: closed - Opened by b-chu over 1 year ago
#318 - Add a regression test for StreamingDataset instantiation and iteration
Pull Request -
State: closed - Opened by b-chu over 1 year ago
#317 - Transfer json folder to Streaming
Issue -
State: closed - Opened by germanjke over 1 year ago
- 2 comments
#316 - Sync tmp directory
Pull Request -
State: closed - Opened by b-chu over 1 year ago
#315 - Add GCS authentication for service accounts
Pull Request -
State: closed - Opened by b-chu over 1 year ago
#314 - Bump fastapi from 0.97.0 to 0.98.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#313 - Bump pytest from 7.3.2 to 7.4.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#312 - Add secrets check as part of pre-commit
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#311 - Added files to support azure datalake storage
Pull Request -
State: closed - Opened by shivshandilya over 1 year ago
- 7 comments
#310 - Can't load dataset from S3
Issue -
State: closed - Opened by germanjke over 1 year ago
- 15 comments
#309 - Bump myst-parser from 1.0.0 to 2.0.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#308 - Bump version to 0.5.1
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#307 - StreamingDataset with DDP hangs and then crashes with NCCL timeout error
Issue -
State: open - Opened by greeneggsandyaml over 1 year ago
- 17 comments
Labels: bug
#306 - Why can't I run two experiments in parallel which will load from the same dataset location?
Issue -
State: closed - Opened by eldarkurtic over 1 year ago
- 8 comments
Labels: bug
#305 - Fix LocalDataset (property size for fancy __getitem__).
Pull Request -
State: closed - Opened by knighton over 1 year ago
#304 - Propagate exception between threads and processes and improved error message
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#303 - fix: :bug: LocalDataset
Pull Request -
State: closed - Opened by tungdq212 over 1 year ago
- 1 comment
#302 - py1bs shuffle algorithm ("staggered py1b")
Pull Request -
State: closed - Opened by knighton over 1 year ago
- 1 comment
#301 - Round drop_first to be divisible by num_physical_nodes.
Pull Request -
State: closed - Opened by knighton over 1 year ago
#300 - LocalDataset bug
Issue -
State: closed - Opened by tungdq212 over 1 year ago
- 6 comments
Labels: bug
#299 - Added a utility method to clean stale shared memory
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#298 - Improved existing exception and exception messages
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#297 - Terminate the main process if thread died unexpectedly
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#296 - Bump pydantic from 1.10.8 to 1.10.9
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#293 - Timeout Error when local=None in StreamingDataset and training in distributed mode
Issue -
State: closed - Opened by vancoykendall over 1 year ago
- 1 comment
Labels: bug
#292 - Support for azure Data Lake Gen2 type storage
Issue -
State: closed - Opened by shivshandilya over 1 year ago
- 6 comments
Labels: enhancement
#288 - Parallel writing of MDS files
Issue -
State: closed - Opened by mpetri over 1 year ago
- 2 comments
Labels: enhancement
#273 - Update README.md - slack
Pull Request -
State: closed - Opened by ejyuen over 1 year ago
#272 - Fix README slack link
Pull Request -
State: closed - Opened by growlix over 1 year ago
- 2 comments
#271 - Composable datasets
Issue -
State: closed - Opened by jacobwjs over 1 year ago
- 4 comments
Labels: enhancement
#270 - Bump furo from 2022.9.29 to 2023.5.20
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 2 comments
Labels: dependencies
#269 - Bump fastapi from 0.95.1 to 0.95.2
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 2 comments
Labels: dependencies
#268 - keep_raw=False doesn't actually delete shards.
Issue -
State: open - Opened by tbenthompson over 1 year ago
- 3 comments
Labels: bug
#267 - Update Stream documentation
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#266 - Add `Stream` usage example to README
Pull Request -
State: closed - Opened by hanlint over 1 year ago
- 3 comments
#265 - Support any S3-compatible object store (R2, Coreweave, Backblaze, etc.)
Pull Request -
State: open - Opened by abhi-mosaic over 1 year ago
- 1 comment
#264 - Memory leak when using `StreamingDataset`'s `__iter__` method.
Issue -
State: closed - Opened by wadimiusz over 1 year ago
- 8 comments
Labels: bug
#263 - Bugfix in user_guide.md sample code
Pull Request -
State: closed - Opened by tginart over 1 year ago
#262 - Fix slack link in readme
Pull Request -
State: closed - Opened by growlix over 1 year ago
#261 - Resume support for MDSWriter?
Issue -
State: closed - Opened by tbenthompson over 1 year ago
- 3 comments
Labels: enhancement
#260 - Add support for any S3 compatible object storage
Issue -
State: open - Opened by vancoykendall over 1 year ago
- 5 comments
#259 - Fix typo in documentation's conversion `pile.py` link
Pull Request -
State: closed - Opened by ouhenio over 1 year ago
#258 - Fixed Pile documentation link
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
- 1 comment
#257 - Documentation link to `pile.py` script for dataset conversion to MDS format returns 404 error
Issue -
State: closed - Opened by ouhenio over 1 year ago
- 5 comments
#256 - Add support for Azure cloud storage
Pull Request -
State: closed - Opened by hlky over 1 year ago
- 1 comment
#255 - Add support for Cloudflare R2 cloud storage
Pull Request -
State: closed - Opened by hlky over 1 year ago
- 6 comments
#254 - Is there an ETA for adding azure support?
Issue -
State: closed - Opened by njb-ms over 1 year ago
- 5 comments
Labels: enhancement
#253 - Bump uvicorn from 0.21.1 to 0.22.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#252 - Bump sphinx from 4.4.0 to 7.0.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 2 comments
Labels: dependencies
#251 - Create a new boto3 session per thread
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#250 - Shared lock
Pull Request -
State: open - Opened by knighton over 1 year ago
#249 - Update readthedocs python version to 3.9
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#248 - Write example for RedPajama
Issue -
State: closed - Opened by mitchellnw over 1 year ago
- 8 comments
Labels: enhancement
#247 - Support dataset.filter sample filtering
Issue -
State: closed - Opened by mpetri over 1 year ago
- 3 comments
Labels: enhancement
#246 - Better organize code
Pull Request -
State: closed - Opened by knighton over 1 year ago
#245 - Added py.typed to indicate that the repository has typing annotations
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
- 1 comment
#244 - Add py.typed marker file for type checking
Issue -
State: closed - Opened by micimize over 1 year ago
- 2 comments
Labels: bug
#243 - Rename "samples" to "choose" (distinguish underlying vs resampled)
Pull Request -
State: closed - Opened by knighton over 1 year ago
#242 - Raise descriptive error message when index.json is corrupted
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#241 - Propagate an exception raise by a thread to its caller
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#240 - Skip distributed all_gather test since CI non-deterministically hangs
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#239 - Bump version to 0.4.1
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#238 - Fixed local directory check
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#237 - Update torch dependency pin to <2.1
Pull Request -
State: closed - Opened by bandish-shah over 1 year ago
- 1 comment
#236 - Redesign shard index
Pull Request -
State: closed - Opened by knighton over 1 year ago
- 1 comment
#235 - Removed pushing auto release branch due to GH action permission
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#234 - Support of torch 2.0
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#233 - Bump yamllint from 1.30.0 to 1.31.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#232 - Add documentation for MDSWriter, conversion scripts, and supported format
Pull Request -
State: open - Opened by karan6181 over 1 year ago
- 1 comment
#231 - Add a requester pays bucket permission args to boto3 for s3 download file
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#230 - Bump pytest from 7.3.0 to 7.3.1
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#229 - Bump sphinx-copybutton from 0.5.1 to 0.5.2
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#228 - Bump sphinxext-opengraph from 0.8.1 to 0.8.2
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#227 - Bump fastapi from 0.95.0 to 0.95.1
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#226 - Virtually split the repeats of repeated shards
Pull Request -
State: closed - Opened by knighton over 1 year ago
- 2 comments
#225 - StreamingDataset with torch.nn.parallel.DistributedDataParallel
Issue -
State: closed - Opened by amallia over 1 year ago
- 6 comments
Labels: enhancement
#224 - Switch documentation search to use Algolia
Pull Request -
State: closed - Opened by bandish-shah over 1 year ago
#223 - Add two shuffling algos: naive (globally) and py1b (fixed-size blocks).
Pull Request -
State: closed - Opened by knighton over 1 year ago
- 3 comments
#222 - Bump pytest from 7.2.2 to 7.3.0
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#221 - Add installation and environments documentation
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#220 - Add a readme for multimodal convert script modal type
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#219 - Cold shard eviction
Pull Request -
State: open - Opened by knighton over 1 year ago
- 4 comments
#218 - Refactor StreamingDataset shared memory prefix setup
Pull Request -
State: closed - Opened by knighton over 1 year ago
#217 - Shared dir selection method prone to collisions in concurrent scenarios
Issue -
State: open - Opened by mx781 over 1 year ago
- 5 comments
Labels: bug
#216 - Bump version to 0.4.0
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#215 - Register atexit handler for resource cleanup
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
- 1 comment
#214 - Allow for accessing slices of dataset
Issue -
State: closed - Opened by VictorSanh over 1 year ago
- 5 comments
Labels: enhancement
#213 - Questions about `StreamingDataset` in the case of limited (fast) local disk storage
Issue -
State: closed - Opened by VictorSanh over 1 year ago
- 2 comments
Labels: enhancement
#212 - Raise an exception if bucket does not exist during upload
Pull Request -
State: closed - Opened by karan6181 over 1 year ago
#211 - Bump pydantic from 1.10.6 to 1.10.7
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies
#210 - Bump furo from 2022.9.29 to 2023.3.27
Pull Request -
State: closed - Opened by dependabot[bot] over 1 year ago
- 2 comments
Labels: dependencies