Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / lightning-ai/litdata issues and pull requests
#415 - Change S3Client to use user-provided storage_options even in Studio
Pull Request -
State: open - Opened by grez72 1 day ago
- 1 comment
#414 - use storage_options even when IS_IN_STUDIO
Issue -
State: open - Opened by grez72 1 day ago
Labels: enhancement
#413 - Multithreading function for merge_datasets
Pull Request -
State: closed - Opened by yhl48 6 days ago
- 1 comment
#412 - Fix AttributeError in `BinaryReader` Destructor Due to Non-Existent `_prepare_thread` Attribute
Pull Request -
State: closed - Opened by Kidand 8 days ago
#411 - `StreamingDataloader` is not split on each rank when training
Issue -
State: closed - Opened by Aceticia 8 days ago
- 8 comments
Labels: bug, help wanted
#410 - Bump version to 0.2.30
Pull Request -
State: closed - Opened by bhimrazy 10 days ago
- 1 comment
#409 - Clear Examples of use with different dataset types and code changes.
Issue -
State: open - Opened by Woodr7 10 days ago
- 2 comments
Labels: enhancement
#408 - training hangs with lightning ddp and cloud dir?
Issue -
State: open - Opened by rxqy 14 days ago
- 3 comments
Labels: bug, help wanted
#405 - π docs: specify custom cache directory
Pull Request -
State: closed - Opened by bhimrazy 17 days ago
- 1 comment
Labels: documentation
#404 - Fix broken link for CONTRIBUTING.md
Pull Request -
State: closed - Opened by bhimrazy 17 days ago
- 1 comment
#403 - `use_checkpoint=True` creates invalid config.json file
Issue -
State: closed - Opened by cyrildiagne 17 days ago
- 4 comments
Labels: bug, help wanted
#402 - incorrect dataloader length when `drop_last=False`
Issue -
State: open - Opened by grez72 17 days ago
- 1 comment
Labels: bug, help wanted
#401 - Feat/add support for numpy datatypes in tokensloader
Pull Request -
State: closed - Opened by bhimrazy 18 days ago
- 1 comment
Labels: enhancement
#400 - Feature: Add support for numpy datatypes in TokensLoader
Issue -
State: closed - Opened by bhimrazy 18 days ago
Labels: enhancement
#399 - Feat: add support for custom cache dir in Streaming Dataset
Pull Request -
State: closed - Opened by bhimrazy 18 days ago
- 1 comment
Labels: enhancement
#398 - Existing Cache files leads to permanent DataLoader hang
Issue -
State: closed - Opened by lilavocado 30 days ago
- 5 comments
Labels: bug, help wanted
#397 - pass storage options to s5cmd
Pull Request -
State: closed - Opened by bhimrazy about 1 month ago
- 2 comments
Labels: enhancement
#396 - Combine Small StreamingDatasets into 1 Large StreamingDataset
Issue -
State: closed - Opened by schopra8 about 1 month ago
- 5 comments
Labels: enhancement
#395 - correct the chunk size by adding header size
Pull Request -
State: closed - Opened by tchaton about 1 month ago
- 1 comment
#394 - correct the chunk size by adding header size
Pull Request -
State: closed - Opened by dangthatsright about 1 month ago
- 2 comments
#393 - Writing / Reading Bug involving writer `chunk_bytes` information
Issue -
State: closed - Opened by dangthatsright about 1 month ago
- 5 comments
Labels: bug, help wanted
#392 - Add Support for Custom S3 Configuration in s5cmd
Issue -
State: closed - Opened by csy1204 about 1 month ago
- 2 comments
Labels: enhancement
#391 - CONTRIBUTING.md for LitData
Pull Request -
State: closed - Opened by deependujha about 1 month ago
- 5 comments
#390 - fix: non-deterministic CI test failure
Pull Request -
State: closed - Opened by deependujha about 1 month ago
- 1 comment
#389 - `One of the worker has failed` error in test
Issue -
State: closed - Opened by deependujha about 1 month ago
- 1 comment
Labels: bug, help wanted
#388 - TreeSpec Error Accessing Data
Issue -
State: closed - Opened by jmoller93 about 1 month ago
- 5 comments
Labels: bug, help wanted
#386 - Improve CombinedStreamingDataset to handle multiple subdatasets efficiently
Issue -
State: open - Opened by bhimrazy about 1 month ago
Labels: enhancement
#385 - π Update Docs: Merge multiple optimized datasets into one
Pull Request -
State: closed - Opened by bhimrazy about 1 month ago
- 1 comment
Labels: documentation
#384 - update tags in pkg metadata
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 1 comment
Labels: documentation
#383 - Bump version 0.2.29
Pull Request -
State: closed - Opened by deependujha about 2 months ago
- 3 comments
#382 - Update `PL Data` to `LitData`
Pull Request -
State: closed - Opened by bhimrazy about 2 months ago
- 1 comment
#381 - Fix/large num chunks error
Pull Request -
State: closed - Opened by bhimrazy about 2 months ago
- 3 comments
#380 - Revert "Feat: Using fsspec to download files"
Pull Request -
State: closed - Opened by tchaton about 2 months ago
- 4 comments
#379 - Bump version to 0.2.27
Pull Request -
State: closed - Opened by bhimrazy about 2 months ago
- 2 comments
#378 - Bump version to 0.2.27.dev
Pull Request -
State: closed - Opened by rasbt about 2 months ago
- 2 comments
#377 - fix import & asignement issue
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 2 comments
Labels: bug
#376 - improve hint readability
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 2 comments
#375 - Fix: Chunks deletion issue
Pull Request -
State: closed - Opened by deependujha about 2 months ago
- 11 comments
#374 - fixing docstrings
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 2 comments
#373 - reduce unnecessary `pass`
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 2 comments
#372 - remove not violated bandit rules from ignore
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 1 comment
#371 - fixing typos in errors & docs
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 2 comments
#370 - The config isn't consistent between chunks
Issue -
State: open - Opened by AugustDev about 2 months ago
- 5 comments
Labels: bug, help wanted
#369 - switch `lightning-cloud` to lightning SDK
Pull Request -
State: closed - Opened by Borda about 2 months ago
- 3 comments
Labels: enhancement, dependencies
#368 - How can I shut down automatically distributing data when using StreamingDataset?
Issue -
State: open - Opened by ygtxr1997 2 months ago
- 3 comments
Labels: enhancement, question
#367 - RuntimeError: All the chunks should have been deleted. Found ['chunk-0-0.bin']
Issue -
State: closed - Opened by rasbt 2 months ago
- 11 comments
Labels: bug, help wanted
#366 - Large number of chunks causes `OSError: [Errno 24] Too many open files`
Issue -
State: closed - Opened by fdalvi 2 months ago
- 8 comments
Labels: bug, help wanted
#365 - azure storage options
Pull Request -
State: closed - Opened by mohanreddypmr 2 months ago
- 3 comments
#364 - Bump cryptography from 42.0.8 to 43.0.1 in /requirements
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 1 comment
Labels: dependencies
#363 - Failed to Resume Training w/ CombinedStreamingDataset
Issue -
State: open - Opened by schopra8 2 months ago
- 1 comment
Labels: bug, duplicate, help wanted
#362 - [WIP] : Fix resume issues with combined streaming dataset in dataloader
Pull Request -
State: open - Opened by bhimrazy 2 months ago
- 6 comments
#361 - ci: drop dependabot
Pull Request -
State: closed - Opened by Borda 2 months ago
- 1 comment
#360 - LitData release 0.2.26
Pull Request -
State: closed - Opened by tchaton 2 months ago
- 1 comment
#359 - Update README.md
Pull Request -
State: closed - Opened by tchaton 2 months ago
- 1 comment
#358 - Update README.md
Pull Request -
State: closed - Opened by tchaton 2 months ago
- 1 comment
#357 - tchaton patch 1
Pull Request -
State: closed - Opened by tchaton 2 months ago
- 1 comment
#356 - Update README.md
Pull Request -
State: closed - Opened by tchaton 2 months ago
- 1 comment
#355 - bump/ci: update to `0.11.7`
Pull Request -
State: closed - Opened by Borda 2 months ago
- 1 comment
Labels: ci / tests
#354 - A contributing.md for the project
Issue -
State: closed - Opened by deependujha 2 months ago
- 1 comment
Labels: enhancement, good first issue
#353 - Fix: Prevent multiple processes from copying the same file when usingβ¦
Pull Request -
State: closed - Opened by dallmann-uniwue 2 months ago
- 5 comments
#352 - Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: ci / tests
#351 - When using DDP, processes see truncated cached index.json when data is loaded from a mounted network filesystem
Issue -
State: closed - Opened by dallmann-uniwue 2 months ago
- 3 comments
Labels: bug, help wanted
#350 - Adds check for existence of dataset path before loading index file
Pull Request -
State: closed - Opened by bhimrazy 2 months ago
- 1 comment
#349 - Error Should Indicate Missing Folder Instead of Missing index.json File
Issue -
State: closed - Opened by bhimrazy 2 months ago
- 1 comment
Labels: bug, help wanted
#348 - Feat: Using fsspec to download files
Pull Request -
State: closed - Opened by deependujha 2 months ago
- 6 comments
#347 - Update numpy requirement from <2.0 to <3.0
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 1 comment
Labels: ci / tests
#346 - Bump mosaicml-streaming from 0.8.0 to 0.8.1
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: ci / tests
#345 - Bump coverage from 7.5.3 to 7.6.1
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 1 comment
Labels: ci / tests
#344 - Tests related to torchaudio fail
Issue -
State: closed - Opened by deependujha 2 months ago
- 1 comment
Labels: bug, help wanted
#343 - Bump: release version 0.2.25
Pull Request -
State: closed - Opened by bhimrazy 3 months ago
- 1 comment
#342 - Fix: Ensure Compression Algorithm is Installed Before Reading Compressed Data
Pull Request -
State: closed - Opened by bhimrazy 3 months ago
- 3 comments
#341 - Bug: Loading compressed data fails silently (no error message, the application simply hangs up)
Issue -
State: closed - Opened by AugustDev 3 months ago
- 3 comments
Labels: bug, help wanted
#340 - CombinedStreamingDataset causes NCCL timeout when using multiple nodes
Issue -
State: open - Opened by hubenjm 3 months ago
- 15 comments
Labels: bug, help wanted
#339 - Lazyload subsamples if subsample=1.0
Issue -
State: open - Opened by deependujha 3 months ago
Labels: enhancement, question
#338 - boost(ci): run tests in parallel
Pull Request -
State: closed - Opened by Borda 3 months ago
- 2 comments
#337 - StreamingDataset intermittently fails due to lack of index.json
Issue -
State: open - Opened by plra 3 months ago
- 2 comments
Labels: bug, help wanted
#336 - bump: use the latest/fixed version of `RequirementCache`
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
Labels: enhancement
#335 - ci: enable testing `py3.10` & prune unused workflows
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
#334 - fix(lint): prune invalid configurations
Pull Request -
State: closed - Opened by Borda 3 months ago
#333 - fix(ci): prune duplicated tests/checks
Pull Request -
State: closed - Opened by Borda 3 months ago
#332 - Bump: release version 0.2.24
Pull Request -
State: closed - Opened by bhimrazy 3 months ago
#331 - Bug: Inconsistent Behavior with StreamingDataloader loading states (specific to CombinedStreamingDataset)
Issue -
State: open - Opened by bhimrazy 3 months ago
Labels: bug, help wanted
#330 - Reset state_dict after resume
Pull Request -
State: closed - Opened by vgurev 3 months ago
- 5 comments
#328 - Bug: Issues with Dataloader Batching Resulting in Uneven number of Batches and Streamed Items
Issue -
State: closed - Opened by bhimrazy 3 months ago
- 2 comments
Labels: bug, help wanted
#327 - Use different batch sizes in CombinedStreamingDataset
Issue -
State: open - Opened by schopra8 3 months ago
- 1 comment
Labels: enhancement, help wanted
#326 - Nitpick: random state best practice
Pull Request -
State: closed - Opened by deependujha 3 months ago
- 1 comment
#323 - Expose max download param
Pull Request -
State: closed - Opened by animan42 3 months ago
- 4 comments
#318 - Bugfix: inconsistent streaming dataloader state (specific to StreamingDataset)
Pull Request -
State: closed - Opened by bhimrazy 3 months ago
- 1 comment
Labels: priority 0
#316 - Bug: Inconsistent Behavior with StreamingDataloader loading states (specific to StreamingDataset)
Issue -
State: closed - Opened by bhimrazy 3 months ago
Labels: bug, help wanted, priority 0
#309 - Fix: Optimize function error on linux
Pull Request -
State: closed - Opened by deependujha 3 months ago
- 2 comments
#272 - Fix: failing tests due to future warning related to torch.loads(weights_only=True)
Pull Request -
State: closed - Opened by deependujha 4 months ago
- 2 comments
#271 - Fix: optimize() with num_workers > 1 leads to deletion issues
Pull Request -
State: closed - Opened by deependujha 4 months ago
- 4 comments
#263 - Resuming Training with New Dataset Fails
Issue -
State: closed - Opened by schopra8 4 months ago
- 6 comments
Labels: bug, help wanted
#245 - `optimize()` with `num_workers > 1` leads to deletion issues
Issue -
State: closed - Opened by awaelchli 4 months ago
- 7 comments
Labels: bug, help wanted, ci / tests
#218 - Is the multinode data processing only available in lightning studio?
Issue -
State: closed - Opened by rishabhm12 4 months ago
- 6 comments
Labels: enhancement, help wanted
#191 - Add support for parquet files for storing the chunks
Issue -
State: open - Opened by tchaton 5 months ago
- 3 comments
Labels: enhancement, help wanted
#181 - Using fsspec to download files
Issue -
State: open - Opened by samsja 5 months ago
- 5 comments
Labels: enhancement, help wanted
#173 - Resolve num_workers when the user provides 0
Pull Request -
State: closed - Opened by tchaton 5 months ago
#172 - Warning Message When Using StreamingDataset with DDP
Issue -
State: closed - Opened by taemincho 5 months ago
- 2 comments
Labels: bug, help wanted
#100 - Fix `map()` failing to create dataset when `input_dir` is None
Pull Request -
State: closed - Opened by awaelchli 7 months ago
Labels: bug