Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / Lightning-AI/lightning issues and pull requests
#19082 - TransformerEngine fallback compute dtype
Pull Request -
State: open - Opened by carmocca 10 months ago
- 2 comments
Labels: docs, fabric, pl, precision: te
#19080 - Fix `item_per_sec` metric in ThroughputMonitor
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: bug, ready, fabric, performance, pl
#19079 - Fix more flaky tests
Pull Request -
State: open - Opened by awaelchli 10 months ago
Labels: fabric
#19078 - DataProcess Refactor: Global queue 1/n
Pull Request -
State: open - Opened by tchaton 10 months ago
Labels: data
#19077 - Add Weights & Biases (W&B) Fabric Logger
Pull Request -
State: open - Opened by ash0ts 10 months ago
- 1 comment
Labels: docs, fabric, pl
#19076 - Cannot open docs & forum on official site from Russia
Issue -
State: open - Opened by IrinaArmstrong 10 months ago
- 2 comments
Labels: bug, docs, ver: 2.2.x
#19075 - Feature/17638 support deepspeed stage 1 offload
Pull Request -
State: open - Opened by nik777 10 months ago
- 1 comment
Labels: fabric, pl
#19074 - Fix comm initialization in `MPIEnvironment`
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: bug, ready, ci, fabric, pl, fun, environment: mpi
#19073 - About LR scheduler
Issue -
State: closed - Opened by morestart 10 months ago
- 6 comments
Labels: repro needed
#19072 - ModelCheckpoint does not work properly when monitored metric is only logged on rank 0
Issue -
State: closed - Opened by nicolasch96 10 months ago
- 4 comments
Labels: bug, repro needed, ver: 2.2.x
#19071 - [NEPTUNE] Optimize uploading k-best model checkpoints
Pull Request -
State: open - Opened by AleksanderWWW 10 months ago
Labels: pl
#19070 - MetricTracker that also logs the maximum/minimum values
Issue -
State: closed - Opened by crazyboy9103 10 months ago
- 2 comments
Labels: feature, callback
#19069 - Bump AButler/upload-release-assets from 2.0 to 3.0
Pull Request -
State: closed - Opened by dependabot[bot] 10 months ago
Labels: ready, ci
#19068 - Add `@override` for files in `src/lightning/fabric/accelerators`
Pull Request -
State: closed - Opened by VictorPrins 10 months ago
- 1 comment
Labels: ready, fabric, code quality, community
#19067 - Clarify setup of optimizer when using `empty_init=True`
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 1 comment
Labels: ready, docs, fabric, strategy: fsdp, fun
#19066 - Clarify requirements for `Trainer.fit(ckpt_path="last")`
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: ready, docs, trainer, pl, fun
#19065 - Add `@override` for remaining files in `src/lightning/pytorch`
Pull Request -
State: closed - Opened by VictorPrins 10 months ago
- 2 comments
Labels: ready, code quality, community, pl
#19064 - Fix ModelCheckpoint alternating between versioned and unversioned file
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: bug, ready, callback: model checkpoint, pl, fun
#19063 - Test TPU
Pull Request -
State: closed - Opened by sakul1234 10 months ago
Labels: pl
#19062 - MPIEnvironment fails for MPI multi-node training at comm gather step of worker nodes
Issue -
State: closed - Opened by sohrabi1 10 months ago
- 1 comment
Labels: bug, help wanted, ver: 2.1.x, environment: mpi
#19061 - Delay `Precision.convert_module` until `configure_model` has run [TPU]
Pull Request -
State: open - Opened by carmocca 10 months ago
- 2 comments
Labels: bug, fabric, pl, precision: bnb, precision: te
#19060 - Reorder `configure_model`
Pull Request -
State: open - Opened by carmocca 10 months ago
- 2 comments
Labels: docs, breaking change, pl
#19059 - Expand paths that start with "~" in `Trainer.default_root_dir` and other places
Issue -
State: open - Opened by awaelchli 10 months ago
- 2 comments
Labels: feature, help wanted, good first issue
#19058 - Fix ModelCheckpoint dirpath expanding home prefix
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 3 comments
Labels: bug, ready, callback: model checkpoint, pl, fun
#19057 - Remove outdated "optimized installation" section from docs
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 1 comment
Labels: ready, docs, pl, fun
#19056 - Clarify `self.log(..., rank_zero_only=True|False)`
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: bug, ready, docs, callback: model checkpoint, pl, fun
#19055 - Fix the "our" word duplication in the docs
Pull Request -
State: open - Opened by Jamim 10 months ago
Labels: ready, docs, fabric, community, pl
#19054 - Make `ModelCheckpoint._format_checkpoint_name` an instance method
Pull Request -
State: closed - Opened by awaelchli 10 months ago
Labels: bug, ready, callback: model checkpoint, pl, fun
#19053 - Fix docs for 'nn.Module from checkpoint'
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 1 comment
Labels: ready, docs, pl, fun
#19052 - Add fault tolerance Streaming Dataset 2/n
Pull Request -
State: closed - Opened by tchaton 10 months ago
- 1 comment
Labels: ready, data
#19051 - Move`ignore[override]` annotation to method signature
Pull Request -
State: closed - Opened by VictorPrins 10 months ago
- 4 comments
Labels: code quality, community, pl
#19050 - Add numpy support for the StreamingDataset 1/2
Pull Request -
State: closed - Opened by tchaton 10 months ago
- 1 comment
Labels: ready, data
#19049 - Add fault tolerance for the StreamingDataset 1/n
Pull Request -
State: closed - Opened by tchaton 10 months ago
- 2 comments
Labels: ready, data
#19048 - TypeError: __init__() got an unexpected keyword argument 'drop_last'
Issue -
State: closed - Opened by liziru 10 months ago
- 3 comments
Labels: bug, waiting on author, data handling, ver: 2.1.x
#19047 - Loading nn.Module from checkpoint doc fix
Issue -
State: closed - Opened by montehoover 10 months ago
- 2 comments
Labels: docs
#19046 - Cast to >=float32 tensor when passing scalar to self.log
Pull Request -
State: closed - Opened by MF-FOOM 10 months ago
- 1 comment
Labels: bug, ready, logging, community, pl
#19045 - Training hangs with DDP + ModelCheckpoint Callback
Issue -
State: closed - Opened by MattMcPartlon 10 months ago
- 8 comments
Labels: bug, distributed, callback: model checkpoint, ver: 2.1.x
#19044 - Add direct s3 support to the streaming dataset
Pull Request -
State: closed - Opened by tchaton 10 months ago
- 1 comment
Labels: ready, data
#19042 - Train diffusion model with fabric
Issue -
State: closed - Opened by caiqi 10 months ago
- 6 comments
Labels: question, strategy: deepspeed, fabric, ver: 2.1.x
#19041 - Add disk usage check before downloading files
Pull Request -
State: closed - Opened by tchaton 10 months ago
- 1 comment
Labels: ready, ci, data
#19040 - PyTest random order for Fabric tests
Pull Request -
State: closed - Opened by awaelchli 10 months ago
- 2 comments
Labels: ready, ci, fabric, tests, fun
#19039 - Filter test names in `run_standalone_tests.sh` when checking for errors
Issue -
State: open - Opened by awaelchli 10 months ago
- 1 comment
Labels: feature, help wanted, tests
#19038 - Re-enable dynamo tests that were fixed in PyTorch 2.1
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: ready, fabric, tests, pl, fun
#19037 - Call `find_free_network_port` only if necessary
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: ready, refactor, fabric, environment: lightning, fun
#19036 - Call `configure_model()` in `LM.load_from_checkpoint()`
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: feature, ready, strategy: fsdp, pl, fun
#19035 - Loggers fails to create metrics.csv file when running on multiple TPU cores
Issue -
State: open - Opened by javiergaitan 11 months ago
- 1 comment
Labels: bug, help wanted, strategy: xla, ver: 2.2.x
#19034 - Remove unnecessary torch.cuda.manual_seed_all()
Pull Request -
State: closed - Opened by JalinWang 11 months ago
- 1 comment
Labels: ready, refactor, fabric, community
#19033 - Remove unnecessary `torch.cuda.manual_seed_all()` in `seed_everything()`
Issue -
State: closed - Opened by JalinWang 11 months ago
- 2 comments
Labels: help wanted, refactor
#19032 - Lightning CLI optimizer from command line for subset of model parameters
Issue -
State: closed - Opened by nilsleh 11 months ago
- 3 comments
Labels: feature, lightningcli
#19031 - Bump Lightning-AI/utilities from 0.9.0 to 0.10.0
Pull Request -
State: closed - Opened by dependabot[bot] 11 months ago
Labels: ready, ci
#19030 - Fix `rank_zero_only` rank not set in ddp-spawn based strategies
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: bug, ready, priority: 0, fabric, strategy: ddp, pl, fun
#19029 - The training process is fast, but it becomes particularly slow during validation.
Issue -
State: closed - Opened by shiyao1999 11 months ago
- 5 comments
Labels: bug, strategy: ddp, performance, ver: 1.8.x, repro needed
#19027 - ThroughputMonitor Trainer callback fixes
Pull Request -
State: closed - Opened by carmocca 11 months ago
- 2 comments
Labels: bug, ready, pl
#19026 - Reduce lightning data's dependencies
Pull Request -
State: closed - Opened by carmocca 11 months ago
- 1 comment
Labels: ready, ci, dependencies, data
#19025 - Add `@override` for files in `src/lightning/pytorch/plugins`
Pull Request -
State: closed - Opened by VictorPrins 11 months ago
- 1 comment
Labels: ready, code quality, community, pl
#19024 - Add support to `saving.py` for loading GPU-trained models on CPU-only machines
Pull Request -
State: open - Opened by amorehead 11 months ago
- 9 comments
Labels: pl
#19023 - Fix fsspec local file protocol checks for new fsspec version
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: bug, ready, priority: 0, fabric, app, pl, dependencies, data
#19022 - Address test flakiness
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: ready, fabric, tests, strategy: ddp, pl, fun
#19021 - docs: update chlog with `2.1.1` & `2.1.2`
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: ready, fabric, app, pl, data
#19020 - Add `@override` for files in `src/lightning/pytorch/profilers`
Pull Request -
State: closed - Opened by VictorPrins 11 months ago
- 1 comment
Labels: ready, code quality, community, pl
#19019 - Remove the LightningDataset relying on un-maintained torchdata
Pull Request -
State: closed - Opened by tchaton 11 months ago
- 1 comment
Labels: ready, dependencies, data
#19018 - ci/docs: upload with dispatch
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: ready, docs, ci, release
#19017 - Resolve Item Loader bugs
Pull Request -
State: closed - Opened by tchaton 11 months ago
- 1 comment
Labels: ready, data
#19016 - docs: enable retro dispatch build
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: ready, docs, ci
#19015 - `Trainer.validate()` after `Trainer.fit()` not working with FSDP and `auto_wrap_policy`
Issue -
State: open - Opened by SpirinEgor 11 months ago
Labels: bug, strategy: fsdp, ver: 2.1.x
#19014 - docs: fix inter-sphinx to `numpy`
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: bug, ready, docs, app
#19013 - ci/doc: enable build docs on spec version
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: ready, docs, ci
#19012 - docs: pin some linked ipynb to 2.1.0
Pull Request -
State: closed - Opened by Borda 11 months ago
- 2 comments
Labels: ready, docs, pl
#19011 - Docs style renders unreadable yaml code snippets
Issue -
State: open - Opened by mauvilsa 11 months ago
- 1 comment
Labels: docs
#19009 - Fail to import lightning due to missing dependency on setuptools
Issue -
State: open - Opened by cshimmin 11 months ago
Labels: bug, needs triage, ver: 2.1.x
#19009 - Fail to import lightning due to missing dependency on setuptools
Issue -
State: open - Opened by cshimmin 11 months ago
Labels: bug, needs triage, ver: 2.1.x
#19008 - ci: minor formatting update
Pull Request -
State: open - Opened by Borda 11 months ago
- 1 comment
Labels: ci
#19008 - ci: minor formatting update
Pull Request -
State: closed - Opened by Borda 11 months ago
- 1 comment
Labels: ready, ci
#19007 - Add `best_model_metrics` to `ModelCheckpoint` callback
Issue -
State: open - Opened by libokj 11 months ago
Labels: feature, needs triage
#19007 - Add `best_model_metrics` to `ModelCheckpoint` callback
Issue -
State: open - Opened by libokj 11 months ago
Labels: feature, needs triage
#19006 - Add `@override` for files in `src/lightning/pytorch/trainer/callbacks`
Pull Request -
State: closed - Opened by VictorPrins 11 months ago
- 4 comments
Labels: ready, code quality, community, pl
#19006 - Add `@override` for files in `src/lightning/pytorch/trainer/callbacks`
Pull Request -
State: open - Opened by VictorPrins 11 months ago
- 2 comments
Labels: pl
#19005 - Add `@override` for files in `src/lightning/pytorch/tuner`
Pull Request -
State: closed - Opened by VictorPrins 11 months ago
- 1 comment
Labels: ready, tuner, code quality, community, pl
#19005 - Add `@override` for files in `src/lightning/pytorch/tuner`
Pull Request -
State: open - Opened by VictorPrins 11 months ago
Labels: tuner, code quality, community, pl
#19003 - Downloading artifacts with wandblogger in DDP case failing on non-zero rank processes
Issue -
State: open - Opened by galbraun 11 months ago
- 2 comments
Labels: bug, needs triage, ver: 2.1.x
#19003 - Downloading artifacts with wandblogger in DDP case failing on non-zero rank processes
Issue -
State: open - Opened by galbraun 11 months ago
- 2 comments
Labels: bug, help wanted, logger: wandb, ver: 2.1.x
#19001 - Weekly minor patch release `2.1.2` [Rebase & merge]
Pull Request -
State: closed - Opened by Borda 11 months ago
- 2 comments
Labels: ready, docs, fabric, app, pl, dependencies, package
#19000 - work completed status
Pull Request -
State: closed - Opened by nohalon 11 months ago
- 1 comment
Labels: app
#18999 - ImportError: cannot import name 'Auth' from partially initialized module 'lightning_cloud.login'
Issue -
State: closed - Opened by weiji14 11 months ago
- 1 comment
Labels: bug, dependencies, ver: 2.1.x
#18997 - Add `@override` for files in `src/lightning/pytorch/trainer/connectors`
Pull Request -
State: closed - Opened by VictorPrins 11 months ago
- 1 comment
Labels: ready, code quality, community, pl
#18996 - Update auto_encoder.py to accomodate torchvision breaking change
Pull Request -
State: closed - Opened by tchaton 11 months ago
- 1 comment
Labels: example, ready, pl
#18995 - Option to save last checkpoint as copy instead of symlinking
Issue -
State: open - Opened by ad12 11 months ago
- 4 comments
Labels: feature, callback: model checkpoint
#18994 - Fix test interactions
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 1 comment
Labels: bug, ready, fabric, tests, pl, fun
#18993 - Fix `ModelCheckpoint.CHECKPOINT_NAME_LAST` test interaction
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: bug, ready, tests, callback: model checkpoint, pl, fun
#18992 - Fix `trainer.save_checkpoint` after `trainer.test` with FSDP
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: bug, ready, trainer: test, strategy: fsdp, pl, fun
#18991 - CUDA out Of Error while Loading From Checkpoint
Issue -
State: closed - Opened by chinge55 11 months ago
Labels: bug, needs triage, ver: 2.0.x
#18990 - Bump axios from 0.26.1 to 1.6.0 in /src/lightning/app/cli/react-ui-template/ui
Pull Request -
State: open - Opened by dependabot[bot] 11 months ago
Labels: app, javascript
#18989 - Add multiple uploaders to the map, optimize
Pull Request -
State: closed - Opened by tchaton 11 months ago
- 1 comment
Labels: ready
#18988 - Avoid importing from `lightning.app` in `lightning.data`
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 1 comment
Labels: bug, ready
#18987 - Improve handling the positional encoding in Transformer example
Pull Request -
State: closed - Opened by awaelchli 11 months ago
- 2 comments
Labels: example, ready, pl
#18986 - Fix typo
Pull Request -
State: closed - Opened by AMHermansen 11 months ago
- 1 comment
Labels: ready, community, pl
#18985 - BatchSizeFinder throws KeyError: 'limit_eval_batches'
Issue -
State: open - Opened by drusmanbashir 11 months ago
- 1 comment
Labels: bug, help wanted, tuner, ver: 2.1.x, repro needed
#18984 - cast to float32 or float64 tensor when passing scalar to self.log
Issue -
State: closed - Opened by MF-FOOM 11 months ago
- 2 comments
Labels: bug, logging, ver: 2.1.x
#18983 - Fix(app): Reduce HTTP Queue Get Request Rates
Pull Request -
State: closed - Opened by rlizzo 11 months ago
- 2 comments
Labels: app
#18982 - registered buffers' dtype is overridden after __init__
Issue -
State: open - Opened by MF-FOOM 11 months ago
- 1 comment
Labels: bug, fabric, pl, ver: 2.1.x, precision: half