Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Lightning-AI/lightning issues and pull requests

#19082 - TransformerEngine fallback compute dtype

Pull Request - State: open - Opened by carmocca 10 months ago - 2 comments
Labels: docs, fabric, pl, precision: te

#19080 - Fix `item_per_sec` metric in ThroughputMonitor

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: bug, ready, fabric, performance, pl

#19079 - Fix more flaky tests

Pull Request - State: open - Opened by awaelchli 10 months ago
Labels: fabric

#19078 - DataProcess Refactor: Global queue 1/n

Pull Request - State: open - Opened by tchaton 10 months ago
Labels: data

#19077 - Add Weights & Biases (W&B) Fabric Logger

Pull Request - State: open - Opened by ash0ts 10 months ago - 1 comment
Labels: docs, fabric, pl

#19076 - Cannot open docs & forum on official site from Russia

Issue - State: open - Opened by IrinaArmstrong 10 months ago - 2 comments
Labels: bug, docs, ver: 2.2.x

#19075 - Feature/17638 support deepspeed stage 1 offload

Pull Request - State: open - Opened by nik777 10 months ago - 1 comment
Labels: fabric, pl

#19074 - Fix comm initialization in `MPIEnvironment`

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: bug, ready, ci, fabric, pl, fun, environment: mpi

#19073 - About LR scheduler

Issue - State: closed - Opened by morestart 10 months ago - 6 comments
Labels: repro needed

#19072 - ModelCheckpoint does not work properly when monitored metric is only logged on rank 0

Issue - State: closed - Opened by nicolasch96 10 months ago - 4 comments
Labels: bug, repro needed, ver: 2.2.x

#19071 - [NEPTUNE] Optimize uploading k-best model checkpoints

Pull Request - State: open - Opened by AleksanderWWW 10 months ago
Labels: pl

#19070 - MetricTracker that also logs the maximum/minimum values

Issue - State: closed - Opened by crazyboy9103 10 months ago - 2 comments
Labels: feature, callback

#19069 - Bump AButler/upload-release-assets from 2.0 to 3.0

Pull Request - State: closed - Opened by dependabot[bot] 10 months ago
Labels: ready, ci

#19068 - Add `@override` for files in `src/lightning/fabric/accelerators`

Pull Request - State: closed - Opened by VictorPrins 10 months ago - 1 comment
Labels: ready, fabric, code quality, community

#19067 - Clarify setup of optimizer when using `empty_init=True`

Pull Request - State: closed - Opened by awaelchli 10 months ago - 1 comment
Labels: ready, docs, fabric, strategy: fsdp, fun

#19066 - Clarify requirements for `Trainer.fit(ckpt_path="last")`

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: ready, docs, trainer, pl, fun

#19065 - Add `@override` for remaining files in `src/lightning/pytorch`

Pull Request - State: closed - Opened by VictorPrins 10 months ago - 2 comments
Labels: ready, code quality, community, pl

#19064 - Fix ModelCheckpoint alternating between versioned and unversioned file

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: bug, ready, callback: model checkpoint, pl, fun

#19063 - Test TPU

Pull Request - State: closed - Opened by sakul1234 10 months ago
Labels: pl

#19062 - MPIEnvironment fails for MPI multi-node training at comm gather step of worker nodes

Issue - State: closed - Opened by sohrabi1 10 months ago - 1 comment
Labels: bug, help wanted, ver: 2.1.x, environment: mpi

#19061 - Delay `Precision.convert_module` until `configure_model` has run [TPU]

Pull Request - State: open - Opened by carmocca 10 months ago - 2 comments
Labels: bug, fabric, pl, precision: bnb, precision: te

#19060 - Reorder `configure_model`

Pull Request - State: open - Opened by carmocca 10 months ago - 2 comments
Labels: docs, breaking change, pl

#19059 - Expand paths that start with "~" in `Trainer.default_root_dir` and other places

Issue - State: open - Opened by awaelchli 10 months ago - 2 comments
Labels: feature, help wanted, good first issue

#19058 - Fix ModelCheckpoint dirpath expanding home prefix

Pull Request - State: closed - Opened by awaelchli 10 months ago - 3 comments
Labels: bug, ready, callback: model checkpoint, pl, fun

#19057 - Remove outdated "optimized installation" section from docs

Pull Request - State: closed - Opened by awaelchli 10 months ago - 1 comment
Labels: ready, docs, pl, fun

#19056 - Clarify `self.log(..., rank_zero_only=True|False)`

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: bug, ready, docs, callback: model checkpoint, pl, fun

#19055 - Fix the "our" word duplication in the docs

Pull Request - State: open - Opened by Jamim 10 months ago
Labels: ready, docs, fabric, community, pl

#19054 - Make `ModelCheckpoint._format_checkpoint_name` an instance method

Pull Request - State: closed - Opened by awaelchli 10 months ago
Labels: bug, ready, callback: model checkpoint, pl, fun

#19053 - Fix docs for 'nn.Module from checkpoint'

Pull Request - State: closed - Opened by awaelchli 10 months ago - 1 comment
Labels: ready, docs, pl, fun

#19052 - Add fault tolerance Streaming Dataset 2/n

Pull Request - State: closed - Opened by tchaton 10 months ago - 1 comment
Labels: ready, data

#19051 - Move`ignore[override]` annotation to method signature

Pull Request - State: closed - Opened by VictorPrins 10 months ago - 4 comments
Labels: code quality, community, pl

#19050 - Add numpy support for the StreamingDataset 1/2

Pull Request - State: closed - Opened by tchaton 10 months ago - 1 comment
Labels: ready, data

#19049 - Add fault tolerance for the StreamingDataset 1/n

Pull Request - State: closed - Opened by tchaton 10 months ago - 2 comments
Labels: ready, data

#19048 - TypeError: __init__() got an unexpected keyword argument 'drop_last'

Issue - State: closed - Opened by liziru 10 months ago - 3 comments
Labels: bug, waiting on author, data handling, ver: 2.1.x

#19047 - Loading nn.Module from checkpoint doc fix

Issue - State: closed - Opened by montehoover 10 months ago - 2 comments
Labels: docs

#19046 - Cast to >=float32 tensor when passing scalar to self.log

Pull Request - State: closed - Opened by MF-FOOM 10 months ago - 1 comment
Labels: bug, ready, logging, community, pl

#19045 - Training hangs with DDP + ModelCheckpoint Callback

Issue - State: closed - Opened by MattMcPartlon 10 months ago - 8 comments
Labels: bug, distributed, callback: model checkpoint, ver: 2.1.x

#19044 - Add direct s3 support to the streaming dataset

Pull Request - State: closed - Opened by tchaton 10 months ago - 1 comment
Labels: ready, data

#19042 - Train diffusion model with fabric

Issue - State: closed - Opened by caiqi 10 months ago - 6 comments
Labels: question, strategy: deepspeed, fabric, ver: 2.1.x

#19041 - Add disk usage check before downloading files

Pull Request - State: closed - Opened by tchaton 10 months ago - 1 comment
Labels: ready, ci, data

#19040 - PyTest random order for Fabric tests

Pull Request - State: closed - Opened by awaelchli 10 months ago - 2 comments
Labels: ready, ci, fabric, tests, fun

#19039 - Filter test names in `run_standalone_tests.sh` when checking for errors

Issue - State: open - Opened by awaelchli 10 months ago - 1 comment
Labels: feature, help wanted, tests

#19038 - Re-enable dynamo tests that were fixed in PyTorch 2.1

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, fabric, tests, pl, fun

#19037 - Call `find_free_network_port` only if necessary

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, refactor, fabric, environment: lightning, fun

#19036 - Call `configure_model()` in `LM.load_from_checkpoint()`

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: feature, ready, strategy: fsdp, pl, fun

#19035 - Loggers fails to create metrics.csv file when running on multiple TPU cores

Issue - State: open - Opened by javiergaitan 11 months ago - 1 comment
Labels: bug, help wanted, strategy: xla, ver: 2.2.x

#19034 - Remove unnecessary torch.cuda.manual_seed_all()

Pull Request - State: closed - Opened by JalinWang 11 months ago - 1 comment
Labels: ready, refactor, fabric, community

#19033 - Remove unnecessary `torch.cuda.manual_seed_all()` in `seed_everything()`

Issue - State: closed - Opened by JalinWang 11 months ago - 2 comments
Labels: help wanted, refactor

#19032 - Lightning CLI optimizer from command line for subset of model parameters

Issue - State: closed - Opened by nilsleh 11 months ago - 3 comments
Labels: feature, lightningcli

#19031 - Bump Lightning-AI/utilities from 0.9.0 to 0.10.0

Pull Request - State: closed - Opened by dependabot[bot] 11 months ago
Labels: ready, ci

#19030 - Fix `rank_zero_only` rank not set in ddp-spawn based strategies

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, priority: 0, fabric, strategy: ddp, pl, fun

#19029 - The training process is fast, but it becomes particularly slow during validation.

Issue - State: closed - Opened by shiyao1999 11 months ago - 5 comments
Labels: bug, strategy: ddp, performance, ver: 1.8.x, repro needed

#19027 - ThroughputMonitor Trainer callback fixes

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: bug, ready, pl

#19026 - Reduce lightning data's dependencies

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: ready, ci, dependencies, data

#19025 - Add `@override` for files in `src/lightning/pytorch/plugins`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 1 comment
Labels: ready, code quality, community, pl

#19024 - Add support to `saving.py` for loading GPU-trained models on CPU-only machines

Pull Request - State: open - Opened by amorehead 11 months ago - 9 comments
Labels: pl

#19023 - Fix fsspec local file protocol checks for new fsspec version

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, priority: 0, fabric, app, pl, dependencies, data

#19022 - Address test flakiness

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, fabric, tests, strategy: ddp, pl, fun

#19021 - docs: update chlog with `2.1.1` & `2.1.2`

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, fabric, app, pl, data

#19020 - Add `@override` for files in `src/lightning/pytorch/profilers`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 1 comment
Labels: ready, code quality, community, pl

#19019 - Remove the LightningDataset relying on un-maintained torchdata

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready, dependencies, data

#19018 - ci/docs: upload with dispatch

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, ci, release

#19017 - Resolve Item Loader bugs

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready, data

#19016 - docs: enable retro dispatch build

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, ci

#19015 - `Trainer.validate()` after `Trainer.fit()` not working with FSDP and `auto_wrap_policy`

Issue - State: open - Opened by SpirinEgor 11 months ago
Labels: bug, strategy: fsdp, ver: 2.1.x

#19014 - docs: fix inter-sphinx to `numpy`

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: bug, ready, docs, app

#19013 - ci/doc: enable build docs on spec version

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, ci

#19012 - docs: pin some linked ipynb to 2.1.0

Pull Request - State: closed - Opened by Borda 11 months ago - 2 comments
Labels: ready, docs, pl

#19011 - Docs style renders unreadable yaml code snippets

Issue - State: open - Opened by mauvilsa 11 months ago - 1 comment
Labels: docs

#19009 - Fail to import lightning due to missing dependency on setuptools

Issue - State: open - Opened by cshimmin 11 months ago
Labels: bug, needs triage, ver: 2.1.x

#19009 - Fail to import lightning due to missing dependency on setuptools

Issue - State: open - Opened by cshimmin 11 months ago
Labels: bug, needs triage, ver: 2.1.x

#19008 - ci: minor formatting update

Pull Request - State: open - Opened by Borda 11 months ago - 1 comment
Labels: ci

#19008 - ci: minor formatting update

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, ci

#19007 - Add `best_model_metrics` to `ModelCheckpoint` callback

Issue - State: open - Opened by libokj 11 months ago
Labels: feature, needs triage

#19007 - Add `best_model_metrics` to `ModelCheckpoint` callback

Issue - State: open - Opened by libokj 11 months ago
Labels: feature, needs triage

#19006 - Add `@override` for files in `src/lightning/pytorch/trainer/callbacks`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 4 comments
Labels: ready, code quality, community, pl

#19006 - Add `@override` for files in `src/lightning/pytorch/trainer/callbacks`

Pull Request - State: open - Opened by VictorPrins 11 months ago - 2 comments
Labels: pl

#19005 - Add `@override` for files in `src/lightning/pytorch/tuner`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 1 comment
Labels: ready, tuner, code quality, community, pl

#19005 - Add `@override` for files in `src/lightning/pytorch/tuner`

Pull Request - State: open - Opened by VictorPrins 11 months ago
Labels: tuner, code quality, community, pl

#19003 - Downloading artifacts with wandblogger in DDP case failing on non-zero rank processes

Issue - State: open - Opened by galbraun 11 months ago - 2 comments
Labels: bug, needs triage, ver: 2.1.x

#19003 - Downloading artifacts with wandblogger in DDP case failing on non-zero rank processes

Issue - State: open - Opened by galbraun 11 months ago - 2 comments
Labels: bug, help wanted, logger: wandb, ver: 2.1.x

#19001 - Weekly minor patch release `2.1.2` [Rebase & merge]

Pull Request - State: closed - Opened by Borda 11 months ago - 2 comments
Labels: ready, docs, fabric, app, pl, dependencies, package

#19000 - work completed status

Pull Request - State: closed - Opened by nohalon 11 months ago - 1 comment
Labels: app

#18999 - ImportError: cannot import name 'Auth' from partially initialized module 'lightning_cloud.login'

Issue - State: closed - Opened by weiji14 11 months ago - 1 comment
Labels: bug, dependencies, ver: 2.1.x

#18997 - Add `@override` for files in `src/lightning/pytorch/trainer/connectors`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 1 comment
Labels: ready, code quality, community, pl

#18996 - Update auto_encoder.py to accomodate torchvision breaking change

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: example, ready, pl

#18995 - Option to save last checkpoint as copy instead of symlinking

Issue - State: open - Opened by ad12 11 months ago - 4 comments
Labels: feature, callback: model checkpoint

#18994 - Fix test interactions

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: bug, ready, fabric, tests, pl, fun

#18993 - Fix `ModelCheckpoint.CHECKPOINT_NAME_LAST` test interaction

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, tests, callback: model checkpoint, pl, fun

#18992 - Fix `trainer.save_checkpoint` after `trainer.test` with FSDP

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, trainer: test, strategy: fsdp, pl, fun

#18991 - CUDA out Of Error while Loading From Checkpoint

Issue - State: closed - Opened by chinge55 11 months ago
Labels: bug, needs triage, ver: 2.0.x

#18990 - Bump axios from 0.26.1 to 1.6.0 in /src/lightning/app/cli/react-ui-template/ui

Pull Request - State: open - Opened by dependabot[bot] 11 months ago
Labels: app, javascript

#18989 - Add multiple uploaders to the map, optimize

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18988 - Avoid importing from `lightning.app` in `lightning.data`

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: bug, ready

#18987 - Improve handling the positional encoding in Transformer example

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: example, ready, pl

#18986 - Fix typo

Pull Request - State: closed - Opened by AMHermansen 11 months ago - 1 comment
Labels: ready, community, pl

#18985 - BatchSizeFinder throws KeyError: 'limit_eval_batches'

Issue - State: open - Opened by drusmanbashir 11 months ago - 1 comment
Labels: bug, help wanted, tuner, ver: 2.1.x, repro needed

#18984 - cast to float32 or float64 tensor when passing scalar to self.log

Issue - State: closed - Opened by MF-FOOM 11 months ago - 2 comments
Labels: bug, logging, ver: 2.1.x

#18983 - Fix(app): Reduce HTTP Queue Get Request Rates

Pull Request - State: closed - Opened by rlizzo 11 months ago - 2 comments
Labels: app

#18982 - registered buffers' dtype is overridden after __init__

Issue - State: open - Opened by MF-FOOM 11 months ago - 1 comment
Labels: bug, fabric, pl, ver: 2.1.x, precision: half