Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / Lightning-AI/pytorch-lightning issues and pull requests
#20191 - Fix: Make `WandbLogger` upload models from all `ModelCheckpoint` callbacks, not just one
Pull Request -
State: open - Opened by cgebbe 3 months ago
- 1 comment
Labels: pl
#20190 - shortcuts for logging weights and biases norms
Issue -
State: open - Opened by heth27 3 months ago
Labels: feature, needs triage
#20189 - Support IO Type Checkpoints for trainer.fit() in ckpt_path Parameter
Issue -
State: open - Opened by kimjw0623 3 months ago
Labels: feature, needs triage
#20188 - Seeding and multi-GPU training
Issue -
State: open - Opened by tomsons22 3 months ago
- 1 comment
Labels: docs, needs triage
#20187 - OnExceptionCheckpoint callback suppresses exceptions and results in NCCL timeout
Issue -
State: open - Opened by jackdent 3 months ago
Labels: bug, needs triage, ver: 2.4.x
#20186 - make plugin type check more flexible
Pull Request -
State: open - Opened by jedyang97 3 months ago
- 1 comment
Labels: pl
#20185 - Checkpoint callback run before validation step - stale or none monitor values considered for validation metrics
Issue -
State: open - Opened by PheelaV 3 months ago
- 2 comments
Labels: bug, needs triage, ver: 2.4.x, ver: 2.3.x
#20184 - MLFlowLogger does not save config.yaml for each run
Issue -
State: open - Opened by jeangud 3 months ago
Labels: bug, needs triage, ver: 2.4.x
#20183 - Add device property to lazy load functionality
Pull Request -
State: closed - Opened by t-vi 3 months ago
- 2 comments
Labels: ready, fabric
#20182 - `Error while merging hparams` when using LightningCLI and YAML
Issue -
State: open - Opened by cgebbe 3 months ago
- 5 comments
Labels: bug, needs triage, ver: 2.4.x
#20181 - trainer test and validate have issues with autograd
Issue -
State: open - Opened by bpfrd 3 months ago
Labels: bug, needs triage, ver: 2.4.x
#20179 - trainer.validate() get different result from trainer.fit
Issue -
State: open - Opened by matrix72c 3 months ago
- 1 comment
Labels: bug, needs triage, ver: 2.2.x
#20177 - Trainer does not switch to train mode after validation step
Issue -
State: open - Opened by ClemensSchwarke 3 months ago
- 2 comments
Labels: bug, needs triage, ver: 2.4.x, ver: 2.3.x
#20176 - Add `step` parameter to `TensorBoardLogger.log_hyperparams`
Pull Request -
State: open - Opened by ringohoffman 3 months ago
- 2 comments
Labels: fabric, pl
#20175 - docs: fixed the `init_module` and deepspeed
Pull Request -
State: open - Opened by alyakin314 3 months ago
- 1 comment
Labels: docs, fabric
#20173 - loss spikes in validation step when the model has multiple losses applied
Issue -
State: open - Opened by RainRoboforce 3 months ago
- 1 comment
Labels: question
#20172 - Re-enable passing BytesIO as path in `.to_onnx()`
Pull Request -
State: closed - Opened by GdoongMathew 3 months ago
- 2 comments
Labels: bug, community, pl
#20171 - Inconsistent input io type between `to_onnx` and `torch.onnx.export`.
Issue -
State: closed - Opened by GdoongMathew 3 months ago
Labels: bug, ver: 2.3.x
#20170 - fix(docs): remove dead link from readme
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
#20169 - fix(ci): resolve input str -> num conversion
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
Labels: ready, ci
#20168 - ci/docs: disable optional cache pkg
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
Labels: docs, ci
#20167 - ci: fix cleaning caches
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
Labels: bug, ci
#20166 - False positive iterable dataset warning for LitData StreamingDataset
Issue -
State: open - Opened by awaelchli 3 months ago
Labels: bug, data handling
#20165 - Remove the `optimizer_to_device` logic if possible
Issue -
State: open - Opened by awaelchli 3 months ago
- 3 comments
Labels: refactor, checkpointing, performance
#20164 - docs: fix typo in `linkcheck_ignore`
Pull Request -
State: closed - Opened by Borda 3 months ago
- 1 comment
Labels: docs, pl
#20163 - Fix parameter count in ModelSummary when parameters are DTensors
Pull Request -
State: closed - Opened by awaelchli 3 months ago
- 2 comments
Labels: bug, fabric, callback: model summary, strategy: fsdp, pl, fun
#20162 - Add email callback on train complete
Pull Request -
State: open - Opened by loucaspapalazarou 3 months ago
- 1 comment
Labels: pl
#20161 - Add diffusion example to README
Pull Request -
State: closed - Opened by awaelchli 3 months ago
- 1 comment
#20160 - Bug: automatic logging doesn't log metric on steps if .update is used
Issue -
State: open - Opened by EtayLivne 4 months ago
Labels: bug, needs triage, ver: 2.2.x
#20159 - Count number of modules in train/eval mode in ModelSummary
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: feature, docs, callback: model summary, pl, fun
#20158 - Remove outdated `process_position` reference in progress bar docs.
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: docs, progress bar: tqdm, pl, fun
#20157 - docs: update ref to latest tutorials
Pull Request -
State: closed - Opened by pl-ghost 4 months ago
- 1 comment
Labels: docs, examples
#20156 - Avoid deprecated distutils for docs build
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: docs, ci, fun
#20155 - Update type check workflow to PyTorch 2.4
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: docs, ci, fabric, code quality, pl, fun, dependencies
#20154 - Prepare Lightning 2.4.0 release
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 1 comment
Labels: docs, ci, release, fabric, pl, fun, package
#20153 - Confusing recommendation to use sync_dist=True even with TorchMetrics
Issue -
State: open - Opened by srprca 4 months ago
- 9 comments
Labels: bug, help wanted, logging, ver: 2.2.x
#20152 - Typing for `_restricted_classmethod` (e.g. for `LightningModule.load_from_checkpoint`) has stopped working for mypy 1.11
Issue -
State: closed - Opened by maciejzj 4 months ago
- 1 comment
Labels: bug, help wanted, code quality, ver: 2.2.x
#20151 - Support computing parameter count in ModelSummary for FSDP models
Issue -
State: closed - Opened by awaelchli 4 months ago
Labels: feature, callback: model summary, strategy: fsdp
#20150 - docs: adding link to obj detect. studio
Pull Request -
State: closed - Opened by Borda 4 months ago
- 1 comment
Labels: ready, docs
#20149 - How to use Webdataset in DDP setting? ValueError: you need to add an explicit nodesplitter to your input pipeline for multi-node training
Issue -
State: open - Opened by cgebbe 4 months ago
Labels: help wanted, docs, ver: 2.2.x
#20148 - Loading `train_dataloader` before estimating `max_batches`
Pull Request -
State: open - Opened by shihchengli 4 months ago
- 1 comment
Labels: pl
#20147 - `link_arguments` does not work in lightning 2.3
Issue -
State: open - Opened by peacekurella 4 months ago
- 7 comments
Labels: bug, lightningcli, ver: 2.2.x
#20146 - Docs: Add note about version counter in `ModelCheckpoint`
Pull Request -
State: closed - Opened by adosar 4 months ago
- 1 comment
Labels: ready, docs, callback: model checkpoint, community, pl
#20145 - mps and manual_seed_all
Issue -
State: closed - Opened by Tonys21 4 months ago
- 2 comments
Labels: question
#20144 - docs: adding link to img classif. studio
Pull Request -
State: closed - Opened by Borda 4 months ago
- 1 comment
Labels: ready, docs
#20143 - Add simple LSTM example to demo folder
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: example, pl
#20142 - Add LLM finetuning Studio example to README.md
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 1 comment
Labels: ready, docs
#20141 - ModelCheckpoint reduce logic seems wrong
Issue -
State: closed - Opened by manbango 4 months ago
- 6 comments
Labels: question, logging, callback: model checkpoint
#20140 - StreamingDataset not working in multi-gpu environement
Issue -
State: open - Opened by davidpicard 4 months ago
- 3 comments
Labels: bug, repro needed
#20138 - FSDP Fails with floating nn.Parameter
Issue -
State: open - Opened by schopra8 4 months ago
- 6 comments
Labels: bug, duplicate, strategy: fsdp, ver: 2.2.x
#20137 - Support restoring callbacks' status when predicting
Issue -
State: closed - Opened by zihaozou 4 months ago
- 1 comment
Labels: feature
#20133 - Email Callback on training done
Issue -
State: open - Opened by loucaspapalazarou 4 months ago
- 6 comments
Labels: feature, discussion
#20130 - Documentation for filename convention of save_top_k in ModelCheckpoint
Issue -
State: closed - Opened by adosar 4 months ago
- 6 comments
Labels: docs
#20128 - training=False when use a pretrained model like BERT
Issue -
State: closed - Opened by huangfu170 4 months ago
- 3 comments
Labels: bug, docs
#20126 - Switch to PyTorch 2.4 stable testing
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: ci, fun, dockers
#20125 - Add `ddp_find_unused_parameters_true` alias in Fabric's DDPStrategy
Pull Request -
State: closed - Opened by 01AbhiSingh 4 months ago
- 4 comments
Labels: bug, fabric, community
#20121 - Fix attribute error on `_NotYetLoadedTensor` after loading checkpoint into quantized model with `_lazy_load()`
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: bug, fabric, precision: bnb
#20111 - docs: update ref to latest tutorials
Pull Request -
State: open - Opened by pl-ghost 4 months ago
- 1 comment
Labels: docs, examples
#20110 - CSV Logger acts weirdly in Callbacks
Issue -
State: open - Opened by oabuhamdan 4 months ago
Labels: bug, needs triage, ver: 2.2.x
#20109 - Remove confusing warning "Missing logger folder"
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: fabric, pl, fun
#20108 - Avoid printing the seed info message multiple times
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: bug, fabric, pl, fun
#20107 - TypeError: on_train_batch_start() takes 3 positional arguments but 4 were given
Issue -
State: closed - Opened by cxhagd 4 months ago
- 3 comments
Labels: question
#20106 - OptimizerLRScheduler typing does not fit examples
Issue -
State: closed - Opened by MalteEbner 4 months ago
- 4 comments
Labels: bug, help wanted, example, ver: 2.2.x
#20105 - What happens during training with HuggingFace models in eval mode?
Issue -
State: closed - Opened by StevenSong 4 months ago
- 2 comments
Labels: bug
#20104 - Get `num_nodes` automatically
Issue -
State: closed - Opened by BakerBunker 4 months ago
- 2 comments
Labels: duplicate, feature, strategy: ddp
#20103 - LightningCLI doesn't save optimizer's configuration if not explicitly given
Issue -
State: closed - Opened by adosar 4 months ago
- 7 comments
Labels: question, lightningcli
#20102 - Remove outdated warnings filter for `reduce_op`
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: code quality, pl
#20101 - pl.TrainResult not found in 2.3.3
Issue -
State: closed - Opened by manavkulshrestha 4 months ago
- 1 comment
Labels: question
#20100 - Pytorch FSDPStrategy saving checkpoint behavior work correctly?
Issue -
State: open - Opened by nbqu 4 months ago
Labels: bug, needs triage
#20099 - Fixed positional encoding not used in Demo Transformer
Pull Request -
State: closed - Opened by K-H-Ismail 4 months ago
- 1 comment
Labels: bug, example, community, pl
#20096 - Adding support for Python 12?
Issue -
State: closed - Opened by mohammedsalah-ai 4 months ago
- 2 comments
Labels: feature
#20095 - Sometimes error when logging model graph with `functional.interpolate` and `deterministic=True`
Issue -
State: open - Opened by pandegaabyan 4 months ago
Labels: bug, needs triage, ver: 2.2.x
#20094 - Please allow automatic optimization for multiple optimizers again.
Issue -
State: open - Opened by profPlum 4 months ago
- 2 comments
Labels: feature, discussion
#20093 - wandblogger : File handles cannot be properly released
Issue -
State: closed - Opened by zhf321 4 months ago
- 1 comment
Labels: repro needed
#20092 - dirpath isn't updated when logger chages dir after first run
Issue -
State: open - Opened by ScarWar 4 months ago
- 2 comments
Labels: bug, ver: 2.2.x
#20091 - Add example to README
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 1 comment
#20090 - Remove numpy from base requirements
Pull Request -
State: closed - Opened by 01AbhiSingh 4 months ago
- 3 comments
Labels: ci, fabric, community, pl, dependencies
#20089 - Checkpoint silently not correctly restored.
Issue -
State: closed - Opened by phuntast1c 4 months ago
- 3 comments
Labels: bug, ver: 2.0.x, repro needed
#20088 - Sometimes I get Dataset Errors when using the lightning module in a distributed manor
Issue -
State: open - Opened by asusdisciple 4 months ago
Labels: bug, needs triage
#20087 - Improve error message when object is passed to Trainer callbacks
Issue -
State: closed - Opened by huangfu170 4 months ago
- 2 comments
Labels: bug, help wanted, good first issue
#20086 - module statistics has no attribute mean
Issue -
State: closed - Opened by FabianKuon 4 months ago
- 3 comments
Labels: question, ver: 2.2.x
#20084 - build(deps): bump Lightning-AI/utilities from 0.11.3 to 0.11.4
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
Labels: ready, ci
#20083 - Make numpy an optional dependency
Pull Request -
State: closed - Opened by 01AbhiSingh 4 months ago
- 2 comments
Labels: has conflicts, fabric
#20082 - Fix: Use `dirpath` to resolve checkpoint path only when passed
Pull Request -
State: closed - Opened by ScarWar 4 months ago
- 1 comment
Labels: pl
#20081 - Remove deprecated `pkg_resources`
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: ci, fabric, pl, fun, dependencies, package
#20080 - Made numpy optional dependency in ```apply_func.py``` and ```logger.py```
Pull Request -
State: closed - Opened by 01AbhiSingh 4 months ago
- 4 comments
Labels: refactor, fabric, community
#20079 - Update PyTorch 2.4 tests
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: fabric, pl, fun
#20078 - Add Python 3.12 to the CPU test matrix
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: ci, fabric, tests, pl, fun, dependencies
#20077 - `pkg_resources` Deprecation Warnings on import
Issue -
State: closed - Opened by LucaBonfiglioli 4 months ago
- 2 comments
Labels: bug, duplicate, help wanted, package, ver: 2.2.x
#20076 - training time increase epoch by epoch
Issue -
State: open - Opened by Eric-Lin-CVTE 4 months ago
- 2 comments
Labels: bug, help wanted, performance, repro needed, ver: 2.2.x
#20075 - ModelCheckpoint save ckpts at the end of every epoch even in step-saving strategy
Issue -
State: open - Opened by leonardodalinky 4 months ago
Labels: bug, needs triage, ver: 2.2.x
#20074 - Cannot pass `schedule` for `PyTorchProfiler` using `LightningCLI`
Issue -
State: open - Opened by tensorcopy 4 months ago
- 6 comments
Labels: bug, lightningcli
#20072 - Drop testing standalone package in GPU CI
Pull Request -
State: closed - Opened by awaelchli 4 months ago
Labels: ci
#20071 - Remove support for Python 3.8
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 2 comments
Labels: ci, fabric, tests, pl, fun, package
#20070 - Using Stochastic Weight Averaging (SWA) and LearningRateFinder simultaneously can cause issues:
Issue -
State: open - Opened by liuzeyu6 4 months ago
Labels: bug, help wanted, callback: swa, ver: 2.2.x
#20069 - Installing lightning 2.3.3 also installs numpy<3
Issue -
State: closed - Opened by wsascha 4 months ago
- 6 comments
Labels: bug, ver: 2.2.x
#20068 - Fix LightningCLI saving hyperparameters breaking change
Pull Request -
State: closed - Opened by mauvilsa 4 months ago
- 4 comments
Labels: bug, lightningcli, pl
#20067 - PowerSGD to FSDP Strategy
Issue -
State: closed - Opened by anandxpeng 4 months ago
Labels: feature, needs triage
#20066 - Add reference to the `torch.compile` manual
Pull Request -
State: closed - Opened by awaelchli 4 months ago
- 1 comment
Labels: ready, docs, fabric, pl
#20065 - enable loading `universal checkpointing` checkpoint in `DeepSpeedStrategy`
Issue -
State: open - Opened by zhoubay 4 months ago
- 1 comment
Labels: feature, help wanted, strategy: deepspeed