Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / Lightning-AI/pytorch-lightning issues and pull requests
#15073 - [App] Support app checkpointing
Pull Request -
State: closed - Opened by manskx about 2 years ago
- 9 comments
Labels: feature, checkpointing, has conflicts, app
#14674 - `auto_lr_find` does not work if there is a BackboneFinetuning callback
Issue -
State: open - Opened by ejm714 about 2 years ago
- 2 comments
Labels: bug, help wanted, tuner, callback: finetuning
#14645 - Unable to run works in the `LightningList` structure on cloud
Issue -
State: closed - Opened by krshrimali about 2 years ago
- 1 comment
Labels: bug, app
#14579 - `TrainingEpochLoop._should_check_val_fx` discrepancy between continued run <> restore from ckpt
Issue -
State: closed - Opened by Anner-deJong about 2 years ago
- 4 comments
Labels: bug, help wanted, checkpointing, loops
#14559 - Version mismatches between package, CITATION file, and Zenodo
Issue -
State: open - Opened by timothygebhard about 2 years ago
- 4 comments
Labels: ci, priority: 2, admin, pl
#14545 - External IP address does not bind to streamlit functions.
Issue -
State: closed - Opened by tmquan about 2 years ago
- 1 comment
Labels: priority: 1, app
#14523 - Can port addresses be changed when launching Lightning App?
Issue -
State: closed - Opened by Felonious-Spellfire about 2 years ago
Labels: docs, app
#14522 - Expose the external IP instead of using 127.0.0.1/view/, when developing Apps locally
Issue -
State: closed - Opened by Felonious-Spellfire about 2 years ago
Labels: docs, app
#14520 - Add instructions for customizing Works using cloud compute in more code examples
Issue -
State: closed - Opened by Felonious-Spellfire about 2 years ago
Labels: docs, app
#14344 - make App pytest independent
Issue -
State: closed - Opened by Borda about 2 years ago
- 4 comments
Labels: priority: 1, tests, app
#14284 - [WIP] uneven input support for DDP
Pull Request -
State: closed - Opened by otaj about 2 years ago
- 6 comments
Labels: pl
#14188 - Introduce `Logger.experiment_dir`
Issue -
State: open - Opened by awaelchli over 2 years ago
- 17 comments
Labels: feature, design, logger
#14167 - Learning Rate finder too strong loss smoothing
Issue -
State: open - Opened by hcgasser over 2 years ago
- 5 comments
Labels: feature, tuner
#14078 - RFC: Remove `num_nodes` Trainer argument and infer world size from cluster environment directly
Issue -
State: open - Opened by awaelchli over 2 years ago
- 8 comments
Labels: deprecation, strategy: ddp, environment, trainer: argument
#14036 - Cannot use `torch.jit.trace` to trace `LightningModule` in Lightning v1.7
Issue -
State: closed - Opened by J-shang over 2 years ago
- 17 comments
Labels: bug, lightningmodule
#13944 - LightningFlow state increases indefinitely
Issue -
State: closed - Opened by belerico over 2 years ago
- 5 comments
Labels: app
#13931 - Switch from click to google Fire
Issue -
State: closed - Opened by nicolai86 over 2 years ago
Labels: discussion, app
#13902 - BackboneFinetuning - even with train_bn batch normalization is still learning.
Issue -
State: closed - Opened by perliczka1 over 2 years ago
- 6 comments
Labels: bug, callback: finetuning
#13891 - 404 page not found error for all "Web UIs * " pages in the Intermediate skills Level 10 docs.
Issue -
State: closed - Opened by Nachimak28 over 2 years ago
- 2 comments
Labels: won't fix, app
#13848 - Write a contributing guide for Lightning App
Issue -
State: closed - Opened by tchaton over 2 years ago
Labels: docs, app
#13757 - Support remote Lightning apps templates
Issue -
State: closed - Opened by manskx over 2 years ago
- 3 comments
Labels: app
#13756 - Update List of Components
Issue -
State: closed - Opened by oojo12 over 2 years ago
- 1 comment
Labels: docs, app
#13745 - Lightning init [app/pl-app] issues
Issue -
State: closed - Opened by luca-medeiros over 2 years ago
- 3 comments
Labels: bug, won't fix
#13709 - lightning app lightning run app docs/source-app/examples/github_repo_runner/app.py fails with train_script.py` wasn't found.
Issue -
State: closed - Opened by robert-s-lee over 2 years ago
- 3 comments
Labels: won't fix, example, app
#13639 - In multinode training with ddp each node duplicates logs and has node_rank=0
Issue -
State: closed - Opened by jessecambon over 2 years ago
- 25 comments
Labels: feature, distributed, environment
#13521 - Lightning applications fail when Path(".") is used
Issue -
State: closed - Opened by krishnakalyan3 over 2 years ago
- 2 comments
Labels: won't fix, app
#13507 - Support dynamic dark theme
Issue -
State: closed - Opened by MarcSkovMadsen over 2 years ago
- 2 comments
Labels: feature, app
#13496 - Add support for embedding a Grid of iframes on the UI
Issue -
State: closed - Opened by tchaton over 2 years ago
- 2 comments
Labels: feature, won't fix, app
#13407 - Enable me to ignore or solve self signed certificate issue
Issue -
State: closed - Opened by MarcSkovMadsen over 2 years ago
- 4 comments
Labels: feature, won't fix, waiting on author, app
#13323 - Make the Streamlit frontend multi tenant by default
Issue -
State: closed - Opened by zippeurfou over 2 years ago
- 1 comment
Labels: feature, won't fix, app
#13124 - Resuming from a mid-epoch checkpoint produces negative time estimates
Issue -
State: closed - Opened by fishbotics over 2 years ago
- 19 comments
Labels: bug, priority: 0, progress bar: tqdm
#12917 - MisconfigurationException: Trying to inject `DistributedSampler` into the `AnnLoader` instance
Issue -
State: closed - Opened by mbuttner over 2 years ago
- 6 comments
Labels: bug, data handling, trainer: predict
#12833 - MLFlowLogger used with server crashes training
Issue -
State: open - Opened by GinkoBalboa over 2 years ago
- 7 comments
Labels: feature, logger: mlflow
#12756 - UserWarning: The flag devices=-1 will be ignored
Issue -
State: closed - Opened by mnslarcher over 2 years ago
- 4 comments
Labels: question
#12624 - Enable Hyperparameter logging from any hook in the LightningModule
Issue -
State: open - Opened by cemde over 2 years ago
- 8 comments
Labels: feature, lightningmodule
#12438 - Whether clarification/documentation/redesign is needed for customizing LightningCLI subcommands
Issue -
State: open - Opened by mauvilsa over 2 years ago
- 3 comments
Labels: docs, design, lightningcli
#12119 - Use :emphasize-lines: in sphinx docs to highlight code.
Issue -
State: open - Opened by tchaton over 2 years ago
- 6 comments
Labels: good first issue, docs, priority: 1
#12095 - Early stopping conditioned on metric `val_loss` which is not available
Issue -
State: closed - Opened by JackRio over 2 years ago
- 5 comments
Labels: bug
#12094 - EarlyStopping Callback relative threshold mode
Issue -
State: open - Opened by tlpss over 2 years ago
- 8 comments
Labels: feature, design, callback: early stopping
#12013 - Cannot pass callable as `model_class` to `LightningCLI`
Issue -
State: closed - Opened by yangky11 over 2 years ago
- 5 comments
Labels: bug, lightningcli
#11979 - `ModelCheckpoint` does NOT save anything if `every_n_train_steps` is greater than the number of training steps in a epoch
Issue -
State: closed - Opened by ShaneTian over 2 years ago
- 8 comments
Labels: bug, callback: model checkpoint
#11923 - DDP GPU memory imbalanced
Issue -
State: closed - Opened by lukasfolle almost 3 years ago
- 8 comments
Labels: bug, strategy: ddp, accelerator: cuda
#11923 - DDP GPU memory imbalanced
Issue -
State: closed - Opened by lukasfolle almost 3 years ago
- 8 comments
Labels: bug, strategy: ddp, accelerator: cuda
#11923 - DDP GPU memory imbalanced
Issue -
State: closed - Opened by lukasfolle almost 3 years ago
- 8 comments
Labels: bug, strategy: ddp, accelerator: cuda
#11922 - Support user-defined parallelization in the LightningModule
Issue -
State: closed - Opened by ananthsub almost 3 years ago
- 3 comments
Labels: feature, distributed, strategy
#11841 - [Bug] training (sometimes) freezes in a multi-gpu setting without throwing any errors or warnings.
Issue -
State: closed - Opened by ragavsachdeva almost 3 years ago
- 6 comments
Labels: bug, won't fix, strategy: ddp
#11547 - Ability to change the number of epochs after initiating the trainer.
Issue -
State: closed - Opened by BartekKrzepkowski almost 3 years ago
- 5 comments
Labels: feature, won't fix
#11438 - Integrate TorchTensorRt in order to increase speed during inference
Issue -
State: open - Opened by Actis92 almost 3 years ago
- 7 comments
Labels: feature, 3rd party, performance
#11242 - DDP training randomly stopping
Issue -
State: closed - Opened by yoonseok312 almost 3 years ago
- 41 comments
Labels: bug, strategy: ddp
#11224 - Add "interval": "validation" to scheduler configuration
Issue -
State: open - Opened by de-gozaru almost 3 years ago
- 3 comments
Labels: feature, priority: 1, lr scheduler
#11158 - Hang when using Lightning CLI from config file and DDP
Issue -
State: closed - Opened by gau-nernst almost 3 years ago
- 12 comments
Labels: bug, lightningcli
#11126 - LightningModule self.log add_dataloader_idx doesn't reduce properly the metric across dataloaders
Issue -
State: open - Opened by tchaton almost 3 years ago
- 13 comments
Labels: bug, priority: 1
#11029 - Resuming training throws the mid-epoch warning everytime
Issue -
State: closed - Opened by rohitgr7 almost 3 years ago
- 13 comments
Labels: refactor, checkpointing
#10914 - Add feature Exponential Moving Average (EMA)
Issue -
State: open - Opened by hankyul2 almost 3 years ago
- 53 comments
Labels: feature
#10876 - RichProgressBar is not compatible with nohup command
Issue -
State: closed - Opened by quancs almost 3 years ago
- 1 comment
Labels: bug, progress bar: rich
#10759 - Proper support for Pytorch SequentialLR Scheduler
Issue -
State: open - Opened by marcm-ml almost 3 years ago
- 9 comments
Labels: bug, 3rd party, lr scheduler
#10530 - Label tracking meta-issue (edit me to get automatically CC'ed on issues!)
Issue -
State: open - Opened by carmocca about 3 years ago
- 9 comments
#10389 - Lightning is very slow between epochs, compared to PyTorch.
Issue -
State: closed - Opened by TheMrZZ about 3 years ago
- 60 comments
Labels: bug, help wanted, priority: 1, performance
#10308 - when the validation_step function returns a type defaultdict, TypeError: first argument must be callable or None occurs
Issue -
State: closed - Opened by jshin49 about 3 years ago
- 6 comments
Labels: bug, help wanted
#10285 - UserWarning: you defined a validation_step but have no val_dataloader. Skipping val loop
Issue -
State: closed - Opened by 7starsea about 3 years ago
- 6 comments
Labels: bug, help wanted, won't fix
#10260 - Guarantee call order for callbacks
Issue -
State: open - Opened by z-a-f about 3 years ago
- 9 comments
Labels: question, callback
#9947 - Support `str(datamodule)`
Issue -
State: open - Opened by carmocca about 3 years ago
- 11 comments
Labels: feature, good first issue, data handling
#9938 - Support checkpoint save and load with Stochastic Weight Averaging
Pull Request -
State: closed - Opened by adamreeve about 3 years ago
- 31 comments
Labels: feature, ready, callback: swa, community, pl
#9450 - PyTorch profiler not working with the new version 1.4.6
Issue -
State: closed - Opened by aprbw about 3 years ago
- 10 comments
Labels: bug, help wanted, priority: 0, profiler
#9318 - dictionary update sequence element #0 has length 1; 2 is required
Issue -
State: closed - Opened by cristianegea about 3 years ago
- 18 comments
Labels: bug, help wanted
#9254 - Run the test set every epoch on a single GPU
Issue -
State: closed - Opened by jipson7 about 3 years ago
- 8 comments
Labels: feature, help wanted
#9170 - Enums parsing in hparams.yaml generated
Pull Request -
State: closed - Opened by grajat90 about 3 years ago
- 11 comments
Labels: bug, ready
#8720 - FineTuning and ReduceLROnPleateau scheduler fail - optimizer.param_groups
Issue -
State: closed - Opened by FlorianMF over 3 years ago
- 7 comments
Labels: feature, help wanted, won't fix
#8040 - Memory explodes when limit_train_batches argument used
Issue -
State: closed - Opened by ejohb over 3 years ago
- 10 comments
Labels: bug, help wanted, good first issue, priority: 0
#7653 - Allow returning of test results from Trainer.test
Issue -
State: open - Opened by Rizhiy over 3 years ago
- 11 comments
Labels: feature, design, trainer: validate, trainer: test
#7028 - [Grid] You must call wandb.init() before wandb.log()
Issue -
State: closed - Opened by turian over 3 years ago
- 8 comments
Labels: bug, help wanted
#6544 - Random job failures caused by the CheckpointConnector on slurm managed hpc
Issue -
State: closed - Opened by dln22 over 3 years ago
- 4 comments
Labels: bug, help wanted, priority: 0, waiting on author, checkpointing, environment: slurm
#6480 - on_epoch_end callback is called before on_validation_epoch_end
Issue -
State: closed - Opened by dumitrescustefan over 3 years ago
- 7 comments
Labels: bug, help wanted, working as intended
#6446 - Early Stopping Min Epochs
Issue -
State: closed - Opened by thomasj02 over 3 years ago
- 5 comments
Labels: feature, help wanted, won't fix, design, callback
#6389 - Disable automatic SLURM Detection
Issue -
State: closed - Opened by amogkam over 3 years ago
- 36 comments
Labels: feature, help wanted, priority: 0, design, environment: slurm
#6381 - fit hangs on single GPU
Issue -
State: closed - Opened by fonnesbeck over 3 years ago
- 9 comments
Labels: bug, help wanted, priority: 2
#6319 - AttributeError in .fit() method for Stallion notebook
Issue -
State: closed - Opened by NatashaSvc over 3 years ago
- 4 comments
Labels: bug, help wanted, won't fix, priority: 1
#6159 - Model loaded from checkpoint has bad accuracy
Issue -
State: closed - Opened by Inspirateur over 3 years ago
- 9 comments
Labels: question
#5969 - Lightning throws "bypassing sigterm" on Slurm Cluster for unknown reason
Issue -
State: closed - Opened by vitusbenson almost 4 years ago
- 15 comments
Labels: bug, help wanted, won't fix, environment: slurm, priority: 2
#5930 - Metrics API when using DDP and multi-GPU freezes on compute() at end of validation phase
Issue -
State: closed - Opened by angadkalra almost 4 years ago
- 31 comments
Labels: bug, help wanted, priority: 0
#5725 - Training Process hangs. Full RAM and SWAP.
Issue -
State: closed - Opened by Arij-Aladel almost 4 years ago
- 16 comments
Labels: won't fix
#5469 - WandB dropping items when logging LR or val_loss with accumulate_grad_batches > 1
Issue -
State: closed - Opened by tadejsv almost 4 years ago
- 9 comments
Labels: bug, help wanted, won't fix, logger, priority: 1
#5384 - Value interpolation with hydra composition
Issue -
State: closed - Opened by celsofranssa almost 4 years ago
- 14 comments
Labels: bug, help wanted, priority: 1
#5339 - Resuming should allow to differentiate what to resume (steps/opti/weights)
Issue -
State: open - Opened by thoglu almost 4 years ago
- 25 comments
Labels: feature, help wanted, priority: 1
#5180 - How can I stop WandbLogger instance being instantiated when calling load_from_checkpoint?
Issue -
State: closed - Opened by kyoungrok0517 almost 4 years ago
- 11 comments
Labels: bug, question, won't fix, logger, 3rd party
#4998 - `LightningModule.log(..., on_epoch=True)` logs with `global_step` instead of `current_epoch`
Issue -
State: closed - Opened by quinor almost 4 years ago
- 11 comments
Labels: feature, help wanted, logging
#4792 - checkpoint cannot be loaded without source code
Issue -
State: closed - Opened by Sushobhan04 almost 4 years ago
- 9 comments
Labels: help wanted, question, checkpointing
#4450 - Data loading hangs before first validation step
Issue -
State: closed - Opened by jonashaag about 4 years ago
- 30 comments
Labels: help wanted, won't fix, waiting on author
#4045 - continue training from checkpoint seems broken (high loss values), while reasonable with .eval()
Issue -
State: closed - Opened by yairkit about 4 years ago
- 20 comments
Labels: bug, help wanted, priority: 0
#3431 - How to disable printings about GPU/TPU
Issue -
State: closed - Opened by 7rick03ligh7 about 4 years ago
- 11 comments
Labels: question
#3325 - Support uneven DDP inputs with pytorch model.join
Issue -
State: open - Opened by edenlightning about 4 years ago
- 25 comments
Labels: feature, help wanted, distributed, 3rd party
#3228 - Log epoch as step when on_epoch=True and on_step=False
Issue -
State: closed - Opened by ToucheSir about 4 years ago
- 34 comments
Labels: feature, help wanted
#3107 - How automaticly load best model checkpoint on Trainer instance with TestTubeLogger
Issue -
State: closed - Opened by Vichoko about 4 years ago
- 10 comments
Labels: question
#2974 - fix tb hparams logging
Pull Request -
State: closed - Opened by s-rog over 4 years ago
- 32 comments
Labels: bug, feature
#2772 - Model alone makes different predictions compared to trainer + model
Issue -
State: closed - Opened by JanRuettinger over 4 years ago
- 14 comments
Labels: bug, help wanted, priority: 0
#2658 - Pytorch lightning switched to cpu in the middle of training. How can I debug this?
Issue -
State: closed - Opened by samikhenissi over 4 years ago
- 11 comments
Labels: bug, help wanted
#2351 - Model validation code is not called
Issue -
State: closed - Opened by Uroc327 over 4 years ago
- 13 comments
Labels: bug, help wanted
#2295 - Stop at Validation sanity check
Issue -
State: closed - Opened by hminle over 4 years ago
- 8 comments
Labels: question
#2189 - Can you make a new progress bar for each epoch?
Issue -
State: closed - Opened by bjourne over 4 years ago
- 22 comments
Labels: question, progress bar: tqdm
#2145 - How do you save a trained model in standard pytorch format?
Issue -
State: closed - Opened by mm04926412 over 4 years ago
- 13 comments
Labels: question