Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Lightning-AI/lightning issues and pull requests

#18877 - Update Habana integration to 1.2

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: ready, docs, tests, accelerator: hpu (external), pl, dependencies

#18876 - Update logic to parse trainer settings from env vars

Pull Request - State: closed - Opened by awaelchli 11 months ago - 5 comments
Labels: bug, has conflicts, lightningcli, strategy: ddp, trainer, pl

#18875 - Feature/15718 add sagemaker experiments logger

Pull Request - State: closed - Opened by tsenst 11 months ago
Labels: docs, ci, fabric, app, pl, dependencies, package

#18874 - Can't override Trainer defaults via env variables for LightningCLI

Issue - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, strategy: ddp, ver: 2.1.x

#18873 - fabric.save errors when checkpoint exists; introduce a override_checkpoint=True argument

Issue - State: closed - Opened by rasbt 11 months ago - 2 comments
Labels: feature, design, checkpointing, fabric

#18872 - DDP + static graph can result in garbage data returned by `all_gather`

Issue - State: open - Opened by mooninrain 11 months ago
Labels: bug, 3rd party, ver: 2.0.x, repro needed

#18871 - self.log issue with torch.compile.

Issue - State: closed - Opened by assassin1991 11 months ago - 2 comments
Labels: bug, needs triage, ver: 2.1.x

#18870 - Support for returning LRSchedulerConfig on LightningModule.configure_optimizers

Issue - State: open - Opened by function2-llx 11 months ago - 1 comment
Labels: feature, help wanted, lightningmodule, lr scheduler

#18869 - Consistent imports in docs for core APIs

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready, docs, pl, fun

#18867 - Fix `ModelCheckpoint` callback for no loggers case

Pull Request - State: closed - Opened by ioangatop 11 months ago - 2 comments
Labels: bug, ready, callback: model checkpoint, community, pl

#18866 - Update typing_extensions minimum. Add overrides to ParallelStrategy.

Pull Request - State: open - Opened by seanbethard 11 months ago - 4 comments
Labels: has conflicts, fabric, pl, dependencies

#18865 - Callback `ModelCheckpoint` option `save_last` without logger fails on remote FS

Issue - State: closed - Opened by ioangatop 11 months ago - 1 comment
Labels: bug, callback: model checkpoint, ver: 2.1.x, ver: 2.2.x

#18864 - Fix `CSVLogger` for remote FS

Pull Request - State: open - Opened by ioangatop 11 months ago - 3 comments
Labels: fabric

#18863 - Handle checkpoint dirpath suffix in NeptuneLogger

Pull Request - State: closed - Opened by AleksanderWWW 11 months ago - 6 comments
Labels: bug, ready, logger: neptune, community, pl

#18862 - Instantiation of Runners is very slow on Windows

Issue - State: closed - Opened by newLabAspect 11 months ago - 4 comments
Labels: question, performance, ver: 2.1.x

#18861 - `CSVLogger` fails on remote FS on version `2.1.0`

Issue - State: open - Opened by ioangatop 11 months ago - 2 comments
Labels: bug, logger: csv, ver: 2.1.x

#18860 - Add broadcast to Dataset Optimizer with multiple nodes

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready, ci, app, dependencies

#18859 - Restore support for builds without distributed

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: bug, ready, fabric, pl

#18858 - Support for torch without distributed broken

Issue - State: closed - Opened by adamjstewart 11 months ago - 3 comments
Labels: bug, distributed, ver: 2.1.x

#18857 - Hanging with NeMo

Issue - State: open - Opened by szhengac 11 months ago
Labels: bug, needs triage, ver: 2.0.x

#18856 - Update debugging_basic.rst

Pull Request - State: closed - Opened by rasbt 11 months ago
Labels: ready, docs, community, pl

#18854 - Bugfix/18394 batch size finder max val batches

Pull Request - State: closed - Opened by BoringDonut 11 months ago - 2 comments
Labels: bug, ready, tuner, community, pl

#18853 - Loading a distributed checkpoint with Fabric fails with a RuntimeError

Issue - State: closed - Opened by rasbt 11 months ago - 2 comments
Labels: bug, fabric, strategy: fsdp, ver: 2.1.x

#18852 - Add `torch.compile` guide to docs

Issue - State: open - Opened by carmocca 11 months ago - 1 comment
Labels: docs, fabric, performance, pl, torch.compile

#18851 - [WIP] Avoid moving XLA model to CPU in teardown [TPU]

Pull Request - State: open - Opened by awaelchli 11 months ago - 2 comments
Labels: accelerator: tpu, strategy: xla

#18850 - Add distributed support for StreamingDataset

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18848 - Add throughput utilities to Fabric and the Trainer

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: feature, ready, docs, callback, fabric, pl

#18847 - Extend warning about reducing non floating types

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: feature, ready, logging, pl

#18846 - Change dangerous default random seed selection

Pull Request - State: closed - Opened by awaelchli 11 months ago - 3 comments
Labels: feature, ready, breaking change, fabric, reproducibility, pl

#18845 - Extra GPU usage in ddp and ddp-spawn

Issue - State: closed - Opened by FANGAreNotGnu 11 months ago - 2 comments
Labels: bug, strategy: ddp, ver: 2.0.x, repro needed

#18844 - Tensor wrapper subclass to avoid `fabric.backward`

Pull Request - State: open - Opened by carmocca 11 months ago - 2 comments
Labels: feature, fabric, pl

#18843 - Fixes in evaluation_basic.rst

Pull Request - State: closed - Opened by rasbt 11 months ago
Labels: ready, docs, pl

#18842 - `Fabric.configure_module` breaks `@property.setter`

Issue - State: closed - Opened by busFred 11 months ago - 2 comments
Labels: bug, ver: 2.0.x, repro needed

#18840 - Rename PrecisionPlugin -> Precision

Pull Request - State: closed - Opened by awaelchli 11 months ago - 4 comments
Labels: ready, docs, refactor, fabric, plugin, pl, fun

#18838 - Add example for loading a LightningModule if it has additional init arguments

Pull Request - State: closed - Opened by rasbt 12 months ago - 2 comments
Labels: ready, docs, pl

#18837 - ci: fix typo in SHA ref

Pull Request - State: closed - Opened by Borda 12 months ago - 1 comment
Labels: ready, ci

#18836 - Stuck at loading the trainer module

Issue - State: closed - Opened by alalith3298 12 months ago - 2 comments
Labels: bug, accelerator: cuda, ver: 2.1.x, repro needed

#18835 - Issue with logs when using torch.compile

Issue - State: open - Opened by Forbu 12 months ago - 5 comments
Labels: bug, torch.compile, ver: 2.1.x

#18834 - `BatchSizeFinder` limits number of validation batches for the whole training process

Issue - State: closed - Opened by BoringDonut 12 months ago - 3 comments
Labels: bug, duplicate, tuner, ver: 2.0.x, ver: 1.8.x

#18833 - Add a `prefix` paramater to `self.log_dict()`

Issue - State: closed - Opened by GaetanLepage 12 months ago - 3 comments
Labels: feature, logging

#18832 - Have each DDP worker optimizing a specific layer of a common model

Issue - State: open - Opened by rob-hen 12 months ago
Labels: feature, needs triage

#18831 - `training_step(dataloader_iter)` no longer moves batch to device in 2.1

Issue - State: closed - Opened by YichengDWu 12 months ago - 4 comments
Labels: question, docs, data handling, ver: 2.1.x

#18830 - Missing folder error when using TensorBoardLogger with S3 uri

Issue - State: open - Opened by celpas 12 months ago
Labels: bug, needs triage, ver: 2.0.x

#18829 - Error resuming checkpoint when using `configure_model` method of `LightningModule`

Issue - State: closed - Opened by Kinyugo 12 months ago - 1 comment
Labels: bug, duplicate, ver: 2.0.x

#18828 - Scheduler is still stepped when optimizer stepping is skipped.

Issue - State: closed - Opened by oguz-hanoglu 12 months ago - 3 comments
Labels: bug, duplicate, precision: amp, ver: 2.0.x

#18827 - Improve DatasetOptimizer API

Pull Request - State: closed - Opened by tchaton 12 months ago - 2 comments
Labels: ready, app, dependencies

#18826 - Fix `BatchSizeFinder` leaving model in train state

Pull Request - State: open - Opened by tanaymeh 12 months ago - 10 comments
Labels: bug, tuner, community, pl

#18825 - Provide DDP rank in Module constructor to enable setting requires_grad worker dependent

Issue - State: closed - Opened by rob-hen 12 months ago
Labels: feature, needs triage

#18824 - `LightningModule.to_torchscript()` does not transfer check_inputs to correct device

Issue - State: open - Opened by pfeatherstone 12 months ago - 6 comments
Labels: bug, good first issue, ver: 2.0.x, repro needed

#18823 - LightningCLI logger related tests not being run in pull requests

Issue - State: closed - Opened by mauvilsa 12 months ago - 3 comments
Labels: bug, ci, tests, ver: 2.1.x

#18822 - LinghtningCLI now will not allow setting a class instance as a default

Pull Request - State: closed - Opened by mauvilsa 12 months ago - 1 comment
Labels: ready, lightningcli, community, pl, dependencies

#18821 - Fix failing lightning cli entry point

Pull Request - State: closed - Opened by awaelchli 12 months ago - 2 comments
Labels: bug, ready, fabric

#18820 - transformer engine (FP8) support for FSDP training

Issue - State: closed - Opened by naveenkumarmarri 12 months ago - 1 comment
Labels: feature, needs triage

#18819 - Avoid false-positive warnings about method calls on the Fabric-wrapped module

Pull Request - State: closed - Opened by awaelchli 12 months ago - 2 comments
Labels: feature, ready, fabric, pl, fun

#18818 - Fix reduce type in FSDP mixed precision

Pull Request - State: open - Opened by awaelchli 12 months ago
Labels: fabric, pl

#18817 - Tiny fixes for the Cache & DatasetOptimizer

Pull Request - State: closed - Opened by tchaton 12 months ago - 1 comment
Labels: ready

#18816 - Update bug report template for 2.1

Pull Request - State: closed - Opened by awaelchli 12 months ago - 1 comment
Labels: ready, ci

#18815 - lightning run cli entry point stopped working after dropping app from top level

Issue - State: closed - Opened by awaelchli 12 months ago
Labels: bug, app, dependencies, ver: 2.1.x

#18814 - Bump @babel/traverse from 7.18.6 to 7.23.2 in /src/lightning/app/cli/react-ui-template/ui

Pull Request - State: open - Opened by dependabot[bot] 12 months ago - 1 comment
Labels: app, javascript

#18813 - BatchSizeFinder leaves model in the train state if used with trainer.validate

Issue - State: open - Opened by BoringDonut 12 months ago - 2 comments
Labels: bug, tuner, ver: 2.0.x, ver: 1.7.x, ver: 1.8.x

#18812 - LR Finder fails when using multi-node training

Issue - State: open - Opened by praritagarwal 12 months ago - 1 comment
Labels: question, tuner, ver: 2.1.x

#18809 - Missing Positional Arguments from CLI/Config File

Issue - State: open - Opened by tommycwh 12 months ago - 1 comment
Labels: bug, needs triage, ver: 2.0.x

#18808 - train_dataloader not recognized in Data Module

Issue - State: closed - Opened by jscottcronin 12 months ago - 2 comments
Labels: needs triage, ver: 2.0.x

#18807 - Add support for text

Pull Request - State: closed - Opened by tchaton 12 months ago - 1 comment
Labels: ready, ci

#18806 - Bad doc webpage layout

Issue - State: closed - Opened by yuzhenmao 12 months ago - 3 comments
Labels: docs, ver: 2.1.x

#18805 - Access denied to save model checkpoint on AWS S3.

Issue - State: closed - Opened by celsofranssa 12 months ago - 1 comment
Labels: bug, needs triage, ver: 2.0.x

#18804 - Modification of the current_epoch attribute or other interesting @properties without setters

Issue - State: open - Opened by rucky96 12 months ago
Labels: feature, needs triage

#18803 - [Bug] RuntimeError: No backend type associated with device type cpu

Issue - State: open - Opened by shenoynikhil 12 months ago - 12 comments
Labels: bug, working as intended, ver: 2.1.x

#18802 - FSDP not working well with BatchNorm and 16-mixed precision

Issue - State: closed - Opened by DLlearn 12 months ago - 3 comments
Labels: bug, 3rd party, precision: amp, strategy: fsdp

#18801 - docs: update ref to latest tutorials

Pull Request - State: closed - Opened by pl-ghost 12 months ago - 1 comment
Labels: ready, examples

#18800 - Docs website css is buggy

Issue - State: closed - Opened by busFred 12 months ago - 1 comment
Labels: docs

#18798 - Cannot use compiled model together with the `ddp` strategy

Issue - State: closed - Opened by quancs 12 months ago - 1 comment
Labels: bug, needs triage, ver: 2.0.x

#18796 - Add name and version

Pull Request - State: closed - Opened by tchaton 12 months ago - 2 comments
Labels: ready, app, dependencies

#18795 - ci: simplify/unify make docs targets

Pull Request - State: closed - Opened by Borda 12 months ago - 1 comment
Labels: ready, docs, ci

#18794 - Update 2.2.0dev development version and changelog

Pull Request - State: closed - Opened by awaelchli 12 months ago - 1 comment
Labels: fabric, app, pl, package

#18793 - Fix bug when removing last checkpoint with deepspeed

Pull Request - State: closed - Opened by hiaoxui 12 months ago - 1 comment
Labels: bug, ready, callback: model checkpoint, community, pl

#18792 - docs: fix pages on PyPI

Pull Request - State: closed - Opened by Borda 12 months ago - 3 comments
Labels: ready, ci, priority: 1, release, fabric, pl, package

#18791 - ci/release: create a PR for release bump

Pull Request - State: closed - Opened by Borda 12 months ago - 1 comment
Labels: ready, ci, priority: 1, release

#18790 - ci/docs: create PR only if needed

Pull Request - State: closed - Opened by Borda 12 months ago - 1 comment
Labels: ready, ci

#18789 - Adding test for legacy checkpoint created with 2.1.x

Pull Request - State: closed - Opened by pl-ghost 12 months ago - 2 comments
Labels: ready, checkpointing, tests, pl

#18788 - Introduce Dataset Optimizer

Pull Request - State: closed - Opened by tchaton 12 months ago - 4 comments
Labels: ready, app, dependencies

#18787 - docs: update ref to latest tutorials & fix CI trigger

Pull Request - State: closed - Opened by pl-ghost 12 months ago - 1 comment
Labels: ready, docs, ci, pl, examples

#18786 - Support saving and loading remote paths with FSDP

Issue - State: open - Opened by schmidt-ai 12 months ago - 3 comments
Labels: feature, help wanted, strategy: fsdp, ver: 2.1.x

#18785 - Revert removal of empty-parameters check for `configure_optimizers()` with FSDP

Pull Request - State: closed - Opened by awaelchli 12 months ago - 2 comments
Labels: bug, ready, strategy: fsdp, pl

#18784 - LightningModule.configure_callbacks overrides Trainer callbacks

Issue - State: open - Opened by adamjstewart 12 months ago - 12 comments
Labels: feature, discussion, lightningmodule

#18783 - docs: setting cron for periodical update tutorials

Pull Request - State: closed - Opened by Borda 12 months ago - 2 comments
Labels: ready, ci

#18782 - Update probot-check-group.yml to v5.4

Pull Request - State: closed - Opened by carmocca 12 months ago - 1 comment
Labels: ready, ci

#18781 - The training mode is accidentally enabled in training_step function.

Issue - State: closed - Opened by w2kun 12 months ago - 1 comment
Labels: bug, needs triage, ver: 1.9.x

#18780 - warnings: resuming before epoch end is absolutely normal for long trainings

Issue - State: open - Opened by stas00 12 months ago - 5 comments
Labels: feature, data handling

#18779 - xfail collective tests

Pull Request - State: closed - Opened by carmocca 12 months ago - 1 comment
Labels: ready, fabric, tests

#18778 - Bugfix: Pin `lightning-cloud` version

Pull Request - State: closed - Opened by ethanwharris 12 months ago - 1 comment
Labels: ready, app, dependencies

#18777 - `ImportError`: cannot import name 'V1CloudSpaceAppAction' from 'lightning_cloud.openapi.models'

Issue - State: closed - Opened by ordabayevy 12 months ago - 3 comments
Labels: bug, app, ver: 2.1.x

#18777 - `ImportError`: cannot import name 'V1CloudSpaceAppAction' from 'lightning_cloud.openapi.models'

Issue - State: closed - Opened by ordabayevy 12 months ago - 3 comments
Labels: bug, app, ver: 2.1.x

#18776 - Raise an exception when calling `fit` twice with spawn

Pull Request - State: closed - Opened by carmocca 12 months ago - 2 comments
Labels: ready, breaking change, strategy: ddp, pl, strategy: xla

#18775 - Calling `trainer.fit` twice with spawn strategies won't work as expected

Issue - State: open - Opened by carmocca 12 months ago
Labels: bug, priority: 1, strategy: ddp, strategy: xla, ver: 2.0.x

#18774 - Minor strategy fixes [TPU]

Pull Request - State: open - Opened by carmocca 12 months ago - 2 comments
Labels: bug, ready, fabric, pl

#18774 - Minor strategy fixes [TPU]

Pull Request - State: closed - Opened by carmocca 12 months ago - 2 comments
Labels: bug, ready, fabric, pl

#18773 - Fix spelling errors

Pull Request - State: closed - Opened by awaelchli 12 months ago - 1 comment
Labels: ready, docs, fabric, app, pl

#18773 - Fix spelling errors

Pull Request - State: closed - Opened by awaelchli 12 months ago - 1 comment
Labels: ready, docs, fabric, app, pl