Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Lightning-AI/lightning issues and pull requests

#18981 - App: Limit rate of requests to http queue

Pull Request - State: closed - Opened by ethanwharris 11 months ago - 2 comments
Labels: ready, app

#18980 - App: Enable bundling addtional files into app source

Pull Request - State: closed - Opened by ethanwharris 11 months ago - 1 comment
Labels: ready, app

#18978 - CSVLogger column ordering is random

Issue - State: open - Opened by taromakino 11 months ago - 7 comments
Labels: feature, good first issue, logger: csv, ver: 2.1.x

#18977 - Add Video/Audio support

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18976 - App: Force plugin server to use localhost

Pull Request - State: closed - Opened by ethanwharris 11 months ago - 1 comment
Labels: ready, app

#18975 - Training a simple XOR network yields incorrect, undeterministic behaviour

Issue - State: closed - Opened by Fohlen 11 months ago - 4 comments
Labels: question, ver: 2.1.x

#18974 - Can we have lr_scheduler as a key input for the trainer() function?

Issue - State: closed - Opened by HelloWorldLTY 11 months ago - 2 comments
Labels: feature

#18973 - Hparams should always be an AttributeDict

Pull Request - State: open - Opened by klieret 11 months ago - 3 comments
Labels: bug, code quality, community, app, pl

#18972 - Loose type annotations of hparams cause Pylint basic type checker false positives

Issue - State: open - Opened by klieret 11 months ago - 2 comments
Labels: bug, code quality, ver: 2.1.x

#18971 - `Trainer.save_checkpoint()` after `Trainer.test()` not working with FSDP

Issue - State: closed - Opened by awaelchli 11 months ago
Labels: bug, strategy: fsdp, ver: 2.1.x

#18969 - Last.ckpt symlink breaking on Windows

Issue - State: closed - Opened by Jason94 11 months ago - 2 comments
Labels: bug, duplicate, ver: 2.1.x

#18968 - Installing lightning changes my pytorch version from cuda118 to cpu

Issue - State: open - Opened by arunraja-hub 11 months ago - 4 comments
Labels: bug, dependencies, ver: 2.2.x

#18967 - Enable parallel logging for private logging servers with high latency

Issue - State: open - Opened by OlfwayAdbayIgbay 11 months ago - 1 comment
Labels: feature, discussion, logger: mlflow

#18966 - Add `@override` for files in `src/lightning/pytorch/loops`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 1 comment
Labels: ready, community, pl

#18965 - Bump Lightning Cloud Version to 0.5.52

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready, app, dependencies

#18964 - Prevent downloading more chunks than needed

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18963 - Provide the default vocab size for the transformer demo model

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: example, ready, pl

#18962 - Bump Lightning Cloud to 0.5.51

Pull Request - State: closed - Opened by tchaton 11 months ago - 3 comments
Labels: ready, app, dependencies

#18961 - ci: add labels for data & store

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, ci

#18960 - Add the input_dir in the cache_dir to avoid overlapping downloads

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready

#18959 - Add support for deleting chunks

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18958 - Add more CUDA card FLOPs

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: feature, ready, fabric

#18957 - Cache directory per worker to avoid collisions

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready

#18956 - ModelCheckpoint filename delimiters not working as expected

Issue - State: closed - Opened by Anjum48 11 months ago - 4 comments
Labels: bug, help wanted, docs, callback: model checkpoint, ver: 2.1.x

#18955 - Create cache dir if it doesn't exist

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: bug, ready

#18954 - Move torchmetrics to device when using FSDP

Pull Request - State: closed - Opened by awaelchli 11 months ago - 3 comments
Labels: bug, ready, fabric, strategy: fsdp, pl, fun

#18953 - Add `@override` for files in `src/lightning/pytorch/core`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 2 comments
Labels: ready, code quality, community, pl

#18952 - Fix parsing v100s in `get_available_flops`

Pull Request - State: closed - Opened by awaelchli 11 months ago - 3 comments
Labels: bug, ready, fabric, fun

#18951 - Remember the eval mode of submodules when switching trainer stages

Pull Request - State: closed - Opened by awaelchli 11 months ago - 3 comments
Labels: feature, ready, docs, trainer: fit, trainer: validate, pl, fun

#18950 - Only one GPU is being used on SLURM cluster

Issue - State: closed - Opened by vandrw 11 months ago - 3 comments
Labels: question, environment: slurm, ver: 2.1.x

#18949 - lightning.data: Fix some bugs with optimize

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18948 - Add `@override` for subclasses of PyTorch `Logger`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 2 comments
Labels: ready, code quality, community, pl

#18947 - Add GPU support for map

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18946 - FLOPs not found for 'Tesla V100S-PCIE-32GB'

Issue - State: closed - Opened by DavidGOrtega 11 months ago - 2 comments
Labels: bug, ver: 2.1.x

#18944 - Root module wrapped when using custom FSDP strategy

Issue - State: open - Opened by asuglia-alana 11 months ago - 6 comments
Labels: question, 3rd party, strategy: fsdp, ver: 2.1.x

#18943 - Add AttributeDict container for Fabric

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: feature, ready, docs, refactor, checkpointing, fabric, pl, fun

#18942 - Fix symlink permission error for "last" checkpoint on Windows

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, callback: model checkpoint, pl, fun

#18941 - SkipResumeTrainingValidationLoop._should_check_val_fx() takes 1 positional argument but 2 were given

Issue - State: closed - Opened by ganymedenet 11 months ago - 2 comments
Labels: question, ver: 2.1.x

#18940 - Add dataset creation

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready, app, dependencies

#18940 - Add dataset creation

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready, app, dependencies

#18939 - Resolve bug with the uploader

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18939 - Resolve bug with the uploader

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18938 - Fix oversized items not fitting into a chunk

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready

#18937 - docs: ignore tutorial's link "what-is-a-gpu"

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, pl

#18936 - `convert_module` in `BitsandbytesPrecision` is called before `configure_model`

Issue - State: open - Opened by lucadiliello 11 months ago - 2 comments
Labels: bug, ver: 2.1.x, ver: 2.2.x, precision: bnb

#18935 - How to debug in the training process

Issue - State: open - Opened by ljhOfGithub 11 months ago
Labels: docs, needs triage

#18934 - Add `@override` for subclasses of PyTorch `Accelerator`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 3 comments
Labels: ready, code quality, community, pl

#18933 - NCCL error of fabric when using dp or ddp strategy

Issue - State: closed - Opened by Galaxy-Husky 11 months ago - 2 comments
Labels: bug, needs triage, ver: 2.1.x

#18932 - Unable to train on TPU v4-8: Failed to get global TPU topology.

Issue - State: open - Opened by fabienGenhealth 11 months ago - 3 comments
Labels: bug, accelerator: tpu, 3rd party, ver: 2.1.x

#18931 - Cannot save last checkpoint due to breaking change in new release

Issue - State: closed - Opened by francescocarzaniga 11 months ago - 1 comment
Labels: bug, needs triage, ver: 2.1.x

#18929 - Remove redundant Trainer state assignment

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, refactor, pl

#18928 - Fix precision default from environment

Pull Request - State: closed - Opened by carmocca 11 months ago - 3 comments
Labels: bug, ready, fabric

#18927 - Add tfloat32 to throughput

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: feature, ready, fabric

#18926 - TPU FSDP support for Fabric requires major refactoring of training script(s)

Issue - State: closed - Opened by fabienGenhealth 11 months ago - 1 comment
Labels: question, strategy: xla

#18925 - Add human readable format for chunk_bytes

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18924 - Rename Throughput flops argument

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: ready, fabric, pl

#18923 - ci: restrict build docs on PR

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, ci, fabric, app, pl

#18922 - Add `@override` for subclasses of PyTorch `_Launcher`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 2 comments
Labels: ready, code quality, community, pl

#18921 - ci/docs: fix copy integration sub-docs

Pull Request - State: closed - Opened by Borda 11 months ago - 2 comments
Labels: ready, docs, ci, priority: 1, app, pl

#18920 - Improve s3 client support

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready, ci

#18919 - Transformer T5Encoder get nan in Trainer with 16-mixed precision

Issue - State: closed - Opened by yanzhenms 11 months ago - 1 comment
Labels: bug, needs triage, ver: 2.1.x

#18918 - The default value of 'precision' in Fabric() should be '32-true'

Issue - State: closed - Opened by shihaoyin 11 months ago - 1 comment
Labels: bug, ver: 2.2.x

#18917 - Weekly minor patch release `2.1.1` [rebase & merge]

Pull Request - State: closed - Opened by Borda 11 months ago - 3 comments
Labels: ready, docs, ci, fabric, app, pl, dependencies, dockers, package

#18916 - Update minimum typing-extensions to 4.4

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: ready, fabric, pl, dependencies

#18914 - Fix bitsandbytes layer conversion under `init_module` context manager

Pull Request - State: closed - Opened by awaelchli 11 months ago - 5 comments
Labels: bug, ready, fabric, pl, dependencies

#18913 - Bitsandbytes doesn't convert layers under `init_module` context

Issue - State: closed - Opened by awaelchli 11 months ago
Labels: bug, ver: 2.1.x

#18912 - Improve map, optimize and StreamingDataset

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready, docs, fabric, app, pl, dependencies

#18911 - Add `@override` for subclasses of PyTorch `Strategy`

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 7 comments
Labels: ready, code quality, community, pl

#18910 - Make input dir in DataProcessor required

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready, data

#18908 - Skip hanging collective test

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready, priority: 0, fabric, tests

#18907 - Greedily select files for data processor workers based on size

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready

#18906 - Flatten dataclass hyperparameters for logging

Pull Request - State: closed - Opened by jaswon 11 months ago - 1 comment
Labels: ready, logger, fabric, community

#18905 - Add batches argument to throughput

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: feature, ready, fabric, pl

#18904 - Fix broken links on Lightning App docs

Pull Request - State: closed - Opened by VictorPrins 11 months ago - 2 comments
Labels: ready, docs, community, app

#18903 - Bitsandbytes docs improvements

Pull Request - State: closed - Opened by carmocca 11 months ago - 1 comment
Labels: ready, docs, fabric, pl

#18902 - Bug of SingleDeviceStrategy: incoherent device between accelerator and strategy when accelerator="auto"

Issue - State: open - Opened by ZekunZh 11 months ago - 3 comments
Labels: bug, priority: 2, ver: 2.0.x

#18901 - Improve map and chunkify

Pull Request - State: closed - Opened by tchaton 11 months ago - 2 comments
Labels: ready, app, dependencies

#18900 - Symlink last checkpoint will fail on windows due to permission error

Issue - State: closed - Opened by aweinmann 11 months ago - 1 comment
Labels: bug, help wanted, callback: model checkpoint, ver: 2.1.x

#18899 - Update CLI tests to no longer require 3rd party logger dependencies

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, lightningcli, tests, pl, fun

#18898 - Install both `lightning` and `pytorch-lightning` in docker image

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: ready, ci, fun, dockers

#18897 - Fix parsing of version in TensorBoardLogger and CSVLogger

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: bug, ready, logger: tensorboard, fabric, logger: csv, pl, fun

#18896 - Update evaluation logging test

Pull Request - State: closed - Opened by carmocca 11 months ago - 2 comments
Labels: ready, tests, pl

#18895 - Support `log_every_n_steps` with validate|test

Pull Request - State: open - Opened by carmocca 11 months ago - 3 comments
Labels: feature, logging, breaking change, pl

#18894 - Implement todos in tensorboard docs

Pull Request - State: closed - Opened by rasbt 11 months ago - 4 comments
Labels: ready, docs, pl

#18893 - docs: switch todo to comment

Pull Request - State: closed - Opened by Borda 11 months ago - 1 comment
Labels: ready, docs, fabric, app, pl

#18892 - Add DataRecipe

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18891 - Prevent leaking the thread to the workers

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: ready

#18890 - ModuleNotFoundError: No module named 'lightning' in lightning container image

Issue - State: closed - Opened by GuyPozner 11 months ago - 4 comments
Labels: question, ver: 2.1.x

#18889 - Bump actions/setup-node from 3 to 4

Pull Request - State: open - Opened by dependabot[bot] 11 months ago - 2 comments
Labels: ready, ci

#18888 - When Using FSDP Strategy, Lightning Does not Move TorchMetrics to Device (Torch 2.1.0)

Issue - State: closed - Opened by kamal-rahimi 11 months ago
Labels: bug, strategy: fsdp, ver: 2.1.x

#18886 - Enable RUF018 rule for walrus assignments in asserts

Pull Request - State: closed - Opened by awaelchli 11 months ago - 2 comments
Labels: ready, fabric, tests, app, pl, fun, package

#18885 - No action for key "ckpt_path" -> ckpt_path not available for linking

Issue - State: open - Opened by Toekan 11 months ago - 3 comments
Labels: bug, lightningcli, ver: 2.0.x

#18884 - Refined FSDP saving logic and error messaging when path exists

Pull Request - State: closed - Opened by awaelchli 11 months ago - 1 comment
Labels: bug, ready, fabric, strategy: fsdp, pl, fun

#18883 - lightning apps: add flow fail()

Pull Request - State: closed - Opened by nohalon 11 months ago - 1 comment
Labels: ready, app

#18882 - Improve Streaming Dataset API

Pull Request - State: closed - Opened by tchaton 11 months ago - 1 comment
Labels: feature, ready, data handling

#18881 - Issues regarding model saving and reading when I use lightning.Fabric multi-GPU and FSDP strategies

Issue - State: closed - Opened by Williamwsk 11 months ago - 2 comments
Labels: bug, strategy: fsdp, ver: 2.1.x

#18880 - Misconfiguration error with self.log() called twice in training step

Issue - State: closed - Opened by NumberChiffre 11 months ago - 7 comments
Labels: bug, logging, torch.compile, ver: 2.1.x

#18879 - feature/15718_sagemaker experiment logger

Pull Request - State: open - Opened by tsenst 11 months ago
Labels: docs, pl, dependencies

#18878 - Cannot save checkpoint when using deepspeed

Issue - State: closed - Opened by ydk-tellurion 11 months ago - 4 comments
Labels: bug, needs triage, ver: 2.0.x, ver: 1.9.x, ver: 2.1.x