pytorch/torchtune issues and pull requests

#808 - Is there a plan to support AnswerDotAI/fsdp_qlora that fine-tune 70b LLM on 2x 24G GPUs (like: RTX 3090)?

Issue - State: closed - Opened by yaohwang 10 months ago - 7 comments

#806 - ImportError: cannot import name 'return_and_correct_aliasing' on Kaggle notebook

Issue - State: closed - Opened by shure-dev 10 months ago - 3 comments

#805 - fix llama3 tutorial link

Pull Request - State: closed - Opened by kartikayk 10 months ago - 3 comments
Labels: CLA Signed

#802 - Llama3-70B LoRA multi GPU

Pull Request - State: closed - Opened by rohan-varma 10 months ago - 4 comments
Labels: CLA Signed

#796 - fixed mixed precision in FSDP

Pull Request - State: closed - Opened by denadai2 10 months ago - 4 comments

#791 - Vision/Multimodal

Issue - State: open - Opened by bhack 10 months ago - 22 comments
Labels: enhancement

#790 - MPS support

Pull Request - State: closed - Opened by maximegmd 10 months ago - 22 comments
Labels: CLA Signed

#785 - Add Selective Activation Checkpointing

Pull Request - State: closed - Opened by lessw2020 10 months ago - 1 comment
Labels: CLA Signed

#781 - utils.set_activation_checkpointing is unnecessarily restrictive

Issue - State: closed - Opened by rohan-varma 10 months ago - 1 comment

#771 - DPO supports multi-device training

Pull Request - State: closed - Opened by yechenzhi 10 months ago - 12 comments
Labels: CLA Signed

#730 - Organise steps logic

Pull Request - State: closed - Opened by tcapelle 10 months ago - 9 comments
Labels: CLA Signed

#721 - Disabling LoRA with compiled models

Issue - State: closed - Opened by BenjaminBossan 10 months ago - 3 comments
Labels: bug

#691 - Testing mega-issue

Issue - State: open - Opened by ebsmothers 10 months ago
Labels: enhancement

#676 - Install gcc >= 9 to support torch.compile testing with inductor backend

Issue - State: closed - Opened by rohan-varma 11 months ago - 1 comment
Labels: testing

#666 - Use less memory during lora state dict validation

Pull Request - State: closed - Opened by ebsmothers 11 months ago - 3 comments
Labels: CLA Signed

#665 - Out of CUDA on 15 GB colab . Just trying to train Mistral 7. v1

Issue - State: closed - Opened by raymondbernard 11 months ago - 12 comments

#654 - int4 gptq working.

Pull Request - State: closed - Opened by HDCharles 11 months ago - 2 comments
Labels: CLA Signed

#651 - [In Progress] FSDP2 + NF4Tensor

Pull Request - State: closed - Opened by weifengpy 11 months ago - 2 comments
Labels: CLA Signed

#648 - [RFC] Documenting and validating recipe params

Issue - State: closed - Opened by RdoubleA 11 months ago - 8 comments

#622 - Will there be a docker?

Issue - State: closed - Opened by Playerrrrr 11 months ago - 4 comments

#614 - fp32 Full Training seems to be taking a lot of memory

Issue - State: closed - Opened by kartikayk 11 months ago - 5 comments

#611 - [fix doc] AC is enabled by default

Pull Request - State: closed - Opened by skcoirz 11 months ago - 4 comments
Labels: CLA Signed

#609 - [RFC][fix test] missing .item() in frozen nf4 test

Pull Request - State: closed - Opened by skcoirz 11 months ago - 6 comments
Labels: CLA Signed

#597 - BF16 and MemoryEffient attention not working on AMD MI250

Issue - State: closed - Opened by chauhang 11 months ago - 5 comments

#584 - Verify that we are consistent with public/private imports

Issue - State: closed - Opened by RdoubleA 11 months ago
Labels: wontfix

#556 - Finetuning whisper

Issue - State: closed - Opened by AmgadHasan 11 months ago - 1 comment
Labels: enhancement

#481 - Attach unique string to produced checkpoints

Issue - State: closed - Opened by rohan-varma 12 months ago
Labels: wontfix

#479 - Configuring WandB logger broken / can't easily configure?

Issue - State: closed - Opened by rohan-varma 12 months ago - 2 comments

#454 - Separate LoRA recipe into single and multi GPU, LoRA finetune < 16GB GPU

Pull Request - State: closed - Opened by rohan-varma 12 months ago - 4 comments
Labels: CLA Signed

#443 - [RFC] Integration with DCP - Benchmark Results

Pull Request - State: closed - Opened by LucasLLC 12 months ago - 2 comments
Labels: CLA Signed

#389 - [RFC] Single Device Full Fine-tune for Llama7B in < 16GB

Pull Request - State: open - Opened by kartikayk about 1 year ago - 13 comments
Labels: CLA Signed

#324 - Remove `datasets` requirement and instead rely on download from `huggingface_hub`

Issue - State: closed - Opened by joecummings about 1 year ago - 7 comments

#301 - Move away from using `/tmp` directories

Issue - State: closed - Opened by RdoubleA about 1 year ago - 11 comments

#226 - Memory Profiling

Pull Request - State: closed - Opened by msaroufim about 1 year ago - 3 comments
Labels: CLA Signed

Ecosyste.ms: Issues

GitHub / pytorch/torchtune issues and pull requests