pytorch/torchtune issues and pull requests

#2306 - Support for Janus-Pro series of model

Issue - State: closed - Opened by Ankur-singh 23 days ago - 2 comments

#2305 - Update LoRA DPO distributed recipe

Issue - State: closed - Opened by SalmanMohammadi 23 days ago

#2304 - Fix stop tokens in PPO

Pull Request - State: closed - Opened by RedTachyon 23 days ago - 8 comments
Labels: CLA Signed

#2303 - Move from PIL to torchvision.io.decode_image

Issue - State: open - Opened by ebsmothers 24 days ago - 12 comments
Labels: best practice, community help wanted

#2302 - Flux Model

Pull Request - State: closed - Opened by calvinpelletier 24 days ago - 1 comment
Labels: CLA Signed

#2301 - Multinode support in torchtune

Pull Request - State: closed - Opened by joecummings 24 days ago - 5 comments
Labels: CLA Signed

#2300 - Missing `<|begin_of_text|>` Token in `Llama3Tokenizer`

Issue - State: open - Opened by seungjun-green 24 days ago - 3 comments

#2299 - Step based checkpointing

Issue - State: closed - Opened by xTRam1 26 days ago - 1 comment
Labels: triage review

#2298 - [WIP] 'tune cat' command for pretty printing configuration files

Pull Request - State: closed - Opened by Ankur-singh 27 days ago - 7 comments
Labels: CLA Signed

#2297 - Training never starts - stuck after Loss is intialized

Issue - State: closed - Opened by datamancerai 27 days ago - 12 comments
Labels: discussion, triaged

#2296 - Tokens per second calculation

Issue - State: open - Opened by EugenHotaj 28 days ago - 8 comments
Labels: best practice, triage review

#2295 - Tune download command not found

Issue - State: closed - Opened by shaunakjoshi12 28 days ago - 3 comments

#2294 - How to checkpoint every N steps?

Issue - State: closed - Opened by tginart 28 days ago - 1 comment

#2293 - Remove deprecated components for 0.6.0

Pull Request - State: closed - Opened by RdoubleA 29 days ago - 1 comment
Labels: CLA Signed

#2292 - Custom DPO losses support

Pull Request - State: open - Opened by krammnic 29 days ago - 8 comments
Labels: CLA Signed

#2291 - Proper prefix handling in EarlyFusion sd hooks

Pull Request - State: closed - Opened by ebsmothers 29 days ago - 3 comments
Labels: CLA Signed

#2290 - Removing `SimPOLoss`

Pull Request - State: closed - Opened by SalmanMohammadi 29 days ago - 1 comment
Labels: CLA Signed

#2288 - Roadmap for distributed recipes using NPU as a backend

Issue - State: open - Opened by Nicorgi 29 days ago

#2287 - deepseek r1 support?

Issue - State: open - Opened by johnnynunez 30 days ago - 14 comments
Labels: enhancement, triage review

#2286 - Documentation for evaluation on a custom dataset for a custom task

Issue - State: open - Opened by karrtikiyer about 1 month ago - 16 comments
Labels: bug, documentation, discussion, triage review

#2285 - Saving multiple checkpoints per epoch

Issue - State: open - Opened by EugenHotaj about 1 month ago - 2 comments
Labels: enhancement, triaged

#2284 - Add masking strategies to message transforms

Pull Request - State: open - Opened by supreethmanyam about 1 month ago - 3 comments
Labels: CLA Signed

#2283 - Inconsistent initialization of RoPE embedding across component builders

Issue - State: open - Opened by Ankur-singh about 1 month ago
Labels: best practice, better engineering

#2282 - Update model builders

Pull Request - State: closed - Opened by Ankur-singh about 1 month ago - 11 comments
Labels: CLA Signed

#2281 - [RFC] Proposal for `tune cat` Command

Issue - State: closed - Opened by Ankur-singh about 1 month ago - 2 comments
Labels: rfc, discussion

#2280 - Roadmap for other parallelisms

Issue - State: open - Opened by rahul-sarvam about 1 month ago - 6 comments
Labels: discussion, triaged

#2279 - _checkpoint_client not installing

Issue - State: closed - Opened by maxwellreynolds about 1 month ago - 4 comments

#2279 - _checkpoint_client not installing

Issue - State: open - Opened by maxwellreynolds about 1 month ago

#2278 - Sample packing for ConcatDataset

Pull Request - State: closed - Opened by ebsmothers about 1 month ago - 2 comments
Labels: CLA Signed

#2278 - Sample packing for ConcatDataset

Pull Request - State: closed - Opened by ebsmothers about 1 month ago - 2 comments
Labels: CLA Signed

#2277 - Llama3.2 vision does not run with distributed state dict

Issue - State: open - Opened by acisseJZhong about 1 month ago - 1 comment

#2277 - Llama3.2 vision does not run with distributed state dict

Issue - State: open - Opened by acisseJZhong about 1 month ago - 1 comment
Labels: bug, triaged

#2276 - Construct EarlyFusion's encoder_token_ids on correct device

Pull Request - State: closed - Opened by ebsmothers about 1 month ago - 1 comment
Labels: CLA Signed

#2276 - Construct EarlyFusion's encoder_token_ids on correct device

Pull Request - State: closed - Opened by ebsmothers about 1 month ago - 1 comment
Labels: CLA Signed

#2275 - Full DPO Distributed

Pull Request - State: closed - Opened by sam-pi about 1 month ago - 20 comments
Labels: CLA Signed

#2275 - Full DPO Distributed

Pull Request - State: closed - Opened by sam-pi about 1 month ago - 20 comments
Labels: CLA Signed

#2274 - Logging resolved config

Pull Request - State: closed - Opened by Ankur-singh about 1 month ago - 6 comments
Labels: CLA Signed

#2273 - The current instantiation does not trigger the initialization of submodules

Issue - State: open - Opened by dz1iang about 1 month ago - 4 comments
Labels: discussion, triaged

#2273 - The current instantiation does not trigger the initialization of submodules

Issue - State: open - Opened by dz1iang about 1 month ago - 4 comments
Labels: discussion, triaged

#2273 - The current instantiation does not trigger the initialization of submodules

Issue - State: open - Opened by dz1iang about 1 month ago - 5 comments
Labels: discussion, triaged

#2273 - The current instantiation does not trigger the initialization of submodules

Issue - State: open - Opened by dz1iang about 1 month ago - 4 comments
Labels: discussion, triaged

#2272 - DPO after / on top of LoRA tuning

Issue - State: open - Opened by albertbn about 1 month ago - 2 comments
Labels: discussion, triaged

#2272 - DPO after / on top of LoRA tuning

Issue - State: open - Opened by albertbn about 1 month ago - 3 comments
Labels: discussion, triaged

#2271 - Fix a bug in set float32 precision

Pull Request - State: closed - Opened by Nicorgi about 1 month ago - 3 comments
Labels: CLA Signed

#2271 - Fix a bug in set float32 precision

Pull Request - State: closed - Opened by Nicorgi about 1 month ago - 3 comments
Labels: CLA Signed

#2270 - Don't use ``_get_clones``

Issue - State: open - Opened by ebsmothers about 1 month ago - 8 comments
Labels: best practice, community help wanted

#2270 - Don't use ``_get_clones``

Issue - State: open - Opened by ebsmothers about 1 month ago - 8 comments
Labels: best practice, community help wanted

#2269 - Fix a bug in set float32 precision

Pull Request - State: closed - Opened by Nicorgi about 1 month ago - 1 comment
Labels: CLA Signed

#2269 - Fix a bug in set float32 precision

Pull Request - State: closed - Opened by Nicorgi about 1 month ago - 1 comment
Labels: CLA Signed

#2268 - About the CLS token for the llama3_2_vision_encoder

Issue - State: open - Opened by dfloreaa about 1 month ago - 4 comments
Labels: discussion, triaged

#2268 - About the CLS token for the llama3_2_vision_encoder

Issue - State: open - Opened by dfloreaa about 1 month ago - 4 comments
Labels: discussion, triaged

#2267 - Expose FSDP2 MixedPrecisionPolicy params

Issue - State: open - Opened by EugenHotaj about 1 month ago - 1 comment
Labels: enhancement, triaged

#2267 - Expose FSDP2 MixedPrecisionPolicy params

Issue - State: open - Opened by EugenHotaj about 1 month ago - 1 comment
Labels: enhancement, triaged

#2267 - Expose FSDP2 MixedPrecisionPolicy params

Issue - State: open - Opened by EugenHotaj about 1 month ago - 1 comment
Labels: enhancement, triaged

#2266 - [EZ] Pass seed to data sampler.

Pull Request - State: open - Opened by EugenHotaj about 1 month ago - 13 comments
Labels: CLA Signed

#2265 - Add AlpacaToMessages to message transforms doc page

Pull Request - State: closed - Opened by AndrewMead10 about 1 month ago - 1 comment
Labels: CLA Signed

#2265 - Add AlpacaToMessages to message transforms doc page

Pull Request - State: closed - Opened by AndrewMead10 about 1 month ago - 1 comment
Labels: CLA Signed

#2265 - Add AlpacaToMessages to message transforms doc page

Pull Request - State: closed - Opened by AndrewMead10 about 1 month ago - 1 comment
Labels: CLA Signed

#2264 - Training with lora_finetune_distributed is slower than single_device, profile shows that nccl is causing this problem

Issue - State: closed - Opened by seekerzz about 1 month ago - 8 comments
Labels: distributed, triaged

#2264 - Training with lora_finetune_distributed is slower than single_device, profile shows that nccl is causing this problem

Issue - State: closed - Opened by seekerzz about 1 month ago - 9 comments
Labels: distributed, triaged

#2264 - Training with lora_finetune_distributed is slower than single_device, profile shows that nccl is causing this problem

Issue - State: closed - Opened by seekerzz about 1 month ago - 9 comments
Labels: distributed, triaged

GitHub / pytorch/torchtune issues and pull requests