Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / pytorch/PiPPy issues and pull requests
#1146 - A problem when modify the GPT example to fit Llama2-7b-chat
Issue -
State: open - Opened by Noblezhong 25 days ago
#1145 - AssertionError when running example scripts for Llama
Issue -
State: open - Opened by Noblezhong about 1 month ago
- 1 comment
#1144 - [Question] Is the current implementation efficient?
Issue -
State: closed - Opened by jq-wei about 1 month ago
- 2 comments
#1143 - [BUG] num_stages incorrect and some assertions
Issue -
State: open - Opened by jq-wei about 1 month ago
- 1 comment
#1142 - How to train a model with pippy
Issue -
State: open - Opened by sunkun1997 2 months ago
- 2 comments
#1141 - fixed missing argument and refactoring
Pull Request -
State: open - Opened by Ankur-singh 3 months ago
- 2 comments
Labels: cla signed
#1139 - Update all hf examples to have dist.barrier
Pull Request -
State: closed - Opened by muellerzr 3 months ago
Labels: cla signed
#1138 - Code hangs permanently
Issue -
State: open - Opened by Narasimha1997 4 months ago
#1137 - Fix llama example split failed
Pull Request -
State: open - Opened by rednoah91 4 months ago
- 1 comment
Labels: cla signed
#1136 - Support for Autoregressive generation with LLMs
Issue -
State: open - Opened by apresunreve 4 months ago
#1135 - Meta init llama then pipeline then materialize
Pull Request -
State: open - Opened by kwen2501 4 months ago
- 1 comment
Labels: cla signed
#1134 - [Error] pipeline() got an unexpected keyword argument
Issue -
State: open - Opened by HieronZhang 4 months ago
- 1 comment
#1133 - [Bug?] Gradient Synchronization for DDP
Issue -
State: open - Opened by jianweif 4 months ago
- 3 comments
#1132 - [BUG] cannot capture your model as a full graph
Issue -
State: open - Opened by sunkun1997 4 months ago
- 6 comments
#1131 - ModuleNotFoundError: No module named 'torch.distributed.pipelining'.
Issue -
State: closed - Opened by sunkun1997 4 months ago
- 1 comment
#1130 - `pipeline` arguments are not matched
Issue -
State: open - Opened by rednoah91 5 months ago
- 8 comments
#1129 - Implemented flexible PP
Pull Request -
State: open - Opened by haocizhang 5 months ago
- 1 comment
Labels: cla signed
#1128 - Add migration notice
Pull Request -
State: closed - Opened by kwen2501 5 months ago
Labels: cla signed
#1127 - Migrate Llama example to use torch APIs
Pull Request -
State: closed - Opened by kwen2501 5 months ago
Labels: cla signed
#1126 - CPU offloading?
Issue -
State: open - Opened by Xynonners 6 months ago
- 2 comments
#1125 - Move auto split out of GPT2 example into a separate file
Pull Request -
State: closed - Opened by kwen2501 6 months ago
Labels: cla signed
#1124 - Migrate some of the HF examples to use 2.4 PP APIs
Pull Request -
State: closed - Opened by kwen2501 6 months ago
Labels: cla signed
#1123 - ImportError: cannot import name 'pipeline' from 'pippy'
Issue -
State: closed - Opened by bob020416 6 months ago
- 2 comments
#1122 - Can Pippy be combined with PEFT LoRA?
Issue -
State: open - Opened by Songjw133 6 months ago
- 1 comment
#1121 - Add nightly model tests against pytorch
Pull Request -
State: closed - Opened by kwen2501 6 months ago
- 1 comment
Labels: cla signed
#1120 - Add nightly model tests against pytorch
Pull Request -
State: closed - Opened by kwen2501 6 months ago
Labels: cla signed
#1119 - Adding 'labels' input to model with 'include_loss_args' fails hf examples
Issue -
State: open - Opened by alexlan137 6 months ago
#1118 - Inference freezes when running llama example with pp>2
Issue -
State: open - Opened by JamesLYan 6 months ago
- 3 comments
#1117 - [WIP] enable doraPP
Pull Request -
State: open - Opened by tianfengfrank 6 months ago
- 1 comment
Labels: cla signed
#1116 - Refactor HuggingFace examples to use torch.distributed.pipelining
Pull Request -
State: closed - Opened by kwen2501 6 months ago
Labels: cla signed
#1115 - examples/huggingface failed
Issue -
State: open - Opened by yaxan 7 months ago
- 8 comments
#1114 - add ddp test
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1113 - Privatize step_microbatches
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1112 - Update test_cpu_init
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1111 - Add comments to _PipelineStage
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1110 - Make PipelineStage private
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1109 - refactor manual stage, include docs and example
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1108 - PP Tracer doesn't work with fused_rmsnorm
Issue -
State: open - Opened by wconstab 7 months ago
- 2 comments
#1107 - Infinite recursion on torch.export for PP tracing
Issue -
State: open - Opened by wconstab 7 months ago
#1106 - FSDP+PP requires changing layer iteration code
Issue -
State: closed - Opened by wconstab 7 months ago
- 1 comment
#1105 - FSDP+PP bug where reshard_after_forward must be true
Issue -
State: open - Opened by wconstab 7 months ago
- 6 comments
#1104 - FSDP+PP tracer issue with cast-to-bf16
Issue -
State: open - Opened by wconstab 7 months ago
- 9 comments
#1103 - Torchtitan Pipeline Parallel Issue Tracker
Issue -
State: open - Opened by wconstab 7 months ago
#1102 - Fix auto-split
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1101 - Use non-strict mode by default
Pull Request -
State: open - Opened by kwen2501 7 months ago
- 1 comment
Labels: cla signed
#1100 - Add tests for input check
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1099 - Make IR private
Pull Request -
State: closed - Opened by kwen2501 7 months ago
- 1 comment
Labels: cla signed
#1098 - Fix interleaved 1f1b race
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1097 - Follow FSDP name change
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1096 - Make Reducer class private
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1095 - pipeline() API accepting split_spec
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1094 - Retrieving the Trained Model
Issue -
State: open - Opened by dheerj188 7 months ago
- 6 comments
#1093 - Make some modules private
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1092 - Actually add test for microbatch
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1091 - Clean up microbatch.py and add unit test
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1090 - Add unit test for stage_backward
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1089 - Add comments to backward.py
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1088 - Fix interleaved 1f1b race
Pull Request -
State: closed - Opened by H-Huang 7 months ago
- 1 comment
Labels: cla signed
#1087 - Exception when splitting model with "--autosplit"
Issue -
State: closed - Opened by spupyrev 7 months ago
#1086 - Setting default logging level to WARNING
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1085 - Add comments in debug.py
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1084 - Clean up IR.py
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1083 - Clean up utils.py
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1082 - Tidy up __init__
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1081 - fix interleaved 1f1b edge case
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1080 - A graph-based pipeline splitting
Pull Request -
State: closed - Opened by spupyrev 7 months ago
- 6 comments
Labels: cla signed
#1079 - re-enable interleaved 1f1b test
Pull Request -
State: open - Opened by H-Huang 7 months ago
Labels: cla signed
#1078 - Test interleaved schedules
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1077 - update loss utilities to take stage
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1076 - Remove stale visualization files
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1075 - Remove stale visualization files
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1074 - Use relative imports
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1073 - Move LoadModule and SaveModule
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1072 - Use relative import to simplify migration
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1071 - Adjust logging level
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1070 - Zero grad of input buffers
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1069 - Add support for FSDP + interleaved 1f1b
Pull Request -
State: closed - Opened by wconstab 7 months ago
Labels: cla signed
#1068 - Add support for FSDP + looped bfs
Pull Request -
State: closed - Opened by wconstab 7 months ago
Labels: cla signed
#1067 - Add support for 1F1B + FSDP
Pull Request -
State: closed - Opened by wconstab 7 months ago
Labels: cla signed
#1066 - Add gradient equivalence test
Pull Request -
State: closed - Opened by kwen2501 7 months ago
Labels: cla signed
#1065 - WIP fixing the test
Pull Request -
State: closed - Opened by wconstab 7 months ago
- 1 comment
Labels: cla signed
#1064 - GPipe Schedule hangs when run with 1 microbatch
Issue -
State: open - Opened by wconstab 7 months ago
#1063 - Interleaved 1f1b supports loss
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1062 - [Test] Create a model registry for testing
Issue -
State: closed - Opened by kwen2501 7 months ago
#1061 - Use relative import to simplify migration
Issue -
State: closed - Opened by kwen2501 7 months ago
#1060 - Refactor loss logic in schedules
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1059 - add loss to looped_bfs
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1058 - Pipeline stages share fwd/bwd send/recv ops
Pull Request -
State: closed - Opened by H-Huang 7 months ago
Labels: cla signed
#1057 - Move GPipe and 1F1B to PipelineScheduleSingle (#1046)
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1056 - Unexpected Memory Usage and Latency with PP
Issue -
State: open - Opened by Lucius-THU 8 months ago
- 4 comments
#1055 - Fix #1053: wait for comm ops in Interleaved 1F1B
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1054 - Manual PP stage gives inconsistent output shape for first stage
Issue -
State: closed - Opened by wconstab 8 months ago
- 2 comments
#1053 - Interleave 1F1B does not wait comm ops
Issue -
State: closed - Opened by kwen2501 8 months ago
#1052 - Fix backward microbatch index in 1F1B
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1051 - Support loss computation in Interleaved 1F1B
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1050 - mb_index undefined in Interleaved 1F1B
Issue -
State: closed - Opened by kwen2501 8 months ago
- 3 comments
#1049 - Use stage index in PipelineStage's log prefix
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1048 - Add base class for multi-stage schedules
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1047 - Consolidate step() function
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed
#1046 - Move GPipe and 1F1B to PipelineScheduleSingle
Pull Request -
State: closed - Opened by kwen2501 8 months ago
Labels: cla signed