Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / pytorch/PiPPy issues and pull requests

#1145 - AssertionError when running example scripts for Llama

Issue - State: open - Opened by Noblezhong about 1 month ago - 1 comment

#1144 - [Question] Is the current implementation efficient?

Issue - State: closed - Opened by jq-wei about 1 month ago - 2 comments

#1143 - [BUG] num_stages incorrect and some assertions

Issue - State: open - Opened by jq-wei about 1 month ago - 1 comment

#1142 - How to train a model with pippy

Issue - State: open - Opened by sunkun1997 2 months ago - 2 comments

#1141 - fixed missing argument and refactoring

Pull Request - State: open - Opened by Ankur-singh 3 months ago - 2 comments
Labels: cla signed

#1139 - Update all hf examples to have dist.barrier

Pull Request - State: closed - Opened by muellerzr 3 months ago
Labels: cla signed

#1138 - Code hangs permanently

Issue - State: open - Opened by Narasimha1997 4 months ago

#1137 - Fix llama example split failed

Pull Request - State: open - Opened by rednoah91 4 months ago - 1 comment
Labels: cla signed

#1136 - Support for Autoregressive generation with LLMs

Issue - State: open - Opened by apresunreve 4 months ago

#1135 - Meta init llama then pipeline then materialize

Pull Request - State: open - Opened by kwen2501 4 months ago - 1 comment
Labels: cla signed

#1134 - [Error] pipeline() got an unexpected keyword argument

Issue - State: open - Opened by HieronZhang 4 months ago - 1 comment

#1133 - [Bug?] Gradient Synchronization for DDP

Issue - State: open - Opened by jianweif 4 months ago - 3 comments

#1132 - [BUG] cannot capture your model as a full graph

Issue - State: open - Opened by sunkun1997 4 months ago - 6 comments

#1131 - ModuleNotFoundError: No module named 'torch.distributed.pipelining'.

Issue - State: closed - Opened by sunkun1997 4 months ago - 1 comment

#1130 - `pipeline` arguments are not matched

Issue - State: open - Opened by rednoah91 5 months ago - 8 comments

#1129 - Implemented flexible PP

Pull Request - State: open - Opened by haocizhang 5 months ago - 1 comment
Labels: cla signed

#1128 - Add migration notice

Pull Request - State: closed - Opened by kwen2501 5 months ago
Labels: cla signed

#1127 - Migrate Llama example to use torch APIs

Pull Request - State: closed - Opened by kwen2501 5 months ago
Labels: cla signed

#1126 - CPU offloading?

Issue - State: open - Opened by Xynonners 6 months ago - 2 comments

#1125 - Move auto split out of GPT2 example into a separate file

Pull Request - State: closed - Opened by kwen2501 6 months ago
Labels: cla signed

#1124 - Migrate some of the HF examples to use 2.4 PP APIs

Pull Request - State: closed - Opened by kwen2501 6 months ago
Labels: cla signed

#1123 - ImportError: cannot import name 'pipeline' from 'pippy'

Issue - State: closed - Opened by bob020416 6 months ago - 2 comments

#1122 - Can Pippy be combined with PEFT LoRA?

Issue - State: open - Opened by Songjw133 6 months ago - 1 comment

#1121 - Add nightly model tests against pytorch

Pull Request - State: closed - Opened by kwen2501 6 months ago - 1 comment
Labels: cla signed

#1120 - Add nightly model tests against pytorch

Pull Request - State: closed - Opened by kwen2501 6 months ago
Labels: cla signed

#1118 - Inference freezes when running llama example with pp>2

Issue - State: open - Opened by JamesLYan 6 months ago - 3 comments

#1117 - [WIP] enable doraPP

Pull Request - State: open - Opened by tianfengfrank 6 months ago - 1 comment
Labels: cla signed

#1116 - Refactor HuggingFace examples to use torch.distributed.pipelining

Pull Request - State: closed - Opened by kwen2501 6 months ago
Labels: cla signed

#1115 - examples/huggingface failed

Issue - State: open - Opened by yaxan 7 months ago - 8 comments

#1114 - add ddp test

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1113 - Privatize step_microbatches

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1112 - Update test_cpu_init

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1111 - Add comments to _PipelineStage

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1110 - Make PipelineStage private

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1109 - refactor manual stage, include docs and example

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1108 - PP Tracer doesn't work with fused_rmsnorm

Issue - State: open - Opened by wconstab 7 months ago - 2 comments

#1107 - Infinite recursion on torch.export for PP tracing

Issue - State: open - Opened by wconstab 7 months ago

#1106 - FSDP+PP requires changing layer iteration code

Issue - State: closed - Opened by wconstab 7 months ago - 1 comment

#1105 - FSDP+PP bug where reshard_after_forward must be true

Issue - State: open - Opened by wconstab 7 months ago - 6 comments

#1104 - FSDP+PP tracer issue with cast-to-bf16

Issue - State: open - Opened by wconstab 7 months ago - 9 comments

#1103 - Torchtitan Pipeline Parallel Issue Tracker

Issue - State: open - Opened by wconstab 7 months ago

#1102 - Fix auto-split

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1101 - Use non-strict mode by default

Pull Request - State: open - Opened by kwen2501 7 months ago - 1 comment
Labels: cla signed

#1100 - Add tests for input check

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1099 - Make IR private

Pull Request - State: closed - Opened by kwen2501 7 months ago - 1 comment
Labels: cla signed

#1098 - Fix interleaved 1f1b race

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1097 - Follow FSDP name change

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1096 - Make Reducer class private

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1095 - pipeline() API accepting split_spec

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1094 - Retrieving the Trained Model

Issue - State: open - Opened by dheerj188 7 months ago - 6 comments

#1093 - Make some modules private

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1092 - Actually add test for microbatch

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1091 - Clean up microbatch.py and add unit test

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1090 - Add unit test for stage_backward

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1089 - Add comments to backward.py

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1088 - Fix interleaved 1f1b race

Pull Request - State: closed - Opened by H-Huang 7 months ago - 1 comment
Labels: cla signed

#1087 - Exception when splitting model with "--autosplit"

Issue - State: closed - Opened by spupyrev 7 months ago

#1086 - Setting default logging level to WARNING

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1085 - Add comments in debug.py

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1084 - Clean up IR.py

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1083 - Clean up utils.py

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1082 - Tidy up __init__

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1081 - fix interleaved 1f1b edge case

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1080 - A graph-based pipeline splitting

Pull Request - State: closed - Opened by spupyrev 7 months ago - 6 comments
Labels: cla signed

#1079 - re-enable interleaved 1f1b test

Pull Request - State: open - Opened by H-Huang 7 months ago
Labels: cla signed

#1078 - Test interleaved schedules

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1077 - update loss utilities to take stage

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1076 - Remove stale visualization files

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1075 - Remove stale visualization files

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1074 - Use relative imports

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1073 - Move LoadModule and SaveModule

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1072 - Use relative import to simplify migration

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1071 - Adjust logging level

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1070 - Zero grad of input buffers

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1069 - Add support for FSDP + interleaved 1f1b

Pull Request - State: closed - Opened by wconstab 7 months ago
Labels: cla signed

#1068 - Add support for FSDP + looped bfs

Pull Request - State: closed - Opened by wconstab 7 months ago
Labels: cla signed

#1067 - Add support for 1F1B + FSDP

Pull Request - State: closed - Opened by wconstab 7 months ago
Labels: cla signed

#1066 - Add gradient equivalence test

Pull Request - State: closed - Opened by kwen2501 7 months ago
Labels: cla signed

#1065 - WIP fixing the test

Pull Request - State: closed - Opened by wconstab 7 months ago - 1 comment
Labels: cla signed

#1064 - GPipe Schedule hangs when run with 1 microbatch

Issue - State: open - Opened by wconstab 7 months ago

#1063 - Interleaved 1f1b supports loss

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1062 - [Test] Create a model registry for testing

Issue - State: closed - Opened by kwen2501 7 months ago

#1061 - Use relative import to simplify migration

Issue - State: closed - Opened by kwen2501 7 months ago

#1060 - Refactor loss logic in schedules

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1059 - add loss to looped_bfs

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1058 - Pipeline stages share fwd/bwd send/recv ops

Pull Request - State: closed - Opened by H-Huang 7 months ago
Labels: cla signed

#1057 - Move GPipe and 1F1B to PipelineScheduleSingle (#1046)

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1056 - Unexpected Memory Usage and Latency with PP

Issue - State: open - Opened by Lucius-THU 8 months ago - 4 comments

#1055 - Fix #1053: wait for comm ops in Interleaved 1F1B

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1054 - Manual PP stage gives inconsistent output shape for first stage

Issue - State: closed - Opened by wconstab 8 months ago - 2 comments

#1053 - Interleave 1F1B does not wait comm ops

Issue - State: closed - Opened by kwen2501 8 months ago

#1052 - Fix backward microbatch index in 1F1B

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1051 - Support loss computation in Interleaved 1F1B

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1050 - mb_index undefined in Interleaved 1F1B

Issue - State: closed - Opened by kwen2501 8 months ago - 3 comments

#1049 - Use stage index in PipelineStage's log prefix

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1048 - Add base class for multi-stage schedules

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1047 - Consolidate step() function

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed

#1046 - Move GPipe and 1F1B to PipelineScheduleSingle

Pull Request - State: closed - Opened by kwen2501 8 months ago
Labels: cla signed