Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / facebookresearch/fairscale issues and pull requests
#1190 - support for grad acc
Pull Request -
State: open - Opened by ngoyal2707 30 days ago
Labels: CLA Signed
#1189 - Hi, Groups division may be incorrect in initialize() in fairscale/nn/model_parallel/initialize.py
Issue -
State: open - Opened by Youngluc about 1 month ago
#1188 - Raising `assert param.grad is not None` when finetuning LoRA.
Issue -
State: open - Opened by HashimotoPatrickMu 3 months ago
- 1 comment
#1187 - Bump scikit-learn from 1.1.3 to 1.5.0
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
Labels: CLA Signed, dependencies
#1186 - [FSDPv1] Optimize memory usage for optimize_backward_concat=True
Pull Request -
State: closed - Opened by chrisxcai 4 months ago
Labels: CLA Signed
#1185 - FP8 AllGather Support in Fairscale
Pull Request -
State: open - Opened by levendlee 4 months ago
Labels: CLA Signed
#1184 - [FSDPv1] Only perform cat() during last microbatch backward() within FlattenParamsWrapper
Pull Request -
State: closed - Opened by chrisxcai 5 months ago
Labels: CLA Signed
#1183 - Llama4 FP8 Training Debug - fairscale
Pull Request -
State: open - Opened by jiecaoyu 5 months ago
Labels: CLA Signed
#1182 - Add timeout in initialize_model_parallel
Pull Request -
State: closed - Opened by vladmihailescu 5 months ago
Labels: CLA Signed
#1181 - Fix minor grammatical corrections in docs
Pull Request -
State: open - Opened by aakashapoorv 5 months ago
Labels: CLA Signed
#1180 - [FSDPv1] Only perform cat() during last microbatch backward() within FlattenParamsWrapper
Pull Request -
State: open - Opened by chrisxcai 5 months ago
Labels: CLA Signed
#1179 - Updated the README file
Pull Request -
State: closed - Opened by KPCOFGS 5 months ago
Labels: CLA Signed
#1178 - [WIP] Make FSDPv1 only perform cat() during last microbatch backward() within FlattenParamsWrapper
Pull Request -
State: open - Opened by chrisxcai 5 months ago
Labels: CLA Signed
#1177 - sync fbcode cp pg initialize
Pull Request -
State: closed - Opened by amylittleyang 6 months ago
Labels: CLA Signed
#1176 - add get_cp_ranks to model_parallel initialize
Pull Request -
State: closed - Opened by amylittleyang 6 months ago
Labels: CLA Signed
#1175 - Add cast input argument
Pull Request -
State: closed - Opened by whbldhwj 6 months ago
Labels: CLA Signed
#1174 - add context parallel group init to mp init
Pull Request -
State: closed - Opened by amylittleyang 6 months ago
Labels: CLA Signed
#1173 - Make sure that tensor is contiguous before gathering across processes
Pull Request -
State: open - Opened by patrickvonplaten 6 months ago
Labels: CLA Signed
#1172 - [question] Different training between DDP & Sharded DDP
Issue -
State: open - Opened by kwohlfahrt 6 months ago
#1171 - Added requires_grad check for params_with_grad method
Pull Request -
State: closed - Opened by whbldhwj 6 months ago
Labels: CLA Signed
#1170 - what are pointwise Optimizers and non-pointwise Optimizers?
Issue -
State: closed - Opened by bugm 6 months ago
- 4 comments
#1169 - Bump black from 22.3.0 to 24.3.0
Pull Request -
State: open - Opened by dependabot[bot] 6 months ago
Labels: CLA Signed, dependencies
#1168 - Fairscale support for only performing allreduce in last microbatch
Pull Request -
State: open - Opened by jiecaoyu 7 months ago
Labels: CLA Signed
#1167 - Fix params_with_grad in FSDP when the model has frozen parameters
Pull Request -
State: open - Opened by whbldhwj 7 months ago
Labels: CLA Signed
#1166 - Changed to only run reshard hook if all gradients computed
Pull Request -
State: closed - Opened by awgu 7 months ago
Labels: CLA Signed
#1165 - Example of MOE
Issue -
State: open - Opened by Juanhui28 7 months ago
- 1 comment
#1164 - Avoid calling _free_fp16_param_shard() too early
Pull Request -
State: open - Opened by jiecaoyu 7 months ago
- 2 comments
Labels: CLA Signed
#1163 - FSDP on the same CNN model requires more memory than DataParallel
Issue -
State: closed - Opened by s-reaungamornrat 7 months ago
#1162 - Should assign norm_type instead of scale_grad_by_freq
Pull Request -
State: closed - Opened by brad-mengchi 8 months ago
- 1 comment
Labels: CLA Signed
#1161 - added option for no PG validation for faster init
Pull Request -
State: closed - Opened by ngoyal2707 8 months ago
Labels: CLA Signed
#1160 - ci: Use GITHUB_OUTPUT envvar instead of set-output command
Pull Request -
State: open - Opened by arunsathiya 8 months ago
- 1 comment
Labels: CLA Signed
#1159 - Added reshard hook for frozen params in backward
Pull Request -
State: open - Opened by awgu 9 months ago
- 5 comments
Labels: CLA Signed
#1158 - Add support for `torch.set_default_device` when initializing model parameters
Pull Request -
State: open - Opened by fshp971 9 months ago
Labels: CLA Signed
#1157 - Assign self.norm_type to input norm_type
Pull Request -
State: closed - Opened by gtamer2 10 months ago
- 1 comment
Labels: CLA Signed
#1156 - Issue in `ParallelEmbedding` constructor - scale_grad_by_freq being assigned to norm_type
Issue -
State: closed - Opened by gtamer2 10 months ago
- 2 comments
#1155 - How can I use torchrun + model parallelism + FSDP
Issue -
State: open - Opened by HackGiter 10 months ago
- 1 comment
#1154 - fixed broken clipping
Pull Request -
State: closed - Opened by ngoyal2707 10 months ago
Labels: CLA Signed
#1153 - fix .grad=None issue when param is not sharded
Pull Request -
State: closed - Opened by jiecaoyu 10 months ago
Labels: CLA Signed
#1152 - changes to keep reduced grad in fp32
Pull Request -
State: closed - Opened by vedanuj 10 months ago
Labels: CLA Signed
#1151 - [not to be merged yet] added temp changes for fp32 main grad, might not work for TE
Pull Request -
State: closed - Opened by ngoyal2707 10 months ago
Labels: CLA Signed
#1150 - fix no shard case
Pull Request -
State: closed - Opened by artkorenev 10 months ago
Labels: CLA Signed
#1149 - Fix _free_full_params()
Pull Request -
State: open - Opened by hadasah 10 months ago
Labels: CLA Signed
#1148 - Extend CheckpointFunction to track all tensor input/output
Pull Request -
State: open - Opened by 000Justin000 11 months ago
Labels: CLA Signed
#1147 - [Not for merge] fp8allgather debug
Pull Request -
State: open - Opened by jiecaoyu 11 months ago
Labels: CLA Signed
#1146 - It is dangerous to using default non_block=True.
Issue -
State: open - Opened by heshenghuan 11 months ago
#1145 - torch.compile with FSDP
Issue -
State: closed - Opened by santha96 12 months ago
- 2 comments
#1144 - Added fns for manual free, reduce-scatter; removed stream sync if event sync
Pull Request -
State: closed - Opened by awgu 12 months ago
- 1 comment
Labels: CLA Signed
#1143 - Cleared backward hooks to avoid accumulating over iterations
Pull Request -
State: closed - Opened by awgu 12 months ago
Labels: CLA Signed
#1142 - Add main grad before fwd pass
Pull Request -
State: open - Opened by vedanuj 12 months ago
- 2 comments
Labels: CLA Signed
#1141 - Removed extra `cat` before reduce-scatter
Pull Request -
State: closed - Opened by awgu 12 months ago
- 1 comment
Labels: CLA Signed
#1140 - Add main_grad
Pull Request -
State: open - Opened by jianyuh 12 months ago
Labels: CLA Signed
#1139 - Fix fsdp+pp+te WPS decreasing issue
Pull Request -
State: closed - Opened by jianyuh 12 months ago
Labels: CLA Signed
#1138 - Fix the parameter in ParallelEmbedding
Pull Request -
State: closed - Opened by taowangcheng about 1 year ago
- 2 comments
Labels: CLA Signed
#1137 - Fix missing params in unconsolidated models
Pull Request -
State: closed - Opened by imjeremyhi about 1 year ago
- 1 comment
Labels: CLA Signed
#1136 - Fp8 all gather hack
Pull Request -
State: open - Opened by jspark1105 about 1 year ago
- 1 comment
Labels: CLA Signed
#1135 - Fix a `ParallelEmbedding` bug
Pull Request -
State: closed - Opened by chhwang about 1 year ago
- 1 comment
Labels: CLA Signed
#1134 - assert self.has_full_params
Issue -
State: open - Opened by pokameng about 1 year ago
- 4 comments
#1133 - Hybrid Sharding in Fairscale's FSDP Implementation
Issue -
State: closed - Opened by stephanpeitz about 1 year ago
- 2 comments
#1132 - Fix typo in ParallelEmbedding argument assignment
Pull Request -
State: open - Opened by hessamb about 1 year ago
- 2 comments
Labels: CLA Signed
#1131 - Why ShardedDDP and OSS are slower than Vanilla DDP
Issue -
State: open - Opened by powermano about 1 year ago
#1130 - pip install failed
Issue -
State: open - Opened by dogxxxxx about 1 year ago
#1129 - Error with nested models "Caffe2 uses a lazy allocation..."
Issue -
State: open - Opened by Emanuele97x about 1 year ago
#1128 - [bug] pip package 0.4.13 fails to build wheel
Issue -
State: open - Opened by project-tuva about 1 year ago
#1127 - Add a context manager for activation sharding.
Pull Request -
State: open - Opened by luyug over 1 year ago
- 1 comment
Labels: CLA Signed
#1126 - Error Freezing Weights
Issue -
State: open - Opened by mostafaelhoushi over 1 year ago
#1125 - added option to do backward AG over smaller set of gpus instead of full DDP world
Pull Request -
State: open - Opened by ngoyal2707 over 1 year ago
- 1 comment
Labels: CLA Signed
#1124 - Compatibility with Pytorch 2.0; failing test `test_gradient_value`
Issue -
State: open - Opened by h-vetinari over 1 year ago
- 4 comments
#1123 - Can exclude some layer parameter not to shard?
Issue -
State: open - Opened by robotcator over 1 year ago
- 5 comments
#1122 - Update oss_sdp_fsdp.rst
Pull Request -
State: open - Opened by wenjun93 over 1 year ago
- 1 comment
Labels: CLA Signed
#1121 - Update integrations.rst
Pull Request -
State: closed - Opened by fc-synth over 1 year ago
- 1 comment
#1120 - Update cross_entropy.py with no_grad
Pull Request -
State: closed - Opened by Geeks-Sid over 1 year ago
- 1 comment
Labels: CLA Signed
#1119 - FSDP on model that has requires_grad = false
Issue -
State: closed - Opened by andrasiani over 1 year ago
- 1 comment
#1118 - Fix docstring typo
Pull Request -
State: closed - Opened by gregor-soniox over 1 year ago
- 2 comments
Labels: CLA Signed
#1117 - All parameters cannot be shared amongst 2 different FSDP modules
Issue -
State: closed - Opened by sarthakgarg over 1 year ago
- 1 comment
#1116 - Update documentation to remove obsolete references
Pull Request -
State: closed - Opened by daleevans over 1 year ago
- 1 comment
Labels: CLA Signed
#1115 - Whether modifying the source code (fully_sharded_data_parallel.py) will bring safety hazard?
Issue -
State: closed - Opened by dropreg over 1 year ago
- 2 comments
#1114 - [AdaScale] self._hook() failure in __init__() of AdaScale() class
Issue -
State: closed - Opened by connieKing511 over 1 year ago
- 1 comment
#1113 - Combine powersgd with fairscale
Issue -
State: closed - Opened by amsword over 1 year ago
- 1 comment
#1112 - memory explodes after self._rebuild_full_params() function
Issue -
State: closed - Opened by haorannlp over 1 year ago
#1111 - Unexpected Large Memory Consumption during Tensor Parallelism Training with OPT-1.3B
Issue -
State: closed - Opened by dangxingyu over 1 year ago
- 5 comments
#1110 - Fix bibtex entry
Pull Request -
State: closed - Opened by mrbaozi over 1 year ago
- 3 comments
Labels: CLA Signed
#1109 - Memory usage different from deepspeed
Issue -
State: closed - Opened by x54-729 over 1 year ago
- 8 comments
#1108 - make a logging warning once
Pull Request -
State: closed - Opened by min-xu-ai over 1 year ago
Labels: CLA Signed
#1107 - Lots of Commandline Output from this line.
Issue -
State: closed - Opened by jstraub over 1 year ago
- 1 comment
#1106 - Remove `torch._six` from `__init__.py`
Pull Request -
State: closed - Opened by malfet over 1 year ago
- 1 comment
Labels: CLA Signed
#1105 - 8 bit all_gather
Pull Request -
State: open - Opened by ngoyal2707 over 1 year ago
- 4 comments
Labels: CLA Signed
#1104 - [fix] typo in wikitext2_data.py
Pull Request -
State: closed - Opened by gajagajago over 1 year ago
- 2 comments
Labels: CLA Signed
#1103 - [fix] typo in flatten_params_wrapper.py
Pull Request -
State: closed - Opened by eltociear over 1 year ago
- 1 comment
Labels: CLA Signed
#1102 - [FSDP] Training gets slower as iterations increase when flatten_parameters=False?
Issue -
State: closed - Opened by woodyx218 over 1 year ago
- 10 comments
#1101 - [FSDP] How to use customized backward hooks?
Issue -
State: closed - Opened by woodyx218 over 1 year ago
- 25 comments
#1100 - FSDP cannot consolidate optimizer state dict with flatten params is False
Issue -
State: open - Opened by ShenglongZ almost 2 years ago
- 3 comments
#1099 - [test] ci py 3.11 tests
Pull Request -
State: closed - Opened by min-xu-ai almost 2 years ago
Labels: CLA Signed
#1098 - [chore] Ci fix
Pull Request -
State: closed - Opened by min-xu-ai almost 2 years ago
Labels: CLA Signed
#1097 - [chore] add fair_dev packages
Pull Request -
State: closed - Opened by min-xu-ai almost 2 years ago
Labels: CLA Signed
#1096 - Skip rather than fail tests in absence of `fair_dev`
Issue -
State: closed - Opened by h-vetinari almost 2 years ago
- 3 comments
#1095 - Implement _compute_intra_grad_corr_mean for gradient computation
Pull Request -
State: closed - Opened by cyugao almost 2 years ago
Labels: CLA Signed
#1094 - Any examples using AdaScale with fairseq?
Issue -
State: closed - Opened by kedarkolluri almost 2 years ago
- 1 comment
#1093 - FSDP - Extra GPU memory consumption when maintaining a EMA weights
Issue -
State: closed - Opened by syorami almost 2 years ago
- 5 comments
#1092 - clip_grad_norm_ from fairscale downcasts to bf16 before all reduce
Issue -
State: open - Opened by glample almost 2 years ago
- 3 comments
#1091 - minor cleanup
Pull Request -
State: closed - Opened by min-xu-ai almost 2 years ago
Labels: CLA Signed