Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / Azure/MS-AMP issues and pull requests
#197 - when LinearReplacer is used?
Issue -
State: closed - Opened by 191220042 3 days ago
- 4 comments
#196 - Installation might be incomplete
Issue -
State: open - Opened by leedrake5 about 1 month ago
- 4 comments
#195 - [Security Fix] Avoid running workflow on self-hosted node.
Pull Request -
State: open - Opened by guoshzhao about 2 months ago
#194 - Stuck at Compilation of msccl_kernel.o
Issue -
State: open - Opened by DefinitlyEvil 2 months ago
#193 - [Security Fix] Running a workflow when a pull request is assigned or labelled
Pull Request -
State: closed - Opened by guoshzhao 2 months ago
#192 - [Security Fix] Running a workflow when a pull request is assigned or labelled
Pull Request -
State: closed - Opened by guoshzhao 2 months ago
Labels: bug
#191 - test: trigger pipeline validation
Pull Request -
State: closed - Opened by hcoona 2 months ago
#190 - [Test PR] Update setup.py
Pull Request -
State: closed - Opened by EricWangCN 2 months ago
#189 - [Test] Update setup.py
Pull Request -
State: closed - Opened by mid-au 2 months ago
#188 - [Test] Update setup.py
Pull Request -
State: closed - Opened by mid-au 2 months ago
#187 - [TEST PR]Update setup.py
Pull Request -
State: closed - Opened by EricWangCN 2 months ago
#186 - Test MSRC
Pull Request -
State: closed - Opened by SecurityResearcher-yoda 3 months ago
#184 - remove algolia
Pull Request -
State: closed - Opened by tocean 3 months ago
#183 - DeepSpeed integration breaks existing DeepSpeed logic
Issue -
State: open - Opened by muellerzr 3 months ago
- 2 comments
#182 - Using ZeRO 3?
Issue -
State: open - Opened by muellerzr 3 months ago
#181 - Make new optimizer more extensible, easier to integrate downstream for FSDP
Pull Request -
State: open - Opened by muellerzr 3 months ago
- 6 comments
#180 - AttributeError: 'ScalingTensor' object has no attribute 'view'
Issue -
State: open - Opened by LSC527 4 months ago
- 3 comments
#179 - Integration with PyTorch Lightning
Issue -
State: open - Opened by schopra8 4 months ago
#178 - Does this actually work?
Issue -
State: closed - Opened by tsengalb99 4 months ago
- 10 comments
#177 - Request for Update to Support Latest Megatron-LM Version
Issue -
State: open - Opened by nogizakar 4 months ago
#176 - Can I use fp8 only when the code runs to the fp8 branch?
Issue -
State: closed - Opened by forevergj 7 months ago
- 8 comments
#175 - Why does using msamp decrease throughput
Issue -
State: closed - Opened by forevergj 7 months ago
- 4 comments
#174 - [compilation error] nvcc fatal : Unsupported gpu architecture 'compute_89'
Issue -
State: closed - Opened by fmo-mt 7 months ago
#173 - Clarification: do we need 20 or 16 bytes per parameter when training with Adam + Mixed precision
Issue -
State: closed - Opened by rodrigo-f-nogueira 7 months ago
- 2 comments
#172 - Optimized model seems slower than original
Issue -
State: closed - Opened by BitCircuit 8 months ago
- 3 comments
#171 - how can i export the model from pytorch to onnx?
Issue -
State: open - Opened by 221588 8 months ago
- 1 comment
#170 - Optimizer datatype
Issue -
State: closed - Opened by brianchmiel 9 months ago
- 4 comments
#169 - [#168 fix] add context manager to fake `ScalingTensor`/`ScalingParameter`'s `__class__` as `torch.Tensor`
Pull Request -
State: open - Opened by 152334H 9 months ago
#168 - MNIST single GPU example: GradScaler AssertionError
Issue -
State: open - Opened by 152334H 9 months ago
- 5 comments
#167 - Is activation checkpointing used for Table 5 from the FP8-LM paper?
Issue -
State: closed - Opened by SolitaryThinker 9 months ago
- 2 comments
#165 - Add release blog of v0.4.0
Pull Request -
State: closed - Opened by tocean 9 months ago
#164 - add topic tag mixed-precision
Issue -
State: closed - Opened by Beliavsky 9 months ago
- 1 comment
#163 - [Question] How to apply MS-AMP to only part of the model?
Issue -
State: closed - Opened by veritas9872 9 months ago
- 3 comments
#162 - Bump ip from 1.1.5 to 1.1.9 in /website
Pull Request -
State: open - Opened by dependabot[bot] 9 months ago
Labels: dependencies
#161 - Bump axios, @docusaurus/core, @docusaurus/preset-classic and @docusaurus/theme-search-algolia in /website
Pull Request -
State: open - Opened by dependabot[bot] 9 months ago
Labels: dependencies
#160 - fix some bugs for latest TE
Pull Request -
State: closed - Opened by tocean 9 months ago
#159 - Add entrypoint in docker file and update document
Pull Request -
State: closed - Opened by tocean 9 months ago
#158 - Optimizer compilation fails with PyTorch 2.2
Issue -
State: open - Opened by rosario-purple 10 months ago
- 2 comments
#157 - Update te to latest stable version
Pull Request -
State: closed - Opened by tocean 10 months ago
#156 - Update deepspeed to latest version
Pull Request -
State: closed - Opened by tocean 10 months ago
#155 - Support AMD MI300 GPU
Pull Request -
State: open - Opened by tocean 10 months ago
#154 - [Question]Is MS-AMP going to support ZeRO-2 + PP ?
Issue -
State: closed - Opened by ohwi 10 months ago
- 1 comment
#150 - Bump follow-redirects from 1.14.8 to 1.15.4 in /website
Pull Request -
State: open - Opened by dependabot[bot] 11 months ago
Labels: dependencies
#149 - Support FSDP
Pull Request -
State: closed - Opened by tocean 11 months ago
- 5 comments
#148 - Support fsdp
Pull Request -
State: closed - Opened by tocean 11 months ago
#147 - Is MS-AMP reproducing the FP8-LM paper's results?
Issue -
State: closed - Opened by xrsrke 11 months ago
- 2 comments
#146 - Question about FP8 matmul coverage in FP8-LM
Issue -
State: closed - Opened by stakahashy 12 months ago
- 2 comments
#145 - Remove model_state.use_fp8_ddp and optimizer.all_reduce_grads
Pull Request -
State: open - Opened by wkcn 12 months ago
- 1 comment
#144 - [Bug Fixed] Support MS-AMP+TE+DDP and MS-AMP+TE+DeepSpeed
Pull Request -
State: closed - Opened by wkcn 12 months ago
#143 - Add cifar10 example using TE+DeepSpeed-Zero+MS-AMP
Pull Request -
State: closed - Opened by tocean 12 months ago
#142 - Support writing optimizer checkpoint only on rank0 and make UT pass on A100
Pull Request -
State: closed - Opened by tocean 12 months ago
#141 - fix bug in deep fp8 zero
Pull Request -
State: closed - Opened by tocean 12 months ago
#140 - [Feature] Auto scaling factor tuning for FP8 collective communication
Pull Request -
State: open - Opened by wkcn 12 months ago
#139 - Support for latest Megatron-LM and transformer-engine 1.0 +
Issue -
State: closed - Opened by sosofun 12 months ago
- 2 comments
#138 - Use ScalingTensor for state in AdamW optimizer
Pull Request -
State: closed - Opened by tocean 12 months ago
- 1 comment
#137 - [Dependencies] update deepspeed
Pull Request -
State: closed - Opened by wkcn 12 months ago
- 3 comments
#136 - [Bugfix] LinearReplacer.replace(linear, Dtypes.kbfloat16) raises error.
Pull Request -
State: closed - Opened by wkcn 12 months ago
- 1 comment
#135 - [Bugfix] when parameters has no grad or ScalingParameter has no is_meta property it will crash
Pull Request -
State: closed - Opened by tocean about 1 year ago
- 1 comment
#134 - Bump axios, @docusaurus/core, @docusaurus/preset-classic and @docusaurus/theme-search-algolia in /website
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#133 - MS-AMP install from source
Issue -
State: closed - Opened by wpf19911118 about 1 year ago
- 2 comments
#132 - Optimize performance by fuse adding high precision tensor to fp8 tensor
Pull Request -
State: closed - Opened by tocean about 1 year ago
- 2 comments
#131 - [Bug Fixed] scale=INF when casting a tensor to scaling FP32/BF16 tensors
Pull Request -
State: closed - Opened by wkcn about 1 year ago
- 4 comments
#130 - MS-AMP crashes with DeepSpeed ZeRO 3
Issue -
State: closed - Opened by rationalism about 1 year ago
- 4 comments
#129 - Please update obsolete dependencies
Issue -
State: closed - Opened by rosario-purple about 1 year ago
- 7 comments
#128 - Huggingface Accelerate Support
Issue -
State: closed - Opened by muellerzr about 1 year ago
- 2 comments
#127 - Questions about error reporting
Issue -
State: closed - Opened by Mrzhang-dada about 1 year ago
- 2 comments
#126 - Improve document
Pull Request -
State: closed - Opened by tocean about 1 year ago
#125 - question about the paper
Issue -
State: closed - Opened by WeiSQ-zju about 1 year ago
- 4 comments
#124 - Add release blog for v0.3.0
Pull Request -
State: closed - Opened by tocean about 1 year ago
#123 - V0.4 Release Plan
Issue -
State: open - Opened by cp5555 about 1 year ago
- 1 comment
Labels: iteration plan
#122 - Support for MS-AMP in FSDP
Issue -
State: closed - Opened by naveenkumarmarri about 1 year ago
- 3 comments
Labels: enhancement
#121 - change release version
Pull Request -
State: closed - Opened by tocean about 1 year ago
- 2 comments
#120 - Override cast_to_fp8 in te.module.linear
Pull Request -
State: closed - Opened by tocean about 1 year ago
#119 - FP8 in tensor parallel region question
Issue -
State: closed - Opened by afcruzs about 1 year ago
- 4 comments
#118 - FP8 in linear layer question
Issue -
State: closed - Opened by afcruzs about 1 year ago
- 2 comments
#117 - Automatic Scaling in the code
Issue -
State: closed - Opened by afcruzs about 1 year ago
- 2 comments
#116 - Question: Is FP8-LM only supported on H100?
Issue -
State: closed - Opened by LSC527 about 1 year ago
- 7 comments
#115 - Training curve datapoints or smoothing
Issue -
State: open - Opened by afcruzs about 1 year ago
- 1 comment
#114 - Fix typo in optimization-level.md
Pull Request -
State: closed - Opened by eltociear about 1 year ago
#113 - add the bibtex of "FP8-LM: Training FP8 Large Language Models" in README
Pull Request -
State: closed - Opened by wkcn about 1 year ago
#112 - [Bug Fixed] The scaling weight is not updated in the optimizer `LBAdamW`
Pull Request -
State: closed - Opened by wkcn about 1 year ago
#111 - Qusetion: FP8 Allreduce
Issue -
State: closed - Opened by MARD1NO about 1 year ago
- 7 comments
#110 - Question : does it work with Apple mps ?
Issue -
State: closed - Opened by edmondja about 1 year ago
- 2 comments
#109 - Improve docs
Pull Request -
State: closed - Opened by tocean about 1 year ago
#108 - Question: Difficulty of FP8 + ZeRO
Issue -
State: closed - Opened by awgu about 1 year ago
- 1 comment
#107 - V0.3.0 Test Plan
Issue -
State: closed - Opened by tocean about 1 year ago
#106 - fix description and format error in docs
Pull Request -
State: closed - Opened by tocean about 1 year ago
#105 - Bump shell-quote, @docusaurus/core and @docusaurus/theme-search-algolia in /website
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#104 - Bump minimatch, @docusaurus/core and @docusaurus/theme-search-algolia in /website
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#103 - Bump @babel/traverse from 7.14.5 to 7.23.2 in /website
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#102 - Bump postcss from 8.3.5 to 8.4.31 in /website
Pull Request -
State: closed - Opened by dependabot[bot] about 1 year ago
- 1 comment
Labels: dependencies
#101 - Add website for MS-AMP
Pull Request -
State: closed - Opened by tocean about 1 year ago
- 1 comment
#100 - Support checkpoint for Megatron-LM
Pull Request -
State: closed - Opened by tocean about 1 year ago
#99 - Questions: Clarifying the use of FP8 for Training
Issue -
State: closed - Opened by jon-chuang about 1 year ago
- 2 comments
#98 - Support latest TransformerEngine
Pull Request -
State: closed - Opened by tocean about 1 year ago
#97 - [Bugfix]Change clip_grad_norm_fp8 to clip_grad_norm_fp32
Pull Request -
State: closed - Opened by tocean over 1 year ago
#96 - [Bugfix]Change clip_grad_norm_fp8 to clip_grad_norm_fp32
Pull Request -
State: closed - Opened by tocean over 1 year ago
#95 - Support Megatron-LM
Pull Request -
State: closed - Opened by tocean over 1 year ago
#94 - Refactor distop module
Pull Request -
State: closed - Opened by tocean over 1 year ago
- 1 comment
#93 - Add fp8 ddp and remove ready_to_all_reduce
Pull Request -
State: closed - Opened by tocean over 1 year ago
- 1 comment