Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / Azure/MS-AMP issues and pull requests

#197 - when LinearReplacer is used?

Issue - State: closed - Opened by 191220042 3 days ago - 4 comments

#196 - Installation might be incomplete

Issue - State: open - Opened by leedrake5 about 1 month ago - 4 comments

#195 - [Security Fix] Avoid running workflow on self-hosted node.

Pull Request - State: open - Opened by guoshzhao about 2 months ago

#194 - Stuck at Compilation of msccl_kernel.o

Issue - State: open - Opened by DefinitlyEvil 2 months ago

#192 - [Security Fix] Running a workflow when a pull request is assigned or labelled

Pull Request - State: closed - Opened by guoshzhao 2 months ago
Labels: bug

#191 - test: trigger pipeline validation

Pull Request - State: closed - Opened by hcoona 2 months ago

#190 - [Test PR] Update setup.py

Pull Request - State: closed - Opened by EricWangCN 2 months ago

#189 - [Test] Update setup.py

Pull Request - State: closed - Opened by mid-au 2 months ago

#188 - [Test] Update setup.py

Pull Request - State: closed - Opened by mid-au 2 months ago

#187 - [TEST PR]Update setup.py

Pull Request - State: closed - Opened by EricWangCN 2 months ago

#186 - Test MSRC

Pull Request - State: closed - Opened by SecurityResearcher-yoda 3 months ago

#184 - remove algolia

Pull Request - State: closed - Opened by tocean 3 months ago

#183 - DeepSpeed integration breaks existing DeepSpeed logic

Issue - State: open - Opened by muellerzr 3 months ago - 2 comments

#182 - Using ZeRO 3?

Issue - State: open - Opened by muellerzr 3 months ago

#181 - Make new optimizer more extensible, easier to integrate downstream for FSDP

Pull Request - State: open - Opened by muellerzr 3 months ago - 6 comments

#180 - AttributeError: 'ScalingTensor' object has no attribute 'view'

Issue - State: open - Opened by LSC527 4 months ago - 3 comments

#179 - Integration with PyTorch Lightning

Issue - State: open - Opened by schopra8 4 months ago

#178 - Does this actually work?

Issue - State: closed - Opened by tsengalb99 4 months ago - 10 comments

#176 - Can I use fp8 only when the code runs to the fp8 branch?

Issue - State: closed - Opened by forevergj 7 months ago - 8 comments

#175 - Why does using msamp decrease throughput

Issue - State: closed - Opened by forevergj 7 months ago - 4 comments

#172 - Optimized model seems slower than original

Issue - State: closed - Opened by BitCircuit 8 months ago - 3 comments

#171 - how can i export the model from pytorch to onnx?

Issue - State: open - Opened by 221588 8 months ago - 1 comment

#170 - Optimizer datatype

Issue - State: closed - Opened by brianchmiel 9 months ago - 4 comments

#168 - MNIST single GPU example: GradScaler AssertionError

Issue - State: open - Opened by 152334H 9 months ago - 5 comments

#167 - Is activation checkpointing used for Table 5 from the FP8-LM paper?

Issue - State: closed - Opened by SolitaryThinker 9 months ago - 2 comments

#165 - Add release blog of v0.4.0

Pull Request - State: closed - Opened by tocean 9 months ago

#164 - add topic tag mixed-precision

Issue - State: closed - Opened by Beliavsky 9 months ago - 1 comment

#163 - [Question] How to apply MS-AMP to only part of the model?

Issue - State: closed - Opened by veritas9872 9 months ago - 3 comments

#162 - Bump ip from 1.1.5 to 1.1.9 in /website

Pull Request - State: open - Opened by dependabot[bot] 9 months ago
Labels: dependencies

#160 - fix some bugs for latest TE

Pull Request - State: closed - Opened by tocean 9 months ago

#159 - Add entrypoint in docker file and update document

Pull Request - State: closed - Opened by tocean 9 months ago

#158 - Optimizer compilation fails with PyTorch 2.2

Issue - State: open - Opened by rosario-purple 10 months ago - 2 comments

#157 - Update te to latest stable version

Pull Request - State: closed - Opened by tocean 10 months ago

#156 - Update deepspeed to latest version

Pull Request - State: closed - Opened by tocean 10 months ago

#155 - Support AMD MI300 GPU

Pull Request - State: open - Opened by tocean 10 months ago

#154 - [Question]Is MS-AMP going to support ZeRO-2 + PP ?

Issue - State: closed - Opened by ohwi 10 months ago - 1 comment

#150 - Bump follow-redirects from 1.14.8 to 1.15.4 in /website

Pull Request - State: open - Opened by dependabot[bot] 11 months ago
Labels: dependencies

#149 - Support FSDP

Pull Request - State: closed - Opened by tocean 11 months ago - 5 comments

#148 - Support fsdp

Pull Request - State: closed - Opened by tocean 11 months ago

#147 - Is MS-AMP reproducing the FP8-LM paper's results?

Issue - State: closed - Opened by xrsrke 11 months ago - 2 comments

#146 - Question about FP8 matmul coverage in FP8-LM

Issue - State: closed - Opened by stakahashy 12 months ago - 2 comments

#145 - Remove model_state.use_fp8_ddp and optimizer.all_reduce_grads

Pull Request - State: open - Opened by wkcn 12 months ago - 1 comment

#144 - [Bug Fixed] Support MS-AMP+TE+DDP and MS-AMP+TE+DeepSpeed

Pull Request - State: closed - Opened by wkcn 12 months ago

#143 - Add cifar10 example using TE+DeepSpeed-Zero+MS-AMP

Pull Request - State: closed - Opened by tocean 12 months ago

#141 - fix bug in deep fp8 zero

Pull Request - State: closed - Opened by tocean 12 months ago

#139 - Support for latest Megatron-LM and transformer-engine 1.0 +

Issue - State: closed - Opened by sosofun 12 months ago - 2 comments

#138 - Use ScalingTensor for state in AdamW optimizer

Pull Request - State: closed - Opened by tocean 12 months ago - 1 comment

#137 - [Dependencies] update deepspeed

Pull Request - State: closed - Opened by wkcn 12 months ago - 3 comments

#136 - [Bugfix] LinearReplacer.replace(linear, Dtypes.kbfloat16) raises error.

Pull Request - State: closed - Opened by wkcn 12 months ago - 1 comment

#134 - Bump axios, @docusaurus/core, @docusaurus/preset-classic and @docusaurus/theme-search-algolia in /website

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#133 - MS-AMP install from source

Issue - State: closed - Opened by wpf19911118 about 1 year ago - 2 comments

#132 - Optimize performance by fuse adding high precision tensor to fp8 tensor

Pull Request - State: closed - Opened by tocean about 1 year ago - 2 comments

#131 - [Bug Fixed] scale=INF when casting a tensor to scaling FP32/BF16 tensors

Pull Request - State: closed - Opened by wkcn about 1 year ago - 4 comments

#130 - MS-AMP crashes with DeepSpeed ZeRO 3

Issue - State: closed - Opened by rationalism about 1 year ago - 4 comments

#129 - Please update obsolete dependencies

Issue - State: closed - Opened by rosario-purple about 1 year ago - 7 comments

#128 - Huggingface Accelerate Support

Issue - State: closed - Opened by muellerzr about 1 year ago - 2 comments

#127 - Questions about error reporting

Issue - State: closed - Opened by Mrzhang-dada about 1 year ago - 2 comments

#126 - Improve document

Pull Request - State: closed - Opened by tocean about 1 year ago

#125 - question about the paper

Issue - State: closed - Opened by WeiSQ-zju about 1 year ago - 4 comments

#124 - Add release blog for v0.3.0

Pull Request - State: closed - Opened by tocean about 1 year ago

#123 - V0.4 Release Plan

Issue - State: open - Opened by cp5555 about 1 year ago - 1 comment
Labels: iteration plan

#122 - Support for MS-AMP in FSDP

Issue - State: closed - Opened by naveenkumarmarri about 1 year ago - 3 comments
Labels: enhancement

#121 - change release version

Pull Request - State: closed - Opened by tocean about 1 year ago - 2 comments

#120 - Override cast_to_fp8 in te.module.linear

Pull Request - State: closed - Opened by tocean about 1 year ago

#119 - FP8 in tensor parallel region question

Issue - State: closed - Opened by afcruzs about 1 year ago - 4 comments

#118 - FP8 in linear layer question

Issue - State: closed - Opened by afcruzs about 1 year ago - 2 comments

#117 - Automatic Scaling in the code

Issue - State: closed - Opened by afcruzs about 1 year ago - 2 comments

#116 - Question: Is FP8-LM only supported on H100?

Issue - State: closed - Opened by LSC527 about 1 year ago - 7 comments

#115 - Training curve datapoints or smoothing

Issue - State: open - Opened by afcruzs about 1 year ago - 1 comment

#114 - Fix typo in optimization-level.md

Pull Request - State: closed - Opened by eltociear about 1 year ago

#113 - add the bibtex of "FP8-LM: Training FP8 Large Language Models" in README

Pull Request - State: closed - Opened by wkcn about 1 year ago

#112 - [Bug Fixed] The scaling weight is not updated in the optimizer `LBAdamW`

Pull Request - State: closed - Opened by wkcn about 1 year ago

#111 - Qusetion: FP8 Allreduce

Issue - State: closed - Opened by MARD1NO about 1 year ago - 7 comments

#110 - Question : does it work with Apple mps ?

Issue - State: closed - Opened by edmondja about 1 year ago - 2 comments

#109 - Improve docs

Pull Request - State: closed - Opened by tocean about 1 year ago

#108 - Question: Difficulty of FP8 + ZeRO

Issue - State: closed - Opened by awgu about 1 year ago - 1 comment

#107 - V0.3.0 Test Plan

Issue - State: closed - Opened by tocean about 1 year ago

#106 - fix description and format error in docs

Pull Request - State: closed - Opened by tocean about 1 year ago

#105 - Bump shell-quote, @docusaurus/core and @docusaurus/theme-search-algolia in /website

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#104 - Bump minimatch, @docusaurus/core and @docusaurus/theme-search-algolia in /website

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#103 - Bump @babel/traverse from 7.14.5 to 7.23.2 in /website

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#102 - Bump postcss from 8.3.5 to 8.4.31 in /website

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#101 - Add website for MS-AMP

Pull Request - State: closed - Opened by tocean about 1 year ago - 1 comment

#100 - Support checkpoint for Megatron-LM

Pull Request - State: closed - Opened by tocean about 1 year ago

#99 - Questions: Clarifying the use of FP8 for Training

Issue - State: closed - Opened by jon-chuang about 1 year ago - 2 comments

#98 - Support latest TransformerEngine

Pull Request - State: closed - Opened by tocean about 1 year ago

#97 - [Bugfix]Change clip_grad_norm_fp8 to clip_grad_norm_fp32

Pull Request - State: closed - Opened by tocean over 1 year ago

#96 - [Bugfix]Change clip_grad_norm_fp8 to clip_grad_norm_fp32

Pull Request - State: closed - Opened by tocean over 1 year ago

#95 - Support Megatron-LM

Pull Request - State: closed - Opened by tocean over 1 year ago

#94 - Refactor distop module

Pull Request - State: closed - Opened by tocean over 1 year ago - 1 comment

#93 - Add fp8 ddp and remove ready_to_all_reduce

Pull Request - State: closed - Opened by tocean over 1 year ago - 1 comment