Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / bigscience-workshop/Megatron-DeepSpeed issues and pull requests

#402 - Bump black from 21.4b0 to 24.3.0

Pull Request - State: open - Opened by dependabot[bot] 8 months ago
Labels: dependencies

#400 - Is this assertion for mask wrong?

Issue - State: open - Opened by yinfangchen 9 months ago - 1 comment

#399 - Feature/tigerbot

Pull Request - State: closed - Opened by i4never about 1 year ago

#398 - Hello, can Megatron-DeepSpeed pre-train llama2?

Issue - State: open - Opened by 13416157913 about 1 year ago

#395 - The difference between zero-3 and megatron with zero-2

Issue - State: open - Opened by nicosouth about 1 year ago

#393 - Feature/tigerbot

Pull Request - State: closed - Opened by i4never over 1 year ago

#392 - questions about inconsistent evaluation result

Issue - State: open - Opened by coorful over 1 year ago

#391 - stage3 error: IndexError: list index out of range

Issue - State: closed - Opened by PhdShi over 1 year ago - 1 comment

#390 - ModuleNotFoundError: No module named 'packaging' when install apex

Issue - State: closed - Opened by SeekPoint over 1 year ago - 3 comments

#388 - Question about ds to universal

Issue - State: open - Opened by saxh over 1 year ago

#386 - hello, I meet a problem

Issue - State: open - Opened by etoilestar over 1 year ago - 8 comments

#384 - Fix/dataloader error

Pull Request - State: closed - Opened by EastInsure over 1 year ago

#383 - pip install -e . failed with ModuleNotFoundError: No module named 'torch'

Issue - State: open - Opened by SeekPoint over 1 year ago - 2 comments

#381 - Megatron-DeepSpeed only applies to specific models?

Issue - State: open - Opened by Bob-cby over 1 year ago

#380 - Universal checkpoints and MP states

Issue - State: closed - Opened by aitorormazabal over 1 year ago - 2 comments

#379 - The given group does not exist pytorch

Issue - State: open - Opened by germanjke over 1 year ago - 2 comments

#378 - upgrade megatron-lm

Issue - State: open - Opened by dz1iang over 1 year ago

#376 - how to do prompt learning with bloom?

Issue - State: open - Opened by moseshu over 1 year ago

#373 - Can I use python only apex for gpt_pretrain?

Issue - State: open - Opened by Luoyang144 over 1 year ago

#372 - how to pretrain t5-lm adapted?

Issue - State: open - Opened by nanyyyyyy over 1 year ago

#371 - How to preprocess data for t5 model?

Issue - State: open - Opened by xiu-ze over 1 year ago

#370 - Add xPos embeddings

Pull Request - State: open - Opened by janEbert over 1 year ago

#369 - Exception: cuda rng state model-parallel-rng is not added

Issue - State: open - Opened by 520jefferson over 1 year ago - 1 comment

#368 - 适配DCU

Pull Request - State: closed - Opened by hepj987 over 1 year ago

#367 - Fix various small problems

Pull Request - State: open - Opened by janEbert over 1 year ago

#366 - How to continue pre-training Bloom?

Issue - State: open - Opened by ShinoharaHare over 1 year ago - 2 comments

#365 - Bloom model training with AML

Pull Request - State: open - Opened by savitamittal1 over 1 year ago

#363 - Is there any script for pretraining/funting Bloom?

Issue - State: open - Opened by drxmy almost 2 years ago

#362 - Bsevalharness

Pull Request - State: closed - Opened by Muennighoff almost 2 years ago

#360 - Fatal error: cuda_fp16.h: No such file or directory on ROCm

Issue - State: open - Opened by lvcc2018 almost 2 years ago - 1 comment

#359 - fintuning bloom 176b with bitfit

Issue - State: closed - Opened by drxmy almost 2 years ago - 2 comments

#358 - Add UL2 data sampling and pretraining

Pull Request - State: open - Opened by janEbert almost 2 years ago - 3 comments

#357 - Add FlashAttention

Pull Request - State: open - Opened by NouamaneTazi almost 2 years ago - 3 comments

#355 - deepspeed_to_megatron several issues

Issue - State: open - Opened by MatejUlcar about 2 years ago - 4 comments

#354 - Distill BLOOM - tentative 2

Pull Request - State: open - Opened by younesbelkada about 2 years ago

#353 - Enable rocm-support

Pull Request - State: open - Opened by luukkonenr about 2 years ago

#352 - Distill megatron - test Draft WIP

Pull Request - State: closed - Opened by younesbelkada about 2 years ago

#351 - Distill megatron - WIP draft code

Pull Request - State: closed - Opened by younesbelkada about 2 years ago

#350 - Load Bloom Optimizer State (i.e. Bloom 1B1)

Issue - State: open - Opened by philippmtk about 2 years ago - 2 comments

#349 - Encoding checkpoint reshaping guide

Pull Request - State: open - Opened by tjruwase about 2 years ago - 1 comment

#348 - Slower inference results for BLOOM fp16 on identical hardware

Issue - State: open - Opened by sarthaklangde about 2 years ago - 5 comments

#347 - grad norm increase strangely

Issue - State: open - Opened by misska1 about 2 years ago - 12 comments

#346 - How to inference GPT2 with DeepSpeed?

Issue - State: closed - Opened by cdj0311 about 2 years ago - 1 comment

#345 - [bloom inference scripts] improvements

Pull Request - State: closed - Opened by stas00 about 2 years ago

#344 - [Bloom inference] further improvements

Pull Request - State: closed - Opened by stas00 about 2 years ago - 1 comment

#343 - About reshape deepspeed checkpoint

Issue - State: open - Opened by henan991201 about 2 years ago - 20 comments

#342 - Installing Apex on Windows

Issue - State: open - Opened by gordicaleksa about 2 years ago - 1 comment

#341 - pretrain_gpt_distributed.sh ERROR!

Issue - State: closed - Opened by cdj0311 about 2 years ago

#340 - [ds-inference bloom] tweaks

Pull Request - State: closed - Opened by stas00 about 2 years ago - 4 comments

#339 - Followup PR for adding generation-server

Pull Request - State: closed - Opened by mayank31398 about 2 years ago - 12 comments

#338 - About convert deepspeed to deepspeed checkpoint

Issue - State: open - Opened by henan991201 about 2 years ago - 4 comments

#337 - Finetuning BLOOM

Issue - State: open - Opened by AnaRhisT94 about 2 years ago - 5 comments

#336 - Add multiple evaluation compat

Pull Request - State: open - Opened by Muennighoff about 2 years ago

#335 - Changing a single example affects forward pass for other examples in a batch

Issue - State: closed - Opened by mayank31398 about 2 years ago - 4 comments
Labels: bug

#333 - About convert DS checkpoint to Transformers

Issue - State: closed - Opened by misska1 about 2 years ago - 2 comments

#332 - disable CI

Pull Request - State: closed - Opened by stas00 about 2 years ago - 1 comment

#331 - merge main

Pull Request - State: closed - Opened by Muennighoff about 2 years ago

#330 - DeepSpeed inference support for int8 parameters on BLOOM?

Issue - State: closed - Opened by pai4451 about 2 years ago - 6 comments

#329 - how to convert huggingface model to megatron-deepspeed?

Issue - State: closed - Opened by yayaQAQ over 2 years ago - 8 comments

#328 - Add generation server scripts using HF accelerate and DS-inference

Pull Request - State: closed - Opened by mayank31398 over 2 years ago - 46 comments

#327 - [checkpoints] replace bf16 with fp32 checkpoint weights

Pull Request - State: open - Opened by stas00 over 2 years ago - 3 comments

#326 - Add option to normalize loss per target

Pull Request - State: closed - Opened by Muennighoff over 2 years ago

#325 - Add generation server scripts

Pull Request - State: closed - Opened by mayank31398 over 2 years ago - 1 comment

#324 - Errors in generation (Bloom) when changing options sampling/use_cache

Issue - State: open - Opened by thies1006 over 2 years ago - 29 comments

#323 - Question about downloading checkpoints of 6.3B,2.5B,1.3B

Issue - State: open - Opened by misska1 over 2 years ago - 3 comments

#322 - add args_deepspeed_gpt.sh

Pull Request - State: closed - Opened by xyn1201 over 2 years ago

#321 - Generation server using HF accelerate and DS inference

Pull Request - State: closed - Opened by mayank31398 over 2 years ago - 19 comments

#319 - where can I download the 176B checkpoint in deepspeed format?

Issue - State: open - Opened by xuyifan-0731 over 2 years ago - 17 comments

#314 - How to run generation?

Issue - State: closed - Opened by mayank31398 over 2 years ago - 1 comment

#313 - Prefix LM Eval

Pull Request - State: open - Opened by Muennighoff over 2 years ago - 4 comments

#311 - Add Bitfit

Pull Request - State: open - Opened by Muennighoff over 2 years ago

#309 - Enable loading ckpt for t0 finetuning

Pull Request - State: open - Opened by Muennighoff over 2 years ago

#308 - BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO

Pull Request - State: closed - Opened by stas00 over 2 years ago - 46 comments

#291 - BigScience Eval Harness

Pull Request - State: open - Opened by Muennighoff over 2 years ago

#284 - MLM adaptation and Multitask Finetuning

Pull Request - State: closed - Opened by lintangsutawika over 2 years ago - 4 comments

#226 - Make sure deepspeed powered models are equivalent with their non deepspeed version

Issue - State: open - Opened by thomasw21 almost 3 years ago - 2 comments
Labels: Good First Issue

#163 - [Tensorboard] Log text prediction in evaluation

Issue - State: open - Opened by thomasw21 about 3 years ago - 14 comments
Labels: Good First Issue

#118 - Corby's numerically more stable self attn version

Pull Request - State: closed - Opened by stas00 about 3 years ago

#114 - Add checks to confirm that the checkpoint conversion script works perfectly correct

Issue - State: closed - Opened by ibeltagy about 3 years ago - 8 comments
Labels: Good First Issue

#99 - Double counts in parameter count

Issue - State: open - Opened by TevenLeScao about 3 years ago - 2 comments