Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/gpt-neox issues and pull requests
#1221 - Rwkv pipeline parallelism
Pull Request -
State: closed - Opened by jahatef 6 months ago
- 1 comment
#1220 - fix conversion of hf -> neox for pythia in model parallel
Pull Request -
State: closed - Opened by dmahan93 6 months ago
#1219 - Fix changed behavior of pipe_parallel
Pull Request -
State: closed - Opened by yang 6 months ago
#1218 - Conversion script bugfixes
Pull Request -
State: closed - Opened by haileyschoelkopf 6 months ago
- 3 comments
#1217 - Fix markdown formatting error
Pull Request -
State: closed - Opened by StellaAthena 6 months ago
#1216 - Run document update again
Pull Request -
State: closed - Opened by jahatef 7 months ago
#1215 - fixed typos
Pull Request -
State: closed - Opened by jahatef 7 months ago
#1214 - fix pipeline parallelism detection
Pull Request -
State: closed - Opened by dmahan93 7 months ago
- 2 comments
#1213 - Add Transformer Engine
Pull Request -
State: open - Opened by Quentin-Anthony 7 months ago
- 1 comment
#1212 - Add `intermediate_size` to GPT-NeoX models
Pull Request -
State: closed - Opened by dtamayo-nlp 7 months ago
- 5 comments
#1211 - Bump jinja2 from 3.1.3 to 3.1.4 in /requirements
Pull Request -
State: closed - Opened by dependabot[bot] 7 months ago
Labels: dependencies
#1210 - Dmoe integration
Pull Request -
State: open - Opened by DayOfThePenguin 7 months ago
#1209 - Fix bug in tools/ckpts/convert_neox_to_hf.py for setting intermediate_size
Pull Request -
State: closed - Opened by jvendrow 7 months ago
- 2 comments
#1208 - 'intermediate_size' not set in tools/ckpts/convert_neox_to_hf.py for neox model architecture
Issue -
State: closed - Opened by jvendrow 7 months ago
- 3 comments
Labels: bug
#1203 - My servers used for multi-node training do not have ssh. How can I launch multi-node training using the torchrun command?
Issue -
State: open - Opened by dingning97 7 months ago
- 2 comments
Labels: feature request
#1201 - Create cmake-multi-platform.yml
Pull Request -
State: closed - Opened by Romario242003 7 months ago
- 2 comments
#1198 - add rwkv support
Pull Request -
State: closed - Opened by jahatef 8 months ago
- 2 comments
Labels: merge-queue
#1197 - Megablocks-based MoE
Pull Request -
State: closed - Opened by DayOfThePenguin 8 months ago
- 1 comment
#1194 - Added infinite lr schedules
Pull Request -
State: open - Opened by kshitijkg 8 months ago
Labels: merge-queue
#1192 - Add megablocks dropless MoE
Pull Request -
State: closed - Opened by yang 8 months ago
#1191 - [ZeRO-3] Ensured passing neox deepspeed_config when using partitioned init
Pull Request -
State: closed - Opened by R0n12 8 months ago
#1188 - [AMD] Supporting fused kernels build using JIT
Pull Request -
State: closed - Opened by R0n12 8 months ago
- 2 comments
#1185 - Diffs to upstream megatron as a basis for discussion towards TE integration
Pull Request -
State: closed - Opened by tf-nv 9 months ago
#1177 - Remove unused requirements-sparseattention
Pull Request -
State: closed - Opened by segyges 9 months ago
- 2 comments
#1167 - Add Basic RWKV Block to GPT-NeoX
Issue -
State: closed - Opened by Quentin-Anthony 9 months ago
- 1 comment
Labels: feature request
#1156 - Fused kernel support for AMD (using JIT)
Pull Request -
State: closed - Opened by R0n12 9 months ago
- 3 comments
#1139 - Better run_eval_harness import
Pull Request -
State: closed - Opened by R0n12 10 months ago
- 1 comment
#1119 - Create Singularity Container
Issue -
State: open - Opened by Quentin-Anthony 11 months ago
- 3 comments
Labels: feature request, good first issue, help wanted
#1088 - Finetune
Issue -
State: closed - Opened by liuxinxin123 12 months ago
- 4 comments
Labels: feature request
#1087 - [muP] Rework
Pull Request -
State: open - Opened by lintangsutawika 12 months ago
#1084 - Support for DeepSpeed Ulysses (SP)
Pull Request -
State: closed - Opened by Quentin-Anthony about 1 year ago
- 1 comment
#1078 - Port DeepSpeed Ulysses
Issue -
State: closed - Opened by Quentin-Anthony about 1 year ago
- 2 comments
Labels: feature request
#1043 - AssertionError: Not sure how to proceed, we were given deepspeed configs in the deepspeed arguments and deepspeed.initialize() function call
Issue -
State: closed - Opened by shaunstoltz about 1 year ago
- 1 comment
Labels: bug
#979 - Dataload fix
Pull Request -
State: closed - Opened by jahatef over 1 year ago
- 2 comments
#878 - Deepspeed benchmarking
Pull Request -
State: open - Opened by cr458 over 1 year ago
- 1 comment
#851 - block-sparse flash attention support
Issue -
State: open - Opened by jordiclive over 1 year ago
- 4 comments
Labels: feature request, good first issue
#812 - Add support for sequence parallelism
Issue -
State: closed - Opened by Quentin-Anthony over 1 year ago
- 12 comments
Labels: feature request, help wanted
#677 - MoE Support
Pull Request -
State: closed - Opened by Quentin-Anthony about 2 years ago
- 1 comment
#645 - RuntimeError: Error(s) in loading state_dict for EmbeddingPipe: size mismatch for word_embeddings.weight
Issue -
State: open - Opened by mcao516 over 2 years ago
- 9 comments
Labels: bug, good first issue, help wanted
#100 - How to calculate parameters
Issue -
State: closed - Opened by Carolingliang almost 4 years ago
#99 - update
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#98 - Updating
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#97 - Pulling in for testing
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#96 - Reverted back to normal adam from 1-bit-adam
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#95 - How to change parameters
Issue -
State: closed - Opened by Carolingliang almost 4 years ago
- 2 comments
#94 - parameters
Issue -
State: closed - Opened by 1660678083Alice almost 4 years ago
- 1 comment
#93 - Update base_model.json
Pull Request -
State: closed - Opened by srulikbd almost 4 years ago
#92 - Expanded patterns for the kill script
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#91 - Miscellaneous docker QoL improvements
Pull Request -
State: closed - Opened by leogao2 almost 4 years ago
- 1 comment
#90 - Add checkpoint saving / loading
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
- 4 comments
#89 - Fix train pipeline
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#88 - Added MPU from Sid's MegatronPipeline
Pull Request -
State: closed - Opened by glebshevchukk almost 4 years ago
- 6 comments
#87 - Batch size needs to be specified
Pull Request -
State: closed - Opened by joshlk almost 4 years ago
#86 - Stella fixes shit
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#85 - Minor fixes
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#84 - Update requirements.txt
Pull Request -
State: closed - Opened by srulikbd almost 4 years ago
- 3 comments
#83 - updated deepspeed_zero2 with recommended settings
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#82 - Create label.yml
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
- 2 comments
#81 - Update issue templates
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#80 - AttributeError: module 'torch.utils' has no attribute 'checkpoint' in gpt-neox/gpt-neox
Issue -
State: closed - Opened by kinoc almost 4 years ago
- 4 comments
Labels: bug
#79 - removed errant comma
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#78 - Removed errant comma
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#77 - Implement distributed training using Kubernetes
Pull Request -
State: closed - Opened by leogao2 almost 4 years ago
- 1 comment
#76 - Create hostfile
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#75 - Implement the MPU from Megatron
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 1 comment
Labels: feature request
#74 - Remove -s flag
Pull Request -
State: closed - Opened by leogao2 almost 4 years ago
#73 - Update deepspeed install script to allow being run as root
Pull Request -
State: closed - Opened by leogao2 almost 4 years ago
#72 - Implemented 1-bit adam
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#71 - gpt3small is broken
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 6 comments
Labels: bug
#70 - Added 1bit config and tested for #69
Pull Request -
State: closed - Opened by glebshevchukk almost 4 years ago
#69 - Implement 1-Bit Adam
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 1 comment
Labels: feature request, good first issue
#68 - Expand to all 8 CoreWeave Machines
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 1 comment
Labels: feature request
#67 - Fix DeepSpeed (ZeRO2 + Pipeline Parallel)
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 1 comment
Labels: bug, help wanted
#66 - (T5) Relative positional encodings?
Issue -
State: closed - Opened by CRG2K almost 4 years ago
- 6 comments
Labels: feature request
#65 - Hub
Pull Request -
State: closed - Opened by raijinspecial almost 4 years ago
- 1 comment
#64 - Updates configs to allow for the third failure mode
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#63 - Pipeline Parallel QoL Fixes
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#62 - Pipeline parallelism and gradient checkpointing (edit: and ZeRO 2!) don’t work together
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 12 comments
Labels: bug
#61 - fix everything that i broke
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#60 - Pipeline parallelism for enwik8
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#59 - implement gradient checkpointing
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#58 - Implement Generation / Eval with deepspeed model engine
Issue -
State: closed - Opened by sdtblck almost 4 years ago
- 6 comments
Labels: feature request
#57 - Revert GPT2Dataset back to old working state
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#56 - Add enron_jsonl and enron_tfr datasets (mostly for testing)
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#55 - Implement Gradient Checkpointing
Issue -
State: closed - Opened by StellaAthena almost 4 years ago
- 2 comments
Labels: feature request, good first issue
#54 - Updating from main
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#53 - Updating branch with new PR code
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#52 - Adding jsonl chunked dataset
Pull Request -
State: closed - Opened by glebshevchukk almost 4 years ago
- 6 comments
#51 - Update data_utils.py
Pull Request -
State: closed - Opened by ShivanshuPurohit almost 4 years ago
- 2 comments
#50 - Stella parallel
Pull Request -
State: closed - Opened by ShivanshuPurohit almost 4 years ago
- 1 comment
#49 - Stella athena patch 1
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#48 - Added link to an installation walk-through
Pull Request -
State: closed - Opened by StellaAthena almost 4 years ago
#47 - update tensorflow to 2.4.0
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#46 - add dynamic dataset for processing/tokenizing examples lazily
Pull Request -
State: closed - Opened by trisongz almost 4 years ago
- 9 comments
#45 - Implement Pipeline Parallelism
Issue -
State: closed - Opened by sdtblck almost 4 years ago
- 6 comments
Labels: feature request
#44 - Ensure learning rate scheduler is functioning correctly
Issue -
State: closed - Opened by sdtblck almost 4 years ago
- 1 comment
Labels: bug, documentation
#43 - Add Deepspeed Transformer Kernel
Issue -
State: closed - Opened by sdtblck almost 4 years ago
- 4 comments
Labels: feature request, good first issue
#42 - Fix deprecation warning
Pull Request -
State: closed - Opened by sdtblck almost 4 years ago
#41 - Fix tfrecord dataset to load less files into memory
Issue -
State: closed - Opened by sdtblck almost 4 years ago
Labels: bug
#40 - Write dataset class that tokenizes on the fly
Issue -
State: closed - Opened by sdtblck almost 4 years ago
- 1 comment
Labels: feature request