Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / EleutherAI/gpt-neox issues and pull requests

#1188 - [AMD] Supporting fused kernels build using JIT

Pull Request - State: closed - Opened by R0n12 8 months ago - 2 comments

#1177 - Remove unused requirements-sparseattention

Pull Request - State: closed - Opened by segyges 8 months ago - 2 comments

#1167 - Add Basic RWKV Block to GPT-NeoX

Issue - State: closed - Opened by Quentin-Anthony 8 months ago - 1 comment
Labels: feature request

#1156 - Fused kernel support for AMD (using JIT)

Pull Request - State: closed - Opened by R0n12 9 months ago - 3 comments

#1139 - Better run_eval_harness import

Pull Request - State: closed - Opened by R0n12 10 months ago - 1 comment

#1119 - Create Singularity Container

Issue - State: open - Opened by Quentin-Anthony 10 months ago - 3 comments
Labels: feature request, good first issue, help wanted

#1088 - Finetune

Issue - State: closed - Opened by liuxinxin123 11 months ago - 4 comments
Labels: feature request

#1087 - [muP] Rework

Pull Request - State: open - Opened by lintangsutawika 12 months ago

#1084 - Support for DeepSpeed Ulysses (SP)

Pull Request - State: closed - Opened by Quentin-Anthony 12 months ago - 1 comment

#1078 - Port DeepSpeed Ulysses

Issue - State: closed - Opened by Quentin-Anthony almost 1 year ago - 2 comments
Labels: feature request

#979 - Dataload fix

Pull Request - State: closed - Opened by jahatef over 1 year ago - 2 comments

#878 - Deepspeed benchmarking

Pull Request - State: open - Opened by cr458 over 1 year ago - 1 comment

#812 - Add support for sequence parallelism

Issue - State: closed - Opened by Quentin-Anthony over 1 year ago - 12 comments
Labels: feature request, help wanted

#677 - MoE Support

Pull Request - State: closed - Opened by Quentin-Anthony about 2 years ago - 1 comment

#645 - RuntimeError: Error(s) in loading state_dict for EmbeddingPipe: size mismatch for word_embeddings.weight

Issue - State: open - Opened by mcao516 over 2 years ago - 9 comments
Labels: bug, good first issue, help wanted

#100 - How to calculate parameters

Issue - State: closed - Opened by Carolingliang almost 4 years ago

#99 - update

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#98 - Updating

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#97 - Pulling in for testing

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#96 - Reverted back to normal adam from 1-bit-adam

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#95 - How to change parameters

Issue - State: closed - Opened by Carolingliang almost 4 years ago - 2 comments

#94 - parameters

Issue - State: closed - Opened by 1660678083Alice almost 4 years ago - 1 comment

#93 - Update base_model.json

Pull Request - State: closed - Opened by srulikbd almost 4 years ago

#92 - Expanded patterns for the kill script

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#91 - Miscellaneous docker QoL improvements

Pull Request - State: closed - Opened by leogao2 almost 4 years ago - 1 comment

#90 - Add checkpoint saving / loading

Pull Request - State: closed - Opened by sdtblck almost 4 years ago - 4 comments

#89 - Fix train pipeline

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#88 - Added MPU from Sid's MegatronPipeline

Pull Request - State: closed - Opened by glebshevchukk almost 4 years ago - 6 comments

#87 - Batch size needs to be specified

Pull Request - State: closed - Opened by joshlk almost 4 years ago

#86 - Stella fixes shit

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#85 - Minor fixes

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#84 - Update requirements.txt

Pull Request - State: closed - Opened by srulikbd almost 4 years ago - 3 comments

#83 - updated deepspeed_zero2 with recommended settings

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#82 - Create label.yml

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago - 2 comments

#81 - Update issue templates

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#80 - AttributeError: module 'torch.utils' has no attribute 'checkpoint' in gpt-neox/gpt-neox

Issue - State: closed - Opened by kinoc almost 4 years ago - 4 comments
Labels: bug

#79 - removed errant comma

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#78 - Removed errant comma

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#77 - Implement distributed training using Kubernetes

Pull Request - State: closed - Opened by leogao2 almost 4 years ago - 1 comment

#76 - Create hostfile

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#75 - Implement the MPU from Megatron

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: feature request

#74 - Remove -s flag

Pull Request - State: closed - Opened by leogao2 almost 4 years ago

#73 - Update deepspeed install script to allow being run as root

Pull Request - State: closed - Opened by leogao2 almost 4 years ago

#72 - Implemented 1-bit adam

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#71 - gpt3small is broken

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 6 comments
Labels: bug

#70 - Added 1bit config and tested for #69

Pull Request - State: closed - Opened by glebshevchukk almost 4 years ago

#69 - Implement 1-Bit Adam

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: feature request, good first issue

#68 - Expand to all 8 CoreWeave Machines

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: feature request

#67 - Fix DeepSpeed (ZeRO2 + Pipeline Parallel)

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: bug, help wanted

#66 - (T5) Relative positional encodings?

Issue - State: closed - Opened by CRG2K almost 4 years ago - 6 comments
Labels: feature request

#65 - Hub

Pull Request - State: closed - Opened by raijinspecial almost 4 years ago - 1 comment

#64 - Updates configs to allow for the third failure mode

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#63 - Pipeline Parallel QoL Fixes

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#62 - Pipeline parallelism and gradient checkpointing (edit: and ZeRO 2!) don’t work together

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 12 comments
Labels: bug

#61 - fix everything that i broke

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#60 - Pipeline parallelism for enwik8

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#59 - implement gradient checkpointing

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#58 - Implement Generation / Eval with deepspeed model engine

Issue - State: closed - Opened by sdtblck almost 4 years ago - 6 comments
Labels: feature request

#57 - Revert GPT2Dataset back to old working state

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#56 - Add enron_jsonl and enron_tfr datasets (mostly for testing)

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#55 - Implement Gradient Checkpointing

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 2 comments
Labels: feature request, good first issue

#54 - Updating from main

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#53 - Updating branch with new PR code

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#52 - Adding jsonl chunked dataset

Pull Request - State: closed - Opened by glebshevchukk almost 4 years ago - 6 comments

#51 - Update data_utils.py

Pull Request - State: closed - Opened by ShivanshuPurohit almost 4 years ago - 2 comments

#50 - Stella parallel

Pull Request - State: closed - Opened by ShivanshuPurohit almost 4 years ago - 1 comment

#49 - Stella athena patch 1

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#48 - Added link to an installation walk-through

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#47 - update tensorflow to 2.4.0

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#46 - add dynamic dataset for processing/tokenizing examples lazily

Pull Request - State: closed - Opened by trisongz almost 4 years ago - 9 comments

#45 - Implement Pipeline Parallelism

Issue - State: closed - Opened by sdtblck almost 4 years ago - 6 comments
Labels: feature request

#44 - Ensure learning rate scheduler is functioning correctly

Issue - State: closed - Opened by sdtblck almost 4 years ago - 1 comment
Labels: bug, documentation

#43 - Add Deepspeed Transformer Kernel

Issue - State: closed - Opened by sdtblck almost 4 years ago - 4 comments
Labels: feature request, good first issue

#42 - Fix deprecation warning

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#41 - Fix tfrecord dataset to load less files into memory

Issue - State: closed - Opened by sdtblck almost 4 years ago
Labels: bug

#40 - Write dataset class that tokenizes on the fly

Issue - State: closed - Opened by sdtblck almost 4 years ago - 1 comment
Labels: feature request

#39 - Add improved data downloading class / pipeline

Pull Request - State: closed - Opened by sdtblck almost 4 years ago - 2 comments

#36 - Update requirements.txt

Pull Request - State: closed - Opened by sdtblck almost 4 years ago

#35 - Fix error in extracting OWT2 dataset

Pull Request - State: closed - Opened by steven-mi almost 4 years ago - 1 comment

#34 - feedforward GLU on by default

Pull Request - State: closed - Opened by lucidrains almost 4 years ago - 1 comment

#33 - Automatically download owt2

Pull Request - State: closed - Opened by steven-mi almost 4 years ago

#32 - Fix depreciated code

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: bug

#31 - disable reduce for loss calculation and calculate mean separately

Pull Request - State: closed - Opened by anthony-dipofi almost 4 years ago - 2 comments

#30 - untie classifier weights by default

Pull Request - State: closed - Opened by lucidrains almost 4 years ago

#29 - add linear warmup over 5000 steps and gradient clipping

Pull Request - State: closed - Opened by lucidrains almost 4 years ago

#28 - ftfy used in create_tfrecords.py but not listed in requirements.txt

Issue - State: closed - Opened by anthony-dipofi almost 4 years ago - 1 comment
Labels: bug

#27 - Update documentation

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: documentation

#26 - Hardcoded paths in gpt3_small.json

Issue - State: closed - Opened by anthony-dipofi almost 4 years ago
Labels: bug

#25 - make mask value smaller by factor of 2

Pull Request - State: closed - Opened by lucidrains almost 4 years ago - 1 comment

#24 - GPT-3 Small Works

Pull Request - State: closed - Opened by StellaAthena almost 4 years ago

#22 - Can't install Triton

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 2 comments
Labels: bug

#21 - fix small bug where sequence length is not passed into attention class

Pull Request - State: closed - Opened by lucidrains almost 4 years ago

#20 - Integrate ZeRO-Powered Data Parallelism

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: feature request

#19 - Integrate the full power of ZeRo into the code

Issue - State: closed - Opened by StellaAthena almost 4 years ago - 1 comment
Labels: feature request