Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / EleutherAI/gpt-neox issues and pull requests
#1323 - Error when converting sequential model to HF
Issue -
State: open - Opened by SilverSulfide 6 days ago
Labels: bug
#1322 - Runtime per step linearly increases with training step number.
Issue -
State: open - Opened by iPRET 13 days ago
- 1 comment
Labels: bug
#1321 - Can `preprocess_data.py` support Huggingface Dataset?
Issue -
State: open - Opened by cafeii 14 days ago
- 1 comment
Labels: feature request
#1320 - _forward_step_fn does not always return two values so eval.py breaks if is_pipe_parallel is false
Issue -
State: open - Opened by markNZed 14 days ago
- 2 comments
Labels: bug
#1319 - LLama mlp project layers missmatch with HF config during conversion
Issue -
State: closed - Opened by Vmjkom 20 days ago
- 2 comments
Labels: bug
#1318 - Fix documentation for converting SFT/DPO weights back to HF Llama
Pull Request -
State: closed - Opened by jacobthebanana 23 days ago
#1317 - KeyError when converting DPO weights from GPTNeoX format to HuggingFace Llama in post-training documentations
Issue -
State: closed - Opened by jacobthebanana 23 days ago
#1316 - Update text_generation_utils.py to work with pipe_parallel_size of 0
Pull Request -
State: open - Opened by markNZed 27 days ago
#1315 - fix a GQA issue (#1314)
Pull Request -
State: closed - Opened by tiandeyu-cs 29 days ago
#1314 - Training crashes when "(hidden_size * num_kv_heads) / (num_attention_heads * num_attention_heads)" is not an integer.
Issue -
State: closed - Opened by tiandeyu-cs 29 days ago
Labels: bug
#1313 - Python 3.10 support
Pull Request -
State: closed - Opened by markNZed 29 days ago
- 1 comment
#1312 - Add support for dropout in sparse attention
Pull Request -
State: closed - Opened by michaelc-yu about 1 month ago
#1311 - Add default bf16 precision setting when bf16 config option is set but precision is unset.
Pull Request -
State: closed - Opened by AI-WAIFU about 1 month ago
#1310 - [Question] Running gpt-neox on AMD-based LUMI HPC centre.
Issue -
State: closed - Opened by iPRET about 1 month ago
- 1 comment
Labels: bug
#1309 - fix 'intermediate_size' in Llama configuration files after the 'mlp_type' option was removed
Pull Request -
State: closed - Opened by tiandeyu-cs about 1 month ago
- 1 comment
#1308 - Add ERROR logging prefix and sort the prefixes alphabetically
Pull Request -
State: closed - Opened by TheBatmanofButler about 1 month ago
- 2 comments
#1308 - Add ERROR logging prefix and sort the prefixes alphabetically
Pull Request -
State: closed - Opened by TheBatmanofButler about 1 month ago
- 2 comments
#1307 - DeeperSpeed cannot support BFloat16 and PipelineParallelism
Issue -
State: open - Opened by jahatef about 1 month ago
- 1 comment
Labels: bug
#1306 - Latest DeepSpeed not supported
Issue -
State: open - Opened by jahatef about 1 month ago
Labels: bug
#1305 - Error with rotary embeddings and BFloat16
Issue -
State: closed - Opened by jahatef about 1 month ago
- 1 comment
Labels: bug
#1304 - CUDA/Pytorch multiprocessing workaround and test fixes
Pull Request -
State: open - Opened by AI-WAIFU about 1 month ago
#1303 - pytest-forked alternative to get around CUDA/pytorch multiprocessing limitation
Pull Request -
State: open - Opened by AI-WAIFU about 1 month ago
#1302 - adds pyproject files and tests
Pull Request -
State: closed - Opened by LouisCastricato about 2 months ago
#1301 - Fix failling tests
Pull Request -
State: closed - Opened by AI-WAIFU about 2 months ago
#1300 - Add additional asserts and update post training readme
Pull Request -
State: closed - Opened by AI-WAIFU about 2 months ago
#1299 - Add support for context parallelism
Pull Request -
State: open - Opened by bclyang about 2 months ago
- 1 comment
#1298 - Improve Profiling Docs
Pull Request -
State: closed - Opened by Quentin-Anthony about 2 months ago
#1297 - TE integration via full TransformerLayer
Pull Request -
State: open - Opened by tf-nv about 2 months ago
#1296 - hotfix for tp >= 2 and pp > 2 in autoitercount
Pull Request -
State: closed - Opened by AI-WAIFU about 2 months ago
#1295 - readded RM training removed during merge conflict in KTO
Pull Request -
State: closed - Opened by dmahan93 2 months ago
#1294 - Add KTO Post-training example
Pull Request -
State: closed - Opened by dmahan93 2 months ago
#1293 - update args docs
Pull Request -
State: closed - Opened by Quentin-Anthony 2 months ago
#1292 - update neox arg docs
Pull Request -
State: closed - Opened by Quentin-Anthony 2 months ago
- 1 comment
#1291 - mamba flop calculations
Pull Request -
State: closed - Opened by jahatef 2 months ago
#1290 - Fix dataset bug
Pull Request -
State: closed - Opened by Quentin-Anthony 2 months ago
#1288 - Reinforce PR
Pull Request -
State: open - Opened by dmahan93 2 months ago
- 1 comment
#1287 - Remove the remaining two hanging wandb config fields
Pull Request -
State: closed - Opened by Quentin-Anthony 2 months ago
#1286 - Make monitors consistent
Pull Request -
State: closed - Opened by Quentin-Anthony 2 months ago
#1285 - Fix off by 1 error on masked tokens for RM training
Pull Request -
State: closed - Opened by dmahan93 2 months ago
#1284 - Update Comet integration instructions
Pull Request -
State: closed - Opened by Lothiraldan 2 months ago
#1283 - Automatically compute train_iters when train_epochs is specified.
Pull Request -
State: closed - Opened by AI-WAIFU 2 months ago
- 1 comment
#1282 - TransformerEngine Integration
Pull Request -
State: open - Opened by aurelion-source 2 months ago
- 3 comments
#1281 - Add model parallel group to reduce scatter
Pull Request -
State: closed - Opened by bclyang 2 months ago
#1280 - Do not fail when git is not installed
Pull Request -
State: closed - Opened by gcaillaut 3 months ago
- 1 comment
#1279 - fix the imports needed for comet integration
Pull Request -
State: closed - Opened by Quentin-Anthony 3 months ago
#1278 - fix gpt-j residual bias assumption
Pull Request -
State: closed - Opened by dmahan93 3 months ago
#1277 - Post training examples
Pull Request -
State: closed - Opened by dmahan93 3 months ago
- 3 comments
#1276 - Hotfix llama models
Pull Request -
State: closed - Opened by dmahan93 3 months ago
- 1 comment
#1275 - Add more informative checks for ZeRO incompatibility.
Pull Request -
State: closed - Opened by AI-WAIFU 3 months ago
#1274 - Fix weight decay module check
Pull Request -
State: closed - Opened by aurelion-source 3 months ago
#1273 - Expand Docstring
Pull Request -
State: closed - Opened by AI-WAIFU 3 months ago
#1272 - TE Import Hotfix
Pull Request -
State: closed - Opened by Quentin-Anthony 3 months ago
- 1 comment
#1271 - Hotfix Activation Typo
Pull Request -
State: closed - Opened by Quentin-Anthony 3 months ago
#1270 - Formatting and Fix Mamba Config
Pull Request -
State: closed - Opened by Quentin-Anthony 3 months ago
#1269 - LayerNorm Refactor
Pull Request -
State: closed - Opened by aurelion-source 3 months ago
- 3 comments
#1268 - Allow training without knowing num_iters
Issue -
State: closed - Opened by StellaAthena 3 months ago
- 1 comment
Labels: feature request
#1267 - Add assert to check for missing tokenizer_type in config. [#1231]
Pull Request -
State: closed - Opened by AI-WAIFU 3 months ago
- 1 comment
#1266 - Add initial ring flash attention support
Pull Request -
State: open - Opened by dmahan93 3 months ago
- 1 comment
#1265 - add Apex fused RMS norm
Pull Request -
State: closed - Opened by dmahan93 3 months ago
- 1 comment
#1264 - Frontier
Pull Request -
State: closed - Opened by jahatef 3 months ago
- 1 comment
#1263 - Improve performance of sequence parallel gather, scatter, and reduce
Pull Request -
State: closed - Opened by bclyang 3 months ago
#1262 - mamba fixes and cleaning
Pull Request -
State: closed - Opened by jahatef 3 months ago
- 2 comments
#1261 - Comet integration
Pull Request -
State: closed - Opened by jverre 3 months ago
- 2 comments
#1260 - Fix gather and reduce scatter ops on sequence dimension
Pull Request -
State: closed - Opened by bclyang 3 months ago
#1259 - Fix LayerNorm all reduce gradient hook
Pull Request -
State: closed - Opened by bclyang 4 months ago
- 1 comment
#1258 - bugfix: chat turns instead of repeating the conversation in preprocess_data_with_chat_template.py
Pull Request -
State: closed - Opened by dmahan93 4 months ago
#1257 - Megatron-LM style Sequence Parallel
Pull Request -
State: closed - Opened by haileyschoelkopf 4 months ago
- 3 comments
#1256 - GitHub actions fix
Pull Request -
State: closed - Opened by jahatef 4 months ago
#1255 - Add new cites
Pull Request -
State: closed - Opened by StellaAthena 4 months ago
- 1 comment
#1254 - How to Load Model from pytorch_model.bin into Trained Model for Text Generation?
Issue -
State: open - Opened by lieh1203 4 months ago
Labels: feature request
#1253 - what's the biggest dataset you've tried?
Issue -
State: open - Opened by exnx 4 months ago
Labels: bug
#1252 - too many .bin files for dataloader, crashed
Issue -
State: closed - Opened by exnx 5 months ago
Labels: bug
#1251 - Assertion Error when Setting pipe_parallel_size or model_parallel_size in GPT-NeoX
Issue -
State: open - Opened by lieh1203 5 months ago
- 3 comments
Labels: bug
#1250 - For nucleus sampling, top-p sampling appears to happen on the softmax-normalized top-k logits
Issue -
State: closed - Opened by j-frei 5 months ago
- 3 comments
Labels: bug
#1248 - batch_input and elapsed time per iteration suddenly slow down during model training
Issue -
State: open - Opened by Yuhanleeee 5 months ago
- 4 comments
Labels: bug
#1247 - Add hf llama to neox conversion
Pull Request -
State: closed - Opened by dmahan93 5 months ago
- 1 comment
#1246 - Add Reward Model training
Pull Request -
State: closed - Opened by dmahan93 5 months ago
#1245 - Conversion for CI from self-hosted hardware
Pull Request -
State: closed - Opened by jaimemcc-intel 5 months ago
#1244 - Add KTO training
Pull Request -
State: closed - Opened by dmahan93 5 months ago
#1243 - Replace unsafe `pyyaml` loader with `SafeLoader` (#2)
Pull Request -
State: closed - Opened by pixeeai 5 months ago
- 1 comment
#1242 - Add DPO training
Pull Request -
State: closed - Opened by dmahan93 5 months ago
- 1 comment
#1241 - Fix paper reference in init_functions.py
Pull Request -
State: closed - Opened by rasbt 5 months ago
- 2 comments
#1240 - SFT improvements (labeling fixes, different packing implementations)
Pull Request -
State: closed - Opened by dmahan93 5 months ago
#1239 - Add a chat data preprocessing script
Pull Request -
State: closed - Opened by dmahan93 5 months ago
#1238 - Pr1212
Pull Request -
State: closed - Opened by jahatef 5 months ago
#1237 - Add tensor parallelism for RWKV
Pull Request -
State: open - Opened by jahatef 5 months ago
#1236 - Ville dev
Pull Request -
State: closed - Opened by Vmjkom 5 months ago
- 1 comment
#1235 - Add Transformer Engine's version of RMSNorm and LayerNorm
Pull Request -
State: closed - Opened by lintangsutawika 6 months ago
- 2 comments
#1234 - fix python version and pytest install
Pull Request -
State: closed - Opened by jahatef 6 months ago
- 5 comments
#1233 - add workflow_dispatch to gh actions pr so we can run on command
Pull Request -
State: closed - Opened by jahatef 6 months ago
#1232 - init changes to README
Pull Request -
State: closed - Opened by jaimemcc-intel 6 months ago
#1231 - Cannot convert neox model to HF
Issue -
State: open - Opened by srivassid 6 months ago
- 2 comments
Labels: bug
#1230 - How to set the ffn hidden size parameter in gpt neox
Issue -
State: closed - Opened by IronMan-WangJinxi 6 months ago
- 2 comments
Labels: feature request
#1228 - Cannot perform inference, be it unconditional. input-file or interactive
Issue -
State: closed - Opened by srivassid 6 months ago
- 2 comments
Labels: bug
#1227 - The results of running eval show only 1 digit after decimal point for acc on all tested tasks
Issue -
State: closed - Opened by lernerjenny 6 months ago
- 2 comments
Labels: bug
#1226 - Add Torch Profiler Support
Pull Request -
State: closed - Opened by DayOfThePenguin 6 months ago
#1225 - Add lora support
Pull Request -
State: open - Opened by mkerin 6 months ago
#1224 - fixed fused_rope naming in JIT + Readme
Pull Request -
State: closed - Opened by R0n12 6 months ago
#1223 - Change python invocation syntax
Pull Request -
State: closed - Opened by jaimemcc-intel 6 months ago
#1222 - Small tidying
Pull Request -
State: closed - Opened by yang 6 months ago