Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / epfllm/megatron-llm issues and pull requests

#48 - Update megatron2hf.py

Pull Request - State: closed - Opened by AleHD over 1 year ago

#48 - Update megatron2hf.py

Pull Request - State: closed - Opened by AleHD over 1 year ago

#47 - Set max_position_embeddings to args.seq_length in LlamaConfig

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#47 - Set max_position_embeddings to args.seq_length in LlamaConfig

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#46 - added support to override special tokens when converting to huggingface

Pull Request - State: closed - Opened by AleHD over 1 year ago

#46 - added support to override special tokens when converting to huggingface

Pull Request - State: closed - Opened by AleHD over 1 year ago

#45 - Fix GQA handling in convert_wqkv

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#45 - Fix GQA handling in convert_wqkv

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#44 - Fix merge order in merge_meta_llama()

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#44 - Fix merge order in merge_meta_llama()

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#43 - Convert LLama-30B to Megatron Error

Issue - State: closed - Opened by dumpmemory over 1 year ago - 1 comment

#43 - Convert LLama-30B to Megatron Error

Issue - State: closed - Opened by dumpmemory over 1 year ago - 1 comment

#42 - Update weights2megatron.py

Pull Request - State: closed - Opened by dumpmemory over 1 year ago - 3 comments

#42 - Update weights2megatron.py

Pull Request - State: closed - Opened by dumpmemory over 1 year ago - 3 comments

#41 - llama2 70B weights2megatron OOM fix

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#41 - llama2 70B weights2megatron OOM fix

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#40 - Instruction tuning

Pull Request - State: closed - Opened by AleHD over 1 year ago

#40 - Instruction tuning

Pull Request - State: closed - Opened by AleHD over 1 year ago

#39 - Add update_to_hub docs

Issue - State: closed - Opened by AleHD over 1 year ago

#39 - Add update_to_hub docs

Issue - State: closed - Opened by AleHD over 1 year ago

#38 - Fix minor typos in push_to_hub.py

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#38 - Fix minor typos in push_to_hub.py

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#37 - Add model export utility push_to_hub.py

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#37 - Add model export utility push_to_hub.py

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#36 - Add Megatron to Huggingface conversion for Falcon models

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#36 - Add Megatron to Huggingface conversion for Falcon models

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#35 - Fixing invalid name in Falcon's megatron weights

Pull Request - State: closed - Opened by malteos over 1 year ago - 4 comments

#35 - Fixing invalid name in Falcon's megatron weights

Pull Request - State: closed - Opened by malteos over 1 year ago - 4 comments

#34 - Improve NaN detection by checking `grad_norm`

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#34 - Improve NaN detection by checking `grad_norm`

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#33 - NaN detection possibly ineffective

Issue - State: closed - Opened by andreaskoepf over 1 year ago

#33 - NaN detection possibly ineffective

Issue - State: closed - Opened by andreaskoepf over 1 year ago

#32 - how to convert baichuan-13b into megatron weights?

Issue - State: closed - Opened by wwngh1233 over 1 year ago - 3 comments

#32 - how to convert baichuan-13b into megatron weights?

Issue - State: closed - Opened by wwngh1233 over 1 year ago - 3 comments

#31 - OpenAssistant training changes [not intended for merging]

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#31 - OpenAssistant training changes [not intended for merging]

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#30 - Nice-to-have training features

Issue - State: open - Opened by andreaskoepf over 1 year ago
Labels: enhancement

#30 - Nice-to-have training features

Issue - State: open - Opened by andreaskoepf over 1 year ago
Labels: enhancement

#29 - more appropriate --chunk_size for tools/preprocess_data.py

Pull Request - State: closed - Opened by panx27 over 1 year ago - 1 comment

#29 - more appropriate --chunk_size for tools/preprocess_data.py

Pull Request - State: closed - Opened by panx27 over 1 year ago - 1 comment

#28 - Add falcon support in megatron2hf.py

Issue - State: closed - Opened by AleHD over 1 year ago - 4 comments
Labels: enhancement

#28 - Add falcon support in megatron2hf.py

Issue - State: closed - Opened by AleHD over 1 year ago - 4 comments
Labels: enhancement

#27 - Weight conversion testing and other features

Pull Request - State: closed - Opened by AleHD over 1 year ago - 1 comment
Labels: enhancement

#27 - Weight conversion testing and other features

Pull Request - State: closed - Opened by AleHD over 1 year ago - 1 comment
Labels: enhancement

#26 - Add linear RoPE scaling & arbitary position_ids

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#26 - Add linear RoPE scaling & arbitary position_ids

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#25 - Update convert_llama2hf.py with latest version from HF transformers

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#25 - Update convert_llama2hf.py with latest version from HF transformers

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#24 - add GQA(MQA) support in megatron2hf conversion

Issue - State: closed - Opened by Olivia-fsm over 1 year ago
Labels: enhancement

#24 - add GQA(MQA) support in megatron2hf conversion

Issue - State: closed - Opened by Olivia-fsm over 1 year ago
Labels: enhancement

#23 - Passed position_ids are ignored for `PositionEmbeddingType.rotary`

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#23 - Passed position_ids are ignored for `PositionEmbeddingType.rotary`

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#22 - iteration-time increases linearly (for TP=2, PP=1 & TP=1, PP=2)

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 8 comments

#22 - iteration-time increases linearly (for TP=2, PP=1 & TP=1, PP=2)

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 8 comments

#21 - Add LIMA dropout

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#21 - Add LIMA dropout

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#20 - Add llama2 to usage help string of weights2megatron.sh

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#20 - Add llama2 to usage help string of weights2megatron.sh

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago

#18 - cuda misaligned address in pretrain llama2 7B

Issue - State: closed - Opened by pwq1989 over 1 year ago - 2 comments

#18 - cuda misaligned address in pretrain llama2 7B

Issue - State: closed - Opened by pwq1989 over 1 year ago - 2 comments

#17 - convert_llama2hf.py should be replaced with newer version

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 3 comments

#17 - convert_llama2hf.py should be replaced with newer version

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 3 comments

#16 - Fix wandb logging of validation metrics

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#16 - Fix wandb logging of validation metrics

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#15 - Add llama2 to usage help string

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#15 - Add llama2 to usage help string

Pull Request - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#13 - Documentation

Pull Request - State: closed - Opened by AleHD over 1 year ago

#13 - Documentation

Pull Request - State: closed - Opened by AleHD over 1 year ago

#12 - Documentation

Pull Request - State: closed - Opened by AleHD over 1 year ago

#12 - Documentation

Pull Request - State: closed - Opened by AleHD over 1 year ago

#10 - HF LLaMA -> megatron weight

Issue - State: closed - Opened by dumpmemory over 1 year ago - 5 comments

#10 - HF LLaMA -> megatron weight

Issue - State: closed - Opened by dumpmemory over 1 year ago - 5 comments

#9 - Validation metrics are not logged to wandb

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#9 - Validation metrics are not logged to wandb

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 1 comment

#7 - Positional embeddings optimization

Pull Request - State: closed - Opened by AleHD over 1 year ago

#6 - Minimum number of 80GB GPUs to train Falcon-40B or Llama2-70B?

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#5 - finetune.py: error: unrecognized arguments: --use_multiquery_attn

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#4 - add RoPE position interpolation

Issue - State: closed - Opened by martinjaggi over 1 year ago - 2 comments
Labels: enhancement

#3 - How to convert Megatron -> Huggingface weights?

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 2 comments

#2 - Falcon unusually high loss

Issue - State: closed - Opened by andreaskoepf over 1 year ago - 3 comments

#1 - Will you be adding a script to pretrain LLAMA2?

Issue - State: closed - Opened by erastogi over 1 year ago - 4 comments