Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / epfllm/megatron-llm issues and pull requests
#50 - llama2 & vocabulary padding (making embedding layer sizes divisible by 128)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#50 - llama2 & vocabulary padding (making embedding layer sizes divisible by 128)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#49 - convert huggingface model to megatron. "Only llama v2 available using huggingface"
Issue -
State: closed - Opened by uygnef over 1 year ago
- 1 comment
#49 - convert huggingface model to megatron. "Only llama v2 available using huggingface"
Issue -
State: closed - Opened by uygnef over 1 year ago
- 1 comment
#48 - Update megatron2hf.py
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#48 - Update megatron2hf.py
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#47 - Set max_position_embeddings to args.seq_length in LlamaConfig
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#47 - Set max_position_embeddings to args.seq_length in LlamaConfig
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#46 - added support to override special tokens when converting to huggingface
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#46 - added support to override special tokens when converting to huggingface
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#45 - Fix GQA handling in convert_wqkv
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#45 - Fix GQA handling in convert_wqkv
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#44 - Fix merge order in merge_meta_llama()
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#44 - Fix merge order in merge_meta_llama()
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#43 - Convert LLama-30B to Megatron Error
Issue -
State: closed - Opened by dumpmemory over 1 year ago
- 1 comment
#43 - Convert LLama-30B to Megatron Error
Issue -
State: closed - Opened by dumpmemory over 1 year ago
- 1 comment
#42 - Update weights2megatron.py
Pull Request -
State: closed - Opened by dumpmemory over 1 year ago
- 3 comments
#42 - Update weights2megatron.py
Pull Request -
State: closed - Opened by dumpmemory over 1 year ago
- 3 comments
#41 - llama2 70B weights2megatron OOM fix
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#41 - llama2 70B weights2megatron OOM fix
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#40 - Instruction tuning
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#40 - Instruction tuning
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#39 - Add update_to_hub docs
Issue -
State: closed - Opened by AleHD over 1 year ago
#39 - Add update_to_hub docs
Issue -
State: closed - Opened by AleHD over 1 year ago
#38 - Fix minor typos in push_to_hub.py
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#38 - Fix minor typos in push_to_hub.py
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#37 - Add model export utility push_to_hub.py
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#37 - Add model export utility push_to_hub.py
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#36 - Add Megatron to Huggingface conversion for Falcon models
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#36 - Add Megatron to Huggingface conversion for Falcon models
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#35 - Fixing invalid name in Falcon's megatron weights
Pull Request -
State: closed - Opened by malteos over 1 year ago
- 4 comments
#35 - Fixing invalid name in Falcon's megatron weights
Pull Request -
State: closed - Opened by malteos over 1 year ago
- 4 comments
#34 - Improve NaN detection by checking `grad_norm`
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#34 - Improve NaN detection by checking `grad_norm`
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#33 - NaN detection possibly ineffective
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
#33 - NaN detection possibly ineffective
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
#32 - how to convert baichuan-13b into megatron weights?
Issue -
State: closed - Opened by wwngh1233 over 1 year ago
- 3 comments
#32 - how to convert baichuan-13b into megatron weights?
Issue -
State: closed - Opened by wwngh1233 over 1 year ago
- 3 comments
#31 - OpenAssistant training changes [not intended for merging]
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#31 - OpenAssistant training changes [not intended for merging]
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#30 - Nice-to-have training features
Issue -
State: open - Opened by andreaskoepf over 1 year ago
Labels: enhancement
#30 - Nice-to-have training features
Issue -
State: open - Opened by andreaskoepf over 1 year ago
Labels: enhancement
#29 - more appropriate --chunk_size for tools/preprocess_data.py
Pull Request -
State: closed - Opened by panx27 over 1 year ago
- 1 comment
#29 - more appropriate --chunk_size for tools/preprocess_data.py
Pull Request -
State: closed - Opened by panx27 over 1 year ago
- 1 comment
#28 - Add falcon support in megatron2hf.py
Issue -
State: closed - Opened by AleHD over 1 year ago
- 4 comments
Labels: enhancement
#28 - Add falcon support in megatron2hf.py
Issue -
State: closed - Opened by AleHD over 1 year ago
- 4 comments
Labels: enhancement
#27 - Weight conversion testing and other features
Pull Request -
State: closed - Opened by AleHD over 1 year ago
- 1 comment
Labels: enhancement
#27 - Weight conversion testing and other features
Pull Request -
State: closed - Opened by AleHD over 1 year ago
- 1 comment
Labels: enhancement
#26 - Add linear RoPE scaling & arbitary position_ids
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#26 - Add linear RoPE scaling & arbitary position_ids
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#25 - Update convert_llama2hf.py with latest version from HF transformers
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#25 - Update convert_llama2hf.py with latest version from HF transformers
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#24 - add GQA(MQA) support in megatron2hf conversion
Issue -
State: closed - Opened by Olivia-fsm over 1 year ago
Labels: enhancement
#24 - add GQA(MQA) support in megatron2hf conversion
Issue -
State: closed - Opened by Olivia-fsm over 1 year ago
Labels: enhancement
#23 - Passed position_ids are ignored for `PositionEmbeddingType.rotary`
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#23 - Passed position_ids are ignored for `PositionEmbeddingType.rotary`
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#22 - iteration-time increases linearly (for TP=2, PP=1 & TP=1, PP=2)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 8 comments
#22 - iteration-time increases linearly (for TP=2, PP=1 & TP=1, PP=2)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 8 comments
#21 - Add LIMA dropout
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#21 - Add LIMA dropout
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#20 - Add llama2 to usage help string of weights2megatron.sh
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#20 - Add llama2 to usage help string of weights2megatron.sh
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
#19 - Generate HuggingFace tokenizer configuration as part of megatron2hf.py (weight conversion)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#19 - Generate HuggingFace tokenizer configuration as part of megatron2hf.py (weight conversion)
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#18 - cuda misaligned address in pretrain llama2 7B
Issue -
State: closed - Opened by pwq1989 over 1 year ago
- 2 comments
#18 - cuda misaligned address in pretrain llama2 7B
Issue -
State: closed - Opened by pwq1989 over 1 year ago
- 2 comments
#17 - convert_llama2hf.py should be replaced with newer version
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 3 comments
#17 - convert_llama2hf.py should be replaced with newer version
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 3 comments
#16 - Fix wandb logging of validation metrics
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#16 - Fix wandb logging of validation metrics
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#15 - Add llama2 to usage help string
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#15 - Add llama2 to usage help string
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#14 - Error during merge of sharded checkpoint: 'TransformerLanguageModel' object has no attribute 'lm_head'
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#14 - Error during merge of sharded checkpoint: 'TransformerLanguageModel' object has no attribute 'lm_head'
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#13 - Documentation
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#13 - Documentation
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#12 - Documentation
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#12 - Documentation
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#11 - The training speed is two times slower than the Megatron-LM and Megatron-Deepspeed
Issue -
State: closed - Opened by zhao1iang over 1 year ago
- 5 comments
#11 - The training speed is two times slower than the Megatron-LM and Megatron-Deepspeed
Issue -
State: closed - Opened by zhao1iang over 1 year ago
- 5 comments
#10 - HF LLaMA -> megatron weight
Issue -
State: closed - Opened by dumpmemory over 1 year ago
- 5 comments
#10 - HF LLaMA -> megatron weight
Issue -
State: closed - Opened by dumpmemory over 1 year ago
- 5 comments
#9 - Validation metrics are not logged to wandb
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#9 - Validation metrics are not logged to wandb
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 1 comment
#8 - Can you send me the complete parameters related to training llama2 using finetune. py?
Issue -
State: closed - Opened by brewswang over 1 year ago
- 1 comment
#8 - Can you send me the complete parameters related to training llama2 using finetune. py?
Issue -
State: closed - Opened by brewswang over 1 year ago
- 1 comment
#7 - Positional embeddings optimization
Pull Request -
State: closed - Opened by AleHD over 1 year ago
#6 - Minimum number of 80GB GPUs to train Falcon-40B or Llama2-70B?
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#5 - finetune.py: error: unrecognized arguments: --use_multiquery_attn
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#4 - add RoPE position interpolation
Issue -
State: closed - Opened by martinjaggi over 1 year ago
- 2 comments
Labels: enhancement
#3 - How to convert Megatron -> Huggingface weights?
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 2 comments
#2 - Falcon unusually high loss
Issue -
State: closed - Opened by andreaskoepf over 1 year ago
- 3 comments
#1 - Will you be adding a script to pretrain LLAMA2?
Issue -
State: closed - Opened by erastogi over 1 year ago
- 4 comments