Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / epfllm/megatron-llm issues and pull requests
#100 - Any plans to rebase the codebase to most recent Megatron-LM for MoE?
Issue -
State: open - Opened by xingyaoww 9 months ago
#100 - Any plans to rebase the codebase to most recent Megatron-LM for MoE?
Issue -
State: open - Opened by xingyaoww 9 months ago
#99 - Correctness when enabling FlashAttention + Sequence Parallel at the same time?
Issue -
State: closed - Opened by xingyaoww 9 months ago
- 2 comments
#99 - Correctness when enabling FlashAttention + Sequence Parallel at the same time?
Issue -
State: closed - Opened by xingyaoww 9 months ago
- 2 comments
#98 - Multi nodes
Issue -
State: closed - Opened by wodeqiansuihan 9 months ago
- 1 comment
#98 - Multi nodes
Issue -
State: closed - Opened by wodeqiansuihan 9 months ago
- 1 comment
#97 - update conversion script to support codellama-70b
Pull Request -
State: open - Opened by panx27 10 months ago
#97 - update conversion script to support codellama-70b
Pull Request -
State: open - Opened by panx27 10 months ago
#96 - Support QWen?
Issue -
State: open - Opened by Vincent131499 10 months ago
- 1 comment
#96 - Support QWen?
Issue -
State: open - Opened by Vincent131499 10 months ago
- 1 comment
#95 - How to load from a saved intermediate checkpoint?
Issue -
State: closed - Opened by jjzha 10 months ago
- 3 comments
#95 - How to load from a saved intermediate checkpoint?
Issue -
State: closed - Opened by jjzha 10 months ago
- 3 comments
#94 - error: preprocess.py file error while working on custom data
Issue -
State: open - Opened by toqeer618 10 months ago
#94 - error: preprocess.py file error while working on custom data
Issue -
State: open - Opened by toqeer618 10 months ago
#93 - Replace 1F1B with ZB-H1
Pull Request -
State: open - Opened by QPHutu 10 months ago
- 4 comments
#93 - Replace 1F1B with ZB-H1
Pull Request -
State: open - Opened by QPHutu 10 months ago
- 4 comments
#92 - LLaMA2-70B Inference Optmization
Issue -
State: closed - Opened by RaymondHQR 11 months ago
- 1 comment
#92 - LLaMA2-70B Inference Optmization
Issue -
State: closed - Opened by RaymondHQR 11 months ago
- 1 comment
#91 - LLaMa and Mistral 7B pretraining support
Issue -
State: closed - Opened by StephennFernandes 11 months ago
- 2 comments
#91 - LLaMa and Mistral 7B pretraining support
Issue -
State: closed - Opened by StephennFernandes 11 months ago
- 2 comments
#90 - added mistral docs
Pull Request -
State: closed - Opened by AleHD 12 months ago
#90 - added mistral docs
Pull Request -
State: closed - Opened by AleHD 12 months ago
#89 - One question about the permute function code in permute_qkv.py
Issue -
State: open - Opened by drxmy about 1 year ago
- 2 comments
#89 - One question about the permute function code in permute_qkv.py
Issue -
State: open - Opened by drxmy about 1 year ago
- 2 comments
#88 - Add Mistral Model
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
#88 - Add Mistral Model
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
#87 - Evalonly and wbresume
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#87 - Evalonly and wbresume
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#86 - Fix missing position_ids argument when recompute_granularity == full
Pull Request -
State: open - Opened by xingyaoww about 1 year ago
#86 - Fix missing position_ids argument when recompute_granularity == full
Pull Request -
State: open - Opened by xingyaoww about 1 year ago
#85 - Typo Fixes in docs/
Pull Request -
State: closed - Opened by tmsagarofficial about 1 year ago
#85 - Typo Fixes in docs/
Pull Request -
State: closed - Opened by tmsagarofficial about 1 year ago
#84 - Support specifying load_iters for checkpoint
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
- 2 comments
#84 - Support specifying load_iters for checkpoint
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
- 2 comments
#83 - Use --no_new_tokens to stop adding built-in special tokens
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
- 4 comments
#83 - Use --no_new_tokens to stop adding built-in special tokens
Pull Request -
State: closed - Opened by xingyaoww about 1 year ago
- 4 comments
#82 - args.make_vocab_size_divisible_by set failed
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#82 - args.make_vocab_size_divisible_by set failed
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#81 - llama2-7B AssertionError: padded_vocab_size value from checkpoint (32000) is not equal to the input argument value (32256)
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#81 - llama2-7B AssertionError: padded_vocab_size value from checkpoint (32000) is not equal to the input argument value (32256)
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#80 - RuntimeError: seq_len <= 2048 INTERNAL ASSERT FAILED
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 4 comments
#80 - RuntimeError: seq_len <= 2048 INTERNAL ASSERT FAILED
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 4 comments
#79 - finetune llama2-7B when set --seq_length 4096 error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#79 - finetune llama2-7B when set --seq_length 4096 error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#78 - run finetune llama2-7B error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#78 - run finetune llama2-7B error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 1 comment
#77 - run finetune llama2-7B error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 2 comments
#77 - run finetune llama2-7B error
Issue -
State: closed - Opened by 13416157913 about 1 year ago
- 2 comments
#76 - Support for Mistral
Issue -
State: closed - Opened by philschmid about 1 year ago
- 7 comments
#76 - Support for Mistral
Issue -
State: closed - Opened by philschmid about 1 year ago
- 7 comments
#75 - Add eval-only arguments and W&B resume options
Pull Request -
State: closed - Opened by eric11eca about 1 year ago
- 4 comments
Labels: enhancement
#75 - Add eval-only arguments and W&B resume options
Pull Request -
State: closed - Opened by eric11eca about 1 year ago
- 4 comments
Labels: enhancement
#74 - Update getting_started.md
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#74 - Update getting_started.md
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#73 - RuntimeError: mat1 and mat2 shapes cannot be multiplied (29056x22016 and 11008x4096)
Issue -
State: closed - Opened by liuxm117 about 1 year ago
- 2 comments
#73 - RuntimeError: mat1 and mat2 shapes cannot be multiplied (29056x22016 and 11008x4096)
Issue -
State: closed - Opened by liuxm117 about 1 year ago
- 2 comments
#72 - Add pointer to the shm-size docker arg to the docs
Pull Request -
State: closed - Opened by kylematoba about 1 year ago
#72 - Add pointer to the shm-size docker arg to the docs
Pull Request -
State: closed - Opened by kylematoba about 1 year ago
#71 - support falcon 180B
Issue -
State: open - Opened by martinjaggi about 1 year ago
#71 - support falcon 180B
Issue -
State: open - Opened by martinjaggi about 1 year ago
#70 - Getting started "shard" model not working
Issue -
State: closed - Opened by philschmid about 1 year ago
- 9 comments
#70 - Getting started "shard" model not working
Issue -
State: closed - Opened by philschmid about 1 year ago
- 9 comments
#69 - [Save checkpoint needs long time]
Issue -
State: closed - Opened by mynewstart about 1 year ago
- 2 comments
#69 - [Save checkpoint needs long time]
Issue -
State: closed - Opened by mynewstart about 1 year ago
- 2 comments
#68 - add support to finetune with use_distributed_optimizer
Pull Request -
State: closed - Opened by dumpmemory about 1 year ago
- 11 comments
#68 - add support to finetune with use_distributed_optimizer
Pull Request -
State: closed - Opened by dumpmemory about 1 year ago
- 11 comments
#67 - [Megatron Base Version] Would mind share the based version of Megatron ?
Issue -
State: closed - Opened by dumpmemory about 1 year ago
- 7 comments
#67 - [Megatron Base Version] Would mind share the based version of Megatron ?
Issue -
State: closed - Opened by dumpmemory about 1 year ago
- 7 comments
#66 - Tokens per second metric
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#66 - Tokens per second metric
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#65 - Feature Request: Can we directly use the huggingface dataset for training
Issue -
State: closed - Opened by dumpmemory about 1 year ago
- 4 comments
Labels: enhancement
#65 - Feature Request: Can we directly use the huggingface dataset for training
Issue -
State: closed - Opened by dumpmemory about 1 year ago
- 4 comments
Labels: enhancement
#64 - [Swiglu] question about swiglu
Issue -
State: closed - Opened by mynewstart about 1 year ago
- 6 comments
Labels: question
#64 - [Swiglu] question about swiglu
Issue -
State: closed - Opened by mynewstart about 1 year ago
- 6 comments
Labels: question
#63 - Loading weights from hf conversion with different TP,PP settings
Issue -
State: closed - Opened by binwang777 about 1 year ago
- 14 comments
#63 - Loading weights from hf conversion with different TP,PP settings
Issue -
State: closed - Opened by binwang777 about 1 year ago
- 14 comments
#62 - Fixed linear time increase observed when micro=1
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 2 comments
#62 - Fixed linear time increase observed when micro=1
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 2 comments
#61 - From custom hf source
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#61 - From custom hf source
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#60 - iteration-time increases linearly when micro_batch_size=1
Issue -
State: closed - Opened by LlinWing about 1 year ago
- 1 comment
#60 - iteration-time increases linearly when micro_batch_size=1
Issue -
State: closed - Opened by LlinWing about 1 year ago
- 1 comment
#59 - Update hf_to_megatron.py
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#59 - Update hf_to_megatron.py
Pull Request -
State: closed - Opened by AleHD about 1 year ago
#58 - Instruct loss scalar
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#58 - Instruct loss scalar
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#57 - Better documentation
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#57 - Better documentation
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#56 - Llama v1 import from HF support
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 3 comments
#56 - Llama v1 import from HF support
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 3 comments
#55 - Metrics support
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#55 - Metrics support
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#54 - Prepend bos token
Issue -
State: closed - Opened by panx27 about 1 year ago
- 1 comment
#54 - Prepend bos token
Issue -
State: closed - Opened by panx27 about 1 year ago
- 1 comment
#53 - Make llama2 vocab size divisible by 128 by default
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#53 - Make llama2 vocab size divisible by 128 by default
Pull Request -
State: closed - Opened by AleHD about 1 year ago
- 1 comment
#52 - dose 8 A100 80g enough to finetune 70b llama2 ?
Issue -
State: closed - Opened by james2v about 1 year ago
- 5 comments
#52 - dose 8 A100 80g enough to finetune 70b llama2 ?
Issue -
State: closed - Opened by james2v about 1 year ago
- 5 comments
#51 - Add CodeLlama support
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 6 comments
#51 - Add CodeLlama support
Pull Request -
State: closed - Opened by andreaskoepf over 1 year ago
- 6 comments