Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / alibaba/Pai-Megatron-Patch issues and pull requests
#132 - Enhance MoE Upcycled and Fix qwen finetune issues
Pull Request -
State: closed - Opened by jerryli1981 9 months ago
- 1 comment
#131 - RuntimeError: The size of tensor a (120) must match the size of tensor b (119) at non-singleton dimension 2
Issue -
State: open - Opened by wangxiang2713 10 months ago
- 1 comment
#130 - Fix traning resume issue
Pull Request -
State: closed - Opened by jerryli1981 10 months ago
- 1 comment
#129 - Enhance MoE Upcycled and Fix Qwen hf & megatron alignment issues
Pull Request -
State: closed - Opened by jerryli1981 10 months ago
- 1 comment
#128 - [fixed] fix unmatched shape when PP_size > 1
Pull Request -
State: closed - Opened by Dylancer1998 10 months ago
- 2 comments
#127 - [fixed] unexpected eos_token concatenation
Pull Request -
State: closed - Opened by Dylancer1998 10 months ago
#126 - [feat]: support safetensors format in llama converter
Pull Request -
State: closed - Opened by Dylancer1998 10 months ago
- 1 comment
#125 - [Fix] support mixtral_8x7b grouped_gemm load state_dict
Pull Request -
State: closed - Opened by lxg2015 10 months ago
- 1 comment
#124 - No such file or directory: '/mtn/workplace/qwen-ckpts/qwen-14b-hf-to-megatron-tp2-pp1/release/mp_rank_00/model_optim_rng.pt
Issue -
State: open - Opened by jamestch 10 months ago
- 1 comment
#123 - Update ReadMe
Pull Request -
State: closed - Opened by jerryli1981 10 months ago
- 1 comment
#122 - Update MoE with Megatron Core
Pull Request -
State: closed - Opened by jerryli1981 10 months ago
- 1 comment
#121 - qwen 7B 增量预训练,模型加载完,卡在 dataloader 部分, seq_length=0
Issue -
State: open - Opened by songyingxin 10 months ago
#120 - Replace break to continue when process layer
Pull Request -
State: closed - Opened by jinzhuer 11 months ago
- 1 comment
#119 - Got a bug during the pretrain of chatglm.
Issue -
State: closed - Opened by FeixLiu 11 months ago
- 1 comment
#118 - Qwen 72B 中 megatron 和 huggingface 的不一致
Issue -
State: open - Opened by chaochen99 11 months ago
#117 - Add mixtral mcore implementation
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#116 - Mixtral model convert no "convert_checkpoint_from_megatron_to_transformers" function
Issue -
State: closed - Opened by cdj0311 11 months ago
- 1 comment
#115 - Update git submodule for megatron version control
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#115 - Update git submodule for megatron version control
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#114 - update llama2 ds train
Pull Request -
State: closed - Opened by MengLeebin 11 months ago
- 1 comment
#114 - update llama2 ds train
Pull Request -
State: closed - Opened by MengLeebin 11 months ago
- 1 comment
#113 - Fix save moe checkpint
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#112 - 请问训练如何resume?
Issue -
State: open - Opened by LittleWhite0208 11 months ago
#112 - 请问训练如何resume?
Issue -
State: open - Opened by LittleWhite0208 11 months ago
#111 - Fix save moe checkpint
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#111 - Fix save moe checkpint
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#110 - Add hf2mcore convertor of mixtral model
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#110 - Add hf2mcore convertor of mixtral model
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#109 - Support expert tensor parallelism
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#109 - Support expert tensor parallelism
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#108 - fp8精度权重转换
Issue -
State: open - Opened by liuxm117 11 months ago
#107 - update readme
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#107 - update readme
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#106 - 微调多轮对话的语料格式是什么
Issue -
State: open - Opened by zeq263 11 months ago
#106 - 微调多轮对话的语料格式是什么
Issue -
State: open - Opened by zeq263 11 months ago
#105 - Add Mixtral MoE and Qwen-vl
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#105 - Add Mixtral MoE and Qwen-vl
Pull Request -
State: closed - Opened by jerryli1981 11 months ago
- 1 comment
#104 - TransformerLayer.__init__() got an unexpected keyword argument 'apply_query_key_layer_scaling'
Issue -
State: closed - Opened by AGI-player 11 months ago
- 2 comments
#103 - Pretrain megatron qwen-7b-tp4-pp1 报错 151851 is not divisible by 4
Issue -
State: closed - Opened by KannbaraQRS 11 months ago
- 2 comments
#103 - Pretrain megatron qwen-7b-tp4-pp1 报错 151851 is not divisible by 4
Issue -
State: closed - Opened by KannbaraQRS 11 months ago
- 2 comments
#102 - 训练baichuan2 13b 报 KeyError: 'instruction'
Issue -
State: closed - Opened by joymcg 12 months ago
- 4 comments
#102 - 训练baichuan2 13b 报 KeyError: 'instruction'
Issue -
State: closed - Opened by joymcg 12 months ago
- 4 comments
#101 - either train-iters or train-samples should be provided
Issue -
State: closed - Opened by liuxm117 12 months ago
- 1 comment
#100 - ModuleNotFoundError: No module named 'megatron.data.gpt_dataset'
Issue -
State: closed - Opened by liuxm117 12 months ago
- 6 comments
#100 - ModuleNotFoundError: No module named 'megatron.data.gpt_dataset'
Issue -
State: closed - Opened by liuxm117 12 months ago
- 6 comments
#99 - add cvcuda_image_processing
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#98 - Fix zero shot evaluate megatron issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#97 - Fix zero shot evaluate megatron issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#97 - Fix zero shot evaluate megatron issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#96 - Fix zero shot evaluate megatron issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#96 - Fix zero shot evaluate megatron issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#95 - Finetune Qwen-72B
Issue -
State: closed - Opened by LittleWhite0208 12 months ago
- 1 comment
#94 - Support pipeline evaluation for deepseek
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#94 - Support pipeline evaluation for deepseek
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#93 - Support pipeline evaluation for baichuan2, llama2, mistral and qwen
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#93 - Support pipeline evaluation for baichuan2, llama2, mistral and qwen
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#92 - Support pipeline evaluation for baichuan2, llama2, mistral and qwen
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#92 - Support pipeline evaluation for baichuan2, llama2, mistral and qwen
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#91 - Fix Yi evaluation issue
Pull Request -
State: closed - Opened by jerryli1981 12 months ago
- 1 comment
#90 - Fix llama2 finetune issue
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#90 - Fix llama2 finetune issue
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#89 - RLHF is not supported by the Patch in Megatron-LLM but is adapted to deepspeed lib
Issue -
State: open - Opened by zhangzhenyu13 about 1 year ago
#89 - RLHF is not supported by the Patch in Megatron-LLM but is adapted to deepspeed lib
Issue -
State: open - Opened by zhangzhenyu13 about 1 year ago
#88 - fix qwen-finetune-withqa when tensor parallel
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#88 - fix qwen-finetune-withqa when tensor parallel
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#87 - fix qwen-finetuen-withga bugs
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#86 - fix llava and qwen finetune with ga bugs
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#86 - fix llava and qwen finetune with ga bugs
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#85 - fix llava, qwen-finetuen-withga bugs
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#84 - fix finetune with GA
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#84 - fix finetune with GA
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#83 - Add Freeze for LLava
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#82 - fix finetunewGA
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#82 - fix finetunewGA
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#81 - fix bugs for qwen and llama2
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#81 - fix bugs for qwen and llama2
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#80 - Qwen rope
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#79 - fix loading model & optimizer
Pull Request -
State: closed - Opened by Renaissance25 about 1 year ago
- 1 comment
#79 - fix loading model & optimizer
Pull Request -
State: closed - Opened by Renaissance25 about 1 year ago
- 1 comment
#78 - add GA finetune and deepseek&codellama rope
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#78 - add GA finetune and deepseek&codellama rope
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#77 - fix the name of Qwen model in readme
Pull Request -
State: closed - Opened by jhuang1207 about 1 year ago
- 1 comment
#77 - fix the name of Qwen model in readme
Pull Request -
State: closed - Opened by jhuang1207 about 1 year ago
- 1 comment
#76 - Add Qwen 72b finetune demo
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#76 - Add Qwen 72b finetune demo
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#75 - remove idx=0 in data module
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#75 - remove idx=0 in data module
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#74 - Fix data module remove idx=0
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#74 - Fix data module remove idx=0
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#73 - Add Baichuan2 for 2304
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#72 - Add Yi Model
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#72 - Add Yi Model
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#71 - bugfix_llava_import
Pull Request -
State: closed - Opened by tuofeilunhifi about 1 year ago
#71 - bugfix_llava_import
Pull Request -
State: closed - Opened by tuofeilunhifi about 1 year ago
#70 - Fix dataset type with Pretrain-IdxMap
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#69 - Fix dataset type with Pretrain-IdxMap
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#69 - Fix dataset type with Pretrain-IdxMap
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#68 - Fix dataset type with Pretrain-IdxMap
Pull Request -
State: closed - Opened by jerryli1981 about 1 year ago
- 1 comment
#67 - update codellama
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago
#67 - update codellama
Pull Request -
State: closed - Opened by lwmlyy about 1 year ago