Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / InternLM/InternEvo issues and pull requests
#295 - fix hf internlm nan bug
Pull Request -
State: closed - Opened by sallyjunjun about 2 months ago
#294 - feat(modeling): support qwen2
Pull Request -
State: open - Opened by SolenoidWGT about 2 months ago
#293 - feat(trainer_builder): refactor trainer_builder and preserve optional callable for model dispatch function
Pull Request -
State: closed - Opened by zigzagcai 2 months ago
- 1 comment
#292 - [QA] 代码中涉及到的字符串比较,整改为枚举类型比较
Issue -
State: open - Opened by sallyjunjun 2 months ago
Labels: question
#291 - [QA] 梳理load_hf_llama_pretrained_weights相关代码逻辑,清理无用代码
Issue -
State: open - Opened by sallyjunjun 2 months ago
Labels: question
#290 - fix(data): fix the unpack data
Pull Request -
State: closed - Opened by yingtongxiong 2 months ago
#289 - fix(moe): change moe norm reduced group
Pull Request -
State: closed - Opened by blankde 2 months ago
- 1 comment
#288 - Feat(*):loong train
Pull Request -
State: open - Opened by huangting4201 2 months ago
#287 - add isp support of huggingface model
Pull Request -
State: closed - Opened by sallyjunjun 2 months ago
#286 - [Feature] how to finetuning lora
Issue -
State: open - Opened by wen020 2 months ago
- 1 comment
Labels: enhancement
#285 - [Bug] RuntimeError: [3] is setting up NCCL communicator and retrieving ncclUniqueId from [0] via c10d key-value store by key '0', but store->get('0') got error: Socket Timeout
Issue -
State: open - Opened by kkscilife 2 months ago
Labels: bug
#284 - Hf isp support
Pull Request -
State: closed - Opened by sallyjunjun 2 months ago
#283 - feat(varlen): support varlen training for huggingface models
Pull Request -
State: closed - Opened by zigzagcai 2 months ago
- 5 comments
#282 - feat(pipeline parallel): add zero bubble pipeline parallelism (ZB-H1)
Pull Request -
State: open - Opened by li126com 2 months ago
#281 - feat(setup and docs): add one-click setup and refine docs
Pull Request -
State: closed - Opened by zigzagcai 3 months ago
#280 - fix(dipu): support newest internevo with deeplink
Pull Request -
State: open - Opened by POI-WX 3 months ago
#279 - [Bug] 仅支持了GShard模式的MoE模型转huggingface
Issue -
State: open - Opened by Cerberous 3 months ago
Labels: bug
#278 - [Bug] 训练bf16 infer fp16出现NaN
Issue -
State: open - Opened by Cerberous 3 months ago
Labels: bug
#277 - fix(huggingface): fix huggingface dataloader when using some huggingface third-party tokenizers
Pull Request -
State: closed - Opened by zigzagcai 3 months ago
- 1 comment
#276 - Fix(ckpt): fix llama2 loading function
Pull Request -
State: closed - Opened by zigzagcai 3 months ago
#275 - feat(checkpoint): TP recomputation communication optimization
Pull Request -
State: open - Opened by li126com 3 months ago
#274 - Patch 1
Pull Request -
State: closed - Opened by xinhao-luo 3 months ago
#273 - Patch 2
Pull Request -
State: closed - Opened by xinhao-luo 3 months ago
#272 - Create modeling_moe_mistral.py
Pull Request -
State: closed - Opened by xinhao-luo 3 months ago
#271 - feat(moe): impl hugginface internln-moe
Pull Request -
State: closed - Opened by blankde 3 months ago
#270 - Feat/refactor 2d sequence parallel code
Pull Request -
State: closed - Opened by huangting4201 3 months ago
#269 - [Bug] TFLOPS计算不准
Issue -
State: open - Opened by Cerberous 3 months ago
- 2 comments
Labels: bug
#268 - [QA] InternEvo能否load预训练llama2的参数
Issue -
State: open - Opened by JunZhan2000 3 months ago
- 4 comments
Labels: question
#267 - [QA] Internevo是否支持tied_embedding?
Issue -
State: open - Opened by Cerberous 3 months ago
- 3 comments
Labels: question
#266 - [Bug] 使用internevo训练,转换成hf模型用opencompass测试时候有一定概率会nan
Issue -
State: open - Opened by Cerberous 3 months ago
- 2 comments
Labels: bug
#265 - feat(doc): add MOE installation
Pull Request -
State: closed - Opened by li126com 3 months ago
#264 - [Bug] 用MoE训练的时候tflop超级低
Issue -
State: open - Opened by Cerberous 3 months ago
- 4 comments
Labels: bug
#263 - feat(refactor): add simple and unified trainer.fit() interface
Pull Request -
State: closed - Opened by zigzagcai 3 months ago
#262 - [Bug] 好像没有把internevo的MoE权重转换成huggingface版本的脚本?
Issue -
State: open - Opened by Cerberous 3 months ago
- 8 comments
Labels: bug
#261 - feat(*): re-impl embedding/head of isp version
Pull Request -
State: closed - Opened by mwiacx 3 months ago
#260 - feat(tools): update InternEvo style ckpt inference tool.
Pull Request -
State: closed - Opened by MCplayerFromPRC 3 months ago
Labels: enhancement
#259 - [QA] 使用InternLM pretrain llama2-7b之后,怎么用训练之后的模型做推理?
Issue -
State: closed - Opened by wen020 4 months ago
- 1 comment
Labels: question
#258 - feat(parallel): support hybrid cluster training
Pull Request -
State: open - Opened by SolenoidWGT 4 months ago
#257 - fix(config): fix llama2 config
Pull Request -
State: closed - Opened by SolenoidWGT 4 months ago
Labels: bug
#256 - Feat/add zeropp
Pull Request -
State: open - Opened by chrysantd 4 months ago
#255 - feat(data): load meta files with shared memory
Pull Request -
State: closed - Opened by li126com 4 months ago
Labels: enhancement
#254 - [QA] llama2-7b模型配置和原生疑似不一致
Issue -
State: closed - Opened by CokeDong 4 months ago
Labels: question
#253 - [QA] Internevo这个框架里面MoE支持expert parallel嘛?
Issue -
State: open - Opened by Cerberous 4 months ago
- 1 comment
Labels: question
#252 - test(*): add backward timing
Pull Request -
State: closed - Opened by yingtongxiong 4 months ago
#251 - [QA] 用Internevo已经训练出来了一个7B模型,如何用这个internevo权重跑MoE?
Issue -
State: open - Opened by Cerberous 4 months ago
- 1 comment
Labels: question
#250 - [Bug] meta数据load进来内存过大,训练过程中内存持续增长
Issue -
State: closed - Opened by Cerberous 4 months ago
- 7 comments
Labels: bug
#249 - fix: RewardModelLinear bcast process_group
Pull Request -
State: closed - Opened by KimmiShi 4 months ago
- 1 comment
Labels: bug
#248 - feat(all2all): add single & tutel all2all
Pull Request -
State: closed - Opened by SolenoidWGT 4 months ago
#247 - test(*):add seq and e2e modifications
Pull Request -
State: closed - Opened by yingtongxiong 4 months ago
#246 - Feat/support sliding window attn selective checkpoint
Pull Request -
State: closed - Opened by huangting4201 4 months ago
#245 - [Bug] AssertionError: Only flash cross entropy support parallel_output
Issue -
State: open - Opened by wen020 4 months ago
- 1 comment
Labels: bug
#244 - feat(huggingface): native support for huggingface model and dataset
Pull Request -
State: closed - Opened by sallyjunjun 4 months ago
#243 - feat(simulator): support parallel cost simulator for internevo
Pull Request -
State: open - Opened by SolenoidWGT 4 months ago
#242 - feat(model): support silding window in 2D parallelism
Pull Request -
State: closed - Opened by yingtongxiong 4 months ago
#241 - Add tool for data cleaning
Issue -
State: open - Opened by www516717402 4 months ago
Labels: question
#240 - feat(moe): set epsize by config
Pull Request -
State: open - Opened by blankde 4 months ago
- 1 comment
#239 - feat(config): add 1.8B config for 16 experts
Pull Request -
State: closed - Opened by blankde 4 months ago
#238 - feat(all2all): support all2all inner overlap
Pull Request -
State: closed - Opened by yingtongxiong 4 months ago
#237 - Feat (optimizer): add new optimizer for spliting zero tensor
Pull Request -
State: closed - Opened by li126com 4 months ago
- 1 comment
#236 - fix(full_kv): fix full_kv bugs
Pull Request -
State: closed - Opened by yingtongxiong 4 months ago
#235 - [Bug] npu训练生成的模型转换为hf格式报错
Issue -
State: closed - Opened by forest-sys 5 months ago
- 3 comments
Labels: bug
#234 - Fix(mha,linear): fix norm_head and mha inference
Pull Request -
State: closed - Opened by KimmiShi 5 months ago
- 1 comment
#233 - fix(moe): fix interface for megablock
Pull Request -
State: closed - Opened by blankde 5 months ago
- 1 comment
#232 - feat(zigzag_full_kv): support head overlap
Pull Request -
State: closed - Opened by yingtongxiong 5 months ago
#231 - fix(dummy_dataset.py): fix random dataset token value exceed vocab size
Pull Request -
State: closed - Opened by huangting4201 5 months ago
#230 - feat(inference): support generation using trainer
Pull Request -
State: closed - Opened by KimmiShi 5 months ago
- 1 comment
Labels: enhancement
#229 - feat(attention): add full_kv zigzag
Pull Request -
State: closed - Opened by mwiacx 5 months ago
#228 - feat(ring_flash_attn): support fa selective checkpoint ++
Pull Request -
State: closed - Opened by huangting4201 5 months ago
#227 - fix(QA): add real tgs to train_CI
Pull Request -
State: closed - Opened by li126com 5 months ago
#226 - Add ring zigzag 2d
Pull Request -
State: closed - Opened by QiaolingChen00 5 months ago
#225 - add ring zigzag 2d
Pull Request -
State: closed - Opened by QiaolingChen00 5 months ago
#224 - feat(GQA): support the sequence parallel when sp size > kv head
Pull Request -
State: closed - Opened by yingtongxiong 5 months ago
#223 - [Bug] TypeError: record_current_batch_training_metrics() missing 1 required positional argument: 'very_begining_time'
Issue -
State: closed - Opened by kkscilife 5 months ago
- 1 comment
Labels: bug
#222 - [Bug] No module named 'internlm.model.ops.fusion_ops_import_helper'
Issue -
State: closed - Opened by kkscilife 5 months ago
- 1 comment
Labels: bug
#221 - Feat (optimizer): split zero tensor
Pull Request -
State: closed - Opened by li126com 5 months ago
#220 - feat(model): import qkvpacked rotary_emb performance
Pull Request -
State: closed - Opened by mwiacx 5 months ago
#219 - fix(model): fix model forward when checkpoint=true
Pull Request -
State: closed - Opened by mwiacx 5 months ago
- 1 comment
#218 - fix(pipeline_scheduler): fix recv_obj_meta args
Pull Request -
State: closed - Opened by mwiacx 5 months ago
#217 - fix(pipeline_scheduler): fix interleaved load_micro_batch
Pull Request -
State: closed - Opened by mwiacx 5 months ago
#216 - fix(ci): fix command and branch
Pull Request -
State: closed - Opened by kkscilife 5 months ago
#215 - [Bug] IndexError: too many indices for tensor of dimension 2
Issue -
State: closed - Opened by kkscilife 5 months ago
- 1 comment
Labels: bug
#214 - remove flash_attn related operator dependency
Pull Request -
State: closed - Opened by sallyjunjun 5 months ago
#213 - Feat (doc): add torch_npu installing
Pull Request -
State: closed - Opened by li126com 5 months ago
#212 - [Bug] 昇腾910微调internLM报错
Issue -
State: open - Opened by rourouZ 6 months ago
- 3 comments
Labels: bug
#211 - [Bug] The task keeps hanging in launch stage
Issue -
State: closed - Opened by kkscilife 6 months ago
- 1 comment
Labels: bug
#210 - fix test_pipeline
Pull Request -
State: closed - Opened by sallyjunjun 6 months ago
#209 - feat(op): support varlen npu flash attention
Pull Request -
State: closed - Opened by SolenoidWGT 6 months ago
Labels: enhancement
#208 - [Bug] 训练报错indexSelectLargeIndex: block: [604,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Issue -
State: closed - Opened by canghaiyunfan 6 months ago
- 1 comment
Labels: bug
#207 - fix(tgs all): set very_beginning_time
Pull Request -
State: closed - Opened by li126com 6 months ago
#206 - Fix(QA): fix loading ckpt and add launcher setting for test loss
Pull Request -
State: closed - Opened by li126com 6 months ago
#205 - feat(singleton): ensure singleton thread safety and no performance degradation
Pull Request -
State: closed - Opened by zigzagcai 6 months ago
- 1 comment
#204 - fix(fix): fix small bug
Pull Request -
State: closed - Opened by SolenoidWGT 6 months ago
#203 - Feat(RMSNorm NPU): Add RMSNormNPU and CI
Pull Request -
State: closed - Opened by li126com 6 months ago
#200 - Fix(docker): update docker image and dockerfile for new version
Pull Request -
State: closed - Opened by li126com 6 months ago
#199 - Fix(load ckpt and QA): fix tran_CI and loading ckpt for new format
Pull Request -
State: closed - Opened by li126com 6 months ago
#188 - feat(npu): support npu fused adamw
Pull Request -
State: closed - Opened by SolenoidWGT 6 months ago
#187 - feat(npu): support npu fusion rotary mul
Pull Request -
State: closed - Opened by SolenoidWGT 6 months ago
#185 - fix test model error
Pull Request -
State: closed - Opened by sallyjunjun 6 months ago
#182 - [Bug] internlm docker image issue
Issue -
State: closed - Opened by marks221b 6 months ago
- 2 comments
Labels: bug
#179 - fix get_accelerator error
Pull Request -
State: closed - Opened by sallyjunjun 6 months ago