Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / pjlab-sys4nlp/llama-moe issues and pull requests
#72 - Removing Deprecated Features & Translation
Pull Request -
State: open - Opened by DaizeDong about 2 months ago
#71 - Code on Expert Specialization Experiments
Issue -
State: closed - Opened by Tangkexian 3 months ago
- 2 comments
#70 - Can this be used as a means to speed up LLM inferencing?
Issue -
State: closed - Opened by bulaikexiansheng 4 months ago
- 2 comments
#69 - Any experiments about the load balancing loss?
Issue -
State: closed - Opened by exhyy 4 months ago
- 3 comments
#68 - Some questions on scripts and runtime
Issue -
State: open - Opened by kevin3567 5 months ago
- 1 comment
#67 - per_device_train_batch_size=1,but almost all of my GPU memory is still being used up?
Issue -
State: closed - Opened by rzr002 11 months ago
- 6 comments
#66 - Some weights of LlamaMoEForCausalLM were not initialized
Issue -
State: closed - Opened by Minami-su 11 months ago
- 5 comments
#65 - please update modeling_llama_moe_hf.py
Issue -
State: closed - Opened by Minami-su 11 months ago
- 5 comments
#64 - If I can't configure Slurm on a cluster, does that mean I can't use multi-node multi-GPU setups?
Issue -
State: closed - Opened by rzr002 11 months ago
- 1 comment
#63 - SFT: add sft contents
Pull Request -
State: closed - Opened by Spico197 11 months ago
#62 - Partition FFNs without downsizing them?
Issue -
State: closed - Opened by abhinand5 12 months ago
- 1 comment
#61 - [Major] HF Code Cleaning
Pull Request -
State: closed - Opened by DaizeDong 12 months ago
#60 - 我们才能从llama13b开始训练moe呢?
Issue -
State: closed - Opened by xyjsjruiliu 12 months ago
- 1 comment
#59 - Update README.md
Pull Request -
State: closed - Opened by DaizeDong 12 months ago
#58 - About dataset prepare
Issue -
State: closed - Opened by bestfleer about 1 year ago
- 1 comment
#57 - Can you report the running time on hardware?
Issue -
State: open - Opened by qiuzh20 about 1 year ago
#56 - How to split "down" by "up" when using clustering to construct experts? 请问使用clustering进行Expert Construction时,down怎么根据up划分?
Issue -
State: closed - Opened by Attention-is-All-I-Need about 1 year ago
- 4 comments
#55 - How many llama models are used for constructing llama-moe ? moe的构建是通过多个llama模型还是1个llama模型
Issue -
State: open - Opened by ZeyuTeng96 about 1 year ago
- 7 comments
#54 - ./scripts/expert_construction/split/run_split_random.sh: 行 18: srun: 未找到命令
Issue -
State: closed - Opened by 18600709862 about 1 year ago
- 4 comments
#53 - about cosine lr scheduler
Issue -
State: closed - Opened by ftgreat about 1 year ago
- 2 comments
#52 - Questions about capacity_factor, score_scale_factor
Issue -
State: closed - Opened by theblackcat102 about 1 year ago
- 2 comments
#51 - #Feature Request# Accelerated Deployment.
Issue -
State: open - Opened by Xingxiangrui about 1 year ago
- 2 comments
#50 - PUBLISH: update citation info
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
#49 - Performance comparison between LLama-MOE and the original dense model.
Issue -
State: closed - Opened by DoubleVII about 1 year ago
- 2 comments
#48 - About Chinese performances. 关于中文能力的询问
Issue -
State: closed - Opened by WangRongsheng about 1 year ago
- 2 comments
#47 - Why a new trainer instead of the original one? 请教一下为什么要新写一个llama_lr_scheduling_trainer,它的作用是什么,为什么不用原始trainer
Issue -
State: closed - Opened by linyubupa about 1 year ago
- 1 comment
#46 - fix typo
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
#45 - PUBLISH: upload technical report
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
#44 - Moefication: Format Standardization (v8)
Pull Request -
State: closed - Opened by DaizeDong about 1 year ago
#43 - PUBLISH: filename refactors and readme preparation
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
#42 - Update gate load vis, update readme
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
- 1 comment
#41 - Moefication: README Update
Pull Request -
State: closed - Opened by DaizeDong about 1 year ago
#40 - Moefication: Aggregation Before Release [pre-commit]
Pull Request -
State: closed - Opened by DaizeDong about 1 year ago
#39 - CPT: add more args and exec scripts
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
- 1 comment
#38 - CPT: add dynamic batch loading in sheared llama
Pull Request -
State: closed - Opened by Spico197 about 1 year ago
- 1 comment
#37 - CPT: add meta info when tokenization
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
- 1 comment
#36 - Moefication: Residual Gate Update [pre-commit]
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#35 - CPT: add eval support
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#34 - Add Residual CPT Pipeline
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#33 - Moefication: Residual MoE Config Update [pre-commit]
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#32 - Merge from Main
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#31 - CPT: fix tb logging, fix grad ckpting, faster data loading
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#30 - Moefication: Format Standardization (v4 v5) & Major Method Update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
- 1 comment
#29 - Moefication: Format Standardization (v4) & Residual MoE Update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#28 - Merge from main
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#27 - CPT: update `save_optim_limit`, update 13B scripts
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#26 - Moefication: Gradient split analysis
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#25 - add max_tokens and lr_scheduler resuming
Pull Request -
State: closed - Opened by tongjingqi over 1 year ago
#24 - Data clustering: add tokenization for clustered data, fix training & eval bugs in `moe_gates.py`
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#23 - Moefication: Switch Transformers Implementation
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#22 - Moefication: Gradient split (2/2) & MoE gate re-initialization update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
- 1 comment
#21 - Moefication: Expert Load visualization & MoE gate evaluation bug fix
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#20 - Data clustering
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#19 - Moefication: Gradient split (1/2) & Get hidden features bug fix [pre-commit]
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#18 - Data Clustering: functional update for clustering and data loading
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
- 1 comment
#17 - CPT: add tensorboard support, update deepspeed config, add fpt resume exec file, update MoE model config for backward compatibility
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#16 - Moefication: Gradient split framework & eval model preparation... (pre-commit)
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#15 - Moefication: Expert select data bug fix (pre-commit)
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#14 - Moefication: Format Standardization (v3)
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#13 - Moefication: eval, split graph
Pull Request -
State: closed - Opened by JCruan519 over 1 year ago
#12 - Moefication: Format Standardization (v2)
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#11 - Moefication: SiwGLU Visualization
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#10 - CPT: update model dumping, update faster training impl, add job wechat notification
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#9 - LLama直接Moe化后效果怎么样?
Issue -
State: closed - Opened by YixinSong-e over 1 year ago
#8 - Moefication: Code merge & Visualization update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#7 - Moefication: Visualization update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#6 - Moefication: Random split update
Pull Request -
State: closed - Opened by DaizeDong over 1 year ago
#5 - Fix moefication integration
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#4 - Moefication integration
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#3 - fix tokenization problem in complex data files
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#2 - filtering instances without content column in `smoe.utils.tokenize`
Pull Request -
State: closed - Opened by Spico197 over 1 year ago
#1 - Update tokenization features
Pull Request -
State: closed - Opened by Spico197 over 1 year ago