Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / TencentARC/LLaMA-Pro issues and pull requests
#33 - 这个方法可以扩展到vit类的视觉encode上吗?
Issue -
State: open - Opened by lucasjinreal 22 days ago
#32 - 请教大佬可以训练qewn2-7b吗
Issue -
State: open - Opened by jqtian123 28 days ago
#31 - 关于论文中通用能力榜单几乎没有下降,部分反而有提升
Issue -
State: closed - Opened by bestpredicts 3 months ago
#30 - 关于运行流程
Issue -
State: open - Opened by GOOD-N-LCM 3 months ago
- 4 comments
#29 - 训练到10B tokens 时loss就收敛了 无法下降
Issue -
State: closed - Opened by bestpredicts 3 months ago
- 1 comment
#28 - 关于零初始化和扩展层的位置
Issue -
State: open - Opened by ouyanxi1125 3 months ago
- 4 comments
#27 - finetune_cosmopedia.sh如何训练出来8B模型
Issue -
State: open - Opened by RuipingWang1986 4 months ago
#26 - 利用finetune_cosmopedia.sh脚本进行继续预训练中的数据集如何构建
Issue -
State: open - Opened by RuipingWang1986 4 months ago
- 2 comments
#25 - Thanks for wonderful projects ! Why I always got the results of apparent loss of original ability?
Issue -
State: open - Opened by hzgdeerHo 4 months ago
- 8 comments
#24 - 请教下论文中的实验
Issue -
State: closed - Opened by ChrisXULC 5 months ago
- 1 comment
#23 - Training on arbitary data
Issue -
State: open - Opened by HelloWorldLTY 5 months ago
- 2 comments
#22 - Pretrain code of Mistral-Pro-8B-v0.1
Issue -
State: open - Opened by shawnricecake 6 months ago
- 1 comment
#21 - Do we need to freeze embedding layer and the lm_head as well during the Llama-pro style training ?
Issue -
State: closed - Opened by shamanez 6 months ago
- 2 comments
#20 - 请教下训练的显存需求
Issue -
State: open - Opened by denghj3 6 months ago
- 4 comments
#19 - Comparison with PEFT
Issue -
State: open - Opened by LaVieEnRose365 6 months ago
- 1 comment
#18 - 更大的模型需要更多的block吗?
Issue -
State: open - Opened by PoseidomWong 6 months ago
- 1 comment
#17 - add `pip install fire` to requirements.txt
Issue -
State: open - Opened by r4dm 6 months ago
- 1 comment
#16 - 新增的transformer层是与上一层共享参数吗?
Issue -
State: closed - Opened by CharlinChen 7 months ago
- 3 comments
#15 - llama factory的llama-pro是不是写得不对啊
Issue -
State: closed - Opened by HuXinjing 7 months ago
- 2 comments
#14 - 对比lora优势是什么
Issue -
State: open - Opened by xiaozhu1106 7 months ago
- 1 comment
#13 - 增量预训练的疑惑?
Issue -
State: closed - Opened by zhuxiaobin 7 months ago
- 6 comments
#12 - Issue with Model Saving After Layer Expansion: Removed Shared Tensors
Issue -
State: closed - Opened by yumingfan-0219 7 months ago
- 2 comments
#11 - guide to run the code
Issue -
State: open - Opened by Abolfazl-kr 7 months ago
- 2 comments
#10 - 您好,请教一下post pretrain的问题
Issue -
State: closed - Opened by ray075hl 8 months ago
- 8 comments
#9 - Question regarding the difference between llama-pro and the regular llama.(关于llama-pro和普通llama之间的区别的疑问)
Issue -
State: open - Opened by WUHU-G 8 months ago
- 8 comments
#8 - How to load the new model weight
Issue -
State: open - Opened by khalil-Hennara 8 months ago
- 1 comment
#7 - Should I freeze norm.weight?
Issue -
State: open - Opened by metterian 8 months ago
- 1 comment
#6 - full code to continue pre-training
Issue -
State: open - Opened by Abolfazl-kr 8 months ago
- 1 comment
#5 - Question about Llama-7B and Llama-7B-Pro comparison.
Issue -
State: open - Opened by ryusaeba 8 months ago
- 2 comments
#4 - Arxiv Data
Issue -
State: open - Opened by ZhengTang1120 8 months ago
- 2 comments
#3 - 我们如何针对扩展区块微调?
Issue -
State: open - Opened by win10ogod 8 months ago
- 5 comments
#2 - Code for training llama pro?
Issue -
State: open - Opened by yhyu13 9 months ago
- 8 comments
#1 - 论文Table7请教
Issue -
State: closed - Opened by XiaoYee 9 months ago
- 5 comments