Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / openmoss/collie issues and pull requests
#190 - fix(dataset): concat input&output then tokenize
Pull Request -
State: closed - Opened by Anti-Entrophic 3 months ago
- 2 comments
#189 - feat(trainer):added the dummy_train_loop function to check if the current configuration can run properly
Pull Request -
State: open - Opened by Zhyw0 5 months ago
#188 - [example] an init train example for colo
Pull Request -
State: closed - Opened by KaiLv69 6 months ago
#187 - [example] an init train example for colo
Pull Request -
State: closed - Opened by KaiLv69 6 months ago
#186 - perf(trainer):optimize the save logic of the trainer
Pull Request -
State: closed - Opened by Zhyw0 6 months ago
- 1 comment
#185 - modify decodemetric class
Pull Request -
State: closed - Opened by LinqiY 6 months ago
#184 - fix: fix dataset w/o labels
Pull Request -
State: closed - Opened by KaiLv69 6 months ago
#183 - refactor:change eval_steps when training
Pull Request -
State: closed - Opened by yyyx-w 6 months ago
#182 - fix(model.py): align_preciion_in_norm_layer
Pull Request -
State: open - Opened by gyt1145028706 6 months ago
#181 - 如果想应用更多的模型,例如InstructBLIP,请问应该怎么修改呢?
Issue -
State: open - Opened by Listever 6 months ago
- 1 comment
#180 - feat: Add MOSS2 tp&pp model and sparse attention kernel(in triton)
Pull Request -
State: open - Opened by Li-dongyang 7 months ago
- 1 comment
#179 - perf(trainer): add save config and tokenizer when save model
Pull Request -
State: closed - Opened by zsj555 7 months ago
- 2 comments
#178 - fix(dist_utils): fix port conflict in setup_distribution
Pull Request -
State: closed - Opened by gyt1145028706 7 months ago
- 3 comments
#177 - fix(model): add load for safetensor
Pull Request -
State: closed - Opened by ti-mm 7 months ago
- 1 comment
#176 - refactor(example): refactor examples
Pull Request -
State: closed - Opened by KaiLv69 7 months ago
#175 - perf(config): set ds bs for hf models
Pull Request -
State: closed - Opened by KaiLv69 7 months ago
#174 - Add Mistral tp&pp model
Pull Request -
State: open - Opened by LinqiY 7 months ago
- 2 comments
#173 - fix:CollieDatasetForTraining class not support list when loading data
Pull Request -
State: closed - Opened by Zhyw0 7 months ago
#172 - fix:CollieDatasetForTraining class not support list when loading data
Pull Request -
State: closed - Opened by Zhyw0 7 months ago
#171 - fix:CollieDatasetForTraining class not supporting list when loading data
Pull Request -
State: closed - Opened by Zhyw0 7 months ago
#170 - perf(monitor): add init mode for monitor
Pull Request -
State: closed - Opened by KaiLv69 7 months ago
#169 - Add an example for chat fine-tuning
Pull Request -
State: closed - Opened by KYLN24 7 months ago
#168 - fix: ColliePipelineEngine has no attribute using_bf16_optimizer
Pull Request -
State: closed - Opened by KYLN24 7 months ago
- 1 comment
#167 - fix(dist_utils): fix port conflict in setup_distribution
Pull Request -
State: closed - Opened by gyt1145028706 7 months ago
#166 - fix(trainer): fix peft config save
Pull Request -
State: closed - Opened by KaiLv69 7 months ago
#165 - Update README.md
Pull Request -
State: closed - Opened by xpqiu 7 months ago
- 1 comment
#164 - fix(adalomo): fix adalomo without zero3
Pull Request -
State: closed - Opened by KaiLv69 7 months ago
#163 - No log information
Issue -
State: closed - Opened by BeastyZ 8 months ago
- 2 comments
#162 - feat(dataset): add multi-turn dataset with template
Pull Request -
State: closed - Opened by KaiLv69 8 months ago
- 1 comment
#161 - Add Qwen2 tp&pp model
Pull Request -
State: open - Opened by Anti-Entrophic 8 months ago
- 1 comment
#160 - add load Internlm2 with safetensors
Pull Request -
State: closed - Opened by KaiLv69 8 months ago
#159 - trainer.py文件中保存peft_config时会出错
Issue -
State: closed - Opened by Mr-nnng 8 months ago
- 2 comments
#158 - Add qwen2 tp&pp model
Pull Request -
State: closed - Opened by Anti-Entrophic 8 months ago
- 2 comments
#157 - 希望能支持 safetensors 格式的权重
Issue -
State: closed - Opened by WillQvQ 8 months ago
- 1 comment
#156 - Add Internlm2
Pull Request -
State: closed - Opened by KaiLv69 9 months ago
#155 - add: Makefile support for 'build' and 'clean'
Pull Request -
State: closed - Opened by WillQvQ 9 months ago
#154 - The interpetation about the transposition operation when spliting weight to tensor parallel group
Issue -
State: closed - Opened by SparkJiao 9 months ago
- 2 comments
#153 - 关于增加千问模型的支持
Issue -
State: closed - Opened by Jieni05 10 months ago
- 11 comments
#152 - lomo训练65b llama实测 Lomo is incompatible with pipeline parallelism
Issue -
State: open - Opened by zlh1992 10 months ago
- 1 comment
#151 - 能否增加一个从头预训练的例子?
Issue -
State: open - Opened by liujuncn 10 months ago
- 1 comment
Labels: help wanted
#150 - LOMO优化器使用梯度裁剪导致训练时间翻倍?
Issue -
State: closed - Opened by Jieni05 11 months ago
- 2 comments
#149 - Zero 2 gets stuck when initializing optimizer states
Issue -
State: closed - Opened by tengxiaoliu 11 months ago
- 1 comment
#148 - version 1.0.5, fix resume checkpoint, model save and generation
Pull Request -
State: closed - Opened by KaiLv69 11 months ago
#147 - update kv cache for generation and fix chatglm2
Pull Request -
State: closed - Opened by KaiLv69 11 months ago
#146 - Kv cache resume checkpoint
Pull Request -
State: closed - Opened by MorningForest 11 months ago
#145 - 支持 RLHF
Issue -
State: open - Opened by KYLN24 11 months ago
#144 - 在加载模型的时候,没有self.folder这个会报错
Pull Request -
State: closed - Opened by 459737087 12 months ago
- 2 comments
#143 - 你好,怎么让保存的模型能够分片,而不是保存一个几十G的大模型
Issue -
State: open - Opened by 459737087 12 months ago
- 2 comments
Labels: enhancement
#142 - 关于模型中断,重启的问题,怎么让模型继续训练
Issue -
State: closed - Opened by 459737087 12 months ago
- 6 comments
Labels: bug
#141 - RendezvousConnectionError,跑着跑着就有这个报错
Issue -
State: closed - Opened by 459737087 12 months ago
- 5 comments
#140 - AttributeError: 'PeftModelForCausalLM' object has no attribute 'set_cache'
Issue -
State: closed - Opened by JiafeiSun 12 months ago
- 2 comments
#139 - 关于 adalomo 没有 loss_scaler 只有 loss_scale 的问题
Issue -
State: open - Opened by HappyLynn 12 months ago
- 1 comment
#138 - A100单卡跑llama2 finetune lora报错oom
Issue -
State: closed - Opened by JiafeiSun about 1 year ago
- 2 comments
#137 - No module named 'collie.callbacks.pefts'
Issue -
State: closed - Opened by JiafeiSun about 1 year ago
- 2 comments
#136 - chatGLM2 好像目前不支持ptuning训练,有计划什么时候支持么
Issue -
State: open - Opened by BlueSkyyyyyy about 1 year ago
- 2 comments
Labels: bug
#135 - chatGLM2 使用张量并行报错
Issue -
State: closed - Opened by BlueSkyyyyyy about 1 year ago
- 6 comments
#134 - 是否可以新增chatglm3 支持?
Issue -
State: closed - Opened by hijeffwu about 1 year ago
- 3 comments
#133 - __init__() missing 'init_method' and 'config'
Issue -
State: closed - Opened by yueg-security about 1 year ago
- 1 comment
#132 - AdaLomo optimizer step method
Issue -
State: open - Opened by winglian about 1 year ago
- 3 comments
#131 - 张量并行流水并行可以和lora一起使用么?报错ValueError: Target module ColumnParallelLinearWithoutBias() is not supported. Currently, only `torch.nn.Linear` and `Conv1D` are supported.
Issue -
State: closed - Opened by BlueSkyyyyyy about 1 year ago
- 3 comments
#130 - 是否可以增加baichuan-2 的fine-tuning支持? 或者 是否可以给一个 如何 新增微调模型的guide? 谢谢
Issue -
State: closed - Opened by wuchangping about 1 year ago
- 3 comments
#129 - 重构ChatGLM2并增加对应测试用例
Pull Request -
State: closed - Opened by MorningForest about 1 year ago
#128 - add adalomo
Pull Request -
State: closed - Opened by KaiLv69 about 1 year ago
#127 - 能不能重新训练啊?
Issue -
State: closed - Opened by 459737087 about 1 year ago
- 3 comments
#126 - 使用的Megatron-LM的版本
Issue -
State: closed - Opened by liaosnow about 1 year ago
- 11 comments
#125 - move load_peft to peft_utils and some peft examples
Pull Request -
State: closed - Opened by KYLN24 about 1 year ago
- 2 comments
#124 - [Feature] examples 里是否可以新增一个 internLM的用例?
Issue -
State: closed - Opened by wuchangping about 1 year ago
- 1 comment
#123 - 使用数据类_ShardContainer遇到错误
Issue -
State: open - Opened by xuguohai about 1 year ago
- 1 comment
#122 - How to convert parallel state_dict to normal state_dict?
Issue -
State: open - Opened by JinchaoLove about 1 year ago
- 3 comments
#121 - Evaluating is too slow
Issue -
State: closed - Opened by JinchaoLove about 1 year ago
- 2 comments
#120 - 该项目能否用于对模型进行二次预训练
Issue -
State: closed - Opened by Zheng-Jay about 1 year ago
- 3 comments
#119 - [BUG] Evaluation 时使用并行可能不会完整地遍历一遍数据
Issue -
State: open - Opened by KYLN24 about 1 year ago
- 1 comment
#118 - [BUG] 使用 CollieDatasetForClassification 在 helm 风格下进行分类评测时,max new token 截取存在问题
Issue -
State: closed - Opened by KYLN24 about 1 year ago
#117 - Could Lomo class support `param_groups`?
Issue -
State: open - Opened by JinchaoLove about 1 year ago
#116 - ColumnParallelLinearWithoutBias is not supported by peft
Issue -
State: closed - Opened by JinchaoLove about 1 year ago
- 5 comments
#115 - ImportError: cannot import name 'PeftConfig' from 'peft.utils'
Issue -
State: closed - Opened by lisherlock about 1 year ago
- 7 comments
#114 - 基于CoLLie训练7B Moss模型,无法使用Huggingface的AutoModelForCausalLM加载吗?
Issue -
State: closed - Opened by xuguohai about 1 year ago
- 3 comments
#113 - [问题]有关训练可视化
Issue -
State: closed - Opened by RickMeow about 1 year ago
- 2 comments
#112 - 修复、重构collie加载/初始化模型Bug
Pull Request -
State: closed - Opened by Kausal-Lei about 1 year ago
#111 - 修复模型初始化Bug,重构模型初始化
Pull Request -
State: closed - Opened by Kausal-Lei about 1 year ago
#110 - [QUESTION]Multi-node multi-gpu training
Issue -
State: closed - Opened by RickMeow about 1 year ago
- 2 comments
#109 - [BUG] ImportError: cannot import name 'PeftConfig' from 'peft.utils'
Issue -
State: closed - Opened by RickMeow about 1 year ago
- 2 comments
#108 - 修复模型初始化的Bug
Pull Request -
State: closed - Opened by Kausal-Lei about 1 year ago
- 2 comments
#107 - 训练loss为NaN
Issue -
State: closed - Opened by fuqianya about 1 year ago
- 6 comments
#106 - lr_scheduler设置的问题
Issue -
State: closed - Opened by YuxiangZhang0114 over 1 year ago
- 1 comment
#105 - 替换tokenizer后载入报错
Issue -
State: closed - Opened by 2793145003 over 1 year ago
- 12 comments
#104 - Llama2 70B 训练报错
Issue -
State: open - Opened by xiaopqr over 1 year ago
- 3 comments
#103 - Update lomo
Pull Request -
State: closed - Opened by KaiLv69 over 1 year ago
#102 - 训练出错但没有报错信息
Issue -
State: closed - Opened by 2793145003 over 1 year ago
- 2 comments
#100 - collie和lomo不兼容
Issue -
State: closed - Opened by LZY-the-boys over 1 year ago
- 3 comments
Labels: bug
#99 - tensor parallel + zero3 error
Issue -
State: open - Opened by LZY-the-boys over 1 year ago
- 1 comment
Labels: help wanted
#98 - Error: llama2 70B LlamaForCausalLM.from_pretrained 开启Zero3,会消耗大量内存导致 OOM
Issue -
State: open - Opened by xiaopqr over 1 year ago
- 5 comments
Labels: bug
#96 - Add more shields
Pull Request -
State: closed - Opened by Carol-gutianle over 1 year ago
#93 - Whether lr_scheduler for Lomo is implemented now?
Issue -
State: open - Opened by DesperateExplorer over 1 year ago
- 4 comments
Labels: bug
#91 - Support for LLaMA-2 70B with Grouped-Query Attention
Issue -
State: open - Opened by kaiwang13 over 1 year ago
- 18 comments
Labels: bug
#87 - llama-2-7b拓展词表报错
Issue -
State: open - Opened by skepsun over 1 year ago
- 4 comments
Labels: bug
#85 - deep_speed initialization for models in the transformers library
Issue -
State: open - Opened by DesperateExplorer over 1 year ago
- 6 comments
Labels: help wanted
#84 - save_16bit_model does not save the proper state_dict
Issue -
State: closed - Opened by DesperateExplorer over 1 year ago
- 11 comments
Labels: help wanted
#71 - save_checkpoint
Issue -
State: closed - Opened by lw3259111 over 1 year ago
- 4 comments
Labels: question
#70 - bf16是否支持?
Issue -
State: closed - Opened by lw3259111 over 1 year ago
- 5 comments