ssbuild/chatglm_finetuning issues and pull requests

#284 - ptv2

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#283 - num_layers_freeze

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#282 - 简化

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#281 - "gradient_checkpointing": False

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#280 - support accelerator trainer

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#279 - support accelerator trainer

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#278 - v0.2.5

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#277 - v0.2.5

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#276 - support ia3

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#275 - 0.2.4

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#274 - fix slidding

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#273 - update

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#272 - update

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#271 - deepspeed precision

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#270 - fix ptv2

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#269 - fix ptv2

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#268 - ptv2 remove device_map

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#267 - build_template

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#266 - 请问adalora能用deepspeed训练吗

Issue - State: open - Opened by Yu-Yuqing about 1 year ago

#265 - update

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#264 - LoRA和ptv2微调均发生OOM

Issue - State: open - Opened by shenzhyzzz about 1 year ago - 4 comments

#263 - 0.2.0

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#262 - 0.1.21

Pull Request - State: closed - Opened by ssbuild about 1 year ago

#261 - INFO:lightning_fabric.utilities.rank_zero:`Trainer.fit` stopped: No training batches.

Issue - State: open - Opened by hasakikiki about 1 year ago - 2 comments

#260 - 有谁用过Mac Studio微调的

Issue - State: open - Opened by xsailor511 over 1 year ago

#259 - 怎么控制每训练n轮就保存一次模型呢

Issue - State: closed - Opened by tjulh over 1 year ago - 1 comment

#258 - AttributeError: module 'torch.optim' has no attribute 'adam'

Issue - State: open - Opened by evanweiguohua over 1 year ago - 5 comments
Labels: bug

#257 - 推理时怎么指定用哪几张卡

Issue - State: closed - Opened by tjulh over 1 year ago - 2 comments

#256 - 修改max_seq_length好像并没有生效？

Issue - State: closed - Opened by tjulh over 1 year ago - 4 comments

#255 - AttributeError: module 'inspect' has no attribute 'ArgSpec'

Issue - State: closed - Opened by SeekPoint over 1 year ago - 1 comment
Labels: bug

#254 - 显示可训练参数数量问题

Issue - State: open - Opened by xxll88 over 1 year ago

#253 - 缺省Lora训练显存消耗 60G

Issue - State: open - Opened by is over 1 year ago

#252 - 您好，非常感谢您的工作。我进行全参数微调后，调用infer_finetuning.py的时候，出现Missing key(s) in state_dict: "_TransformerLightningModule__backbone.transformer.lm_head.weight".请问您有遇到过这个问题吗

Issue - State: closed - Opened by Xuan-ZW over 1 year ago - 2 comments

#251 - fix potential expand vocab_size

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#250 - requirements.txt

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#249 - load float16 weight

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#248 - support resize embs

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#247 - 模型训练只使用到了单个GPU

Issue - State: closed - Opened by GZJAS over 1 year ago - 1 comment

#246 - 0.1.10

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#245 - ptuning v2 如何启动quantization_bit 4

Issue - State: open - Opened by xxll88 over 1 year ago - 1 comment

#244 - v0.1.10

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#243 - 用单轮数据集。 p-tuning微调chatGLM之后出现的问题。

Issue - State: open - Opened by SMR-S over 1 year ago - 1 comment

#242 - 单轮的QA回答的还不错，但是多轮对话时，就会出现瞎回答的情况。我感觉应该是多轮对话的history记录干扰了回答的效果？目前还不确定是不是这样的情况，有兄弟遇到类似的情况吗？交流一波啊！

Issue - State: closed - Opened by SMR-S over 1 year ago

#241 - should be load_sft_weight?

Issue - State: closed - Opened by HenryYuxuanWang over 1 year ago - 1 comment
Labels: bug

#239 - 运行lora的训练代码， int8=True ，推理时报错 RuntimeError: expected scalar type Half but found Float，这是什么原因？

Issue - State: closed - Opened by MathamPollard over 1 year ago - 14 comments

#238 - 执行infer_lora_finetuning.py报错：‘NoneType’ objectg has no attribute 'learning_rate'

Issue - State: closed - Opened by paizhongxing over 1 year ago - 8 comments

#237 - input_ids格式是否需要<CLS>

Issue - State: open - Opened by Jong-Won over 1 year ago

#236 - 如何使用evaluate.py对测试集进行验证

Issue - State: open - Opened by lawrencelxy over 1 year ago - 4 comments
Labels: new feature, good issue

#235 - Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.

Issue - State: open - Opened by sanwei111 over 1 year ago

#234 - 关于需要多少显卡资源

Issue - State: open - Opened by sanwei111 over 1 year ago - 1 comment

#233 - deepspeed 和普通训练（lora ptuning） batch_size 只能设置4以下不然会OOM

Issue - State: closed - Opened by markWJJ over 1 year ago - 21 comments

#232 - ptv2显存不够？

Issue - State: open - Opened by sanwei111 over 1 year ago - 11 comments

#231 - 单机两卡指令怎么样

Issue - State: open - Opened by sanwei111 over 1 year ago - 2 comments

#230 - 关于数据的instruction，input，output

Issue - State: open - Opened by sanwei111 over 1 year ago - 3 comments

#229 - v2

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#228 - 关于数据格式

Issue - State: open - Opened by sanwei111 over 1 year ago - 6 comments

#227 - V2 merge

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#226 - RuntimeError: expected scalar type Half but found Float 你们在训练的时候遇到过这样的情况吗？

Issue - State: closed - Opened by SMR-S over 1 year ago - 3 comments
Labels: invalid

#225 - v2

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#224 - v2

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#223 - 加载lora模型出错～

Issue - State: closed - Opened by zlht812 over 1 year ago

#222 - merge v2

Pull Request - State: closed - Opened by ssbuild over 1 year ago

#221 - 请问如何试用一般新闻语料对ChatGLM进行继续finetuing呢？

Issue - State: open - Opened by yang9112 over 1 year ago - 1 comment

#220 - web/api_lora_demo.py 如何多张卡推理

Issue - State: open - Opened by lxw0109 over 1 year ago

#219 - 第一次f16 lora双卡训练成功，第二次int8 lora单卡训练成功，第三次换会f16 lora双卡训练失败，详情请进～

Issue - State: closed - Opened by zlht812 over 1 year ago - 3 comments

#218 - 使用ptv2的方式进行finetune，总是OOM（正常微调、lora方式都没问题）

Issue - State: closed - Opened by lxw0109 over 1 year ago - 7 comments

#217 - 请问一下，mac系统装不了deep_training？

Issue - State: closed - Opened by WHJTC over 1 year ago - 1 comment

#216 - Lora推理2分30s正常吗？

Issue - State: closed - Opened by jikhunb over 1 year ago - 2 comments

#215 - Lora训练后推理问题

Issue - State: closed - Opened by jikhunb over 1 year ago - 2 comments

#215 - Lora训练后推理问题

Issue - State: closed - Opened by jikhunb over 1 year ago - 2 comments

#214 - python train.py执行训练报错，求解。

Issue - State: closed - Opened by pan365wang over 1 year ago - 9 comments

#213 - 设置 LoRa微调的 'target_modules' 后，运行报错 "AssertionError"

Issue - State: closed - Opened by ngbruce over 1 year ago - 4 comments
Labels: wontfix

#212 - Deepspeed stage3保存模型权重维度为0

Issue - State: closed - Opened by Jong-Won over 1 year ago - 2 comments

#212 - Deepspeed stage3保存模型权重维度为0

Issue - State: closed - Opened by Jong-Won over 1 year ago - 2 comments

#211 - 大佬好，请问使用lora和ptv2进行微调分别需要修改哪些配置？

Issue - State: open - Opened by mircop1t over 1 year ago - 19 comments

#211 - 大佬好，请问使用lora和ptv2进行微调分别需要修改哪些配置？

Issue - State: open - Opened by mircop1t over 1 year ago - 19 comments

#210 - 大佬好，请问关于scheduler

Issue - State: closed - Opened by IamRoBota over 1 year ago - 4 comments

#209 - deepspeed如何设置可以避免OOM

Issue - State: open - Opened by lianrzh over 1 year ago - 2 comments

#209 - deepspeed如何设置可以避免OOM

Issue - State: open - Opened by lianrzh over 1 year ago - 2 comments

#208 - 大佬好，请问下数据构造中的特殊token

Issue - State: open - Opened by IamRoBota over 1 year ago - 2 comments

#207 - 数据集

Issue - State: open - Opened by renmengjie7 over 1 year ago

#206 - 整体微调以后，领域内的知识记住了，但是问常规问题，比如你好，你叫什么，他也回答领域内的知识

Issue - State: open - Opened by heiheiwangergou over 1 year ago - 5 comments

#205 - 训练数据集的q和a有长度限制吗，和max_seq_length是什么关系

Issue - State: open - Opened by lancexiao over 1 year ago

#205 - 训练数据集的q和a有长度限制吗，和max_seq_length是什么关系

Issue - State: open - Opened by lancexiao over 1 year ago

#203 - 大佬，能讲一下如何合并lora权重到原来的模型中吗？

Issue - State: closed - Opened by cywjava over 1 year ago - 5 comments

#202 - Lora int8微调，推理时出错

Issue - State: closed - Opened by crellian over 1 year ago - 4 comments

#201 - 整体微调，loss数值训练到第6，7轮左右为nan，各位大佬给看看啥原因

Issue - State: closed - Opened by heiheiwangergou over 1 year ago - 6 comments

#197 - 使用英文语料训练，可以收敛，但是每个字母就会有个\n提行。啥原因？

Issue - State: closed - Opened by leoluopy over 1 year ago - 3 comments

#170 - 能在readme中提供使用多轮对话数据的模板进行微调的数据输入格式例子吗？

Issue - State: closed - Opened by cristianohello over 1 year ago - 2 comments

#160 - 领域10万条多轮对话数据微调之后，能记住10万条数据，但是其他问题，回答的不好，如何解决？

Issue - State: closed - Opened by cristianohello over 1 year ago - 15 comments

#158 - 有没有大佬试验过哪个更好一些？ptv2和lora参数

Issue - State: closed - Opened by cristianohello over 1 year ago - 1 comment

#156 - 在垂直领域做微调，微调数据有4万条多轮对话数据。怎么设置参数才能让模型记住微调的4万条数据，同时又不影响chatglm其他对话的输出内容？？？

Issue - State: open - Opened by cristianohello over 1 year ago - 4 comments

#152 - 你好，我想问一下，如何修改配置参数或者如何微调模型才能让模型记住多轮对话数据集中的问和回复？

Issue - State: closed - Opened by cristianohello over 1 year ago - 5 comments

#146 - 求助 lora load_in_8bit 参数设置

Issue - State: closed - Opened by Zarc98 over 1 year ago - 34 comments
Labels: bug, good issue

#141 - 预计什么时候lora能够支持用deepspeed方式训练

Issue - State: closed - Opened by penguindadyy over 1 year ago - 2 comments
Labels: new feature

#137 - 你好大佬，多轮对话的时候，response, history = model.chat()，中的history 作用是什么？如何在第二轮对话的时候利用上第一轮的对话结果q和a?

Issue - State: closed - Opened by cristianohello over 1 year ago - 4 comments

#115 - LoRA做infer的时候用int4之后，模型性能会大幅度下降

Issue - State: open - Opened by JamesQFreeman over 1 year ago - 1 comment

#115 - LoRA做infer的时候用int4之后，模型性能会大幅度下降

Issue - State: open - Opened by JamesQFreeman over 1 year ago - 1 comment

#112 - 131 K Trainable params 1.4 B Non-trainable params 1.4 B Total params。可训练参数为什么很低？131k？

Issue - State: closed - Opened by cristianohello over 1 year ago

#80 - loss不收敛的问题

Issue - State: closed - Opened by weizhenzhao over 1 year ago - 31 comments

GitHub / ssbuild/chatglm_finetuning issues and pull requests