hiyouga/LLaMA-Factory issues and pull requests

#5495 - qwen2-vl无法启动

Issue - State: closed - Opened by JohnZhuYX about 2 months ago - 1 comment
Labels: solved

#5494 - `report_to: none`

Issue - State: closed - Opened by kanqgg about 2 months ago
Labels: invalid

#5493 - 用视频数据微调qwen2-vl-7b的算力要求是什么？

Issue - State: open - Opened by J0eky about 2 months ago
Labels: pending

#5492 - wandb

Issue - State: closed - Opened by kanqgg about 2 months ago - 2 comments
Labels: solved

#5491 - 昇腾910B npu8卡训练显存不足

Issue - State: open - Opened by LtroiNGU about 2 months ago - 2 comments
Labels: pending, npu

#5490 - 【help】训练中途突然显存暴涨导致OOM

Issue - State: closed - Opened by RRRRRayyyyy about 2 months ago - 1 comment
Labels: solved

#5489 - Is LLAVA chat template correct?

Issue - State: open - Opened by mibejjh about 2 months ago
Labels: pending

#5488 - Running on machines with limited number of online programs

Issue - State: open - Opened by moshushi007ow about 2 months ago
Labels: pending

#5487 - 微调qwen2.5系列模型需要换template吗？看模型官方tokenizer_config.json中的template有变化

Issue - State: closed - Opened by nieallen about 2 months ago - 6 comments
Labels: solved

#5486 - [Draft] Add AutoRound support

Pull Request - State: open - Opened by wenhuach21 about 2 months ago - 2 comments

#5485 - 启动 webui失败

Issue - State: open - Opened by ClementeGao about 2 months ago
Labels: pending

#5484 - 请问DPO训练的时候有什么注意事项吗？我训练出来效果很差。

Issue - State: open - Opened by zlh-source about 2 months ago - 9 comments
Labels: pending

#5483 - fix: 修复function call数据集如果 function_call 值的为不合法json，异常提示且中断训练。

Pull Request - State: closed - Opened by whybeyoung about 2 months ago
Labels: solved

#5482 - Qlora 报错

Issue - State: closed - Opened by dayuyang1999 about 2 months ago - 1 comment
Labels: solved

#5481 - 如何加快微调qwen2-vl-7b后合并的模型在视频上的推理速度？

Issue - State: closed - Opened by J0eky about 2 months ago - 1 comment
Labels: solved

#5480 - fix ppo_freeze mat1 mat2 should have the same dtype

Pull Request - State: open - Opened by ex-yanminmin001 about 2 months ago - 4 comments

#5479 - qwen2vl-sft后如何将adapter_model.safetensors和模型原始参数合并使用

Issue - State: closed - Opened by YajieW99 about 2 months ago - 1 comment
Labels: solved

#5478 - Can we set default_system in yaml file when training?

Issue - State: closed - Opened by Huarong about 2 months ago - 1 comment
Labels: solved

#5477 - qwen2vl训练需要修改position_ids问题吗

Issue - State: closed - Opened by sunzjz about 2 months ago - 2 comments
Labels: solved

#5476 - qwen2-1.5微调训练后tokenizer_config.json中的chat_template值被改了

Issue - State: closed - Opened by czhcc about 2 months ago - 2 comments
Labels: solved

#5475 - Fix phi-3-small issues

Pull Request - State: open - Opened by menibrief about 2 months ago - 1 comment

#5474 - 训练时template设为empty时，label开头会加上<|EOT|>，之前的版本好像不会这样

Issue - State: open - Opened by haoranjun about 2 months ago
Labels: pending

#5473 - Support Mistral format tools

Pull Request - State: open - Opened by AlongWY about 2 months ago - 1 comment

#5472 - 只全参数微调Qwen2-VL-7B-Instruct的visual.merger部分，冻结其他模型参数，训练过程报错

Issue - State: open - Opened by wjx-sudo about 2 months ago - 6 comments
Labels: pending

#5471 - 多卡微调时报错

Issue - State: closed - Opened by Maydaytyh about 2 months ago - 2 comments
Labels: solved

#5469 - 如何自己编写代码加载合并后的模型推理视频？

Issue - State: closed - Opened by J0eky about 2 months ago - 1 comment
Labels: solved

#5468 - 请问支持多图Qwen2-VL-7B-Instruct微调吗？数据格式有示例嚒？

Issue - State: closed - Opened by WorldHellooo about 2 months ago - 6 comments
Labels: solved

#5467 - 如何自动保存checkpoint?

Issue - State: closed - Opened by dayuyang1999 about 2 months ago - 2 comments
Labels: solved

#5466 - webui启动之后框内元素无法渲染

Issue - State: closed - Opened by DSW2001 about 2 months ago - 1 comment
Labels: solved

#5465 - sft do_predict, 生成的json 文件的 label 都是空

Issue - State: closed - Opened by dayuyang1999 about 2 months ago - 1 comment
Labels: wontfix

#5464 - 请问SFT之后的模型在推理的时候，是否可以返回多个response？

Issue - State: closed - Opened by zlh-source about 2 months ago - 3 comments
Labels: solved

#5463 - 依赖项安装不了，cuda已安装

Issue - State: closed - Opened by DSW2001 about 2 months ago - 2 comments
Labels: solved

#5462 - qwen2_vl模型训练异常

Issue - State: open - Opened by will-wiki about 2 months ago - 3 comments
Labels: pending

#5461 - AttributeError: 'Qwen2Attention' object has no attribute 'max_position_embeddings'

Issue - State: open - Opened by chengchengpei about 2 months ago - 1 comment
Labels: pending

#5460 - Tips for implementing LlaMa-Factory for new Hardwares

Issue - State: open - Opened by EtashGuha about 2 months ago
Labels: pending

#5459 - Do you support for full parameters pre-training?

Issue - State: closed - Opened by lingchensanwen about 2 months ago - 1 comment
Labels: solved

#5458 - Flatting Packing / maybe fix #5443 and #5426

Pull Request - State: open - Opened by AlongWY about 2 months ago - 11 comments
Labels: pending

#5457 - no such a file or directory of data

Issue - State: closed - Opened by Esmail-ibraheem about 2 months ago - 2 comments
Labels: solved

#5456 - max pixels argument

Issue - State: open - Opened by sharonsalabiglossai about 2 months ago - 1 comment
Labels: pending

#5455 - "Cannot find valid samples" when running DPO on llama3-8b

Issue - State: closed - Opened by zky-kf about 2 months ago - 3 comments
Labels: solved

#5454 - 多卡制定HF_DATASETS_CACHE会报错

Issue - State: closed - Opened by Fu-Dayuan about 2 months ago - 2 comments
Labels: solved

#5453 - ValueError: Template qwen2 does not exist.

Issue - State: closed - Opened by Oyounger about 2 months ago - 1 comment
Labels: solved

#5451 - Correctly pass gen_kwarg to eval during model runs

Pull Request - State: open - Opened by aliencaocao about 2 months ago - 1 comment
Labels: pending

#5450 - 多机多卡运行报错

Issue - State: open - Opened by hecheng64 about 2 months ago
Labels: pending

#5449 - qwen2-vl双卡全量微调OOM

Issue - State: closed - Opened by hitsz-zxw about 2 months ago - 4 comments
Labels: solved

#5447 - 对微调后的GLM-4-9B-Chat运行examples/train_lora/llama3_lora_predict.yaml出错

Issue - State: open - Opened by Twilightsh about 2 months ago - 1 comment
Labels: pending

#5446 - 设置随机数种子后，相同数据集和配置的每次训练loss还是不一样

Issue - State: closed - Opened by andy7002 about 2 months ago - 2 comments
Labels: solved

#5445 - qizhen

Pull Request - State: closed - Opened by A-magic about 2 months ago

#5444 - model.generate的参数在yaml中设定无效，我设了do_sample: false，使用profiler查看实际还是true 此问题只在训练中途的eval发生，训练结束的最后一次eval正常

Issue - State: open - Opened by aliencaocao about 2 months ago
Labels: pending

#5443 - Running tokenizer on dataset 速度逐渐变慢

Issue - State: open - Opened by xuyue1112 about 2 months ago - 3 comments
Labels: pending

#5442 - bitsandbytes qlora微调模型推理

Issue - State: open - Opened by oulin1031esti about 2 months ago - 2 comments
Labels: pending

#5441 - help on understanding the implementation of FSDP.

Issue - State: open - Opened by jq-wei about 2 months ago
Labels: pending

#5440 - 如何在使用 openai 风格部署时，使用 beam search

Issue - State: open - Opened by cat-knight about 2 months ago
Labels: pending

#5439 - Llama-factory使用错误

Issue - State: closed - Opened by lifelsl about 2 months ago
Labels: invalid

#5438 - Add qwen_vl to liger kernel supported list

Pull Request - State: closed - Opened by aliencaocao about 2 months ago
Labels: solved

#5437 - 请问使用qlora微调后生成的模型中哪里体现了量化的配置参数

Issue - State: closed - Opened by yangxue-1 about 2 months ago - 1 comment
Labels: solved

#5436 - 微调后词表长度不一致怎么办

Issue - State: open - Opened by topology1 about 2 months ago
Labels: pending

#5435 - Gemma 2 + unsloth + fa2 full SFT RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Issue - State: open - Opened by hengdos about 2 months ago
Labels: pending

#5434 - 请问，llamafactory现在支持在昇腾910上进行模型评估嘛？

Issue - State: open - Opened by yiyayieryo about 2 months ago - 3 comments
Labels: pending, npu

#5433 - 如何使用自己的reward函数而不是使用reward model呢？

Issue - State: closed - Opened by destroy-lonely about 2 months ago - 2 comments
Labels: solved

#5432 - webui Chat hugging face 总是乱码

Issue - State: closed - Opened by Rane2021 about 2 months ago - 2 comments
Labels: wontfix

#5431 - Latest LLaMA-Factory repo force to use Troch 2.4 hence is clashing with Unsloth/XFormers

Issue - State: open - Opened by thusinh1969 about 2 months ago - 3 comments
Labels: pending

#5430 - 模型eval时只输出loss等info，却没有acc？求助（是环境问题吗）

Issue - State: closed - Opened by phbst about 2 months ago - 3 comments
Labels: solved

#5429 - gptq模型加载很慢

Issue - State: open - Opened by cat-knight about 2 months ago
Labels: pending

#5428 - [QUESTION] 数据的preprocess的过程是否存在问题

Issue - State: closed - Opened by LaniakeaS about 2 months ago - 3 comments
Labels: wontfix

#5427 - Update the ROCm version to 6.2

Pull Request - State: closed - Opened by HardAndHeavy about 2 months ago - 1 comment
Labels: solved

#5426 - 使用neat_packing进行sft训练，模型性能指标下降明显

Issue - State: open - Opened by muziyongshixin about 2 months ago - 6 comments
Labels: pending

#5425 - 310P 微调报错 RuntimeError: call aclnnCast failed, detail:EZ9999: Inner Error

Issue - State: open - Opened by Tao-begd about 2 months ago - 2 comments
Labels: pending, npu

#5424 - [WIP] add florence2

Pull Request - State: open - Opened by Sanster about 2 months ago
Labels: pending

#5423 - [Question]support for quantization algorithms that are not performed on-the-fly

Issue - State: open - Opened by wenhuach21 about 2 months ago - 3 comments
Labels: pending

#5422 - npu全量微调qwen2结束时报错 'Qwen2ForCausalLM' object has no attribute 'create_or_update_model_card'

Issue - State: closed - Opened by XYZliang about 2 months ago - 2 comments
Labels: solved, npu

#5421 - 求助：请问使用llama-factory 的教程里的Batch Predicting and Computing BLEU and ROUGE Scores 对下的脚本llamafactory-cli train examples/train_lora/llama3_lora_predict.yaml 来进行预测时，默认的参数是多少，如topk,temperature，max_len等参数是多少，或者有什么命令可以查看吗

Issue - State: closed - Opened by mfj9999 about 2 months ago - 1 comment
Labels: solved

#5420 - glm-4-9b-chat llamafactory-cli lora_pretrain后回答特别长

Issue - State: closed - Opened by vipcong816 about 2 months ago - 1 comment
Labels: solved

#5419 - QWEN1.5系列模型在训练时进入eval阶段就报OOM

Issue - State: closed - Opened by Cucunnber about 2 months ago - 1 comment
Labels: solved

#5418 - 昇腾链接打不开

Issue - State: closed - Opened by shijinkui about 2 months ago - 1 comment
Labels: solved, npu

#5417 - Meet c10::DistBackendError when finetuning Qwen2-VL with video dataset

Issue - State: closed - Opened by htlou about 2 months ago - 3 comments
Labels: solved

#5416 - qwen2vl 7b , how to call api

Issue - State: closed - Opened by xddun about 2 months ago - 3 comments
Labels: solved

#5415 - Código descuento Dogfy Diet

Issue - State: closed - Opened by fernichum about 2 months ago
Labels: invalid

#5414 - 微调后的lora模型用vllm加载报错

Issue - State: closed - Opened by zhangyuqi-1 about 2 months ago - 2 comments
Labels: solved

#5413 - 正常训练模型结束后predict报错“RuntimeError: The expanded size of the tensor (496) must match the existing size (495) at non-singleton dimension 3. Target sizes: [16, 64, 1, 496]. Tensor sizes: [16, 1, 1, 495]”

Issue - State: closed - Opened by mfj9999 about 2 months ago
Labels: wontfix

#5412 - deepspeed的配置是通过什么传给模型的？

Issue - State: closed - Opened by pagepal666 about 2 months ago - 1 comment
Labels: wontfix

#5411 - 奖励模型部署问题

Issue - State: closed - Opened by sangzy23 about 2 months ago
Labels: solved

#5410 - glm-4-9b-chat.tokenization_chatglm

Issue - State: closed - Opened by zhaohongpu about 2 months ago - 1 comment
Labels: solved

#5409 - llamafactory-cli api 使用vllm与直接使用vllm serve模型效果不一致

Issue - State: open - Opened by jackiezzq about 2 months ago
Labels: pending

#5408 - 多模态数据处理逻辑有点奇怪

Issue - State: closed - Opened by will-wiki about 2 months ago - 1 comment
Labels: solved

#5407 - PPO训练问题

Issue - State: open - Opened by yang-chenyu104 about 2 months ago - 1 comment
Labels: pending

#5406 - Got unknown args, potentially deprecated arguments: ['--pref_loss:', 'simpo']

Issue - State: closed - Opened by JohnZhuYX about 2 months ago
Labels: invalid

#5405 - support llava-next(video)/video-llava/idefics2

Pull Request - State: closed - Opened by BUAADreamer about 2 months ago

#5404 - qwen2-vl微调目标检测，webchat 输出内容为空，但transformers加载模型可正常输出

Issue - State: closed - Opened by thunder95 about 2 months ago
Labels: wontfix

#5403 - 您好这个项目支持天数智芯的智铠，天垓系列GPU吗

Issue - State: closed - Opened by yang-yu-don about 2 months ago - 1 comment
Labels: solved

#5402 - AttributeError: 'AdamW' object has no attribute 'train'

Issue - State: closed - Opened by Oseemaker about 2 months ago - 1 comment
Labels: solved

#5401 - use LLaMA Pro，如何配置扩展参数

Issue - State: closed - Opened by whywhy258 about 2 months ago - 2 comments
Labels: solved

#5400 - Qwen2-vl sft训练问题，报错 AttributeError: 'DeepSpeedZeroOptimizer_Stage3' object has no attribute 'train'

Issue - State: closed - Opened by kbl-dong about 2 months ago - 14 comments
Labels: solved

#5399 - LLaMA-Factory运行webui后页面一片空白

Issue - State: closed - Opened by cxw-11 2 months ago - 3 comments
Labels: wontfix

#5398 - Why is LoRA much slower than Freeze?

Issue - State: closed - Opened by gugugu-469 2 months ago - 1 comment
Labels: solved

#5397 - qwen2-7b 使用vllm api接口报错

Issue - State: closed - Opened by EASTERNTIGER 2 months ago - 1 comment
Labels: duplicate

#5396 - llama factory save strategy

Issue - State: closed - Opened by 0awei0 2 months ago - 4 comments
Labels: solved

#5395 - 运行Qwen-VL的sft训练时卡住，请问这是什么原因呢？

Issue - State: closed - Opened by GzmCR 2 months ago - 5 comments
Labels: solved

#5394 - transformer版本升级原因

Issue - State: closed - Opened by liuzhengzheng123 2 months ago - 1 comment
Labels: solved

#5393 - 使用LLamafactory加载模型和使用huggingface加载模型，效果不同

Issue - State: closed - Opened by zhaoweiguo 2 months ago - 2 comments
Labels: solved

GitHub / hiyouga/LLaMA-Factory issues and pull requests