hiyouga/LLaMA-Factory issues and pull requests

#5599 - Base model pretrain doesn't have eos token?

Issue - State: closed - Opened by sts07142 about 1 month ago - 1 comment
Labels: solved

#5598 - Llama3.2 3B 奇慢无比

Issue - State: closed - Opened by dayuyang1999 about 1 month ago - 1 comment
Labels: solved

#5597 - Order of samples when doing batch inference

Issue - State: open - Opened by dayuyang1999 about 1 month ago
Labels: pending

#5596 - ValueError: Trying to set a tensor of shape torch.Size([197002752]) in "weight" (which has shape torch.Size([128256, 3072])), this look incorrect.

Issue - State: open - Opened by amankumarhal about 1 month ago - 3 comments
Labels: pending

#5595 - Question on deepspeed zero3 + qlora

Issue - State: closed - Opened by mces89 about 1 month ago - 1 comment
Labels: solved

#5593 - 直接调用llama-3-8B作为base model并且完成sft训练之后，模型输出一直重复

Issue - State: closed - Opened by ZWH-ASTAR about 1 month ago - 1 comment
Labels: solved

#5592 - [BUG] Unable to run model post training with Unsloth+DoRa+RsLora

Issue - State: closed - Opened by Tejaswgupta about 1 month ago - 2 comments
Labels: solved

#5591 - 微调报错datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset

Issue - State: closed - Opened by xiao-liya about 1 month ago - 1 comment
Labels: solved

#5590 - 2Nodes * 8 A100 80G sft full Qwen2VL OOM

Issue - State: open - Opened by VincentVanNF about 1 month ago - 1 comment
Labels: pending

#5589 - qwen2vl-72b推理脚本

Issue - State: closed - Opened by yhy-2000 about 1 month ago - 1 comment
Labels: solved

#5588 - 如何确保结果完全可复现？？---llama-factory 使用相同数据集，相同超参数，相同随机种子训练出来的结果每次都不一样（有1-2%的指标浮动）

Issue - State: closed - Opened by mfj9999 about 1 month ago - 1 comment
Labels: solved

#5587 - 多机多卡微调卡住不动，两台机器可以ping通，自己写的其他项目的DDP多机多卡分布式可以正常运行

Issue - State: open - Opened by WangJennie about 1 month ago - 9 comments
Labels: pending

#5586 - ds_z3_config.json stage3_prefetch_bucket_size 应该是一个整数

Issue - State: open - Opened by ZhuJD-China about 1 month ago - 3 comments
Labels: pending

#5585 - Support EXAONE3.0 Model

Pull Request - State: closed - Opened by shing100 about 1 month ago - 1 comment
Labels: solved

#5584 - 资源问题

Issue - State: closed - Opened by weilx2267 about 1 month ago - 1 comment
Labels: solved

#5583 - RuntimeError: mat1 and mat2 must have the same dtype, but got BFloat16 and Float in fused_linear_cross_entropy_forward

Issue - State: open - Opened by kostum123 about 1 month ago
Labels: pending

#5582 - Qwen-2.5+unsloth 是否支持64k的训练?

Issue - State: closed - Opened by XiangTodayEatsWhat about 1 month ago - 1 comment
Labels: solved

#5581 - [WIP] Support Pixtral-12B

Pull Request - State: open - Opened by Kuangdd01 about 1 month ago - 4 comments
Labels: pending

#5580 - made a small change to a warning about fa2 for gemma2 models.

Pull Request - State: closed - Opened by amrear about 1 month ago
Labels: solved

#5579 - 如何保存fp16格式的模型权重

Issue - State: closed - Opened by lufred8341 about 1 month ago - 2 comments
Labels: solved

#5578 - 请问有没有一个评估结果展示的界面效果或实现可以参考？

Issue - State: closed - Opened by czhcc about 1 month ago
Labels: wontfix

#5577 - Qlora训练和合并（使用量化后的模型进行Qlora训练，显示不能合并）

Issue - State: closed - Opened by xuwang0117 about 1 month ago - 3 comments
Labels: solved

#5576 - 您好，请咨询一下，llama-factor会有和ray结合在一起训练和推理的计划吗

Issue - State: open - Opened by cicijohn1983 about 1 month ago
Labels: pending

#5574 - support llava-next(video)/video-llava

Pull Request - State: closed - Opened by BUAADreamer about 1 month ago
Labels: solved

#5573 - Issue when saving a checkpoint using unsloth

Issue - State: open - Opened by amrear about 1 month ago
Labels: pending

#5572 - 是否支持昇腾910B多机多卡训练?

Issue - State: closed - Opened by LtroiNGU about 1 month ago - 1 comment
Labels: wontfix, npu

#5570 - 在昇腾环境下推理模型时需要设置do_sample=False才能运行没法调节参数，有什么解决方案吗？

Issue - State: closed - Opened by warmbreeze92 about 1 month ago - 1 comment
Labels: wontfix, npu

#5569 - WSL下无法使用多卡运行

Issue - State: closed - Opened by gotothehill about 1 month ago - 1 comment
Labels: wontfix

#5568 - 可以尽快支持一下最新的LoRA-GA微调方法吗

Issue - State: closed - Opened by xyangyan about 1 month ago
Labels: duplicate

#5567 - when support GOT-OCR2 ？

Issue - State: open - Opened by tbwang-clound about 1 month ago
Labels: pending

#5566 - TDPO

Issue - State: open - Opened by lycheeyolo about 1 month ago
Labels: pending

#5565 - vllm多卡推理遇到的问题-qwen2.5

Issue - State: open - Opened by YChengxin about 1 month ago
Labels: pending

#5564 - vllm多卡推理-qwen2.5遇到的问题

Issue - State: closed - Opened by YChengxin about 1 month ago
Labels: duplicate

#5563 - Pei eng patch 1

Pull Request - State: closed - Opened by Pei-eng about 1 month ago
Labels: invalid

#5562 - How to align qwen2-vl fine-tuning training and vllm inference formats

Issue - State: open - Opened by xuyifan-0731 about 1 month ago
Labels: pending

#5561 - 把qwen2-7b训练模型变更成qwen2.5-32b，训练完成后推理结果不会停止

Issue - State: open - Opened by wenocy about 1 month ago - 7 comments
Labels: pending

#5560 - 多机多卡训练一直停留在这个地方两台机器是局域网还需要注意什么？

Issue - State: closed - Opened by ZhuJD-China about 1 month ago - 1 comment
Labels: solved

#5559 - H20 100G*8 qwen14B full sft do_predict 阶段运行终止 failed (exitcode: -8)

Issue - State: closed - Opened by amoyplane about 1 month ago - 4 comments
Labels: solved

#5558 - 多机多卡训练想问下现在是不支持accelerate launch训练吗

Issue - State: closed - Opened by Hansen06 about 1 month ago - 1 comment
Labels: solved

#5557 - HQQ quantization fails to serialize model

Issue - State: closed - Opened by TweedBeetle about 1 month ago - 1 comment
Labels: solved

#5556 - 如何加载模型进行beam_search

Issue - State: open - Opened by Maydaytyh about 1 month ago
Labels: pending

#5555 - Support llama3.2vl(WIP).

Pull Request - State: open - Opened by marko1616 about 1 month ago
Labels: pending

#5554 - Will it support SFT multi-modal large models (for example, qwen2-vl) with plain text?

Issue - State: closed - Opened by zhshj0110 about 1 month ago - 2 comments
Labels: solved

#5553 - A100 80G *4 sft full Qwen2-VL-72B-Instruct OOM

Issue - State: closed - Opened by VincentVanNF about 1 month ago - 3 comments
Labels: duplicate

#5552 - 原始模型和lora模型批量推理速度差异

Issue - State: closed - Opened by yysj-zq about 1 month ago - 2 comments
Labels: duplicate

#5551 - npu环境下tokenizer超时

Issue - State: closed - Opened by lambda-lee about 1 month ago - 2 comments
Labels: solved, npu

#5550 - 使用最新26日更新的LLamaFactory full sft训练qwen2-1.5B-instruct，出现deepspeed错误如下

Issue - State: closed - Opened by xiehust about 1 month ago - 2 comments
Labels: solved

#5549 - LLaMa-factory 部署Llama-3.2-11B-Vision-Instruct 运行报错

Issue - State: closed - Opened by caijx168 about 1 month ago - 20 comments
Labels: wontfix

#5548 - [Help]webui学习率调节器选择warmup_stable_decay时出现问题

Issue - State: open - Opened by ishkong about 1 month ago
Labels: enhancement, pending

#5547 - Chore: Support llama3.2.

Pull Request - State: closed - Opened by marko1616 about 1 month ago
Labels: solved

#5546 - 1, log exceptions in details; 2, check processor is None before calling it

Pull Request - State: closed - Opened by chengchengpei about 1 month ago

#5545 - OOM when fine tuning 8b with ~64k cutoff_len

Issue - State: closed - Opened by TweedBeetle about 1 month ago - 2 comments
Labels: solved

#5544 - 【提示】transformers>=4.43.0小显存训练时不加入以下参数容易导致显存累积，直至爆显存/OOM

Issue - State: open - Opened by xd2333 about 1 month ago - 5 comments
Labels: pending

#5543 - Qlora训练和合并模型问题

Issue - State: closed - Opened by bitallin about 1 month ago - 4 comments
Labels: solved

#5542 - Liger kernel brake fine tuning

Issue - State: closed - Opened by arit2 about 1 month ago - 4 comments
Labels: solved

#5541 - qwen2.5-7B-instruct lora 微调 loss一直为0.0

Issue - State: closed - Opened by Liufeiran123 about 1 month ago
Labels: invalid

#5540 - 用llamafactory chat加载qwen2-vl-72b模型推理视频效果很差的原因是什么？

Issue - State: open - Opened by J0eky about 1 month ago
Labels: pending

#5539 - lora微调qwen2.5-math-7b出问题

Issue - State: open - Opened by lin-dy about 1 month ago - 2 comments
Labels: pending

#5538 - Qwen2VL模型图像识别不准

Issue - State: open - Opened by JohnZhuYX about 1 month ago - 1 comment
Labels: pending

#5537 - throughput is much slower than expected for pre-training

Issue - State: open - Opened by lingchensanwen about 1 month ago
Labels: pending

#5536 - Update identity.json

Pull Request - State: closed - Opened by Cherry39-lab about 1 month ago
Labels: invalid

#5535 - 训练时指定evaluation_set(validation_set)

Issue - State: closed - Opened by mzc2113391 about 1 month ago - 1 comment
Labels: solved

#5534 - 训练后的agent模型，vllm部署后工具调用失败

Issue - State: closed - Opened by bingoohe about 1 month ago - 1 comment
Labels: solved

#5533 - Add additional install options to Dockerfiles

Pull Request - State: closed - Opened by StrangeBytesDev about 1 month ago
Labels: solved

#5532 - feat: Long Text Fine-Tuning Support

Pull Request - State: open - Opened by glide-the about 2 months ago - 1 comment

#5531 - Cannot install llamafactory 0.9.1.dev0 (from /code/LLaMA-Factory) because these package versions have conflicting dependencies.

Issue - State: closed - Opened by leoozy about 2 months ago - 1 comment
Labels: solved

#5530 - deepspeed 单机多卡sft时，如何只保存模型

Issue - State: closed - Opened by lufred8341 about 2 months ago - 1 comment
Labels: solved

#5529 - 使用 unsloth 时，qwen模型会报错

Issue - State: closed - Opened by cat-knight about 2 months ago - 1 comment
Labels: wontfix

#5528 - No liger kernels will be applied. Qwen2-vl

Issue - State: closed - Opened by arit2 about 2 months ago - 1 comment
Labels: solved

#5527 - 微调qwen2 video 的时候出现，list index out of range

Issue - State: closed - Opened by wudidaxuexue about 2 months ago - 4 comments
Labels: solved

#5526 - LORA微调LLaMa2-7b-chat，推理时报错 Some keys are not used by the HfArgumentParser: ['eval_dataset', 'quantization_method']

Issue - State: closed - Opened by Math312 about 2 months ago - 2 comments
Labels: solved

#5525 - Flash 2 attention warning, flash attention not working properly for qwen2vl

Issue - State: closed - Opened by sharonsalabiglossai about 2 months ago - 3 comments
Labels: solved

#5524 - qwen2 预训练的loss震荡

Issue - State: closed - Opened by allen20200111 about 2 months ago - 1 comment
Labels: solved

#5523 - 模型评估，ceval数据集评估结果均为0，初步分析原因：测试所用的ceval->test数据集没有answer，是什么情况呢？

Issue - State: closed - Opened by yiyayieryo about 2 months ago - 2 comments
Labels: duplicate

#5522 - [Update] loader.py , evaluate will run separate evaluations on each eval_dataset

Pull Request - State: open - Opened by SrWYG about 2 months ago - 2 comments
Labels: pending

#5521 - Support for loading local HuggingFace-formatted Datasets

Issue - State: closed - Opened by nathan-az about 2 months ago - 2 comments
Labels: solved

#5520 - 群二维码已经过期，无法加入，需要更新一下

Issue - State: closed - Opened by eddey666 about 2 months ago
Labels: solved

#5519 - 训练时视频处理逻辑

Issue - State: open - Opened by liuao743 about 2 months ago - 1 comment
Labels: pending

#5518 - function call 微调后部署问题

Issue - State: open - Opened by pugnazhaotianqi about 2 months ago
Labels: pending

#5517 - LlamaFactory模型合并后，推理速度很慢，且重复和乱答，动态推理正常表现

Issue - State: closed - Opened by Scorponok31 about 2 months ago - 2 comments
Labels: invalid

#5516 - 使用Qwen-7B使用Qlora时报错在阿里的PAI-DSW

Issue - State: closed - Opened by lzy728 about 2 months ago - 1 comment
Labels: solved

#5515 - READ.ME中看到已经支持Qwen2.5(千问2.5)但是选择模版时，还是没有Qwen2和Qwen2.5的模版

Issue - State: closed - Opened by lishiyucn about 2 months ago - 2 comments
Labels: solved

#5514 - 默认的optimizer是什么？如何添加自己的optimizer如SGD？

Issue - State: closed - Opened by DSW2001 about 2 months ago - 1 comment
Labels: solved

#5513 - 有可能对train函数加上差分隐私的训练处理吗，如果我想对sft微调训练过程中使用opacus加入差分隐私处理，我该怎么做？

Issue - State: open - Opened by DSW2001 about 2 months ago - 1 comment
Labels: pending

#5512 - How to train the mm_proj and the LLM part with lora of Qwen2-VL

Issue - State: open - Opened by leoozy about 2 months ago - 3 comments
Labels: pending

#5511 - 请问作者有计划支持序列并行相关的能力吗，类似于 xtuner 那种，类似于感觉可以集成 xtuner 的序列并行接口

Issue - State: closed - Opened by ldh127 about 2 months ago - 1 comment
Labels: solved

#5510 - [Feature Request] 请问能加入Liger-Kernel的支持吗?

Issue - State: closed - Opened by Orion-zhen about 2 months ago - 1 comment
Labels: solved

#5509 - 请问一下多图训练的时候如何指定每张图的像素？Internvl在训练的时候就有相关的功能

Issue - State: open - Opened by leoozy about 2 months ago
Labels: pending

#5508 - glm4微调导入模型报错

Issue - State: closed - Opened by WWeellkkiinn about 2 months ago - 1 comment
Labels: solved

#5507 - Add deepseek-v2.5 template

Pull Request - State: open - Opened by piamo about 2 months ago
Labels: pending

#5506 - Deepseek v2.5的 template 变了，与 v2不同

Issue - State: open - Opened by piamo about 2 months ago
Labels: pending

#5505 - WSD Learning rate scheduling problem

Issue - State: closed - Opened by runningto about 2 months ago - 1 comment
Labels: solved

#5504 - pretrain from scratch 输出都是数字

Issue - State: open - Opened by UbeCc about 2 months ago
Labels: pending

#5503 - KeyError: 'messages'

Issue - State: closed - Opened by laozhai507 about 2 months ago
Labels: invalid

#5502 - 最新版代码不支持 visual_inputs 参数

Issue - State: closed - Opened by laozhai507 about 2 months ago - 1 comment
Labels: solved

#5501 - Qwen2-VL在微调之后进行merge的过程中出现ValueError: Unrecognized configuration class问题

Issue - State: closed - Opened by AmuzeLu about 2 months ago - 2 comments
Labels: solved

#5499 - 在checkpoint上继续训练，没有保存训练后的checkpint

Issue - State: closed - Opened by cuisws about 2 months ago - 2 comments
Labels: solved

#5498 - 数据长度过长，开了zero3后依旧是一个显卡装不下一条数据，没法训练

Issue - State: closed - Opened by zhangyuqi-1 about 2 months ago - 2 comments
Labels: duplicate

#5497 - save_only_model后无法续训

Issue - State: open - Opened by yuepengs about 2 months ago
Labels: pending

#5496 - Can you support Jamba 1.5 model and Mamba family models, mamba2-hybrid, ssm model, etc pls?

Issue - State: open - Opened by badrabbitt about 2 months ago
Labels: pending

GitHub / hiyouga/LLaMA-Factory issues and pull requests