Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / hiyouga/LLaMA-Factory issues and pull requests
#5599 - Base model pretrain doesn't have eos token?
Issue -
State: closed - Opened by sts07142 about 1 month ago
- 1 comment
Labels: solved
#5598 - Llama3.2 3B 奇慢无比
Issue -
State: closed - Opened by dayuyang1999 about 1 month ago
- 1 comment
Labels: solved
#5597 - Order of samples when doing batch inference
Issue -
State: open - Opened by dayuyang1999 about 1 month ago
Labels: pending
#5596 - ValueError: Trying to set a tensor of shape torch.Size([197002752]) in "weight" (which has shape torch.Size([128256, 3072])), this look incorrect.
Issue -
State: open - Opened by amankumarhal about 1 month ago
- 3 comments
Labels: pending
#5595 - Question on deepspeed zero3 + qlora
Issue -
State: closed - Opened by mces89 about 1 month ago
- 1 comment
Labels: solved
#5593 - 直接调用llama-3-8B作为base model并且完成sft训练之后,模型输出一直重复
Issue -
State: closed - Opened by ZWH-ASTAR about 1 month ago
- 1 comment
Labels: solved
#5592 - [BUG] Unable to run model post training with Unsloth+DoRa+RsLora
Issue -
State: closed - Opened by Tejaswgupta about 1 month ago
- 2 comments
Labels: solved
#5591 - 微调报错datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
Issue -
State: closed - Opened by xiao-liya about 1 month ago
- 1 comment
Labels: solved
#5590 - 2Nodes * 8 A100 80G sft full Qwen2VL OOM
Issue -
State: open - Opened by VincentVanNF about 1 month ago
- 1 comment
Labels: pending
#5589 - qwen2vl-72b推理脚本
Issue -
State: closed - Opened by yhy-2000 about 1 month ago
- 1 comment
Labels: solved
#5588 - 如何确保结果完全可复现??---llama-factory 使用相同数据集,相同超参数,相同随机种子训练出来的结果每次都不一样(有1-2%的指标浮动)
Issue -
State: closed - Opened by mfj9999 about 1 month ago
- 1 comment
Labels: solved
#5587 - 多机多卡微调卡住不动,两台机器可以ping通,自己写的其他项目的DDP多机多卡分布式可以正常运行
Issue -
State: open - Opened by WangJennie about 1 month ago
- 9 comments
Labels: pending
#5586 - ds_z3_config.json stage3_prefetch_bucket_size 应该是一个整数
Issue -
State: open - Opened by ZhuJD-China about 1 month ago
- 3 comments
Labels: pending
#5585 - Support EXAONE3.0 Model
Pull Request -
State: closed - Opened by shing100 about 1 month ago
- 1 comment
Labels: solved
#5584 - 资源问题
Issue -
State: closed - Opened by weilx2267 about 1 month ago
- 1 comment
Labels: solved
#5583 - RuntimeError: mat1 and mat2 must have the same dtype, but got BFloat16 and Float in fused_linear_cross_entropy_forward
Issue -
State: open - Opened by kostum123 about 1 month ago
Labels: pending
#5582 - Qwen-2.5+unsloth 是否支持64k的训练?
Issue -
State: closed - Opened by XiangTodayEatsWhat about 1 month ago
- 1 comment
Labels: solved
#5581 - [WIP] Support Pixtral-12B
Pull Request -
State: open - Opened by Kuangdd01 about 1 month ago
- 4 comments
Labels: pending
#5580 - made a small change to a warning about fa2 for gemma2 models.
Pull Request -
State: closed - Opened by amrear about 1 month ago
Labels: solved
#5579 - 如何保存fp16格式的模型权重
Issue -
State: closed - Opened by lufred8341 about 1 month ago
- 2 comments
Labels: solved
#5578 - 请问有没有一个评估结果展示的界面效果或实现可以参考?
Issue -
State: closed - Opened by czhcc about 1 month ago
Labels: wontfix
#5577 - Qlora训练和合并(使用量化后的模型进行Qlora训练,显示不能合并)
Issue -
State: closed - Opened by xuwang0117 about 1 month ago
- 3 comments
Labels: solved
#5576 - 您好,请咨询一下,llama-factor会有和ray结合在一起训练和推理的计划吗
Issue -
State: open - Opened by cicijohn1983 about 1 month ago
Labels: pending
#5574 - support llava-next(video)/video-llava
Pull Request -
State: closed - Opened by BUAADreamer about 1 month ago
Labels: solved
#5573 - Issue when saving a checkpoint using unsloth
Issue -
State: open - Opened by amrear about 1 month ago
Labels: pending
#5572 - 是否支持昇腾910B多机多卡训练?
Issue -
State: closed - Opened by LtroiNGU about 1 month ago
- 1 comment
Labels: wontfix, npu
#5570 - 在昇腾环境下推理模型时需要设置do_sample=False才能运行没法调节参数,有什么解决方案吗?
Issue -
State: closed - Opened by warmbreeze92 about 1 month ago
- 1 comment
Labels: wontfix, npu
#5569 - WSL下无法使用多卡运行
Issue -
State: closed - Opened by gotothehill about 1 month ago
- 1 comment
Labels: wontfix
#5568 - 可以尽快支持一下最新的LoRA-GA微调方法吗
Issue -
State: closed - Opened by xyangyan about 1 month ago
Labels: duplicate
#5567 - when support GOT-OCR2 ?
Issue -
State: open - Opened by tbwang-clound about 1 month ago
Labels: pending
#5566 - TDPO
Issue -
State: open - Opened by lycheeyolo about 1 month ago
Labels: pending
#5565 - vllm多卡推理遇到的问题-qwen2.5
Issue -
State: open - Opened by YChengxin about 1 month ago
Labels: pending
#5564 - vllm多卡推理-qwen2.5遇到的问题
Issue -
State: closed - Opened by YChengxin about 1 month ago
Labels: duplicate
#5563 - Pei eng patch 1
Pull Request -
State: closed - Opened by Pei-eng about 1 month ago
Labels: invalid
#5562 - How to align qwen2-vl fine-tuning training and vllm inference formats
Issue -
State: open - Opened by xuyifan-0731 about 1 month ago
Labels: pending
#5561 - 把qwen2-7b训练模型变更成qwen2.5-32b,训练完成后推理结果不会停止
Issue -
State: open - Opened by wenocy about 1 month ago
- 7 comments
Labels: pending
#5560 - 多机多卡训练 一直停留在这个地方 两台机器是局域网 还需要注意什么?
Issue -
State: closed - Opened by ZhuJD-China about 1 month ago
- 1 comment
Labels: solved
#5559 - H20 100G*8 qwen14B full sft do_predict 阶段运行终止 failed (exitcode: -8)
Issue -
State: closed - Opened by amoyplane about 1 month ago
- 4 comments
Labels: solved
#5558 - 多机多卡训练 想问下现在是不支持accelerate launch训练吗
Issue -
State: closed - Opened by Hansen06 about 1 month ago
- 1 comment
Labels: solved
#5557 - HQQ quantization fails to serialize model
Issue -
State: closed - Opened by TweedBeetle about 1 month ago
- 1 comment
Labels: solved
#5556 - 如何加载模型进行beam_search
Issue -
State: open - Opened by Maydaytyh about 1 month ago
Labels: pending
#5555 - Support llama3.2vl(WIP).
Pull Request -
State: open - Opened by marko1616 about 1 month ago
Labels: pending
#5554 - Will it support SFT multi-modal large models (for example, qwen2-vl) with plain text?
Issue -
State: closed - Opened by zhshj0110 about 1 month ago
- 2 comments
Labels: solved
#5553 - A100 80G *4 sft full Qwen2-VL-72B-Instruct OOM
Issue -
State: closed - Opened by VincentVanNF about 1 month ago
- 3 comments
Labels: duplicate
#5552 - 原始模型和lora模型批量推理速度差异
Issue -
State: closed - Opened by yysj-zq about 1 month ago
- 2 comments
Labels: duplicate
#5551 - npu环境下tokenizer超时
Issue -
State: closed - Opened by lambda-lee about 1 month ago
- 2 comments
Labels: solved, npu
#5550 - 使用最新26日更新的LLamaFactory full sft训练qwen2-1.5B-instruct,出现deepspeed错误如下
Issue -
State: closed - Opened by xiehust about 1 month ago
- 2 comments
Labels: solved
#5549 - LLaMa-factory 部署Llama-3.2-11B-Vision-Instruct 运行报错
Issue -
State: closed - Opened by caijx168 about 1 month ago
- 20 comments
Labels: wontfix
#5548 - [Help]webui学习率调节器选择warmup_stable_decay时出现问题
Issue -
State: open - Opened by ishkong about 1 month ago
Labels: enhancement, pending
#5547 - Chore: Support llama3.2.
Pull Request -
State: closed - Opened by marko1616 about 1 month ago
Labels: solved
#5546 - 1, log exceptions in details; 2, check processor is None before calling it
Pull Request -
State: closed - Opened by chengchengpei about 1 month ago
#5545 - OOM when fine tuning 8b with ~64k cutoff_len
Issue -
State: closed - Opened by TweedBeetle about 1 month ago
- 2 comments
Labels: solved
#5544 - 【提示】transformers>=4.43.0小显存训练时不加入以下参数容易导致显存累积,直至爆显存/OOM
Issue -
State: open - Opened by xd2333 about 1 month ago
- 5 comments
Labels: pending
#5543 - Qlora训练和合并模型问题
Issue -
State: closed - Opened by bitallin about 1 month ago
- 4 comments
Labels: solved
#5542 - Liger kernel brake fine tuning
Issue -
State: closed - Opened by arit2 about 1 month ago
- 4 comments
Labels: solved
#5541 - qwen2.5-7B-instruct lora 微调 loss一直为0.0
Issue -
State: closed - Opened by Liufeiran123 about 1 month ago
Labels: invalid
#5540 - 用llamafactory chat加载qwen2-vl-72b模型推理视频效果很差的原因是什么?
Issue -
State: open - Opened by J0eky about 1 month ago
Labels: pending
#5539 - lora微调qwen2.5-math-7b出问题
Issue -
State: open - Opened by lin-dy about 1 month ago
- 2 comments
Labels: pending
#5538 - Qwen2VL模型图像识别不准
Issue -
State: open - Opened by JohnZhuYX about 1 month ago
- 1 comment
Labels: pending
#5537 - throughput is much slower than expected for pre-training
Issue -
State: open - Opened by lingchensanwen about 1 month ago
Labels: pending
#5536 - Update identity.json
Pull Request -
State: closed - Opened by Cherry39-lab about 1 month ago
Labels: invalid
#5535 - 训练时指定evaluation_set(validation_set)
Issue -
State: closed - Opened by mzc2113391 about 1 month ago
- 1 comment
Labels: solved
#5534 - 训练后的agent模型,vllm部署后工具调用失败
Issue -
State: closed - Opened by bingoohe about 1 month ago
- 1 comment
Labels: solved
#5533 - Add additional install options to Dockerfiles
Pull Request -
State: closed - Opened by StrangeBytesDev about 1 month ago
Labels: solved
#5532 - feat: Long Text Fine-Tuning Support
Pull Request -
State: open - Opened by glide-the about 2 months ago
- 1 comment
#5531 - Cannot install llamafactory 0.9.1.dev0 (from /code/LLaMA-Factory) because these package versions have conflicting dependencies.
Issue -
State: closed - Opened by leoozy about 2 months ago
- 1 comment
Labels: solved
#5530 - deepspeed 单机多卡sft时,如何只保存模型
Issue -
State: closed - Opened by lufred8341 about 2 months ago
- 1 comment
Labels: solved
#5529 - 使用 unsloth 时,qwen模型会报错
Issue -
State: closed - Opened by cat-knight about 2 months ago
- 1 comment
Labels: wontfix
#5528 - No liger kernels will be applied. Qwen2-vl
Issue -
State: closed - Opened by arit2 about 2 months ago
- 1 comment
Labels: solved
#5527 - 微调qwen2 video 的时候出现,list index out of range
Issue -
State: closed - Opened by wudidaxuexue about 2 months ago
- 4 comments
Labels: solved
#5526 - LORA微调LLaMa2-7b-chat,推理时报错 Some keys are not used by the HfArgumentParser: ['eval_dataset', 'quantization_method']
Issue -
State: closed - Opened by Math312 about 2 months ago
- 2 comments
Labels: solved
#5525 - Flash 2 attention warning, flash attention not working properly for qwen2vl
Issue -
State: closed - Opened by sharonsalabiglossai about 2 months ago
- 3 comments
Labels: solved
#5524 - qwen2 预训练的loss震荡
Issue -
State: closed - Opened by allen20200111 about 2 months ago
- 1 comment
Labels: solved
#5523 - 模型评估,ceval数据集评估结果均为0,初步分析原因:测试所用的ceval->test数据集没有answer,是什么情况呢?
Issue -
State: closed - Opened by yiyayieryo about 2 months ago
- 2 comments
Labels: duplicate
#5522 - [Update] loader.py , evaluate will run separate evaluations on each eval_dataset
Pull Request -
State: open - Opened by SrWYG about 2 months ago
- 2 comments
Labels: pending
#5521 - Support for loading local HuggingFace-formatted Datasets
Issue -
State: closed - Opened by nathan-az about 2 months ago
- 2 comments
Labels: solved
#5520 - 群二维码已经过期,无法加入,需要更新一下
Issue -
State: closed - Opened by eddey666 about 2 months ago
Labels: solved
#5519 - 训练时视频处理逻辑
Issue -
State: open - Opened by liuao743 about 2 months ago
- 1 comment
Labels: pending
#5518 - function call 微调后部署问题
Issue -
State: open - Opened by pugnazhaotianqi about 2 months ago
Labels: pending
#5517 - LlamaFactory模型合并后,推理速度很慢,且重复和乱答,动态推理正常表现
Issue -
State: closed - Opened by Scorponok31 about 2 months ago
- 2 comments
Labels: invalid
#5516 - 使用Qwen-7B使用Qlora时报错在阿里的PAI-DSW
Issue -
State: closed - Opened by lzy728 about 2 months ago
- 1 comment
Labels: solved
#5515 - READ.ME中看到已经支持Qwen2.5(千问2.5)但是选择模版时,还是没有Qwen2和Qwen2.5的模版
Issue -
State: closed - Opened by lishiyucn about 2 months ago
- 2 comments
Labels: solved
#5514 - 默认的optimizer是什么?如何添加自己的optimizer如SGD?
Issue -
State: closed - Opened by DSW2001 about 2 months ago
- 1 comment
Labels: solved
#5513 - 有可能对train函数加上差分隐私的训练处理吗,如果我想对sft微调训练过程中使用opacus加入差分隐私处理,我该怎么做?
Issue -
State: open - Opened by DSW2001 about 2 months ago
- 1 comment
Labels: pending
#5512 - How to train the mm_proj and the LLM part with lora of Qwen2-VL
Issue -
State: open - Opened by leoozy about 2 months ago
- 3 comments
Labels: pending
#5511 - 请问作者有计划支持序列并行相关的能力吗,类似于 xtuner 那种,类似于感觉可以集成 xtuner 的序列并行接口
Issue -
State: closed - Opened by ldh127 about 2 months ago
- 1 comment
Labels: solved
#5510 - [Feature Request] 请问能加入Liger-Kernel的支持吗?
Issue -
State: closed - Opened by Orion-zhen about 2 months ago
- 1 comment
Labels: solved
#5509 - 请问一下多图训练的时候如何指定每张图的像素?Internvl在训练的时候就有相关的功能
Issue -
State: open - Opened by leoozy about 2 months ago
Labels: pending
#5508 - glm4微调导入模型报错
Issue -
State: closed - Opened by WWeellkkiinn about 2 months ago
- 1 comment
Labels: solved
#5507 - Add deepseek-v2.5 template
Pull Request -
State: open - Opened by piamo about 2 months ago
Labels: pending
#5506 - Deepseek v2.5的 template 变了,与 v2不同
Issue -
State: open - Opened by piamo about 2 months ago
Labels: pending
#5505 - WSD Learning rate scheduling problem
Issue -
State: closed - Opened by runningto about 2 months ago
- 1 comment
Labels: solved
#5504 - pretrain from scratch 输出都是数字
Issue -
State: open - Opened by UbeCc about 2 months ago
Labels: pending
#5503 - KeyError: 'messages'
Issue -
State: closed - Opened by laozhai507 about 2 months ago
Labels: invalid
#5502 - 最新版代码不支持 visual_inputs 参数
Issue -
State: closed - Opened by laozhai507 about 2 months ago
- 1 comment
Labels: solved
#5501 - Qwen2-VL在微调之后进行merge的过程中出现ValueError: Unrecognized configuration class问题
Issue -
State: closed - Opened by AmuzeLu about 2 months ago
- 2 comments
Labels: solved
#5499 - 在checkpoint上继续训练,没有保存训练后的checkpint
Issue -
State: closed - Opened by cuisws about 2 months ago
- 2 comments
Labels: solved
#5498 - 数据长度过长,开了zero3后依旧是一个显卡装不下一条数据,没法训练
Issue -
State: closed - Opened by zhangyuqi-1 about 2 months ago
- 2 comments
Labels: duplicate
#5497 - save_only_model后无法续训
Issue -
State: open - Opened by yuepengs about 2 months ago
Labels: pending
#5496 - Can you support Jamba 1.5 model and Mamba family models, mamba2-hybrid, ssm model, etc pls?
Issue -
State: open - Opened by badrabbitt about 2 months ago
Labels: pending