modelscope/ms-swift issues and pull requests

#2053 - Pixtral 12B

Issue - State: open - Opened by SinanAkkoyun 2 days ago

#2052 - support qwen2-vl-base

Pull Request - State: open - Opened by Jintao-Huang 2 days ago

#2051 - fix qwen2vl position_ids

Pull Request - State: open - Opened by Jintao-Huang 2 days ago

#2050 - update docs

Pull Request - State: closed - Opened by Jintao-Huang 3 days ago

#2049 - 请问如何freeze一部分pretrain的模型后，接入自定义的pytorch model并进行训练？

Issue - State: open - Opened by CDWJ 3 days ago - 2 comments

#2048 - llama3 tool calling

Pull Request - State: closed - Opened by tastelikefeet 4 days ago

#2047 - Fix multi coordinate grounding

Pull Request - State: closed - Opened by tastelikefeet 4 days ago

#2046 - internlm-xcomposer2-7b-chat 使用 use_flash_attn 出现错误

Issue - State: closed - Opened by fly-dragon211 5 days ago - 2 comments

#2045 - support multi bbox grounding

Pull Request - State: closed - Opened by tastelikefeet 5 days ago

#2044 - DPO training error `RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!`

Issue - State: open - Opened by Lopa07 5 days ago - 1 comment

#2043 - chatglm4v-9b全量微调full报错：File "/usr/local/Python3.10.0/lib/python3.10/site-packages/deepspeed/runtime/zero/stage3.py", line 149, in init [rank0]: self.dtype = self.optimizer.param_groups[0]['params'][0].dtype [rank0]: IndexError: list index out of range

Issue - State: open - Opened by Sunxiaohu0406 5 days ago

#2042 - fix mplug-owl3

Pull Request - State: closed - Opened by Jintao-Huang 5 days ago - 1 comment

#2041 - Merge Lora后为什么sft_type被设置为full，不应该是lora么

Issue - State: closed - Opened by Sunxiaohu0406 6 days ago - 1 comment

#2040 - Merge Lora后为什么stf被设置为full

Issue - State: closed - Opened by Sunxiaohu0406 6 days ago - 1 comment

#2039 - AttributeError: 'Qwen2VLForConditionalGeneration' object has no attribute 'quantize'. Did you mean: 'dequantize'?

Issue - State: open - Opened by CachCheng 6 days ago

#2038 - 使用OpenAI API方式启动部署报错：ImportError: cannot import name 'prepare_vllm_engine_template' from 'swift.llm.utils'

Issue - State: open - Opened by lgy0404 6 days ago - 2 comments

#2037 - Add longwriter filtered dataset

Pull Request - State: closed - Opened by wangxingjun778 6 days ago

#2036 - windows使用webui训练报错 invalid float value: "'1e-4'"

Issue - State: open - Opened by chuanSir123 6 days ago - 1 comment

#2035 - LISA训练要么OOM，要么使用deepseed就报错

Issue - State: open - Opened by LIUKAI0815 6 days ago

#2034 - fix rlhf & zero3

Pull Request - State: closed - Opened by Jintao-Huang 6 days ago

#2033 - 关于微调+量化的一些经验求教

Issue - State: open - Opened by lzcchl 7 days ago

#2032 - Fix olora and pissa saving files which will cause the second saving failed

Pull Request - State: closed - Opened by tastelikefeet 7 days ago

#2031 - DPO support resume_from_checkpoint

Issue - State: closed - Opened by Jintao-Huang 7 days ago - 1 comment

#2030 - [WIP]Feat/refactor3

Pull Request - State: open - Opened by tastelikefeet 7 days ago

#2029 - fix deploy eval kill

Pull Request - State: closed - Opened by Jintao-Huang 7 days ago

#2028 - update code

Pull Request - State: closed - Opened by Jintao-Huang 7 days ago

#2027 - internvl2-8b在docker里训练OOM

Issue - State: open - Opened by mc-lan 7 days ago

#2026 - 模型推理保存路径

Issue - State: open - Opened by Ranking666 8 days ago - 1 comment

#2025 - 使用cogvlm2在rlaif-v数据集上做DPO训练报错

Issue - State: closed - Opened by kaka-Cao 8 days ago - 2 comments

#2024 - How to track pooling stride and frame count of llava-next-video in Swift

Issue - State: open - Opened by YoungjaeDev 8 days ago

#2023 - Swift DPO Template 格式问题

Issue - State: open - Opened by jameslian87v5 8 days ago - 6 comments

#2022 - swift不能做模型的二次预训练？具体怎么做啊？

Issue - State: open - Opened by kandada 8 days ago

#2021 - cannot import name 'ftp_head' from 'datasets.utils.file_utils'

Issue - State: closed - Opened by 77h2l 8 days ago - 8 comments

#2020 - Internlm-Xcomposer2.5 推理时输入多张图报错

Issue - State: closed - Opened by diodes-zhang 8 days ago - 1 comment
Labels: bug, solved

#2019 - Florence use _post_encode & template support encoder-decoder

Pull Request - State: closed - Opened by Jintao-Huang 9 days ago

#2018 - transformers>=4.45.0.dev0

Issue - State: closed - Opened by Ranking666 9 days ago - 3 comments

#2017 - Dataset format compatibility between LLaVA-NexT and Qwen-VL2 for custom JSON datasets

Issue - State: open - Opened by YoungjaeDev 9 days ago - 1 comment

#2016 - internvl2-40b infer

Issue - State: open - Opened by wangli68 9 days ago

#2015 - Does DPO/RLHF tuning support internVL2 video models?

Issue - State: closed - Opened by BillChan226 9 days ago - 2 comments

#2014 - Deployment or Export

Issue - State: open - Opened by ztianlin 9 days ago

#2013 - Add FAQ Document

Pull Request - State: closed - Opened by slin000111 9 days ago

#2012 - [Re-appeared] DPO training error UnboundLocalError: local variable 'num_patches' referenced before assignment

Issue - State: closed - Opened by Lopa07 9 days ago - 1 comment

#2011 - Add FAQ Document

Pull Request - State: closed - Opened by slin000111 9 days ago

#2010 - Qwen2-VL-7B-Instruct训练爆显存

Issue - State: open - Opened by warm345 9 days ago - 4 comments

#2009 - fix lmdeploy qwen_vl

Pull Request - State: closed - Opened by Jintao-Huang 9 days ago

#2008 - dpo训练时，使用自带数据集，报错

Issue - State: closed - Opened by JiaXinLI98 9 days ago

#2007 - Regular lora target module cannot use imdeploy

Issue - State: open - Opened by orzgugu 10 days ago - 2 comments

#2006 - eval_acc 是如何计算的？

Issue - State: closed - Opened by XiaoMaGe-hero 10 days ago - 2 comments

#2005 - Support llava1.6-llama3.1-8b-instruct

Pull Request - State: closed - Opened by DaozeZhang 10 days ago

#2004 - train_on_input

Issue - State: open - Opened by UNO-TTS 10 days ago - 2 comments

#2003 - Fix rlhf ref model

Pull Request - State: closed - Opened by Jintao-Huang 10 days ago

#2002 - inference接口不支持设置返回多组结果

Issue - State: open - Opened by thbupt 10 days ago

#2001 - compat lmdeploy==0.6

Pull Request - State: closed - Opened by Jintao-Huang 10 days ago

#2000 - 无法评测LoRA微调后的llava1.5模型

Issue - State: open - Opened by Harry-zzh 10 days ago - 1 comment

#1999 - MiniCPM-V-2 lora微调后推理报错： AssertionError: Current sentence length exceeds the model max_length: 4096

Issue - State: open - Opened by lgy0404 10 days ago - 1 comment

#1998 - internlm2_5-20b-chat 量化模型不支持vllm推理

Issue - State: open - Opened by tanshoudong 10 days ago
Labels: bug

#1997 - fix patch

Pull Request - State: closed - Opened by Jintao-Huang 10 days ago

#1996 - 使用deepspeed时，出现RuntimeError: 'weight' must be 2-D 的错误

Issue - State: closed - Opened by shuye-cheung 10 days ago - 1 comment

#1995 - fix

Pull Request - State: closed - Opened by tastelikefeet 10 days ago

#1994 - 换了多个模型，都报一个错误：sh:1:syntax

Issue - State: closed - Opened by unikok 10 days ago - 1 comment

#1993 - internvl 40Bckp保存和推理加载的时候出错

Issue - State: open - Opened by minaa-lab 10 days ago - 1 comment

#1992 - Support Deepseek 2.5

Pull Request - State: closed - Opened by DaozeZhang 10 days ago

#1991 - Inference Not Working on cuda:1

Issue - State: open - Opened by YoungjaeDev 10 days ago - 2 comments

#1990 - fix EngineGenerationConfig importError of lmdeploy

Pull Request - State: closed - Opened by irexyc 11 days ago - 1 comment

#1989 - O-lora 不可用

Issue - State: closed - Opened by CoderXiaopang 11 days ago - 2 comments

#1988 - qwen2-vl-7b-instruct 推理视频报错

Issue - State: closed - Opened by 1765643327 11 days ago

#1987 - Can't do RLHF with InternVL2-4B

Issue - State: closed - Opened by yuanf8 11 days ago - 2 comments

#1986 - LLaVA-NeXT-Video model configuration initialize error

Issue - State: open - Opened by VenusHui 11 days ago - 3 comments

#1985 - lmdeploy的main分支已经移除了EngineGenerationConfig，目前使用swift调用main分支的lmdeploy会报错

Issue - State: closed - Opened by RandomCoins 11 days ago - 1 comment

#1984 - Issue: RuntimeError with Multiple GPUs in MS Swift 2.5.0-dev

Issue - State: closed - Opened by YoungjaeDev 11 days ago - 5 comments

#1983 - lr_scheduler_type

Issue - State: closed - Opened by NancyGu 11 days ago - 2 comments

#1982 - fix model_mapping

Pull Request - State: closed - Opened by Jintao-Huang 11 days ago

#1981 - Uploading models to Hugging Face from MS Swift

Issue - State: closed - Opened by YoungjaeDev 11 days ago - 1 comment

#1980 - fix typo

Pull Request - State: closed - Opened by Jintao-Huang 11 days ago

#1979 - DPO微调InternVL2-2B时报错

Issue - State: closed - Opened by guihonghao 11 days ago - 5 comments

#1978 - llava1_5-7b-instruct 评测开始时能正常输出，后面无法正常输出

Issue - State: open - Opened by Harry-zzh 11 days ago

#1977 - Dataset preparation for Object Detection with Florence2

Issue - State: open - Opened by Mistsink 11 days ago - 2 comments

#1976 - Huggingface link is broken

Issue - State: closed - Opened by Tsardoz 12 days ago

#1975 - refactor rlhf

Pull Request - State: closed - Opened by Jintao-Huang 12 days ago

#1974 - 评测时出现超时错误

Issue - State: closed - Opened by Harry-zzh 12 days ago - 2 comments

#1973 - Add reflection model

Pull Request - State: closed - Opened by tastelikefeet 12 days ago

#1972 - 如何手动下载评测数据集，并且如何指定评测时的数据集路径

Issue - State: closed - Opened by Harry-zzh 13 days ago - 1 comment

#1971 - OOM when tokenizing datasets

Issue - State: open - Opened by SparrowZheyuan18 13 days ago - 4 comments

#1970 - update docs

Pull Request - State: closed - Opened by Jintao-Huang 13 days ago

#1969 - mplug-owl3-7b-chat fine-tuning document

Issue - State: open - Opened by Jintao-Huang 13 days ago - 4 comments
Labels: good first issue

#1968 - Dataset stucked when using --dataloader_num_workers 1 and --streaming true

Issue - State: open - Opened by tastelikefeet 13 days ago

#1967 - 全量微调minicpm-V2.6,但是生成的sft_args.json里面仍然有lora

Issue - State: open - Opened by zhaoyangwei123 13 days ago - 1 comment

#1966 - 环境镜像

Issue - State: closed - Opened by Qiny-dl 13 days ago - 2 comments

#1965 - 如果我在训练Lora时也想训练head，应该如何设置

Issue - State: open - Opened by daidaiershidi 13 days ago - 2 comments

#1964 - Fix data info print in rlhf

Pull Request - State: closed - Opened by tastelikefeet 14 days ago

#1963 - Fix the lora hook

Pull Request - State: closed - Opened by tastelikefeet 14 days ago

#1962 - support minicpm3-4b

Issue - State: closed - Opened by uRENu 14 days ago - 1 comment

#1961 - 用deploy部署qwen2vl，多个请求同时并发报错

Issue - State: open - Opened by zhengzehong 14 days ago - 1 comment

#1960 - TypeError: BaseAWQForCausalLM.quantize() got an unexpected keyword argument 'n_parallel_calib_samples'

Issue - State: closed - Opened by yichuxue 14 days ago - 1 comment

#1959 - fix bugs

Pull Request - State: closed - Opened by Jintao-Huang 14 days ago

#1958 - 期望RLHF能支持序列并行（sequence_parallel）

Issue - State: open - Opened by kangyishuai 14 days ago - 1 comment
Labels: enhancement

#1957 - support mplug_owl3

Pull Request - State: closed - Opened by Jintao-Huang 14 days ago

#1956 - Add lazy_tokenize to RLHF

Pull Request - State: closed - Opened by tastelikefeet 14 days ago

#1955 - InternVL2 全量微调时显存占用持续上涨

Issue - State: closed - Opened by bonre 14 days ago

#1954 - qwen2-vl-7b-instruct 以VLLM形式启动推理引擎失败“ assert "factor" in rope_scaling”

Issue - State: open - Opened by wyclike 14 days ago - 2 comments

GitHub / modelscope/ms-swift issues and pull requests