modelscope/ms-swift issues and pull requests

#1953 - swift 中如何设置模型内部参数，如internvl中的max_dynamic_patch

Issue - State: open - Opened by zws-2019 14 days ago - 5 comments

#1952 - Support minicpm 3

Pull Request - State: closed - Opened by Jintao-Huang 14 days ago

#1951 - NPU qwen2模型推理报错

Issue - State: open - Opened by JiayuQiao 14 days ago - 1 comment

#1950 - fix qwen2-vl & video

Pull Request - State: closed - Opened by Jintao-Huang 14 days ago

#1949 - fix rlhf

Pull Request - State: closed - Opened by hjh0119 14 days ago

#1948 - fix file rename error in megatron when there are multi process

Pull Request - State: closed - Opened by Zhikaiiii 14 days ago

#1947 - support dynamic_eos

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago

#1946 - fix do_sample

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago - 7 comments

#1945 - fix lmdeploy seed

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago

#1944 - BUG：init_lora lead to the wrong distribute？

Issue - State: closed - Opened by bonre 15 days ago - 2 comments

#1943 - swift infer的时候，传递do_sample参数不起作用

Issue - State: closed - Opened by baibaiw5 15 days ago - 3 comments

#1942 - update yi-coder

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago

#1941 - 请问是否支持在NPU上训练多模态大模型

Issue - State: open - Opened by ChingKwanCheung 15 days ago - 2 comments

#1940 - 微调速度非常慢

Issue - State: open - Opened by gengpeip 15 days ago - 11 comments

#1939 - streaming模式读取数据，显存利用率很低

Issue - State: open - Opened by guozhiyao 15 days ago - 7 comments
Labels: bug

#1938 - Training stops for `KTO` after model loads into memory.

Issue - State: open - Opened by Aunali321 15 days ago - 5 comments

#1937 - qwen2_audio_7b_instruct利用VLLM推理错误

Issue - State: open - Opened by huangzj421 15 days ago - 2 comments

#1936 - fix swift deploy

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago

#1935 - 关于 Qwen2_VL-2B 微调时显存不足的问题

Issue - State: open - Opened by Betty-J 15 days ago - 4 comments
Labels: question

#1935 - 关于 Qwen2_VL-2B 微调时显存不足的问题

Issue - State: open - Opened by Betty-J 15 days ago - 8 comments
Labels: question

#1934 - 微调glm4v, 给glm4v的视觉部分都添加了checkpoint, 但是还是显存溢出（lora_target_modules 设置为'ALL' ）

Issue - State: open - Opened by samaritan1998 15 days ago - 4 comments

#1933 - fix typing

Pull Request - State: closed - Opened by Jintao-Huang 15 days ago

#1932 - “训练推理界面”后点击“通过 API 使用”时报错。

Issue - State: open - Opened by KnightLancelot 15 days ago - 2 comments

#1931 - # 观察数据后，发现下面的代码会过滤掉一些没有问题的数据，如：sure, here are some tools and …

Pull Request - State: open - Opened by KnightLancelot 15 days ago

#1931 - # 观察数据后，发现下面的代码会过滤掉一些没有问题的数据，如：sure, here are some tools and …

Pull Request - State: open - Opened by KnightLancelot 15 days ago

#1930 - dpo internvl2存在mismatch

Issue - State: closed - Opened by Ranking666 16 days ago - 5 comments
Labels: bug

#1929 - TypeError: Qwen2ForCausalLM.forward() got an unexpected keyword argument '_data'

Issue - State: closed - Opened by xiamaozi11 16 days ago - 5 comments

#1929 - TypeError: Qwen2ForCausalLM.forward() got an unexpected keyword argument '_data'

Issue - State: open - Opened by xiamaozi11 16 days ago - 5 comments

#1928 - 视觉模块支持gradient_checkpointing

Issue - State: open - Opened by samaritan1998 16 days ago
Labels: enhancement

#1927 - Qwen2-VL-7B-instruct 微调报错：RuntimeError: CUDA error: too many resources requested for launch

Issue - State: closed - Opened by xiajinxiong 16 days ago - 1 comment

#1927 - Qwen2-VL-7B-instruct 微调报错：RuntimeError: CUDA error: too many resources requested for launch

Issue - State: open - Opened by xiajinxiong 16 days ago

#1926 - update docs & fix bug

Pull Request - State: closed - Opened by Jintao-Huang 16 days ago

#1925 - update wechat

Pull Request - State: closed - Opened by tastelikefeet 16 days ago

#1924 - minicpm-v-v2.6评测出现结果为0情况

Issue - State: open - Opened by zhudongmei123 16 days ago

#1923 - Support for Fine-Tuning Best Practices with LLaVA-OV

Issue - State: open - Opened by YoungjaeDev 16 days ago

#1922 - 使用自定数据集DPO mllm时报错KeyError: 'prompt'

Issue - State: closed - Opened by SparrowZheyuan18 16 days ago - 2 comments

#1921 - 【新增功能需求】 Internvl2模型+VLLM 后端实现异步客户端请求的Video 推理功能

Issue - State: open - Opened by PancakeAwesome 16 days ago - 1 comment
Labels: enhancement

#1920 - Qwen2-VL-7B-Instruct Video inference

Issue - State: closed - Opened by wangli68 16 days ago - 1 comment

#1919 - MooER audio support request

Issue - State: open - Opened by seetimee 16 days ago
Labels: enhancement

#1918 - qwen2-vl-2b-instruct使用自定义数据集微调出现DatasetGenerationError

Issue - State: closed - Opened by lgy0404 16 days ago

#1917 - qwen2-vl-2b-instruct微调报错ValueError: push_best is not a valid HubStrategy, please select one of ['end', 'every_save', 'checkpoint', 'all_checkpoints']

Issue - State: closed - Opened by learn01one 16 days ago - 4 comments

#1916 - [TorchAcc] perf: use xm.save instead of torch.save

Pull Request - State: closed - Opened by baoleai 16 days ago

#1915 - refactor docs

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1914 - fix bugs when megatron_patch_path

Pull Request - State: closed - Opened by Zhikaiiii 17 days ago

#1913 - Refactor docs

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1912 - Refactor docs

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1911 - llava-llama-3-8b-v1_1 AttributeError: 'NoneType' object has no attribute 'get_output_embeddings'

Issue - State: open - Opened by thisiskofi 17 days ago - 2 comments
Labels: bug

#1910 - internvl2-26b多卡训练报错“Expected all tensors to be on the same device...”

Issue - State: closed - Opened by tzw451721677 17 days ago - 3 comments

#1909 - fix web-ui push to hub strategy

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1908 - update docs

Pull Request - State: closed - Opened by Jintao-Huang 17 days ago

#1907 - deepspeed use cosine lr_schduler

Pull Request - State: closed - Opened by Jintao-Huang 17 days ago

#1906 - AssertionError: DeepSpeed does not recognize LR scheduler WarmupCosineLR

Issue - State: closed - Opened by Jintao-Huang 17 days ago

#1905 - qwen2-vl-2b-instruct微调报错：importlib.metadata.PackageNotFoundError: No package metadata was found for The 'qwen_vl_utils' distribution was not found and is required by this application.

Issue - State: closed - Opened by lgy0404 17 days ago - 2 comments

#1904 - Cannot get model_type from the deploy service

Issue - State: closed - Opened by Harry-zzh 17 days ago - 2 comments

#1903 - [TorchAcc] fix: fix the judegement of fsdp_num

Pull Request - State: closed - Opened by baoleai 17 days ago

#1902 - GLM4V-9B微调后，直接调用微调的模型出错：ValueError: The following `model_kwargs` are not used by the model: ['images'] (note: typos in the generate arguments will also show up in this list)

Issue - State: closed - Opened by tw-repository 17 days ago - 1 comment

#1901 - fix push_to_ms

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1900 - support logprobs

Pull Request - State: closed - Opened by Jintao-Huang 17 days ago

#1899 - dpo微调与zero3不兼容

Issue - State: closed - Opened by zhangfan-algo 17 days ago - 5 comments
Labels: enhancement

#1898 - Failed to import swift.llm.sft because of the following error

Issue - State: closed - Opened by orzgugu 17 days ago - 15 comments
Labels: bug

#1897 - Fix push_to_hub when last-checkpoint

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1896 - qwen2-vl-chat-instruct示例数据格式

Issue - State: open - Opened by Guangming92 17 days ago
Labels: enhancement

#1895 - Merge LoRA & 量化部分支持bnb量化嘛

Issue - State: closed - Opened by kelenlv 17 days ago - 2 comments

#1894 - lr_scheduler_type

Issue - State: closed - Opened by mc-lan 17 days ago - 1 comment

#1893 - support custom quantized dataset

Pull Request - State: closed - Opened by tastelikefeet 17 days ago

#1892 - internvl2-llama3-76b 微调报错

Issue - State: open - Opened by zhangfan-algo 17 days ago - 6 comments
Labels: bug

#1891 - 如何调整训练损失函数类型？自定义损失函数？(Custom loss function)

Issue - State: open - Opened by XiaoMaGe-hero 17 days ago - 3 comments
Labels: enhancement

#1890 - Add some warnings and fix RLHF

Pull Request - State: closed - Opened by tastelikefeet 18 days ago

#1889 - add vllm lmdeploy benchmark

Pull Request - State: closed - Opened by Jintao-Huang 18 days ago

#1888 - Fix push to hub logic

Pull Request - State: closed - Opened by tastelikefeet 18 days ago

#1887 - qwen2-vl微调使用flash_attn报错

Issue - State: open - Opened by zhangfan-algo 18 days ago - 6 comments

#1886 - dpo微调internvl2

Issue - State: closed - Opened by Ranking666 18 days ago - 1 comment

#1885 - refactor rlhf

Pull Request - State: closed - Opened by hjh0119 18 days ago

#1884 - support qwen2-vl gptq awq

Pull Request - State: closed - Opened by Jintao-Huang 18 days ago

#1883 - Refactor push_to_hub

Pull Request - State: closed - Opened by tastelikefeet 19 days ago

#1882 - 两张V100微调qwen2-7b，单卡微调正常，双卡微调出现RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

Issue - State: closed - Opened by kksmi 19 days ago - 3 comments

#1881 - internvl-40b模型微调后推理时报错

Issue - State: closed - Opened by ymlab 19 days ago - 2 comments

#1880 - Support freeze vit

Pull Request - State: closed - Opened by Jintao-Huang 19 days ago

#1879 - How to freeze the ViT part during full parameter fine-tuning of Qwen2-vl?

Issue - State: closed - Opened by Jintao-Huang 19 days ago - 1 comment

#1878 - ImportError: cannot import name 'LlavaOnevisionForConditionalGeneration' from 'transformers'

Issue - State: closed - Opened by Lopa07 19 days ago - 3 comments

#1877 - add duet

Pull Request - State: closed - Opened by tastelikefeet 19 days ago

#1876 - 请教一下，图片、视频等多模态数据超长应该怎么截断？

Issue - State: closed - Opened by HuiResearch 19 days ago - 4 comments

#1875 - Fix neftune doc

Pull Request - State: closed - Opened by tastelikefeet 20 days ago

#1874 - Fix num_proc

Pull Request - State: closed - Opened by Jintao-Huang 20 days ago

#1873 - Add train record

Pull Request - State: closed - Opened by tastelikefeet 20 days ago

#1872 - [TorchAcc] fix serveral bugs for torchacc FSDP.

Pull Request - State: closed - Opened by baoleai 20 days ago

#1871 - Support faster data map

Pull Request - State: closed - Opened by tastelikefeet 20 days ago

#1870 - qwen2-vl fine-tuning error: module 'torch.nn' has no attribute 'RMSNorm'

Issue - State: closed - Opened by Jintao-Huang 20 days ago

#1869 - update docs qwen2-vl

Pull Request - State: closed - Opened by Jintao-Huang 20 days ago

#1868 - support qwen2-vl zero3

Pull Request - State: closed - Opened by Jintao-Huang 21 days ago

#1867 - CUDA error: too many resources requested for launch (V100, qwen2-vl)

Issue - State: open - Opened by Jiax323 21 days ago - 15 comments

#1866 - Memory(not GPU RAM) exceeds when using 'swift deploy'

Issue - State: open - Opened by VenusHui 21 days ago
Labels: bug

#1865 - 部分数据集处理时出现了超过max length的warining，但实际数据貌似并没有超过，前几个版本的分支没有这个问题

Issue - State: closed - Opened by changqingla 21 days ago - 1 comment

#1864 - fix requirements

Pull Request - State: closed - Opened by Jintao-Huang 21 days ago

#1863 - minicpm-V-2最佳实践,执行推理时,模型不输出任何结果

Issue - State: open - Opened by hxzl-98 21 days ago - 4 comments

#1862 - deepspeed-zero3: Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=11560550.4, input_type=float]

Issue - State: closed - Opened by Jintao-Huang 21 days ago

#1861 - fix qwen2-vl docs

Pull Request - State: closed - Opened by Jintao-Huang 21 days ago

#1860 - Qwen2-VL-7B的微调out of memory

Issue - State: closed - Opened by KirbytroNic0528 21 days ago - 4 comments

#1859 - Qwen2-VL-7B的微调out of memory

Issue - State: closed - Opened by KirbytroNic0528 21 days ago - 3 comments

#1858 - update qwen2-vl docs

Pull Request - State: closed - Opened by Jintao-Huang 21 days ago

GitHub / modelscope/ms-swift issues and pull requests