InternLM/lmdeploy issues and pull requests

#2288 - 什么时候能适配python 3.12

Issue - State: closed - Opened by wuhongsheng about 1 month ago - 1 comment

#2287 - [Feature] Support for decoding method that reduce Hallucinations

Issue - State: open - Opened by zhly0 about 1 month ago - 1 comment

#2286 - [Bug] Not able to install lmdeploy

Issue - State: closed - Opened by Pavun-KumarCH about 1 month ago - 1 comment

#2286 - [Bug] Not able to install lmdeploy

Issue - State: closed - Opened by Pavun-KumarCH about 1 month ago - 1 comment

#2285 - remove eviction param

Pull Request - State: closed - Opened by grimoire about 1 month ago
Labels: improvement

#2283 - 有关输入拆分计算的问题

Issue - State: closed - Opened by SeibertronSS about 1 month ago

#2283 - 有关输入拆分计算的问题

Issue - State: closed - Opened by SeibertronSS about 1 month ago

#2282 - [Bug] LoRA Finetuned model is not generating any text.

Issue - State: open - Opened by shiva-vardhineedi about 1 month ago - 4 comments

#2281 - [Bug] glm4量化之后，吐字异常重复的问题

Issue - State: open - Opened by maxin9966 about 1 month ago - 6 comments

#2280 - [Bug] OOM errors while loading awq quatized Mixtral 7X8b instruct model

Issue - State: closed - Opened by ab6995 about 1 month ago - 2 comments

#2279 - 请问一下国产显卡Ascend 910 and Hygon DCU Z100L等如何安装？

Issue - State: closed - Opened by cgq0816 about 1 month ago - 1 comment

#2278 - build(ascend): add Dockerfile for ascend aarch64 910B

Pull Request - State: closed - Opened by CyCle1024 about 1 month ago - 13 comments
Labels: enhancement

#2277 - [WIP] Support cogvlm model

Pull Request - State: closed - Opened by pdx1989 about 1 month ago

#2275 - fix side-effect: failed to update tm model config with tm engine config

Pull Request - State: closed - Opened by lvhan028 about 1 month ago - 2 comments
Labels: Bug:P1

#2274 - [Feature] support qqq(w4a8) for lmdeploy

Pull Request - State: open - Opened by HandH1998 about 1 month ago - 14 comments

#2273 - [Bug] internvl2-4B 在v100上推理异常，输出text均为空

Issue - State: closed - Opened by qism about 1 month ago - 7 comments
Labels: awaiting response, Stale

#2272 - batch_infer疑似死锁卡住[Bug]

Issue - State: open - Opened by hitzhu about 1 month ago - 8 comments

#2271 - lmdeploy pipe 部署internvl2-40B batch_infer出错[Bug]

Issue - State: open - Opened by hitzhu about 1 month ago - 4 comments

#2270 - [Bug]

Issue - State: closed - Opened by hitzhu about 1 month ago - 3 comments

#2269 - [Bug] Qwen/Qwen2-1.5B error: floating point exception

Issue - State: open - Opened by lpf6 about 1 month ago - 13 comments
Labels: awaiting response

#2268 - Qwen-VL 推理返回空文本

Issue - State: closed - Opened by Liufeiran123 about 1 month ago - 4 comments

#2267 - minicpmv2.5部署报错

Issue - State: closed - Opened by lyc728 about 1 month ago - 5 comments

#2266 - 单卡A10用swift加载微调后的glm-4v, 报OOM错误

Issue - State: closed - Opened by demoninpiano about 1 month ago - 2 comments

#2265 - [Bug] 使用lmdeploy推理internvl2-40B出错

Issue - State: open - Opened by hitzhu about 1 month ago - 9 comments

#2264 - [Bug] lmdeploy 在加载auto_awq 之后的 llava 模型之后显示没有 output.weight

Issue - State: closed - Opened by Skyseaee about 1 month ago - 4 comments
Labels: awaiting response

#2263 - [Bug] InternVL2 超过2图就报错

Issue - State: closed - Opened by JixiangGao about 1 month ago - 4 comments

#2262 - [Bug] internvl2-2B awq w4a16量化后掉点严重，应该如何排查？

Issue - State: closed - Opened by Howe-Young about 1 month ago - 4 comments
Labels: awaiting response, Stale

#2260 - 使用lmdeploy的openai部署了internvl2.0服务，准备做视频推理，但是结果很不稳定

Issue - State: closed - Opened by sunnymoon155 about 1 month ago - 14 comments

#2259 - internvl2-8b 推理结果没有特殊token

Issue - State: closed - Opened by lyc728 about 1 month ago - 14 comments
Labels: awaiting response, Stale

#2258 - [Bug] Intel MKL FATAL ERROR /torch/lib/libtorch_cpu.so

Issue - State: closed - Opened by maxin9966 about 1 month ago

#2257 - [Bug] 0.5.3版本运行InternVL2-26B-AWQ报错

Issue - State: closed - Opened by bltcn about 1 month ago - 4 comments

#2256 - enable run vlm with pytorch engine in gradio

Pull Request - State: closed - Opened by RunningLeon about 1 month ago - 3 comments
Labels: Bug:P1

#2255 - [Bug] qwen272b- lmDeploy hanged after 900 requests

Issue - State: open - Opened by ChunyiY about 1 month ago - 2 comments

#2254 - [Bug] after i convert llava-1.5 to turbomind format, the same input occur an error: run out of tokens

Issue - State: closed - Opened by Skyseaee about 1 month ago - 1 comment

#2253 - [Bug] mac pip 提示找不到匹配信息

Issue - State: closed - Opened by fuckqqcom about 1 month ago - 1 comment

#2252 - Split token_embs and lm_head weights

Pull Request - State: closed - Opened by irexyc about 1 month ago - 5 comments
Labels: improvement

#2251 - debug

Pull Request - State: closed - Opened by RunningLeon about 1 month ago

#2250 - [Bug] internv2系列模型使用pipeline报错

Issue - State: open - Opened by wssywh about 1 month ago - 10 comments

#2249 - [Bug] AssertionError: tp should be 2^n

Issue - State: closed - Opened by colorfulandcjy0806 about 1 month ago - 3 comments

#2247 - [Bug] ImportError: cannot import name 'VLAsyncEngine'

Issue - State: closed - Opened by LSC527 about 1 month ago - 2 comments

#2246 - cancel support baichuan2 7b awq in pytorch engine

Pull Request - State: closed - Opened by grimoire about 1 month ago
Labels: documentation

#2245 - support vlm custom image process parameters in openai input format

Pull Request - State: closed - Opened by irexyc about 1 month ago - 1 comment
Labels: enhancement

#2244 - [Bug] interenvl2 auto_awq quant failed

Issue - State: closed - Opened by janelu9 about 1 month ago - 2 comments

#2243 - [Bug] lmdeploy lite auto_awq量化错误

Issue - State: open - Opened by ZanePoe about 1 month ago - 9 comments

#2242 - bump version to v0.5.3

Pull Request - State: closed - Opened by lvhan028 about 1 month ago

#2241 - [Feature] Minicpm-2.6 support

Issue - State: closed - Opened by AnyangAngus about 1 month ago - 3 comments

#2240 - fix the issue missing dependencies in the Dockerfile and pip

Pull Request - State: closed - Opened by ColorfulDick about 1 month ago - 4 comments
Labels: Bug:P1

#2239 - [Bug] Infering InternVL2-8B model with Lmdepoly in trouble

Issue - State: closed - Opened by PancakeAwesome about 1 month ago - 3 comments
Labels: awaiting response, Stale

#2238 - lmdeploy在windows上的支持问题

Issue - State: closed - Opened by humphreyde about 1 month ago - 8 comments

#2237 - docs: add Japanese README

Pull Request - State: closed - Opened by eltociear about 2 months ago - 1 comment
Labels: documentation

#2236 - [Bug] RuntimeError: [TM][ERROR] CUDA runtime error: out of memory /lmdeploy/src/turbomind/utils/memory_utils.cu:32

Issue - State: closed - Opened by dengruoqing about 2 months ago - 9 comments
Labels: awaiting response

#2235 - Lmdeploy turbomind backend support rope_scaling?

Issue - State: closed - Opened by Lzhang-hub about 2 months ago - 7 comments
Labels: awaiting response, Stale

#2233 - Fix typos in profile_generation.py

Pull Request - State: closed - Opened by jiajie-yang about 2 months ago - 1 comment

#2232 - [Bug] logprobs的返回值全为0.0，和vllm对不上，配置如下topp=1，topk=3，temperature=0，logprobs=1

Issue - State: closed - Opened by Greatpanc about 2 months ago - 4 comments

#2231 - 当连续请求200多次后，出现突然卡住的情况

Issue - State: open - Opened by lai-serena about 2 months ago - 11 comments

#2230 - terminate called after throwing an instance of 'std::runtime_error'

Issue - State: closed - Opened by lyc728 about 2 months ago - 21 comments
Labels: awaiting response

#2226 - 多卡推理，无法生成结果

Issue - State: closed - Opened by deepslee about 2 months ago - 12 comments

#2223 - [Bug] illegal memory access was encountered /opt/lmdeploy/src/turbomind/utils/allocator.h:233

Issue - State: closed - Opened by pseudotensor about 2 months ago - 5 comments

#2221 - [Bug] IntrenVL2-1B awq量化后推理异常问题

Issue - State: closed - Opened by Jeremy-J-J about 2 months ago - 5 comments
Labels: awaiting response, Stale

#2219 - [Bug] RuntimeError: [TM][ERROR] Assertion fail: D:\a\lmdeploy\lmdeploy\src\turbomind\python\bind.cpp:294

Issue - State: closed - Opened by NB-Group about 2 months ago - 9 comments

#2218 - Add peer-access-enabled allocator

Pull Request - State: closed - Opened by lzhangzz about 2 months ago
Labels: Bug:P1

#2217 - internlm2-chat-20b量化为4bit后，模型文件从原来的38.7G减小到11.5G，但是部署后的显存占用从原来的78.3G降低到68.3G，只降低了10G，正常吗？

Issue - State: closed - Opened by sxk000 about 2 months ago - 4 comments
Labels: awaiting response, Stale

#2216 - [Bug] 使用InternVL2-4B版本时flash_attention出错

Issue - State: open - Opened by wonderingtom about 2 months ago - 3 comments
Labels: mllm

#2215 - Fix hidden size and support mistral nemo

Pull Request - State: closed - Opened by AllentDan about 2 months ago - 1 comment
Labels: enhancement

#2214 - 在2080ti上四卡推理qwen2-72b-instruct-awq模型非常缓慢

Issue - State: closed - Opened by bltcn about 2 months ago - 4 comments

#2213 - 调用LMdeploy报错显示[TM][ERROR] Null stream callback for (0)

Issue - State: open - Opened by wxxcn about 2 months ago - 1 comment

#2212 - fix runtime error when using dynamic scale rotary embed for InternLM2…

Pull Request - State: closed - Opened by CyCle1024 about 2 months ago
Labels: Bug:P1

#2210 - [Bug] 量化glm4-9b-chat模型报错

Issue - State: closed - Opened by MdcGIt about 2 months ago - 17 comments

#2205 - TypeError: MiniCPMV.forward() missing 1 required positional argument: 'data'

Issue - State: closed - Opened by dengruoqing about 2 months ago - 8 comments
Labels: awaiting response, Stale

#2204 - [Bug] 4卡部署InternVL2-26B，显存占用过多，CUDA runtime error: an illegal memory access was encountered

Issue - State: closed - Opened by GRD-Chang about 2 months ago - 18 comments

#2201 - Fix chunked prefill

Pull Request - State: closed - Opened by lzhangzz about 2 months ago
Labels: enhancement

#2195 - [Bug] CUDA runtime error: an illegal memory access was encountered

Issue - State: closed - Opened by thiner about 2 months ago - 16 comments

#2192 - test prtest image update

Pull Request - State: closed - Opened by zhulinJulia24 about 2 months ago

#2189 - lmdeploy对于多模态的推理加速的支持让人着迷😊，ms-swift接入了lmdeploy作为infer和deploy的后端

Issue - State: closed - Opened by Jintao-Huang about 2 months ago - 3 comments

#2185 - [Feature] Can lmdeploy quant InternVL2 to AWQ?

Issue - State: closed - Opened by janelu9 about 2 months ago - 3 comments

#2184 - [Feature] Fine tuning of quantized 4bit internlm/internlm-xcomposer2-4khd-7b?

Issue - State: closed - Opened by zhuraromdev about 2 months ago - 3 comments

#2183 - [ci] benchmark react

Pull Request - State: closed - Opened by zhulinJulia24 about 2 months ago - 1 comment

#2181 - Question regarding the prefix caching

Issue - State: closed - Opened by laserwave about 2 months ago - 3 comments
Labels: awaiting response, Stale

#2180 - [Bug] unmatched weights key in xcomposer2d5

Issue - State: closed - Opened by Stan-lei about 2 months ago - 3 comments
Labels: awaiting response, Stale

#2179 - [Bug] pipeline.stream_infer无法支持openai format history（text+image),也无法支持sess用法

Issue - State: open - Opened by owl-10 about 2 months ago - 2 comments

#2177 - [Docs] internlm2.5的function call的template的来源

Issue - State: closed - Opened by Lam1360 about 2 months ago - 1 comment

#2174 - 关于在vlm模型在awq量化时加入自己的数据的一些问题

Issue - State: closed - Opened by lzcchl about 2 months ago - 5 comments

#2170 - [Bug] 模型接口调用时，直接模型服务崩溃

Issue - State: closed - Opened by WCwalker about 2 months ago - 8 comments

#2169 - 对模型量化4bit时报错ConnectionError: Couldn't reach 'ptb_text_only' on the Hub (ConnectionError)

Issue - State: closed - Opened by sxk000 about 2 months ago - 7 comments

#2166 - [Bug] Llama3.1 AWQ at TP>1 giving different responses

Issue - State: closed - Opened by Tushar-ml about 2 months ago - 11 comments

#2165 - wrong expression

Pull Request - State: closed - Opened by ArtificialZeng about 2 months ago
Labels: documentation

#2164 - [Bug Regression] segfault in turbomind for OpenGVLab/InternVL2-Llama3-76B and OpenGVLab/InternVL-Chat-V1-5

Issue - State: closed - Opened by pseudotensor about 2 months ago - 12 comments

#2162 - [Feature] Loading InternVL2-40B-AWQ within 24GB of VRAM

Issue - State: closed - Opened by josephrocca about 2 months ago - 4 comments

#2160 - 关于多图插入位置的问题请教一下

Issue - State: closed - Opened by liangofthechen about 2 months ago - 1 comment
Labels: mllm

#2152 - [Feature] 对api_server的一些建议

Issue - State: closed - Opened by ly19970621 about 2 months ago - 6 comments
Labels: awaiting response, Stale

#2150 - [Feature] 为什么模型的文件夹名称一定不能是任意的？

Issue - State: closed - Opened by janelu9 about 2 months ago - 8 comments

#2148 - [Bug] Mini-InternVL-Chat-2B-V1-5 AWQ量化后推理速度比量化前慢

Issue - State: open - Opened by hezeli123 about 2 months ago - 5 comments

#2147 - Support send tool_calls back to internlm2

Pull Request - State: closed - Opened by AllentDan about 2 months ago - 4 comments
Labels: improvement

#2146 - [Docs] how to install lmdeploy0.5.1 in jetson

Issue - State: closed - Opened by quanfeifan about 2 months ago - 4 comments

#2144 - [Bug] Tensor parallel hangs with LlaVA 34B

Issue - State: closed - Opened by apresunreve about 2 months ago - 3 comments
Labels: awaiting response, Stale, mllm

#2140 - [Bug] 使用lmdeploy部署的gemma2-27b，推理能力减弱很多

Issue - State: closed - Opened by zhouyuustc about 2 months ago - 8 comments
Labels: awaiting response

#2138 - [Bug] 启动服务后, 测试internlm2.5的function call功能返回为None，没有进行调用工具

Issue - State: closed - Opened by HughesZhang2021 about 2 months ago - 4 comments
Labels: awaiting response, Stale

#2137 - [Bug] 4卡-Tesla V100推理报错CUDA error: an illegal memory access was encountered

Issue - State: closed - Opened by stiyet about 2 months ago - 17 comments
Labels: awaiting response, Stale

#2134 - Fix duplicated session_id when pipeline is used by multithreads

Pull Request - State: closed - Opened by irexyc about 2 months ago - 9 comments
Labels: BC-breaking, improvement

#2133 - Remove duplicate code

Pull Request - State: closed - Opened by cmpute about 2 months ago - 4 comments
Labels: improvement

GitHub / InternLM/lmdeploy issues and pull requests