InternLM/lmdeploy issues and pull requests

#2130 - [Feature] Add 'tools' and 'tool_choice' to InternVL2-40B models.

Issue - State: open - Opened by coelho-k about 2 months ago

#2129 - [Bug] `slice_image` attribute error with MiniCPM-Llama3-V-2_5

Issue - State: open - Opened by thisiskofi about 2 months ago - 12 comments
Labels: awaiting response, Stale, mllm

#2125 - [Bug] topk (43930832) is larger

Issue - State: closed - Opened by WCwalker about 2 months ago - 3 comments

#2120 - [Bug] Gradio serve不能正确识别chat template

Issue - State: closed - Opened by cmpute about 2 months ago - 4 comments

#2117 - [Bug] Llama 3.1 Support

Issue - State: open - Opened by vladrad about 2 months ago - 18 comments
Labels: awaiting response

#2110 - 如何最大化pipline batch infer的吞吐量/速度[Bug]

Issue - State: closed - Opened by hitzhu about 2 months ago - 3 comments
Labels: awaiting response, Stale

#2104 - Refactor pytorch engine

Pull Request - State: closed - Opened by grimoire about 2 months ago - 21 comments
Labels: enhancement

#2103 - Renew a session for reset button

Pull Request - State: closed - Opened by AllentDan about 2 months ago

#2103 - Renew a session for reset button

Pull Request - State: closed - Opened by AllentDan about 2 months ago

#2101 - [Bug] serve的时候event loop报错

Issue - State: open - Opened by cmpute about 2 months ago - 27 comments

#2101 - [Bug] serve的时候event loop报错

Issue - State: open - Opened by cmpute about 2 months ago - 27 comments

#2095 - [Bug] glm-4v=9b速度特别慢

Issue - State: closed - Opened by bltcn 2 months ago - 2 comments

#2090 - New GEMM kernels for weight-only quantization

Pull Request - State: closed - Opened by lzhangzz 2 months ago - 14 comments
Labels: enhancement

#2084 - Add user guide about slora serving

Pull Request - State: closed - Opened by AllentDan 2 months ago - 1 comment
Labels: documentation

#2061 - [Bug] internlm2_5-7b-chat，TypeError: TextEncodeInput must be Union[TextInputSequence, Tuple[InputSequence, InputSequence]]

Issue - State: closed - Opened by rongruosong 2 months ago - 9 comments

#2060 - [Bug] 使用官方镜像v0.5.1进行GLM4v的部署会有一个报错

Issue - State: closed - Opened by ZhiyuYUE 2 months ago - 8 comments
Labels: awaiting response, Stale

#2057 - 想问下Lmdeploy支持base model加多lora的部署方式么

Issue - State: closed - Opened by will-wiki 2 months ago - 11 comments
Labels: awaiting response, Stale

#2055 - [Bug] 最新版 lmdeploy 0.5.1 在v100以及rtx 2080 ti 上部署cogvlm2，推理过程会报错： ERROR - Engine loop failed with error: map::at

Issue - State: open - Opened by kklots 2 months ago - 10 comments
Labels: v100

#2052 - Florence 2 support :)

Issue - State: open - Opened by SinanAkkoyun 2 months ago - 2 comments

#2050 - 多卡部署internvl2-8b报错

Issue - State: closed - Opened by haoduoyu1203 2 months ago - 5 comments

#2046 - Support custom attention backend

Pull Request - State: closed - Opened by grimoire 2 months ago

#2043 - [Bug] gradio reset button stucked after I cancel a response.

Issue - State: closed - Opened by zhulinJulia24 2 months ago - 2 comments

#2038 - Reorganize the user guide and update the get_started section

Pull Request - State: closed - Opened by lvhan028 2 months ago
Labels: documentation

#2036 - [Bug] lmdeploy has other questions about server for lora_merge_model

Issue - State: closed - Opened by Volta-lemon 2 months ago - 20 comments

#2036 - [Bug] lmdeploy has other questions about server for lora_merge_model

Issue - State: closed - Opened by Volta-lemon 2 months ago - 20 comments

#2018 - Add prefix cache stats to usage

Pull Request - State: open - Opened by ispobock 2 months ago - 10 comments

#2001 - [Bug] 访问一段时间后服务卡死/无响应

Issue - State: open - Opened by hezeli123 2 months ago - 13 comments

#1990 - Could not use my local internVL mini model for inference

Issue - State: open - Opened by shiva-vardhineedi 2 months ago - 4 comments

#1989 - [Feature] 我们支持gptq量化模型的推理么

Issue - State: closed - Opened by eigen2017 2 months ago - 15 comments

#1984 - Phi3 awq

Pull Request - State: closed - Opened by grimoire 2 months ago - 4 comments
Labels: enhancement

#1981 - [Bug] MiniCPMV的推理有问题

Issue - State: closed - Opened by LDLINGLINGLING 2 months ago - 17 comments
Labels: awaiting response, Stale

#1970 - [Feature] tubromind有计划支持cogvlm2吗？

Issue - State: closed - Opened by jidechao 2 months ago - 1 comment
Labels: backlog

#1966 - support min_p sampling & do_sample setting

Pull Request - State: closed - Opened by irexyc 2 months ago - 4 comments
Labels: WIP

#1966 - support min_p sampling & do_sample setting

Pull Request - State: closed - Opened by irexyc 2 months ago - 4 comments
Labels: WIP

#1962 - torch engine optimize prefill for long context

Pull Request - State: closed - Opened by grimoire 2 months ago - 8 comments
Labels: improvement

#1931 - Remove deprecated arguments from API and clarify model_name and chat_template_name

Pull Request - State: closed - Opened by lvhan028 3 months ago - 2 comments
Labels: BC-breaking, improvement

#1906 - minicpm-v采用W4A16量化，推理速度没什么变化

Issue - State: closed - Opened by DankoZhang 3 months ago - 14 comments

#1862 - [Bug] 单轮的图文交错对话的实现原理

Issue - State: open - Opened by stay-leave 3 months ago - 1 comment

#1856 - Support guided decoding for pytorch backend

Pull Request - State: closed - Opened by AllentDan 3 months ago - 8 comments
Labels: enhancement

#1846 - 如何指定模型的数据类型为f16

Issue - State: open - Opened by Yang-bug-star 3 months ago - 6 comments

#1846 - 如何指定模型的数据类型为f16

Issue - State: closed - Opened by Yang-bug-star 3 months ago - 9 comments

#1844 - Maybe a workaround for qwen2 quantization Nan error

Pull Request - State: closed - Opened by AllentDan 3 months ago - 5 comments

#1836 - [Bug] qwen2 awq量化微调后的模型报错

Issue - State: open - Opened by qiuxuezhe123 3 months ago - 12 comments

#1833 - [Feature] How to support do_sample config just like Automodel 能否像Automodel推理中的do_sample参数支持，支持使用确定性生成方法，而不是随机采样

Issue - State: open - Opened by Leo-yang-1020 3 months ago - 10 comments

#1831 - [Bug] smoothquant量化Bacihuan2-7B-Chat模型，无法正常量化

Issue - State: closed - Opened by CodexDive 3 months ago - 11 comments

#1826 - [Bug] awq for Qwen2-72B-instruct

Issue - State: open - Opened by Vincent131499 3 months ago - 25 comments

#1815 - [Bug] lmdeploy部署intermlm2-chat-20b，遇到<|im_end|>不会停止

Issue - State: open - Opened by jeinlee1991 3 months ago - 11 comments

#1745 - [Feature] `min_p` sampling parameter

Issue - State: closed - Opened by josephrocca 3 months ago - 4 comments

#1738 - [Feature] Speculative Decoding

Issue - State: open - Opened by josephrocca 3 months ago - 15 comments

#1711 - [Feature] V100量化推理

Issue - State: closed - Opened by QwertyJack 4 months ago - 16 comments
Labels: v100

#1615 - Check base64 image validation

Pull Request - State: closed - Opened by AllentDan 4 months ago - 2 comments
Labels: Bug:P2

#1587 - [Feature] Support W4A8KV4 Quantization(QServe/QoQ)

Issue - State: closed - Opened by wanzhenchn 4 months ago - 3 comments

#1565 - add more model into benchmark and evaluate workflow

Pull Request - State: closed - Opened by zhulinJulia24 4 months ago - 1 comment

#1548 - [Bug] set logprobs = true and top_logprobs = 5 in restful server. The number of top logrobs is 4 which is unexpected.

Issue - State: closed - Opened by zhulinJulia24 5 months ago - 5 comments

#1526 - [Bug] ImportError: DLL load failed while importing _turbomind: 找不到指定的模块。

Issue - State: closed - Opened by StarCycle 5 months ago - 32 comments

#1332 - Add docs of support new vl model

Pull Request - State: closed - Opened by irexyc 6 months ago
Labels: documentation

#1315 - [Feature] 建议训练internlm2-chat-7b 的 GPTQ-4bit 量化模型并支持llmdeploy部署

Issue - State: closed - Opened by wwewwt 6 months ago - 4 comments
Labels: backlog

#1177 - [Bug] 3090 部署internlm2-chat-20b-4bits，提问卡住不懂

Issue - State: closed - Opened by makefree3 7 months ago - 9 comments
Labels: awaiting response, Stale

#1035 - Compatible with Gradio 4.x

Pull Request - State: closed - Opened by irexyc 8 months ago - 1 comment
Labels: WIP, improvement

#1033 - [Bug] KV Cache INT8 校准警告：Token indices sequence length is longer than the specified maximum sequence length for this model (2874305 > 4096)

Issue - State: closed - Opened by deepslee 8 months ago - 5 comments

GitHub / InternLM/lmdeploy issues and pull requests