InternLM/lmdeploy issues and pull requests

#2729 - [Feature] 希望可以使用镜像站进行量化用语料的下载

Issue - State: open - Opened by HelloCard 1 day ago

#2728 - [Bug] How to improve the first frame response speed TTFT

Issue - State: open - Opened by zhouyuustc 2 days ago - 1 comment
Labels: awaiting response

#2727 - Support Mono-InternVL with PyTorch backend

Pull Request - State: open - Opened by wzk1015 3 days ago - 4 comments
Labels: enhancement

#2726 - [Bug] WSL2环境下0.6.2版本无法推理W8A8的Orca-2-13b量化模型

Issue - State: closed - Opened by HelloCard 3 days ago - 2 comments

#2725 - Support mixtral moe AWQ quantization.

Pull Request - State: open - Opened by AllentDan 3 days ago - 1 comment
Labels: enhancement

#2724 - [Bug] Deployment of Llama3.1-70b getting struck

Issue - State: open - Opened by pulkitmehtaworkmetacube 3 days ago - 5 comments

#2723 - Support Qwen2-MoE models

Pull Request - State: open - Opened by lzhangzz 3 days ago - 1 comment
Labels: enhancement

#2722 - lmdeploy - ERROR - init.py:17 - ModuleNotFoundError: No module named 'dlinfer'

Issue - State: open - Opened by jiabao-wang 3 days ago - 1 comment
Labels: awaiting response

#2721 - [ci] add more testcase into evaluation and daily test

Pull Request - State: open - Opened by zhulinJulia24 4 days ago

#2720 - support qwen2-vl with turbomind backend

Pull Request - State: open - Opened by irexyc 4 days ago - 1 comment
Labels: enhancement

#2719 - Check server input

Pull Request - State: open - Opened by irexyc 4 days ago
Labels: improvement

#2718 - fix tp exit code for pytorch engine

Pull Request - State: closed - Opened by RunningLeon 4 days ago
Labels: Bug:P1

#2717 - bump version to 0.6.2.post1

Pull Request - State: closed - Opened by lvhan028 5 days ago

#2716 - Support molmo in turbomind

Pull Request - State: open - Opened by lvhan028 5 days ago - 1 comment
Labels: enhancement

#2715 - support turbomind head_dim 64

Pull Request - State: closed - Opened by irexyc 5 days ago - 2 comments
Labels: enhancement

#2714 - [Bug] awq模型不支持lora吗

Issue - State: closed - Opened by LIUKAI0815 5 days ago - 4 comments

#2713 - Fix turbomind TP for v0.6.2

Pull Request - State: closed - Opened by lzhangzz 5 days ago

#2712 - [Bug] 显存溢出后程序卡死，而不是报错

Issue - State: open - Opened by Weiyun1025 5 days ago - 1 comment

#2711 - [Bug] GenerationConfig.max_new_tokens seem to be not working

Issue - State: closed - Opened by YYue000 5 days ago - 1 comment

#2710 - [Feature]: support LlavaForConditionalGeneration with turbomind inference

Pull Request - State: closed - Opened by deepindeed2022 5 days ago - 3 comments
Labels: enhancement

#2709 - [Bug] 并发场景下，发起大的输入token请求时会导致流式响应出现问题

Issue - State: open - Opened by zhouyuustc 5 days ago - 1 comment
Labels: awaiting response

#2708 - Remove one of the duplicate bos tokens

Pull Request - State: open - Opened by AllentDan 6 days ago

#2707 - Add ensure_ascii = False for json.dumps

Pull Request - State: closed - Opened by AllentDan 6 days ago
Labels: improvement

#2706 - Fix turbomind TP

Pull Request - State: closed - Opened by lzhangzz 6 days ago
Labels: Bug:P1

#2705 - [Bug] InternVL2-1B performance of lmdeploy is much worse compared to the original Hugging Face PyTorch model.

Issue - State: open - Opened by henry16lin 6 days ago - 4 comments

#2703 - Fix llama3.2 VL vision in "Supported Modals" documents

Pull Request - State: closed - Opened by blankanswer 6 days ago

#2702 - [Bug] CUDA 12.5 源码编译 test_utils.cu 报错

Issue - State: open - Opened by DefTruth 6 days ago - 1 comment

#2701 - Run loop.run_until_complete in another thread

Pull Request - State: open - Opened by AllentDan 6 days ago

#2699 - [Bug] Does it support Internvl2-26B quantization by awq on nvidia V100

Issue - State: closed - Opened by diandianliu 6 days ago - 1 comment

#2698 - miss to read moe_ffn weights from converted tm model

Pull Request - State: closed - Opened by lvhan028 9 days ago
Labels: Bug:P1

#2697 - fix index error when computing ppl on long-text prompt

Pull Request - State: closed - Opened by lvhan028 9 days ago - 1 comment
Labels: Bug:P1

#2696 - fix ascend get_started.md link

Pull Request - State: closed - Opened by CyCle1024 9 days ago

#2695 - [Feature] Mono-Internvl

Issue - State: closed - Opened by sxlyiyiyi 10 days ago - 1 comment

#2694 - [Feature] unsupported quant config

Issue - State: open - Opened by maxin9966 10 days ago - 4 comments
Labels: awaiting response

#2693 - [Bug] cannot get gpqa's score on Qwen2.5-7b model by using lmdeploy backend and opencompass

Issue - State: closed - Opened by zhulinJulia24 10 days ago - 1 comment
Labels: awaiting response

#2692 - [Bug] cannot get winogrande dataset's score on Qwen2.5-7b model by using lmdeploy backend and opencompass

Issue - State: open - Opened by zhulinJulia24 10 days ago - 1 comment
Labels: awaiting response

#2691 - api_server 方式部署有概率卡住

Issue - State: open - Opened by LiYtao 10 days ago - 1 comment
Labels: awaiting response

#2690 - Support ep, column major moe kernel.

Pull Request - State: open - Opened by grimoire 10 days ago - 1 comment
Labels: improvement

#2689 - [Bug] chat with converted mixtral-8x7b model, raise RuntimeError

Issue - State: closed - Opened by zhulinJulia24 10 days ago - 1 comment
Labels: awaiting response

#2688 - fix decoding kernel for deepseekv2

Pull Request - State: closed - Opened by grimoire 10 days ago - 2 comments
Labels: Bug:P1

#2687 - [Bug] Ascend910+Internvl2容器自启动问题

Issue - State: closed - Opened by linuxmi 10 days ago

#2686 - [Docs] LoRA 推理服务

Issue - State: open - Opened by LIUKAI0815 10 days ago - 1 comment

#2684 - [Feature] Whether pytorch backend is supported on Windows?

Issue - State: closed - Opened by eeyrw 11 days ago - 2 comments

#2683 - update pre-commit config

Pull Request - State: open - Opened by lvhan028 11 days ago

#2682 - [Bug] min_p from request is not used

Issue - State: open - Opened by ErykCh 12 days ago - 1 comment
Labels: awaiting response

#2681 - Support min_tokens, min_p parameters for api_server

Pull Request - State: closed - Opened by AllentDan 12 days ago - 6 comments
Labels: Bug:P1

#2680 - mllama3.2-V-11b support text-only mode?

Issue - State: open - Opened by AnyangAngus 12 days ago - 9 comments
Labels: awaiting response

#2679 - [Bug] pytorch backend 's precision points loss 1.0-2.5 points between main code and v0.6.1 on some models.

Issue - State: open - Opened by zhulinJulia24 12 days ago

#2678 - [Bug] request parameter `min_new_tokens` is not used

Issue - State: closed - Opened by Huarong 12 days ago - 3 comments

#2677 - Better tp exit log.

Pull Request - State: closed - Opened by grimoire 12 days ago - 1 comment
Labels: Bug:P2

#2676 - Flatten cache and add flashattention

Pull Request - State: closed - Opened by grimoire 12 days ago
Labels: improvement

#2675 - [Feature] Support QwenVL on Ascend

Issue - State: open - Opened by Yang1032 12 days ago

#2674 - [Feature] support multi-lora in turbomind backend

Issue - State: open - Opened by zzf2grx 13 days ago

#2673 - [Feature] Response Metrics

Issue - State: open - Opened by nathan-az 13 days ago - 2 comments

#2672 - remove dlinfer version

Pull Request - State: closed - Opened by CyCle1024 13 days ago
Labels: improvement

#2671 - Call cuda empty_cache to prevent OOM when quantizing model

Pull Request - State: closed - Opened by AllentDan 13 days ago
Labels: improvement

#2670 - feat: support dynamic/llama3 rotary embedding in ascend graph mode

Pull Request - State: closed - Opened by tangzhiyi11 13 days ago
Labels: improvement

#2669 - fix supported model list in ascend graph mode

Pull Request - State: closed - Opened by jinminxi104 13 days ago
Labels: improvement

#2668 - fix inference mode error for qwen2-vl

Pull Request - State: closed - Opened by irexyc 13 days ago
Labels: Bug:P1

#2667 - fix build error in ascend dockerfile

Pull Request - State: closed - Opened by CyCle1024 13 days ago - 1 comment
Labels: Bug:P1

#2666 - Set history_cross_kv_seqlens to 0 by default

Pull Request - State: closed - Opened by AllentDan 13 days ago
Labels: Bug:P1

#2665 - [ci] support v100 dailytest

Pull Request - State: closed - Opened by zhulinJulia24 16 days ago

#2664 - fix syntax in Dockerfile_aarch64_ascend

Pull Request - State: closed - Opened by CyCle1024 16 days ago
Labels: Bug:P1

#2663 - miss device_type when checking is_bf16_supported on ascend platform

Pull Request - State: closed - Opened by lvhan028 16 days ago
Labels: Bug:P1

#2662 - Update ascend get_started tutorial about installing nnal

Pull Request - State: closed - Opened by jinminxi104 16 days ago
Labels: documentation

#2661 - update ascend dockerfile

Pull Request - State: closed - Opened by CyCle1024 16 days ago
Labels: improvement

#2660 - [Bug] Core Dumped！使用lmdeploy==0.6.1版本在单卡P100上部署Internvl2-2B模型请求报错

Issue - State: closed - Opened by xuexidi 16 days ago - 2 comments

#2659 - Bump version to v0.6.2

Pull Request - State: closed - Opened by lvhan028 16 days ago - 1 comment

#2658 - [Bug] no user input makes api server throw exception with MLLM

Issue - State: open - Opened by gaord 16 days ago - 1 comment

#2657 - bugfix: llava-hf/llava-interleave-qwen-7b-hf (#2497)

Pull Request - State: closed - Opened by deepindeed2022 16 days ago - 2 comments
Labels: Bug:P1

#2656 - adding the package install prerequisites section to installation doc

Pull Request - State: closed - Opened by jianliao 16 days ago - 3 comments

#2655 - Update get_started tutorial about deploying on ascend platform

Pull Request - State: closed - Opened by jinminxi104 17 days ago - 1 comment
Labels: documentation

#2654 - Add warning message about `do_sample` to alert BC

Pull Request - State: closed - Opened by lvhan028 17 days ago
Labels: improvement

#2653 - Check whether device support bfloat16

Pull Request - State: closed - Opened by lvhan028 17 days ago
Labels: improvement

#2652 - [Feature] how to get the token score(logprob) of greedy decoder？

Issue - State: closed - Opened by Wondersui 17 days ago - 1 comment

#2651 - [Bug] use new 4bits quantizated models of internlm2, decoded word starts with a blank.

Issue - State: open - Opened by zhulinJulia24 17 days ago

#2650 - [Bug] AWQ量化InternVL2 26B输出无意义的信息

Issue - State: closed - Opened by diandianliu 17 days ago - 12 comments

#2649 - match torch and torch_vision version

Pull Request - State: closed - Opened by grimoire 17 days ago

#2648 - [ascend] make compatibility for Ascend310P

Pull Request - State: closed - Opened by yao-fengchen 17 days ago

#2647 - [ascend] add ascend graph mode

Pull Request - State: closed - Opened by CyCle1024 17 days ago
Labels: enhancement

#2646 - Fix error in python3.8.

Pull Request - State: closed - Opened by Reinerzhou 17 days ago
Labels: Bug:P1

#2645 - add --eager-mode to cli

Pull Request - State: closed - Opened by RunningLeon 17 days ago
Labels: enhancement

#2644 - Align UT with triton fill_kv_cache_quant kernel

Pull Request - State: closed - Opened by AllentDan 17 days ago
Labels: Bug:P1

#2643 - [Bug] minicpm-v-2.6量化报错

Issue - State: closed - Opened by sph116 17 days ago - 6 comments
Labels: awaiting response

#2642 - [Bug] Use triton to deploy minicpm-v-2_6 GPU memory keeps increasing until it overflows

Issue - State: open - Opened by LinJianping 17 days ago - 1 comment

#2641 - update check for triton

Pull Request - State: closed - Opened by grimoire 17 days ago
Labels: improvement

#2640 - Ccy/add ascend graph mode

Pull Request - State: closed - Opened by jinminxi104 17 days ago

#2639 - [Question] About the meaning of `permute_v2` function at weight loading

Issue - State: closed - Opened by vicety 18 days ago - 2 comments

#2638 - [Feature] Metrics Endpoint

Issue - State: open - Opened by eldhosemjoy 18 days ago - 1 comment

#2637 - [Bug] cpu 100%

Issue - State: open - Opened by whk6688 18 days ago - 3 comments

#2636 - [maca] add maca backend support.

Pull Request - State: closed - Opened by Reinerzhou 18 days ago - 3 comments
Labels: enhancement

#2635 - [ci] fix restful script

Pull Request - State: closed - Opened by zhulinJulia24 18 days ago

#2634 - [ci] add v100 testworkflow

Pull Request - State: closed - Opened by zhulinJulia24 18 days ago

#2633 - [Bug] Phi-3-vision-128k-instruct 跑模型在8卡上出现 “Expected all tensors to be on the same device, but found at least two devices”

Issue - State: open - Opened by dreamerlin 19 days ago - 4 comments
Labels: mllm

#2632 - refine pre-post-process

Pull Request - State: closed - Opened by jinminxi104 19 days ago
Labels: improvement

#2631 - [ci] add internlm2_5_7b_batch_1 into evaluation testcase

Pull Request - State: closed - Opened by zhulinJulia24 19 days ago

#2630 - [Feature] lmdeploy awq量化支持多卡运行吗？

Issue - State: closed - Opened by maxin9966 19 days ago - 1 comment

#2629 - [Bug] qwen2-vl-7b docker delpoy bugs

Issue - State: closed - Opened by jnzbfgjd 19 days ago - 3 comments
Labels: awaiting response, Stale

#2628 - [Feature] Combine Batched Inference and Chat Conversation in VLMs Deployment

Issue - State: open - Opened by Yusepp 20 days ago

#2627 - add linear op on dlinfer platform

Pull Request - State: closed - Opened by yao-fengchen 20 days ago
Labels: enhancement

GitHub / InternLM/lmdeploy issues and pull requests