GitHub / NVIDIA/TensorRT-LLM issues and pull requests
#4111 - Draft: chore: Make GEMM config enums human readable for better logging
Pull Request -
State: open - Opened by djns99 3 months ago
#4110 - fix: instruct torch to use nvtx3
Pull Request -
State: open - Opened by tongyuantongyu 3 months ago
- 6 comments
#4109 - [Infra] - Update code ownership rules
Pull Request -
State: closed - Opened by chzblych 3 months ago
- 3 comments
#4108 - doc: update release notes
Pull Request -
State: closed - Opened by kaiyux 3 months ago
- 3 comments
#4107 - [TRTLLM-5057][fix] Adding option to specify a set of token ids for multimodal tokens
Pull Request -
State: closed - Opened by rakib-hasan 3 months ago
- 9 comments
#4106 - Fix Pipeline Parallelism in Llama4
Pull Request -
State: open - Opened by v-shobhit 3 months ago
- 4 comments
#4105 - Add initial list of CODEOWNERS
Pull Request -
State: closed - Opened by kevinch-nv 3 months ago
- 5 comments
#4104 - [nvbug/5262268][fix] Fix trtllm-bench for llama 4
Pull Request -
State: open - Opened by mikeiovine 3 months ago
- 12 comments
#4102 - refactor: Copy sequence lengths once in decoder setup
Pull Request -
State: open - Opened by Funatiq 3 months ago
- 18 comments
#4101 - [https://nvbugspro.nvidia.com/bug/5238626] illegal memory address when running llama 4 with cuda graph enabled
Pull Request -
State: open - Opened by PerkzZheng 3 months ago
- 33 comments
#4100 - doc: TRTLLM-4797 Update perf-analysis.md
Pull Request -
State: closed - Opened by kaiyux 3 months ago
- 5 comments
#4099 - Install from docs not working
Issue -
State: closed - Opened by darraghdog 3 months ago
- 7 comments
Labels: bug, triaged, Installation
#4097 - enh: Update docker Makefile to use only the visible GPUs of machine
Pull Request -
State: closed - Opened by venkywonka 3 months ago
- 3 comments
Labels: Ease of Use
#4096 - refactor: Unify request order in TRT and PyTorch workflow
Pull Request -
State: open - Opened by Funatiq 3 months ago
- 18 comments
#4095 - [https://nvbugspro.nvidia.com/bug/5260676]test: skip fp8 quantization case for pre-ada
Pull Request -
State: open - Opened by crazydemo 3 months ago
- 3 comments
#4094 - fix: llmapi-launch add add trtllm-bench test with engine building
Pull Request -
State: closed - Opened by Superjomn 3 months ago
- 4 comments
#4093 - [Qwen3] chore: fix bug of fused_moe on tp > 1
Pull Request -
State: closed - Opened by byshiue 3 months ago
- 3 comments
#4092 - [TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow
Pull Request -
State: open - Opened by Funatiq 3 months ago
- 13 comments
#4091 - fix: llmapi-launch add add trtllm-bench test with engine building
Pull Request -
State: open - Opened by Superjomn 3 months ago
- 15 comments
#4090 - [TRTLLM-5081] [test] Align parametrize_with_ids to the pytest behavior
Pull Request -
State: open - Opened by syuoni 3 months ago
- 12 comments
#4089 - [#4085][fix] Fix `apply_per_channel_scale` for extremely large input sequence length.
Pull Request -
State: open - Opened by StudyingShao 3 months ago
- 20 comments
Labels: bug
#4087 - chore: misc static analysis fixes and generating xqa source header at config time
Pull Request -
State: open - Opened by hypdeb 3 months ago
- 4 comments
#4086 - Cherry-pick trtllm-gen from feat/llama4 to main
Pull Request -
State: open - Opened by chenfeiz0326 3 months ago
- 26 comments
#4084 - [fix] Fix add_dummy_requests for spec decoding cases
Pull Request -
State: closed - Opened by lfr-0531 3 months ago
- 44 comments
#4083 - test: add qwen3 and disaggregated serving accuracy tests to qa test list
Pull Request -
State: open - Opened by StanleySun639 3 months ago
- 21 comments
#4082 - Integrate trtllm-gen kernel for the QKV gemm in llama4
Pull Request -
State: open - Opened by eopXD 3 months ago
- 1 comment
#4081 - chore: Clean up the legacy DeepseekAllreudceFusionOp.
Pull Request -
State: open - Opened by hyukn 3 months ago
- 20 comments
#4080 - feat: Fallback to NCCL for various patterns when input size is large.
Pull Request -
State: open - Opened by hyukn 3 months ago
- 18 comments
#4079 - fix: Enable test case disabled by nvbug 5245262
Pull Request -
State: closed - Opened by HuiGao-NV 3 months ago
- 12 comments
#4078 - refactor: Allow models to override apply_qk_norm.
Pull Request -
State: open - Opened by yuxianq 3 months ago
- 28 comments
#4077 - [https://nvbugspro.nvidia.com/bug/5244006, https://nvbugspro.nvidia.com/bug/5240350][test] Unwaive guided decoding tests
Pull Request -
State: open - Opened by syuoni 3 months ago
- 31 comments
#4070 - fix: Update log query regex in perf integration test to match trtllm-bench reporting
Pull Request -
State: open - Opened by venkywonka 3 months ago
- 9 comments
#4069 - [fix][nvbug/5244009] Fix llama 4 test lists/scout accuracy issue
Pull Request -
State: closed - Opened by mikeiovine 3 months ago
- 23 comments
#4068 - fix: Set `trust_remote_code=True` when verifying config.json load
Pull Request -
State: closed - Opened by venkywonka 3 months ago
- 1 comment
#4067 - feat: Reduce branch overhead in groupRMSNorm kernels
Pull Request -
State: closed - Opened by SimengLiu-nv 3 months ago
- 10 comments
#4066 - feat: Support the Structural Tag in guided decoding
Pull Request -
State: open - Opened by Ubospica 3 months ago
- 15 comments
Labels: Community want to contribute, Community Engagement
#4065 - [feat/] enable attention DP in Llama4 maverick model - part 1
Pull Request -
State: open - Opened by zihaok 3 months ago
- 13 comments
#4064 - feat:[AutoDeploy] utilize torch._inductor.pattern_matcher to write pattern matcher
Pull Request -
State: open - Opened by Fridah-nv 3 months ago
- 3 comments
Labels: AutoDeploy
#4063 - [feat] trtllmGen MoE routing: added support for top groups and top K bounds
Pull Request -
State: open - Opened by MatthiasKohl 3 months ago
- 5 comments
#4061 - Refactor: Lookahead TRT workflow
Pull Request -
State: open - Opened by wili-65535 3 months ago
- 1 comment
#4057 - feat: adopt new logprob definition in PyTorch flow
Pull Request -
State: closed - Opened by tongyuantongyu 3 months ago
- 16 comments
#4053 - [TRTQA-2861][test]: add nemotron and llama4 cases into qa test
Pull Request -
State: closed - Opened by crazydemo 3 months ago
- 14 comments
#4047 - feat: Add heuristic for GroupRMSNorm kernel selection.
Pull Request -
State: open - Opened by SimengLiu-nv 3 months ago
- 4 comments
#4046 - test: [CI] remove closed bugs
Pull Request -
State: closed - Opened by xinhe-nv 3 months ago
- 15 comments
#4037 - Deepseek R1 and V3, FP4 quant, output quality issues at batch size > 2
Issue -
State: open - Opened by pankajroark 3 months ago
- 7 comments
Labels: bug, triaged
#4034 - feat: [nvbug/5261055][nvbug/5170160] non-invasive pipeline parallelism
Pull Request -
State: open - Opened by yuxianq 3 months ago
- 30 comments
#4030 - [DRAFT] Introducing multi-vocab token sampling for audio generation
Pull Request -
State: open - Opened by vklimkov-nvidia 3 months ago
- 3 comments
#4028 - feat:enable kvcache to be reused during request generation
Pull Request -
State: open - Opened by narutolhy 3 months ago
- 105 comments
Labels: triaged, Community want to contribute, Community Engagement
#4027 - Refactor: Restructure C++ tests for better modularisation of non-shared code
Pull Request -
State: closed - Opened by DomBrown 3 months ago
- 38 comments
#4020 - feat: Enable AutoDeploy to llm-eval example
Pull Request -
State: open - Opened by meenchen 3 months ago
- 4 comments
Labels: AutoDeploy
#4019 - feat: Add Slurm support and enable RTX Pro 6000 testing pipeline in CI
Pull Request -
State: closed - Opened by yuanjingx87 3 months ago
- 61 comments
#4016 - [Deepseek] Refactor Deepseek Decoder layer
Pull Request -
State: closed - Opened by hlu1 3 months ago
- 23 comments
#4011 - bench: TRTLLM-4936 Port benchmark_serving.py
Pull Request -
State: closed - Opened by kaiyux 3 months ago
- 6 comments
#3998 - [fix] Fix llama4 + eagle3
Pull Request -
State: open - Opened by mikeiovine 3 months ago
- 21 comments
#3993 - chore:update .gitignore for doc building task.
Pull Request -
State: closed - Opened by nv-guomingz 3 months ago
- 6 comments
#3992 - chore: enhance the cmake experience by ignoring the additional semicolon
Pull Request -
State: closed - Opened by nv-guomingz 3 months ago
- 17 comments
#3990 - chore: reduce size of the docker images
Pull Request -
State: closed - Opened by MartinMarciniszyn 3 months ago
- 14 comments
#3989 - fix:https://nvbugs/5246733
Pull Request -
State: open - Opened by nv-guomingz 3 months ago
- 2 comments
#3988 - fix: [nvbug/5241627] Fix AllReduce kernel hang issue when both tp and pp are enabled.
Pull Request -
State: open - Opened by hyukn 3 months ago
- 2 comments
#3986 - docs:update 0.19 docs
Pull Request -
State: closed - Opened by nv-guomingz 3 months ago
- 3 comments
#3985 - [TRTLLM-3925, https://nvbugs/5245262] [fix] Normalize LLM.generate API
Pull Request -
State: closed - Opened by syuoni 3 months ago
- 14 comments
#3984 - fix: Correctly sizes seqslotmanager considering pp.
Pull Request -
State: open - Opened by dcampora 3 months ago
- 6 comments
#3983 - feat: support to trace executor loop.
Pull Request -
State: open - Opened by yuxianq 3 months ago
- 6 comments
#3981 - infra: Add NIXL into the Dockerfile
Pull Request -
State: closed - Opened by Shixiaowei02 3 months ago
- 11 comments
#3980 - refactor: Move ModelSpec to core library
Pull Request -
State: open - Opened by Funatiq 3 months ago
- 14 comments
#3979 - Feat: Variable-Beam-Width-Search (VBWS) part4
Pull Request -
State: open - Opened by wili-65535 3 months ago
- 9 comments
#3978 - [fix] Enable pp tests
Pull Request -
State: open - Opened by yizhang-nv 3 months ago
#3977 - Qserve-w4a8 Shows Lower Computational Efficiency on H20
Issue -
State: open - Opened by StaryDing 3 months ago
- 4 comments
Labels: not a bug
#3976 - doc: Update 0.19.0 release notes
Pull Request -
State: open - Opened by kaiyux 3 months ago
#3975 - [https://nvbugspro.nvidia.com/bug/5247148][fix] Attention DP with overlap scheduler
Pull Request -
State: open - Opened by syuoni 3 months ago
- 14 comments
#3974 - feat: conditional disaggregation in disagg server
Pull Request -
State: closed - Opened by zhengd-nv 3 months ago
- 53 comments
#3973 - chore: update internal_cutlass_kernels.
Pull Request -
State: closed - Opened by nv-guomingz 3 months ago
- 9 comments
#3972 - fix[nvbug-5228840]: Add debug log memory infomation for memory allocation error
Pull Request -
State: open - Opened by HuiGao-NV 3 months ago
- 3 comments
#3971 - chore: update multi-gpu trigger file list
Pull Request -
State: closed - Opened by QiJune 3 months ago
- 6 comments
#3970 - fix: Add attention workspace memory check
Pull Request -
State: open - Opened by hlu1 3 months ago
- 3 comments
#3969 - Chore: 2025-04-29 CI allowlist update
Pull Request -
State: open - Opened by tburt-nv 3 months ago
#3968 - [TRTLLM-4623][fix] sync internal cutlass kernel changes
Pull Request -
State: closed - Opened by pamelap-nvidia 3 months ago
- 3 comments
#3967 - [fix] Eagle-2 LLMAPI pybind argument fix.
Pull Request -
State: open - Opened by jhaotingc 3 months ago
- 19 comments
#3966 - SW Architecture Enhancements
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: roadmap, SW Architecture
#3964 - 1.0 Architecture
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: roadmap, SW Architecture
#3963 - Disaggregated Prefill & Decode serving optimizations
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: triaged, Performance, Investigating, roadmap
#3962 - MoE optimizations
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: triaged, Performance, Investigating, roadmap
#3961 - Support versioned github.io doc to make it easy to map code with the corresponding doc version
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: Documentation, triaged, Investigating, roadmap
#3960 - Re-organize the example directory into Feature level examples and Model level examples
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: Documentation, triaged, Investigating, roadmap
#3958 - Intra-1.x-version backward compatibility for selected APIs.
Issue -
State: open - Opened by mk-nvidia 3 months ago
Labels: triaged, Investigating, roadmap
#3957 - [fix] Pad requests to maximum draft length in spec decode
Pull Request -
State: closed - Opened by mikeiovine 3 months ago
- 3 comments
#3955 - Plenty of regressions in trt-llm v0.20.0
Issue -
State: open - Opened by michaelfeil 3 months ago
- 1 comment
Labels: bug
#3954 - chore: remove release branch codeowners from main
Pull Request -
State: closed - Opened by tburt-nv 3 months ago
- 3 comments
#3953 - align decoder state with trtllm decoder
Pull Request -
State: closed - Opened by netanel-haber 3 months ago
#3952 - [https://nvbugs/5123103][fix] Fix torch compile for DeepSeekV3
Pull Request -
State: open - Opened by liji-nv 3 months ago
- 43 comments
#3951 - [https://nvbugs/5238105] fix: ModelRunnerCpp num_return_sequences
Pull Request -
State: open - Opened by Funatiq 3 months ago
- 44 comments
#3950 - test: Add fp8kv to DS-v3-lite integration tests.
Pull Request -
State: open - Opened by bobboli 3 months ago
- 23 comments
#3949 - chore: bump version to 0.20.0rc2
Pull Request -
State: closed - Opened by ZhanruiSunCh 3 months ago
- 6 comments
#3948 - infra: Fix pipeline step error in post merge
Pull Request -
State: open - Opened by ZhanruiSunCh 3 months ago
- 6 comments
#3946 - [TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval
Pull Request -
State: closed - Opened by syuoni 3 months ago
- 15 comments
#3945 - fix: Move all casters to customCasters.
Pull Request -
State: open - Opened by dcampora 3 months ago
- 13 comments
#3943 - test: [CI] Add failed cases into waives.txt
Pull Request -
State: closed - Opened by xinhe-nv 3 months ago
- 9 comments
#3942 - fix cache transfer buffer
Pull Request -
State: closed - Opened by chuangz0 3 months ago
- 12 comments
#3936 - [TRTLLM-5000][feat] Pytorch implementation of ngram drafter
Pull Request -
State: open - Opened by thorjohnsen 3 months ago
- 5 comments
#3935 - chore: Remove duplicated get_sm_version.
Pull Request -
State: closed - Opened by yuxianq 3 months ago
- 3 comments