An open API service for providing issue and pull request metadata for open source projects.

GitHub / triton-inference-server/tensorrtllm_backend issues and pull requests

Labelled with: bug

#750 - lora_config shape mismatch when using converted LoRA at runtime

Issue - State: open - Opened by paulhendricks 7 months ago
Labels: bug

#743 - whisper tensorrt-llm backend drop the accuracy for small.en model

Issue - State: closed - Opened by hualin-wu-2000 7 months ago
Labels: bug

#738 - docker export than import the image,run error

Issue - State: closed - Opened by XiaoBin1992 7 months ago
Labels: bug

#727 - tensorrtllm_backend doesn't work with remote repos

Issue - State: open - Opened by ShuaiShao93 8 months ago
Labels: bug

#707 - Failed to build TensorRT-LLM whisper Decoder

Issue - State: open - Opened by muhammad-faizan-122 10 months ago
Labels: bug

#705 - Inconsistent Batch Index Order in Decoupled Mode with trt-llm and triton trtllm backend

Issue - State: closed - Opened by Oldpan 10 months ago - 2 comments
Labels: bug

#692 - Mllama ignores input image when deployed in triton

Issue - State: open - Opened by mutkach 10 months ago
Labels: bug

#686 - Unable to build from source for tag `v0.16.0`.

Issue - State: open - Opened by jingzhaoou 10 months ago
Labels: bug

#682 - Beam search diversity lost with in-flight batching

Issue - State: open - Opened by Grace-YingHuang 10 months ago
Labels: bug

#679 - Assertion failed: sizeof(T) <= remaining_buffer_size

Issue - State: open - Opened by gawain000000 11 months ago
Labels: bug

#678 - Inference error with using draft target model

Issue - State: open - Opened by pimang62 11 months ago
Labels: bug

#667 - Inflight Batching not working

Issue - State: open - Opened by frosk1 11 months ago
Labels: bug

#661 - triton server multi request dynamic_batching not work

Issue - State: open - Opened by kazyun 12 months ago
Labels: bug

#656 - Qwen2___5-0___5B-Instruct convert_checkpoint error

Issue - State: open - Opened by giftyang 12 months ago
Labels: bug

#651 - triton streaming is not working as expected

Issue - State: open - Opened by robosina about 1 year ago
Labels: bug

#646 - Stub process 'whisper_bls_0_0' is not healthy.

Issue - State: open - Opened by MrD005 about 1 year ago
Labels: bug

#642 - With same engine, trtllm backend is 40x slower than TensorRT-LLM/examples/run.py

Issue - State: closed - Opened by ShuaiShao93 about 1 year ago - 1 comment
Labels: bug

#640 - problem with streaming

Issue - State: closed - Opened by Alireza3242 about 1 year ago - 1 comment
Labels: bug

#639 - Support non-detached mode for python trtllm backend

Issue - State: open - Opened by ShuaiShao93 about 1 year ago
Labels: bug

#630 - the output of bls is unstable

Issue - State: open - Opened by dwq370 about 1 year ago
Labels: bug

#626 - Streaming Inference Failure

Issue - State: open - Opened by imilli about 1 year ago
Labels: bug

#625 - The GPU memory usage is too high.

Issue - State: open - Opened by imilli about 1 year ago
Labels: bug

#623 - Failed install in nvcr.io/nvidia/tritonserver:24.08-trtllm-python-py3

Issue - State: open - Opened by wwx007121 about 1 year ago
Labels: bug

#619 - Throw ZeroDivisionError when benchmark

Issue - State: closed - Opened by moyerlee about 1 year ago
Labels: bug

#616 - fill_template.py and gpu_device_ids

Issue - State: open - Opened by Alireza3242 about 1 year ago
Labels: bug

#610 - Is ReDrafter supported by the TensorRT-LLM backend?

Issue - State: open - Opened by vkc1vk about 1 year ago - 2 comments
Labels: bug

#609 - Dynamic batching not working

Issue - State: open - Opened by ShuaiShao93 about 1 year ago
Labels: bug

#601 - Qwen2-14B inference garbled

Issue - State: open - Opened by kazyun about 1 year ago
Labels: bug

#598 - generation logits dtype bug

Issue - State: open - Opened by binhtranmcs about 1 year ago - 3 comments
Labels: bug

#595 - Can't build GPT-J 6B

Issue - State: open - Opened by coppock about 1 year ago
Labels: bug

#593 - Is `no_repeat_ngram_size` generation option supported?

Issue - State: open - Opened by vnkc1 about 1 year ago
Labels: bug

#583 - Trion server + lora multiple times the same input results are different

Issue - State: closed - Opened by PAOPAO6 over 1 year ago
Labels: bug

#579 - unable to launch model with tensorrt_llm

Issue - State: closed - Opened by janpetrov over 1 year ago - 8 comments
Labels: bug

#577 - Unable to launch triton server with TP

Issue - State: open - Opened by dhruvmullick over 1 year ago
Labels: bug

#576 - Unable to launch triton server with TP

Pull Request - State: closed - Opened by dhruvmullick over 1 year ago - 1 comment
Labels: bug

#573 - Inference server stalling

Issue - State: open - Opened by siddhatiwari over 1 year ago - 5 comments
Labels: bug

#572 - Failed to launch triton server, the tensorrt_llm protobuf file failed to load

Issue - State: open - Opened by KuntaiDu over 1 year ago - 2 comments
Labels: bug

#569 - LLAMA3: Unable to launch with tp 2

Issue - State: open - Opened by mindhash over 1 year ago
Labels: bug

#566 - Build Qwen2-72B model to TensorRT engines failed

Issue - State: open - Opened by wangpeilin over 1 year ago
Labels: bug

#564 - v0.11.0 release fails when TP>1

Issue - State: open - Opened by daulet over 1 year ago
Labels: bug

#563 - Triton crashes on boot

Issue - State: open - Opened by daulet over 1 year ago
Labels: bug

#562 - Unable to initialize shared memory key 'triton_python_backend_shm_region_2'

Pull Request - State: closed - Opened by zhangyu68 over 1 year ago - 1 comment
Labels: bug

#560 - How to calculate the number of loras that can be cached to host cache?

Issue - State: open - Opened by limertang over 1 year ago
Labels: bug

#559 - How to calculate the number of cached loras

Pull Request - State: open - Opened by limertang over 1 year ago
Labels: bug

#558 - inflight_batcher_llm example batching

Issue - State: open - Opened by PKaralupov over 1 year ago
Labels: bug

#557 - `min_length` parameter doesn't work

Issue - State: open - Opened by vnkc1 over 1 year ago
Labels: bug

#555 - Llama 3.1 Tool-Calling Support

Issue - State: open - Opened by LanceB57 over 1 year ago
Labels: bug

#553 - Server stuck after `Starting Python backend stub`

Issue - State: open - Opened by DZADSL72-00558 over 1 year ago
Labels: bug

#550 - bugs in v0.10.0 version with tensorrtllm_backend

Issue - State: closed - Opened by x-transformers over 1 year ago - 2 comments
Labels: bug

#542 - Unable to build tensorrt_llm backend; problems with CXX11 ABI

Issue - State: closed - Opened by jlewi over 1 year ago - 3 comments
Labels: bug

#532 - Achieving Benchmark Performance on Triton Inference Server

Issue - State: open - Opened by LanceB57 over 1 year ago
Labels: bug

#531 - Deserializing Engine Version Mismatch

Issue - State: closed - Opened by LanceB57 over 1 year ago - 1 comment
Labels: bug

#525 - Issue Mixtral 8x7b failed to load preprocessing model.

Issue - State: closed - Opened by christian-ci over 1 year ago - 1 comment
Labels: bug

#524 - launch multi-gpu triton server and got an Error

Issue - State: open - Opened by dwq370 over 1 year ago
Labels: bug

#513 - Accumulation of tokens while beam_width > 1

Issue - State: open - Opened by wxsms over 1 year ago
Labels: bug

#511 - Exception when disabling "inflight_fused_batching"

Issue - State: open - Opened by TheCodeWrangler over 1 year ago
Labels: bug

#509 - 3rd Tritonserver fails to respond

Pull Request - State: open - Opened by njaramish over 1 year ago
Labels: bug

#508 - Assertion failed: Invalid tensor name: decoder_input_lengths

Issue - State: open - Opened by HowardChenRV over 1 year ago
Labels: bug

#506 - Key 'lora_config' not found

Issue - State: open - Opened by LanceB57 over 1 year ago
Labels: bug

#505 - how to set `ignore_eos` when benchmark TensorRT LLM

Issue - State: closed - Opened by zhyncs over 1 year ago - 2 comments
Labels: bug

#503 - No Text Output

Issue - State: open - Opened by Adevils over 1 year ago
Labels: bug

#493 - Deepseek model streaming mode with Chinese character �?

Issue - State: open - Opened by activezhao over 1 year ago
Labels: bug

#488 - Error in streaming mode noting that execute function should return None

Issue - State: closed - Opened by kisseternity over 1 year ago - 2 comments
Labels: bug, triaged, need more info

#487 - Got repeated answer while deploying LLaMA3-Instruct-8B model in triton server

Issue - State: closed - Opened by AndyZZt over 1 year ago - 2 comments
Labels: bug

#486 - [Bug] Output generation does not stop at stop token </s>

Issue - State: closed - Opened by Hao-YunDeng over 1 year ago - 2 comments
Labels: bug