triton-inference-server/pytriton issues and pull requests

#95 - Network latency is very high

Issue - State: open - Opened by WangGewu 5 days ago - 4 comments

#94 - A pretty interesting performance bug.

Issue - State: open - Opened by jiiiong 21 days ago - 1 comment

#93 - Feature request: Update numpy dependency to allow >=2.0

Issue - State: open - Opened by dmatlak 25 days ago

#92 - Error in production inference with GCP Cloud Run - Server returns before model is ready

Issue - State: closed - Opened by sricke 3 months ago - 2 comments

#91 - Model repo

Issue - State: closed - Opened by tylerweitzman 3 months ago - 8 comments
Labels: enhancement, question, Stale

#90 - do not recreate task receiving messages from ZMQ every second

Pull Request - State: closed - Opened by catwell 3 months ago

#89 - How to use VLMs with pytriton and vllm

Issue - State: closed - Opened by sourabh-patil 3 months ago - 5 comments
Labels: Stale

#88 - Server randomly gets stuck since update

Issue - State: closed - Opened by catwell 4 months ago - 14 comments

#87 - Failed to successfully start the service

Issue - State: closed - Opened by DracoMori 4 months ago - 3 comments
Labels: Stale

#86 - Can not host "meta-llama/Llama-3.2-1B-Instruct" using vllm backend on RTX 2080

Issue - State: closed - Opened by sourabh-patil 4 months ago - 3 comments

#85 - Add Jenkins pipeline to build Python Backend Stubs for Triton

Pull Request - State: closed - Opened by this 4 months ago - 1 comment

#84 - A minimalistic example of using Cuda Shared memory

Issue - State: closed - Opened by sayanmutd 5 months ago - 5 comments
Labels: Stale

#83 - expose requested_output_names in Request

Pull Request - State: closed - Opened by catwell 5 months ago - 3 comments

#82 - [Question] Working with custom request schemas

Issue - State: closed - Opened by kheyer 5 months ago - 3 comments
Labels: Stale

#81 - [Question] Can I bind additional models after triton has started serving?

Issue - State: closed - Opened by caffeinism 6 months ago - 2 comments

#80 - PA Migration: Doc Updates

Pull Request - State: closed - Opened by fpetrini15 6 months ago

#79 - [Question] Tensor parallelism for tensorrt_llm

Issue - State: open - Opened by JoeLiu996 8 months ago - 1 comment
Labels: non-stale

#78 - disallow use of numpy 2 for now

Pull Request - State: closed - Opened by catwell 8 months ago - 5 comments
Labels: bug

#77 - Model is not initialized to GPU.

Issue - State: closed - Opened by jaehyeong-bespin 8 months ago - 7 comments
Labels: Stale

#76 - [Error following Quick Start] - tritonclient.utils.InferenceServerException: [400] client received an empty response from the server.

Issue - State: closed - Opened by sophot 8 months ago - 1 comment

#75 - multi-gpu inference with pytriton got worse TPS

Issue - State: closed - Opened by lionsheep24 8 months ago - 7 comments
Labels: Stale

#74 - [Question] About the subprocess for multi-instance

Issue - State: closed - Opened by leafjungle 9 months ago - 4 comments
Labels: Stale

#73 - [Question] will the server fork several subprocess when infer_func is a list?

Issue - State: closed - Opened by leafjungle 9 months ago - 1 comment

#72 - [HELP] what is the problem for this demo code?

Issue - State: closed - Opened by leafjungle 9 months ago - 1 comment

#71 - PyTriton produces DEBUG output by default?

Issue - State: closed - Opened by JanFSchulte 9 months ago - 1 comment

#70 - [Question] What is the relationship between "model_repository" and "infer_func"?

Issue - State: closed - Opened by leafjungle 10 months ago

#69 - Model instances question

Issue - State: closed - Opened by tinsss 10 months ago - 3 comments
Labels: Stale

#68 - Is there a way to run pytriton on glibc2.32?

Issue - State: closed - Opened by DZ9 10 months ago - 9 comments
Labels: Stale

#67 - double container in kubernetes

Issue - State: closed - Opened by leafjungle 10 months ago - 1 comment

#66 - [Bug] Fail to deploy serving model on the Azure Machine Learning Platform. Exited with failure (confusing error information and exit code)

Issue - State: closed - Opened by keli-wen 11 months ago - 3 comments
Labels: bug, Stale

#65 - Example of TensorRT-LLM Whisper backend for PyTriton

Issue - State: open - Opened by aleksandr-smechov 11 months ago - 5 comments
Labels: enhancement, non-stale

#64 - fix: Remove duplicated paragraph

Pull Request - State: closed - Opened by getty708 almost 1 year ago - 3 comments
Labels: documentation

#63 - Python InferenceServerClient issue when call close() from del

Issue - State: closed - Opened by lionsheep0724 about 1 year ago - 7 comments
Labels: bug, Stale

#62 - Put `pytriton.client` in the separate package/wheel.

Issue - State: open - Opened by flyingleafe about 1 year ago - 3 comments
Labels: enhancement, non-stale

#61 - pytriton use onnx is slower than onnx runtime for tiny bert model

Issue - State: open - Opened by yan123456jie about 1 year ago - 1 comment
Labels: bug, non-stale

#60 - how to define a new api andinput like flask

Issue - State: closed - Opened by Pobby321 about 1 year ago - 3 comments
Labels: Stale

#59 - nav.optimize() bug

Issue - State: closed - Opened by Pobby321 about 1 year ago - 7 comments
Labels: Stale

#58 - Questions about new feature at 0.5.0 : decoupled model

Issue - State: closed - Opened by lionsheep0724 about 1 year ago - 4 comments
Labels: Stale

#57 - onnx and tensorrt model supported?

Issue - State: closed - Opened by oreo-lp about 1 year ago - 3 comments
Labels: Stale

#56 - ModuleNotFoundError: No module named '_ctypes' error when run pytriton server with 0.5.0

Issue - State: closed - Opened by lionsheep0724 about 1 year ago - 5 comments
Labels: bug, Stale

#55 - The content of this document is wrong

Issue - State: closed - Opened by HJH0924 about 1 year ago - 2 comments
Labels: documentation, Stale

#54 - The content of this document is incorrect

Issue - State: closed - Opened by HJH0924 about 1 year ago - 2 comments
Labels: documentation, Stale

#53 - What is the proxy backend in pytriton?

Issue - State: closed - Opened by HJH0924 about 1 year ago - 4 comments
Labels: Stale

#52 - pytriton is slower than triton

Issue - State: closed - Opened by yan123456jie about 1 year ago - 6 comments

#51 - AttributeError: '_thread.RLock' object has no attribute '_recursion_count'

Issue - State: closed - Opened by dogky123 about 1 year ago - 4 comments
Labels: bug, Stale

#50 - How to infer with sequence ?

Issue - State: open - Opened by monsterlyg about 1 year ago - 3 comments
Labels: enhancement, question, non-stale

#49 - fix boot when allow_http=False

Pull Request - State: closed - Opened by catwell about 1 year ago - 3 comments

#48 - [problem]How to allowed multiple models running on same GPU at same time?

Issue - State: closed - Opened by Firefly-Dance about 1 year ago - 5 comments
Labels: Stale

#47 - while inference by running server.py and client.py why client is taking gpu memory.

Issue - State: closed - Opened by Justsubh01 about 1 year ago

#46 - Enabling Redis cache throws: Unable to find shared library libtritonserver.so

Issue - State: closed - Opened by zbloss about 1 year ago - 5 comments
Labels: bug, Stale

#45 - Error deploying model on Vertex AI

Issue - State: closed - Opened by sricke about 1 year ago - 16 comments
Labels: bug, non-stale

#44 - Support Mac installation

Issue - State: open - Opened by zbloss about 1 year ago - 16 comments
Labels: enhancement, non-stale

#43 - Streaming and batching

Issue - State: closed - Opened by giuseppe915 about 1 year ago - 6 comments

#42 - How to pass priority level during inference?

Issue - State: open - Opened by jackielam918 over 1 year ago - 3 comments
Labels: enhancement, non-stale

#41 - TensorRT-LLM suport?

Issue - State: closed - Opened by LouisCastricato over 1 year ago - 4 comments
Labels: non-stale

#40 - Pytriton don't nativly support pytorch or tensorflow dtype

Issue - State: closed - Opened by dahai331 over 1 year ago - 3 comments
Labels: question

#39 - tritonclient.grpc doesn't support timeout for other commands than infer.

Issue - State: closed - Opened by dogky123 over 1 year ago - 4 comments
Labels: enhancement, non-stale

#38 - Client network and/or connection timeout is smaller than requested timeout_s. This may cause unexpected behavior.

Issue - State: closed - Opened by lfxx over 1 year ago - 2 comments
Labels: enhancement

#37 - Stub process 'REGIS_0' is not healthy

Issue - State: closed - Opened by lfxx over 1 year ago - 2 comments

#36 - Support for ubuntu20.04

Issue - State: closed - Opened by lfxx over 1 year ago - 3 comments
Labels: enhancement

#35 - OUTPUT triton: list or tuple or any kind of Iterables

Issue - State: closed - Opened by dogky123 over 1 year ago - 4 comments
Labels: Stale

#34 - Binary output truncated...

Issue - State: closed - Opened by rilango over 1 year ago - 4 comments
Labels: Stale

#33 - Update megatron example so that it would support latest changes in NeMo

Pull Request - State: closed - Opened by PeganovAnton over 1 year ago - 2 comments
Labels: Stale

#32 - Best practices with ModelClient

Issue - State: closed - Opened by markbarna over 1 year ago - 4 comments
Labels: question

#31 - Example of (or support for) Inference Callable of Triton ensemble definition

Issue - State: closed - Opened by michaelhagel over 1 year ago - 11 comments
Labels: enhancement, Stale

#30 - How to check if the server-side service is still online through the client-side?

Issue - State: closed - Opened by lfxx over 1 year ago - 2 comments

#29 - Has this repo been abandoned?

Issue - State: closed - Opened by lfxx over 1 year ago

#28 - Are there any examples of using python multiprocessing to run multiple copies of model on same GPU

Issue - State: closed - Opened by wilson97 over 1 year ago - 12 comments
Labels: Stale

#27 - Does pytriton support hot loading models?

Issue - State: closed - Opened by Fjallraven-hc over 1 year ago - 4 comments
Labels: enhancement, question

#26 - Support for aarch64

Issue - State: closed - Opened by j-delvaux over 1 year ago - 5 comments
Labels: enhancement, question, non-stale

#25 - Sagemaker example

Issue - State: closed - Opened by enricorotundo over 1 year ago - 3 comments

#24 - some questions

Issue - State: closed - Opened by lfxx over 1 year ago - 4 comments

#23 - why python3.7 is not supported?

Issue - State: closed - Opened by lfxx over 1 year ago - 2 comments

#22 - Issues inferencing HTTP with Bart model

Issue - State: closed - Opened by jridevapp over 1 year ago - 2 comments
Labels: question

#21 - How to unload models?

Issue - State: closed - Opened by lfxx over 1 year ago - 7 comments
Labels: Stale

#20 - Fix error in running example inference for add_sub_notebook

Pull Request - State: closed - Opened by yeahdongcn over 1 year ago - 2 comments
Labels: documentation

#19 - unable to install

Issue - State: closed - Opened by riyaj8888 over 1 year ago - 4 comments
Labels: Stale

#18 - Updated documentation links

Pull Request - State: closed - Opened by mahimairaja over 1 year ago - 5 comments
Labels: documentation

#17 - How to implement machine learning model, receive request as json file from client and process them

Issue - State: closed - Opened by thHust191 over 1 year ago - 3 comments
Labels: Stale

#16 - is there any performance compare table between pytriton depoly with original python depoly example on sd, gpt, tts, detection etc?

Issue - State: closed - Opened by foocker over 1 year ago - 3 comments
Labels: Stale

GitHub / triton-inference-server/pytriton issues and pull requests