triton-inference-server/onnxruntime_backend issues and pull requests

#302 - TPRD-1325: adding policy argument

Pull Request - State: closed - Opened by mc-nv 4 months ago

#301 - GPU VRAM Leak with Python Backend BLS Requests to ORT Backend

Issue - State: closed - Opened by WoodieDudy 4 months ago

#300 - OpenVINO CPU Execution Accelerator Fails with "Device with 'NPU' name is not registered" Error

Issue - State: open - Opened by alib1513 4 months ago

#299 - fix: Set correct device ID with TRT EP

Pull Request - State: closed - Opened by milesial 5 months ago

#298 - Advance upstreams to 25.02

Pull Request - State: closed - Opened by nv-kmcgill53 5 months ago - 6 comments

#297 - Enable remote cache if applicable

Pull Request - State: closed - Opened by mc-nv 5 months ago

#296 - Update default branch post 25.01

Pull Request - State: closed - Opened by mc-nv 6 months ago

#295 - Downgrade patchelf version from 0.18.0 to 0.17.2

Pull Request - State: closed - Opened by nv-kmcgill53 6 months ago

#294 - Add support for `session.use_device_allocator_for_initializers` in onnxruntime_backend

Pull Request - State: closed - Opened by pskiran1 6 months ago - 2 comments

#293 - Limited GPU Memory Utilization in ColBERT Deployment

Issue - State: open - Opened by nauyan 6 months ago

#292 - Parallel warmup when using multiple GPUs

Issue - State: open - Opened by asaff1 7 months ago

#291 - Update CUDA archs in ORT

Pull Request - State: closed - Opened by pvijayakrish 7 months ago - 2 comments

#290 - Upadate default branch post 24.12

Pull Request - State: closed - Opened by mc-nv 7 months ago

#289 - Extract archive in different location

Pull Request - State: closed - Opened by mc-nv 7 months ago

#284 - fix: Fix L0_onnx_execution_provider

Pull Request - State: closed - Opened by yinggeh 9 months ago

#258 - build: Add WAR for CUDA 12.5 build issue (#257)

Pull Request - State: closed - Opened by rmccorm4 about 1 year ago

#203 - Onnxruntime backend error when workload is high since Triton uses CUDA 12

Issue - State: open - Opened by zeruniverse about 2 years ago - 7 comments
Labels: bug

#166 - Expose `session.use_device_allocator_for_initializers` in onnxruntime_backend to completely shrink arena

Issue - State: closed - Opened by zeruniverse over 2 years ago

#102 - Memory Leaks Cause Server OOMs (CPU, TF2/ONNX)

Issue - State: closed - Opened by narolski over 3 years ago - 18 comments

#99 - Cannot build `r22.01` onnxruntime_backend with OpenVino

Issue - State: closed - Opened by narolski over 3 years ago - 3 comments

#98 - enable io binding for outputs

Pull Request - State: closed - Opened by askhade over 3 years ago - 2 comments

#97 - Model hangs when warming up for tensorrt optimization

Issue - State: closed - Opened by chajath over 3 years ago - 2 comments

#96 - Half of CPU threads not utilized when running GPU model

Issue - State: open - Opened by wilsoncai1992 over 3 years ago

#95 - tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] onnx runtime error 2: not enough space: expected 270080, got 261760

Issue - State: open - Opened by piekey1994 over 3 years ago - 7 comments

#94 - Not able to load simple iris model: Getting error: `Unsupported ONNX Type 'ONNX_TYPE_SEQUENCE'`

Issue - State: open - Opened by KshitizLohia over 3 years ago - 13 comments

#93 - Fix string state support (#92)

Pull Request - State: closed - Opened by Tabrizian over 3 years ago

#92 - Fix string state support

Pull Request - State: closed - Opened by Tabrizian over 3 years ago

#91 - Fix error handling to include failed counts in infer stat

Pull Request - State: closed - Opened by tanmayv25 over 3 years ago

#90 - Responder should explicitly use 'first dimension batching'

Pull Request - State: closed - Opened by deadeyegoodwin over 3 years ago

#89 - Error in onnxruntime-openvino backend when run with Triton

Issue - State: open - Opened by mayani-nv over 3 years ago - 7 comments

#88 - how to make the onnx to support batch

Issue - State: closed - Opened by zijin520 over 3 years ago - 2 comments

#87 - Segmentation fault on initialization

Issue - State: closed - Opened by vigneshwaran-nv-10329 over 3 years ago - 7 comments
Labels: more-info-needed

#86 - Model loading failure: densenet_onnx fails to load due to "pthread_setaffinity_np" failure

Issue - State: open - Opened by shrek over 3 years ago - 4 comments

#85 - enable global threadpool

Pull Request - State: closed - Opened by askhade over 3 years ago

#84 - Why is it slower to use openvino than not ？

Issue - State: closed - Opened by aixuedegege over 3 years ago - 2 comments

#83 - remove version dependency on libonnxruntime.so

Pull Request - State: closed - Opened by askhade over 3 years ago - 1 comment

#82 - add more config options for ort

Pull Request - State: closed - Opened by askhade over 3 years ago - 3 comments

#81 - Add execution_mode and inter_op_num_threads config options

Issue - State: closed - Opened by askhade over 3 years ago

#80 - enable ort debug builds on linux

Pull Request - State: closed - Opened by askhade over 3 years ago - 1 comment

#79 - update tensorrt parser to support TRT 8.2

Pull Request - State: closed - Opened by askhade almost 4 years ago - 4 comments

#78 - Add support for implicit state

Pull Request - State: closed - Opened by Tabrizian almost 4 years ago - 5 comments

#77 - Refactor docker scripts generation for building ORT on Windows and Linux

Pull Request - State: closed - Opened by jcwchen almost 4 years ago

#76 - Model Loading failure: Invalid argument: model output cannot have empty reshape for non-batching model for test_model

Issue - State: closed - Opened by supercharleszhu almost 4 years ago - 11 comments
Labels: more-info-needed

#75 - Fix model config examples using string_value

Pull Request - State: closed - Opened by yoldemir almost 4 years ago

#74 - Document ORT OpenVino EP

Pull Request - State: closed - Opened by deadeyegoodwin almost 4 years ago

#73 - ORT backend always returns tensor on CPU

Issue - State: closed - Opened by aklife97 almost 4 years ago - 7 comments

#72 - enable ort configs

Pull Request - State: closed - Opened by askhade almost 4 years ago

#71 - Return early if ortrun bumps into a inference error

Pull Request - State: closed - Opened by jcwchen almost 4 years ago - 13 comments

#70 - Use repo from local dir instead of git

Pull Request - State: closed - Opened by CoderHam almost 4 years ago

#69 - ORT backend causes Triton to crash for a failed inference run

Issue - State: closed - Opened by tanmayv25 almost 4 years ago

#68 - ARM64 build support

Pull Request - State: closed - Opened by deadeyegoodwin almost 4 years ago

#67 - use default thread count instead of hard coding to 1

Pull Request - State: closed - Opened by askhade almost 4 years ago

#66 - Re-use generated TensorRT plan when instance groups or multi-gpu

Issue - State: closed - Opened by damonmaria almost 4 years ago - 6 comments

#65 - Guidance on building Onnx backend without docker

Issue - State: closed - Opened by smijolovic almost 4 years ago - 2 comments

#64 - enable more trt options

Pull Request - State: closed - Opened by askhade almost 4 years ago

#63 - explicitly add cmake cuda architectures for windows and remove use_openmp build option

Pull Request - State: closed - Opened by askhade almost 4 years ago - 3 comments

#62 - ort trt update

Pull Request - State: closed - Opened by askhade almost 4 years ago - 1 comment

#61 - Use ORT 1.8.1 with changes required to enable ONNX-TRT for TRT8

Pull Request - State: closed - Opened by deadeyegoodwin almost 4 years ago - 1 comment

#60 - Update ortbackend

Pull Request - State: closed - Opened by askhade almost 4 years ago

#59 - update ort version

Pull Request - State: closed - Opened by deadeyegoodwin almost 4 years ago

#58 - Segfault during L0_lifecycle testing of 21.08 onnxruntime_backend

Issue - State: closed - Opened by deadeyegoodwin almost 4 years ago - 6 comments

#57 - Extend START, END, READY controls to allow BOOL type

Pull Request - State: closed - Opened by krishung5 almost 4 years ago

#56 - How can I control the cuda memory for onnx models?

Issue - State: closed - Opened by LLsmile almost 4 years ago - 3 comments

#55 - update ort version

Pull Request - State: closed - Opened by askhade almost 4 years ago - 1 comment

#54 - CPU only build for onnx runtime

Pull Request - State: closed - Opened by jbkyang-nvi about 4 years ago

#53 - Update ORT to 1.8.1

Issue - State: closed - Opened by deadeyegoodwin about 4 years ago - 2 comments

#52 - In Dockerfile gen script, CUDNN_VERSION should be obtained from docker image

Issue - State: open - Opened by GuanLuo about 4 years ago

#51 - Expose CUDNN home as CMake option. Fix PATH CMake variable checking

Pull Request - State: closed - Opened by GuanLuo about 4 years ago

#50 - Dockerfile gen script for building ORT libraries should condition on TRITON_ENABLE_GPU

Issue - State: closed - Opened by deadeyegoodwin about 4 years ago

#49 - Modify backend to be host policy aware

Pull Request - State: closed - Opened by GuanLuo about 4 years ago - 2 comments

#48 - always bind output to cpu

Pull Request - State: closed - Opened by askhade about 4 years ago

#47 - enable configuring trt options

Pull Request - State: closed - Opened by askhade about 4 years ago

#46 - Fixing mem leak reported in #45

Pull Request - State: closed - Opened by askhade about 4 years ago - 2 comments

#45 - Memory leak in ONNX runtime backend

Issue - State: closed - Opened by Tabrizian about 4 years ago - 3 comments

#44 - enable debug builds for ORT

Pull Request - State: closed - Opened by askhade about 4 years ago - 5 comments

#43 - [E:onnxruntime:, sequential_executor.cc:333 Execute]

Issue - State: closed - Opened by htran170642 about 4 years ago - 2 comments

#42 - enable iobinding

Pull Request - State: closed - Opened by askhade about 4 years ago - 1 comment

#41 - Expand ONNXRuntime backend for Jetson/zero-container build

Pull Request - State: closed - Opened by CoderHam over 4 years ago

#40 - Build without docker

Issue - State: closed - Opened by vigneshwaran-nv-10329 over 4 years ago - 3 comments

#39 - add ragged batch support for ort backend

Pull Request - State: closed - Opened by askhade over 4 years ago - 1 comment

#38 - Include onnxruntime_providers_shared with TensorRT EP

Pull Request - State: closed - Opened by deadeyegoodwin over 4 years ago

#37 - Improve error messages to be more clear

Pull Request - State: closed - Opened by CoderHam over 4 years ago

#36 - Add log warning to notify user of max_batch_size autofill policy

Pull Request - State: closed - Opened by CoderHam over 4 years ago

#35 - Use advanced version of ProcessTensor()

Pull Request - State: closed - Opened by GuanLuo over 4 years ago

#34 - CPU inference is much slower than with ONNX Runtime directly

Issue - State: open - Opened by artmatsak over 4 years ago - 10 comments
Labels: more-info-needed

#33 - cudnn_home not valid during build

Issue - State: closed - Opened by mfruhner over 4 years ago - 20 comments

#32 - Remove onnx-trt patch (upstream is fixed), Add patch for cuda headers

Pull Request - State: closed - Opened by deadeyegoodwin over 4 years ago

#31 - ONNX Runtime backend build support for windows

Pull Request - State: closed - Opened by deadeyegoodwin over 4 years ago

#30 - Triton-OnnxRt- TRT performance i

Issue - State: open - Opened by mayani-nv over 4 years ago - 4 comments

#29 - Add option to build ONNX Runtime library as part of build

Pull Request - State: closed - Opened by deadeyegoodwin over 4 years ago

#28 - ORT library build should happen in the onnxruntime_backend repo

Issue - State: closed - Opened by deadeyegoodwin over 4 years ago

#27 - WIP: fix issue with custom op library unload

Pull Request - State: closed - Opened by CoderHam over 4 years ago

#26 - failed to load onnx model with tensorrt optimization

Issue - State: closed - Opened by zirui over 4 years ago - 2 comments

#25 - Support for multiple streams in ORT

Issue - State: closed - Opened by DavidLangworthy over 4 years ago - 3 comments

#24 - Support configuration of thread counts in Triton (and other ORT config)

Issue - State: closed - Opened by DavidLangworthy over 4 years ago - 2 comments

#23 - Support for Global Thread Pool (Sharing thread pool across ORT session)

Issue - State: closed - Opened by DavidLangworthy over 4 years ago

#22 - Don't always calculate all outputs.

Issue - State: open - Opened by DavidLangworthy over 4 years ago

#21 - Add support for ragged batching (especially useful for BERT-type models).

Issue - State: closed - Opened by DavidLangworthy over 4 years ago - 6 comments

#20 - More options to control TensorRT execution provider.

Issue - State: closed - Opened by DavidLangworthy over 4 years ago - 2 comments

#19 - Use IOBinding to avoid unnecessary data copy and unnecessary CPU<->GPU transfers.

Issue - State: closed - Opened by DavidLangworthy over 4 years ago - 5 comments

GitHub / triton-inference-server/onnxruntime_backend issues and pull requests