triton-inference-server

Here are 73 public repositories matching this topic...

torchpipe / torchpipe

An Alternative for Triton Inference Server. Boosting DL Service Throughput 1.5-4x by Ensemble Pipeline Serving with Concurrent CUDA Streams for PyTorch/LibTorch Frontend and TensorRT/CVCUDA, etc., Backends

deployment inference pytorch ray serve tensorrt serving pipeline-parallelism torch2trt triton-inference-server ray-serve cvcuda

Updated Jun 5, 2024
C++

olibartfast / computer-vision-triton-cpp-client

Star

C++ application to perform computer vision tasks using Nvidia Triton Server for model inference

computer-vision object-detection triton-inference-server

Updated Jun 5, 2024
C++

NVIDIA / GenerativeAIExamples

Star

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

microservice gpu-acceleration nemo tensorrt rag triton-inference-server large-language-models llm llm-inference retrieval-augmented-generation

Updated Jun 4, 2024
Python

fversaci / cassandra-dali-plugin

Star

Cassandra plugin for NVIDIA DALI

cassandra deep-learning tensorflow torch data-loading nvidia-dali triton-inference-server

Updated May 31, 2024
C++

NVIDIA-ISAAC-ROS / isaac_ros_dnn_inference

Star

NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

ai deep-learning gpu dnn ros nvidia triton deeplearning tao jetson ros2 tensorrt triton-inference-server tensorrt-inference ros2-humble

Updated May 31, 2024
C++

triton-inference-server / onnxruntime_backend

Star

The Triton backend for the ONNX Runtime.

backend inference triton-inference-server onnx-runtime

Updated Jun 6, 2024
C++

SteliosGian / triton-server-transformers

Star

Triton inference server with Python backend and transformers

python transformers inference model-serving onnxruntime triton-inference-server nvidia-triton

Updated May 25, 2024
Python

rtzr / tritony

Star

Tiny configuration for Triton Inference Server

inference mlops triton-inference-server tritonclient

Updated May 24, 2024
Python

hoang-quoc-trung / sumen-triton

Star

The Sumen model integrates with Triton Inference Server

triton-inference-server

Updated May 22, 2024
Python

YeonwooSung / MLOps

Sponsor

Star

Miscellaneous codes and writings for MLOps

Updated May 18, 2024
Jupyter Notebook

AntonioConsiglio / triton_server

Star

Streamlit Dockerized Computer Vision App with Triton Inference Server and PostgreSQL database

docker-compose postgresql streamlit triton-inference-server

Updated May 16, 2024
Python

allegroai / clearml-serving

Star

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated May 29, 2024
Python

ConnorSouthEngineering / MVision

Star

This repository contains the content for a proof of concept implementation of computer vision systems in industry. The project explores scalability and performance using the NVIDIA ecosystem, aiming to create an example scaffold for implementing a system accessible to non-technical users.

nodejs docker angular gstreamer docker-compose tensorflow postgresql python3 nvidia cv2 jetson-xavier jetson-nano triton-inference-server

Updated May 2, 2024
TypeScript

npuichigo / openai_trtllm

Star

OpenAI compatible API for TensorRT LLM triton backend

triton-inference-server openai-api llm langchain tensorrt-llm

Updated Apr 26, 2024
Rust

notAI-tech / fastDeploy

Star

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Apr 23, 2024
Python

rungrodkspeed / resnet50_optimization

Star

python go pytorch convolutional-neural-networks resnet-50 tensorrt concurrent-futures onnx triton-inference-server

Updated Apr 21, 2024
Python

levipereira / deepstream-yolo-triton-server-rtsp-out

Star

The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.

deepstream triton-inference-server deepstreamsdk triton-server yolov7 deepstream-python deepstream-python-apps yolov9