TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,065 1,848 Updated Nov 7, 2025

microsoft / wil

Windows Implementation Library

C++ 2,793 270 Updated Nov 6, 2025

NVIDIA / cudnn-frontend

cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it

C++ 638 134 Updated Nov 7, 2025

triton-inference-server / onnxruntime_backend

The Triton backend for the ONNX Runtime.

C++ 165 72 Updated Nov 4, 2025

triton-inference-server / tensorrt_backend

The Triton backend for TensorRT.

C++ 79 33 Updated Nov 4, 2025

triton-inference-server / tensorflow_backend

The Triton backend for TensorFlow.

C++ 53 22 Updated Nov 4, 2025

triton-inference-server / openvino_backend

OpenVINO backend for Triton.

C++ 34 18 Updated Nov 4, 2025

VisionaryArchitects / llama.cpp

Forked from ggml-org/llama.cpp

LLM inference in C/C++

C++ 1 Updated Nov 5, 2025

Starred topics

GitHub CLI extension

TypeScript

Scala

CSS

Chrome

Compiler

Code quality

Front end

Database

Node.js

See all starred topics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VisionaryArchitects™️ VisionaryArchitects

Achievements

Achievements

Highlights

Block or report VisionaryArchitects

Lists (2)

HIGH IMPACT:

🚀 My stack

Starred repositories

microsoft / terminal

ggml-org / llama.cpp

LadybirdBrowser / ladybird

NVIDIA / TensorRT-LLM