Build your way out. 🙌🏻
Highlights
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
11
stars
written in C++
Clear filter
The new Windows Terminal and the original Windows console host, all in the same place!
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
The Triton backend for the ONNX Runtime.
The Triton backend for TensorRT.
The Triton backend for TensorFlow.
OpenVINO backend for Triton.
VisionaryArchitects / llama.cpp
Forked from ggml-org/llama.cppLLM inference in C/C++