tgi

Here are 4 public repositories matching this topic...

zRzRzRzRzRzRzR / lm-fly

大模型推理框架加速，让 LLM 飞起来

mlx tgi openvino llm vllm llm-inference tensorrt-llm

Updated May 10, 2024
Python

Bench360 is a modular benchmarking suite for local LLM deployments. It offers a full-stack, extensible pipeline to evaluate the latency, throughput, quality, and cost of LLM inference on consumer and enterprise GPUs. Bench360 supports flexible backends, tasks and scenarios, enabling fair and reproducible comparisons for researchers & practitioners.

benchmark performance framework energy deployment local optimization engine inference quantization energy-consumption tgi llm vllm llm-inference sglang lmdeploy bench360

Updated Jan 21, 2026
Python

Bradley-Butcher / Splleed

Star

LLM Inference performance harness

benchmarking latency inference tgi llm vllm llm-inference

Updated Dec 29, 2025
Python

aisingapore / sealion-sampler

Star

Lightweight HTML form with Python Flask app and accompanying scripts for swift testing of interactions with SEA-LION family of LLMs.

flask-application tgi large-language-models ollama

Updated Aug 2, 2024
Python

Improve this page

Add a description, image, and links to the tgi topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tgi topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tgi