yurayli

Ray Li yurayli

23 followers · 85 following

Taipei, Taiwan

Achievements

Stars

Inference engine

8 repositories

FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,382 588 Updated Oct 28, 2024

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 4,194 434 Updated Dec 5, 2025

huggingface / optimum

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

Python 3,218 609 Updated Dec 17, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,416 1,963 Updated Dec 17, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,646 12,034 Updated Dec 18, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 21,597 3,786 Updated Dec 18, 2025

ollama / ollama

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 157,819 13,943 Updated Dec 17, 2025

MoonshotAI / checkpoint-engine

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 864 70 Updated Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray Li yurayli

Achievements

Achievements

Block or report yurayli

Inference engine

FMInference / FlexLLMGen

OpenNMT / CTranslate2

huggingface / optimum

NVIDIA / TensorRT-LLM

vllm-project / vllm

sgl-project / sglang

ollama / ollama

MoonshotAI / checkpoint-engine