AnnaYue

Follow

AnnaYue

Follow

5 followers · 24 following

Ant Group
shanghai

Achievements

Achievements

Stars

15 stars written in Python

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,215 11,055 Updated Nov 6, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 19,781 3,275 Updated Nov 6, 2025

eosphoros-ai / DB-GPT

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Python 17,578 2,455 Updated Nov 6, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,150 2,430 Updated Nov 6, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,629 2,110 Updated Jul 17, 2025

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 11,943 932 Updated Mar 11, 2025

bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,179 885 Updated Nov 4, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 4,073 248 Updated Oct 6, 2025

inclusionAI / AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 2,955 222 Updated Nov 6, 2025

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,973 220 Updated Nov 5, 2025

vllm-project / production-stack

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 1,911 313 Updated Nov 6, 2025

AlibabaPAI / llumnix

Efficient and easy multi-instance LLM serving

Python 505 41 Updated Sep 3, 2025

TuGraph-family / chat2graph

Chat2Graph: Graph Native Agentic System.

Python 363 43 Updated Oct 30, 2025

codefuse-ai / D2LLM

Python 34 2 Updated Jul 23, 2024

nickaggarwal / nvidia-triton-llm-streaming

Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use Nvidia Triton in Streaming use-cases ( hard to find in their…

Python 10 Updated May 29, 2024