zszdsze

Follow

zszdsze

Follow

0 followers · 5 following

Stars

alibaba / yalantinglibs

A collection of modern C++ libraries, include coro_http, coro_rpc, compile-time reflection, struct_pack, struct_json, struct_xml, struct_pb, easylog, async_simple etc.

C++ 2,013 300 Updated Nov 5, 2025

Infini-AI-Lab / UMbreLLa

LLM Inference on consumer devices

Python 125 15 Updated Mar 17, 2025

flexflow / flexflow-serve

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 63 5 Updated Sep 15, 2025

Infini-AI-Lab / MagicPIG

[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation

Python 238 16 Updated Dec 16, 2024

flexflow / flexflow-train

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,843 245 Updated Nov 4, 2025

Infini-AI-Lab / Sequoia

scalable and robust tree-based speculative decoding algorithm

Python 361 37 Updated Jan 28, 2025

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,972 220 Updated Nov 5, 2025

ModelTC / LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,702 281 Updated Nov 5, 2025

AlibabaPAI / llumnix

Efficient and easy multi-instance LLM serving

Python 505 41 Updated Sep 3, 2025

snu-comparch / InfiniGen

InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)

Python 157 29 Updated Jul 10, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,187 11,049 Updated Nov 6, 2025

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Go 4,344 479 Updated Nov 4, 2025

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 1,979 221 Updated Nov 5, 2025

LMCache / LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python 5,911 691 Updated Nov 6, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 27,547 2,531 Updated Nov 5, 2025

octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

Python 1,409 236 Updated Jul 31, 2024

SpatialVLA / SpatialVLA

🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.

Python 562 32 Updated Jun 23, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,221 420 Updated Nov 6, 2025

JackYFL / awesome-VLLMs

This repository collects papers on VLLM applications. We will update new papers irregularly.

174 14 Updated Sep 7, 2025

linux-rdma / rdma-core

RDMA core userspace libraries and daemons

C 2,012 792 Updated Nov 2, 2025

zchoi / Awesome-Embodied-Robotics-and-Agent

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

1,580 89 Updated Oct 30, 2025

vimalabs / VIMA

Official Algorithm Implementation of ICML'23 Paper "VIMA: General Robot Manipulation with Multimodal Prompts"

Python 831 96 Updated Apr 18, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,902 2,659 Updated Aug 12, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,345 468 Updated Aug 7, 2024

openvla / openvla

Forked from TRI-ML/prismatic-vlms

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 4,325 518 Updated Mar 23, 2025

jingyi0000 / VLM_survey

Collection of AWESOME vision-language models for vision tasks

2,987 222 Updated Oct 14, 2025

mit-han-lab / streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,110 391 Updated Jul 11, 2024

FMInference / DejaVu

Python 345 44 Updated Apr 2, 2024

ByteDance-Seed / ShadowKV

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 267 18 Updated May 1, 2025

October2001 / Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

593 15 Updated Sep 30, 2025