OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 6,941 773 Updated Apr 20, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,072 599 Updated Mar 13, 2026

FasterDecoding / SnapKV

Python 311 34 Updated Jul 10, 2025

ChaseLab-PKU / InstAttention

InstAttention: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference

C 17 1 Updated Mar 30, 2025

UCSB-NLP-Chang / KVLink

Python 41 5 Updated Oct 16, 2025

ModelEngine-Group / unified-cache-management

Persist and reuse KV Cache to speedup your LLM.

Python 274 73 Updated Apr 27, 2026

YaoJiayi / CacheBlend

Python 179 31 Updated Jul 15, 2025

yangyifei729 / KVSharer

Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''

Python 31 2 Updated Oct 24, 2024

HandsOnLLM / Hands-On-Large-Language-Models

Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

Jupyter Notebook 25,442 5,901 Updated Apr 24, 2026

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 57,266 9,812 Updated Nov 12, 2025

karpathy / minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 24,249 3,219 Updated Aug 15, 2024

October2001 / Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

695 24 Updated Apr 15, 2026

LMCache / LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,136 1,133 Updated Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jasmine jinhuix

Achievements

Achievements

Block or report jinhuix

Stars

datawhalechina / base-llm

aiming-lab / AutoResearchClaw

NVIDIA / cuda-samples

IST-DASLab / marlin

xlite-dev / LeetCUDA

NVIDIA / accelerated-computing-hub

gpu-mode / resource-stream

Infrasys-AI / AIInfra

chenhongyu2048 / LLM-inference-optimization-paper

ruipeterpan / marconi

CalvinXKY / InfraTech

GeeeekExplorer / nano-vllm

chaoyij / KVPR

MobiSense / SpecOffload-public

MLSysU / TD-Pipe

RUCAIBox / LLMSurvey

EnnengYang / Awesome-Model-Merging-Methods-Theories-Applications

open-compass / opencompass