Skip to content
View james0zan's full-sized avatar

Organizations

@kvcache-ai

Block or report james0zan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 27,909 5,952 Updated May 17, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 17,170 1,292 Updated May 14, 2026

A PyTorch native library for training speculative decoding models

Python 111 22 Updated May 17, 2026

Engine-agnostic LLM gateway in Rust. Full OpenAI & Anthropic API compatibility across SGLang, vLLM, TRT-LLM, OpenAI, Gemini & more. Industry-first gRPC pipeline, KV cache-aware routing, chat histor…

Rust 262 76 Updated May 17, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,340 210 Updated May 17, 2026

A visualized theorem prover based on Lean 4

TypeScript 8 Updated Nov 12, 2025

Proof the completeness of Russel's Axiomatic System in lean4, and using C++ to automatically convert lean4 file to markdown file

Lean 4 Updated Jan 6, 2026
Go 81 6 Updated Sep 15, 2025

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 953 83 Updated Feb 28, 2026

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 810 52 Updated May 17, 2026

Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,282 1,178 Updated May 17, 2026

I created a claude code deep researcher that seems to work better than the current deep research models

132 23 Updated Apr 21, 2026

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 2,117 147 Updated Apr 3, 2025

High-speed Large Language Model Serving for Local Deployment

C++ 9,461 576 Updated May 11, 2026

Merico Build is a web app empowering open source developers, maintainers, and communities with metrics from Git, GitHub, and more.

494 25 Updated Jun 29, 2021

CSI driver to bring SPDK to Kubernetes storage through NVMe-oF or iSCSI. Supports dynamic volume provisioning and enables Pods to use SPDK storage transparently.

Go 87 45 Updated Jan 27, 2026

A RocksDB compatible KV storage engine with better performance

C++ 2,149 211 Updated Aug 25, 2025

Concurrent data structures in C++

C++ 1,450 158 Updated May 16, 2026

Bot Framework provides the most comprehensive experience for building conversation applications.

JavaScript 7,803 2,434 Updated Dec 29, 2025

Multithreaded HTTP Download Accelerator

C 23 12 Updated Jul 27, 2014

A Python library for using the duoshuo API

Python 3 Updated Jul 22, 2012

A Python library for using the duoshuo API

Python 88 31 Updated Nov 23, 2021

PyCoder's Weekly Chinese Translate Sources Repo

HTML 392 92 Updated Dec 3, 2017