- Shanghai, China
Lists (12)
Sort Name ascending (A-Z)
Starred repositories
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
sparkrun - launch, manage, and stop LLM inference workloads on NVIDIA DGX Spark systems
Vim-fork focused on extensibility and usability
Build a ChatGPT like LLM from scratch in PyTorch, explained step by step.
Learn LLM internals step by step - from tokenization to attention to inference optimization.
An agent harness that compiles a model into one provably-correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS at batch-1 LLM decode.
AI 圆桌 - Multi-AI Roundtable Chrome Extension
SGLang NVFP4 (fp4_e2m1) KV cache for Blackwell SM120 (RTX PRO 6000): FlashInfer FA2 kernel patches + native FP4 pool + hybrid-SWA wiring + per-layer global-scale auto-calibration. 1.778x KV capacit…
Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM and DeepSpeed integration, CUDA/ROCm/Triton kernels, and transparent hardware-aware scaling.
a fast, scalable, multi-language and extensible build system
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
Random program generation based on semantic reification (PLDI'26)
Sub2API is an open-source relay platform that unifies Claude, OpenAI, Gemini, and Antigravity subscriptions into a single endpoint. It supports account sharing and cost-sharing, with seamless nativ…
Kimi Code CLI — The Starting Point for Next-Gen Agents
A collection of DESIGN.md files analysis by popular brand design systems. Drop one into your project and let coding agents generate a matching UI.
The official Lark/Feishu CLI tool, maintained by the larksuite team — built for humans and AI Agents. Covers core business domains including Messenger, Docs, Base, Sheets, Calendar, Mail, Tasks, Me…
Conveniently export torch.compile compiled products into self-contained Python files
Interactive World Model papers organized by core research challenges.
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838