-
Peking University
- http://qipengwang.github.io/
Stars
[SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Zero Bubble Pipeline Parallelism
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
cluster data collected from production clusters in Alibaba for cluster management research
Ring attention implementation with flash attention
Large Language Model (LLM) Systems Paper List
Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Curated list of project-based tutorials
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
Quantization of Convolutional Neural networks.
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents