Stars
[NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
My learning notes for ML SYS.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
2025年12月更新,目前国内可用Docker镜像源汇总,DockerHub国内镜像加速列表,🚀DockerHub镜像加速器
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
Supercharge Your LLM with the Fastest KV Cache Layer
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
A high-throughput and memory-efficient inference and serving engine for LLMs
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
NVIDIA Linux open GPU kernel module source
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
High performance self-hosted photo and video management solution.
LiteIO is a cloud-native block device service that uses multiple storage engines, including SPDK and LVM, to achieve high performance. It is specifically designed for Kubernetes in a hyper-converge…
dhschall / gem5-fdp
Forked from gem5/gem5Development repository for Fetch Directed Instruction Prefetching (FDP) in gem5
Ocolos is the first online code layout optimization system for unmodified applications written in unmanaged languages.
An artifact for Berti: an Accurate and Timely Local-Delta Data Prefetcher
This repository is meant to be a guide for building your own prefetcher for CPU caches and evaluating it, using ChampSim simulator
JHipster is a development platform to quickly generate, develop, & deploy modern web applications & microservice architectures.