Stars
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
[COLM'25] CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Achieve state of the art inference performance with modern accelerators on Kubernetes
LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure
Make every token count — an experimental LLM inference layer that optimizes cost through caching, adaptive routing, and ML-assisted decision-making.
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
企业级AI助手规则体系 - 基于agent-rules优化扩展,专为中国开发者打造,支持Augment、Cursor、Claude Code、Trae AI等主流AI工具的一键安装和配置
AppPlatform 是一个前沿的大模型应用工程,旨在通过集成的声明式编程和低代码配置工具,简化和优化大模型的训练与推理应用的开发过程。本工程为软件工程师和产品经理提供一个强大的、可扩展的环境,以支持从概念到部署的全流程 AI 应用开发。
基于Spring AI + LangGraph4j 工作流 + RAG 知识库 + Redis 高并发优化 + Dubbo微服务架构(7个独立服务)/单体架构+ Higress 云原生网关的教育智能体平台
A full-system, cycle-level simulator based on gem5 that provides complete support for all three CXL sub-protocols and all three types of CXL devices.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Community maintained hardware plugin for vLLM on Ascend
FlagGems is an operator library for large language models implemented in the Triton Language.
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
Use the TPC-DS benchmark to test Spark SQL performance
The official GitHub page for the survey paper "A Survey of Large Language Models".
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
An industrial deep learning framework for high-dimension sparse data