Skip to content
View gxkevin's full-sized avatar

Block or report gxkevin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,964 620 Updated Mar 24, 2026

[COLM'25] CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

Python 19 3 Updated Jun 25, 2025

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 332,446 64,769 Updated Mar 24, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,675 362 Updated Mar 23, 2026

LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure

Python 218 46 Updated Mar 13, 2026

Make every token count — an experimental LLM inference layer that optimizes cost through caching, adaptive routing, and ML-assisted decision-making.

Python 1 Updated Jan 1, 2026
Python 19 4 Updated Mar 22, 2026

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …

C++ 367 81 Updated Mar 19, 2026

企业级AI助手规则体系 - 基于agent-rules优化扩展,专为中国开发者打造,支持Augment、Cursor、Claude Code、Trae AI等主流AI工具的一键安装和配置

Batchfile 158 29 Updated Nov 7, 2025

AppPlatform 是一个前沿的大模型应用工程,旨在通过集成的声明式编程和低代码配置工具,简化和优化大模型的训练与推理应用的开发过程。本工程为软件工程师和产品经理提供一个强大的、可扩展的环境,以支持从概念到部署的全流程 AI 应用开发。

Java 1,424 229 Updated Mar 13, 2026

基于Spring AI + LangGraph4j 工作流 + RAG 知识库 + Redis 高并发优化 + Dubbo微服务架构(7个独立服务)/单体架构+ Higress 云原生网关的教育智能体平台

Java 12 4 Updated Nov 16, 2025

A full-system, cycle-level simulator based on gem5 that provides complete support for all three CXL sub-protocols and all three types of CXL devices.

C++ 136 39 Updated Mar 4, 2026

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 56,976 4,715 Updated Mar 23, 2026

Community maintained hardware plugin for vLLM on Ascend

Python 1,816 969 Updated Mar 24, 2026

先进编译实验室的个人主页

C++ 213 23 Updated Oct 15, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 926 293 Updated Mar 23, 2026

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

631 137 Updated Oct 16, 2023

Use the TPC-DS benchmark to test Spark SQL performance

TSQL 184 98 Updated Apr 27, 2020

Source code for the virtualization book

C 97 23 Updated Jan 13, 2026

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 12,135 939 Updated Mar 11, 2025

极客时间《Linux 性能优化实战》案例

C 1,119 546 Updated Jan 6, 2023

Stack trace visualizer

Perl 19,376 2,091 Updated Oct 20, 2024

📖 【译】笨办法学C

CSS 846 184 Updated Oct 31, 2023

MSR Cloud Tools

Shell 185 34 Updated Jul 29, 2020

Apache ORC - the smallest, fastest columnar storage for Hadoop workloads

Java 766 507 Updated Mar 23, 2026
C 1 Updated Apr 15, 2019

Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)

Go 519 234 Updated Mar 23, 2026
Python 5 Updated Mar 3, 2014

Torque Repository

C 263 144 Updated May 12, 2023

An industrial deep learning framework for high-dimension sparse data

PureBasic 4,307 1,028 Updated Sep 25, 2024
Next