Stars
Claude Code skill that removes signs of AI-generated writing from text
Community-contributed instructions, agents, skills, and configurations to help you make the most of GitHub Copilot.
ResearchClaw is a personal AI assistant built for research: fast to set up, easy to run locally or in the cloud, and ready to integrate with the chat apps you already use. With extensible skills, i…
Make Any Website & Tool Your CLI. A universal CLI Hub and AI-native runtime. Transform any website, Electron app, or local binary into a standardized command-line interface. Built for AI Agents to …
Production-grade engineering skills for AI coding agents.
An agentic skills framework & software development methodology that works.
let coding agents use ncu skills analysis cuda program automatically!
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-performance systems.
Alibaba Cloud's high-performance KVCache system for LLM inference, with components for global cache management, inference simulation(HiSim), and more.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Persist and reuse KV Cache to speedup your LLM.
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
A Datacenter Scale Distributed Inference Serving Framework
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-throughput and memory-efficient inference and serving engine for LLMs
Serverless LLM Serving for Everyone.
Apache Fluss is a streaming storage built for real-time analytics.
Supercharge Your LLM with the Fastest KV Cache Layer
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.