Stars
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
SGLang is a high-performance serving framework for large language models and multimodal models.
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
A high-throughput and memory-efficient inference and serving engine for LLMs
SOTA rounding quantization for high-accuracy low-bit LLM inference. Seamlessly optimized for vLLM, sglang, and CPU/GPU/CUDA with multi-datatype support.
NVIDIA Linux open GPU with P2P support
Roo Code gives you a whole dev team of AI agents in your code editor.
ChromaDB-powered local indexing support for Cursor, exposed as an MCP server
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.
Provides name-based registrations to Microsoft.Extensions.DependencyInjection
Extremely Fast MessagePack Serializer for C#(.NET, .NET Core, Unity, Xamarin). / msgpack.org[C#]