Stars
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
SGLang is a fast serving framework for large language models and vision language models.
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
A high-throughput and memory-efficient inference and serving engine for LLMs
Advanced quantization toolkit for LLMs and VLMs. Support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Schemes and seamless integration with Transformers, vLLM, SGLang, and llm-compressor
NVIDIA Linux open GPU with P2P support
Roo Code gives you a whole dev team of AI agents in your code editor.
ChromaDB-powered local indexing support for Cursor, exposed as an MCP server
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.
Provides name-based registrations to Microsoft.Extensions.DependencyInjection
Extremely Fast MessagePack Serializer for C#(.NET, .NET Core, Unity, Xamarin). / msgpack.org[C#]