Stars
A powerful meta-prompting, context engineering and spec-driven development system that enables agents to work for long periods of time autonomously without losing track of the big picture
(one of )The SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Tr…
SGLang is a high-performance serving framework for large language models and multimodal models.
A high-throughput and memory-efficient inference and serving engine for LLMs
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
Roo Code gives you a whole dev team of AI agents in your code editor.
Enable true multi gpu capability in Comfy UI using XDiT XFuser and FSDP managed by Ray
ComfyUI MCP server + Claude Code plugin — workflow execution, visualization, composition, model management, and skill generation
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.
Extremely Fast MessagePack Serializer for C#(.NET, .NET Core, Unity, Xamarin). / msgpack.org[C#]
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
NVIDIA Linux open GPU with P2P support
ChromaDB-powered local indexing support for Cursor, exposed as an MCP server
Provides name-based registrations to Microsoft.Extensions.DependencyInjection