Stars
An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…
llama.cpp fork with additional SOTA quants and improved performance
Your Personal AI Assistant; easy to install, deploy on your own machine or on the cloud; supports multiple chat apps with easily extensible capabilities.
Fully automatic censorship removal for language models
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
VPTQ, A Flexible and Extreme low-bit quantization algorithm
aider is AI pair programming in your terminal
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
sgsdxzy / YuE-exllamav2-fork
Forked from AlpinDale/Better-YuEYuE: Open Full-song Generation Foundation Model, something similar to Suno.ai but open
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and <think> tag filtering. Perfect for using advanced models wi…
🚀 豆包大模型逆向API【特长:超强联网搜索】,零配置部署,多路token支持,仅供测试,如需商用请前往官方开放平台。
A Python implementation of global optimization with gaussian processes.
🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
HunyuanVideo: A Systematic Framework For Large Video Generation Model
LLM-powered multiagent persona simulation for imagination enhancement and business insights.
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
Implements harmful/harmless refusal removal using pure HF Transformers
Enforce the output format (JSON Schema, Regex etc) of a language model
A fast inference library for running LLMs locally on modern consumer-class GPUs
chu-tianxiang / vllm-gptq
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs