Stars
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
VPTQ, A Flexible and Extreme low-bit quantization algorithm
aider is AI pair programming in your terminal
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
sgsdxzy / YuE-exllamav2-fork
Forked from AlpinDale/Better-YuEYuE: Open Full-song Generation Foundation Model, something similar to Suno.ai but open
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and <think> tag filtering. Perfect for using advanced models wi…
🚀 豆包大模型逆向API【特长:超强联网搜索】,零配置部署,多路token支持,仅供测试,如需商用请前往官方开放平台。
A Python implementation of global optimization with gaussian processes.
🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
HunyuanVideo: A Systematic Framework For Large Video Generation Model
LLM-powered multiagent persona simulation for imagination enhancement and business insights.
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
Implements harmful/harmless refusal removal using pure HF Transformers
Enforce the output format (JSON Schema, Regex etc) of a language model
A fast inference library for running LLMs locally on modern consumer-class GPUs
chu-tianxiang / vllm-gptq
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…