Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Sharp Monocular View Synthesis in Less Than a Second
Blendshape and kinematics calculator for Mediapipe/Tensorflow.js Face, Eyes, Pose, and Finger tracking models.
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.
Introduction to Machine Learning Systems
State-of-the-art paired encoder and decoder models (17M-1B params)
SkyRL: A Modular Full-stack RL Library for LLMs
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
🤗A PyTorch-native Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs: Z-Image, FLUX2, Qwen-Image, etc.
Evaluation software used in the Text Retrieval Conference
Lightweight coding agent that runs in your terminal
⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.
A collection of sample agents built with Agent Development Kit (ADK)
Context7 MCP Server -- Up-to-date code documentation for LLMs and AI code editors
Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Cross-platform Rust rewrite of the GNU coreutils
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.