Highlights
- Pro
Stars
AlpaSim is an open-source autonomous vehicle simulation platform designed for development and testing of end-to-end AV policies
Realtime speech to presentation. Let the whiteboard whiteboard itself.
[ICLR2026] codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Hundreds of models & providers. One command to find what runs on your hardware.
Efficient Test-Time Scaling for Small Vision-Language Models, official implementation of the ICLR'26 paper, test-time scaling via test-time augmentation
👓 A web interface of gpustat: monitor GPU clusters at a look
📊 A simple command-line utility for querying and monitoring GPU status
[CVPR 2024] HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
[ICCV 2023 Oral] Game-theoretic modeling and learning of Transformer-based interactive prediction and planning
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Official implementation for BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)
Scalable toolkit for efficient model reinforcement
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
Magenta RealTime 2: An Open-Weights Live Music Model
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Train transformer language models with reinforcement learning.
Reference PyTorch implementation and models for DINOv3
Official Code for "Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction (CVPR 2024)" and "Social Reasoning-Aware Trajectory Prediction via Multimodal Language Mod…
Official inference framework for 1-bit LLMs