Stars
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
A generative world for general-purpose robotics & embodied AI learning.
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
A TTS model capable of generating ultra-realistic dialogue in one pass.
Wan: Open and Advanced Large-Scale Video Generative Models
Efficient Triton Kernels for LLM Training
PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....
[ECCV 2024] codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
The best OSS video generation models, created by Genmo
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
Modularized Implementation of Deep RL Algorithms in PyTorch
SAPIEN Manipulation Skill Framework, an open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
This SDK is now deprecated, use the new unified Google GenAI SDK.
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
Official repository for "AM-RADIO: Reduce All Domains Into One"
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Cosmos-Reason1 models understand the physical common sense and generate appropriate embodied decisions in natural language through long chain-of-thought reasoning processes.
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
This code corresponds to simulation environments used as part of the MimicGen project.
Official repo of VLABench, a large scale benchmark designed for fairly evaluating VLA, Embodied Agent, and VLMs.
The Open-Source Repository of Taipei City Dashboard. Try it out by clicking the link below!
Reference workflow for generating large amounts of synthetic motion trajectories for robot manipulation from a few human demonstrations.
Revisiting Image Deblurring with an Efficient ConvNet - An efficient CNN performs better than Transformer
Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression