Stars
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
Enjoy the magic of Diffusion models!
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
仅需Python基础,从0构建自己的具身智能机器人;从0逐步构建VLA/OpenVLA/SmolVLA/Pi0, 深入理解具身智能
Vero: An Open RL Recipe for General Visual Reasoning
Light Image Video Generation Inference Framework
ZGI is an open-source platform for building AI applications. Its intuitive interface combines workflow design, agent orchestration, dataset management, and model integration—allowing you to quickly…
把前任蒸馏成 AI Skill,用ta的方式跟你说话。Inspired by colleague-skill(同事skill).
Make Any Website & Tool Your CLI. A universal CLI Hub and AI-native runtime. Transform any website, Electron app, or local binary into a standardized command-line interface. Built for AI Agents to …
Official implementation of "OmniForcing: Unleashing Real-time Joint Audio-Visual Generation"[arXiv:2603.11647]. OmniForcing is the first framework to distill bidirectional audio-visual diffusion mo…
Lightweight coding agent that runs in your terminal
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
An agentic skills framework & software development methodology that works.
A diffusion-based framework for document OCR that replaces autoregressive decoding with block-level parallel diffusion decoding.
Try X-Dub to sync any character in a video with any audio you like | Official repository for "From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping"
Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
Automated system for LLM evaluation via agents.
Agentic IM Chatbot infrastructure that integrates lots of IM platforms, LLMs, plugins and AI feature, and can be your openclaw alternative. ✨
mjlab-native port of InstinctLab for humanoid RL and Project-Instinct workflows.
Unified Operator on Interactive World Model is a unified frontend for interactive world models. It lets users select a model, choose a dataset (e.g., CSGO) or directly upload an image, and immediat…
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
OpenClaw-RL: Train any agent simply by talking
The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment
The First Unified Agent Data Synthesis Framework for Custom Agentic Task with all-in-one envrionment
A curated collection of research papers, models, and resources tracing the evolution from specialized models to unified world models.