Starred repositories
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling
[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
Official Implementation of "MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives"
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
Official implementation of "Repurposing Geometric Foundation Models for Multi-view Diffusion"
(TPAMI 2026) Learning Continuous Wasserstein Barycenter Space for Generalized All-in-One Image Restoration
[CVPR 2025] VideoWorld is a simple generative model that learns purely from unlabeled videos—much like how babies learn by observing their environment.
To pioneer training long-context multi-modal transformer models
ViPE: Video Pose Engine for Geometric 3D Perception
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…
A list of works on video generation towards world model
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Consistent Autoregressive Video Generation with Long Context
Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation"
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals
QVerisAI / QVerisBot
Forked from openclaw/openclawYour own professional personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
[ICLR 2026] TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
LLM驱动的 A/H/美股智能分析器:多数据源行情 + 实时新闻 + LLM决策仪表盘 + 多渠道推送,零成本定时运行,纯白嫖. LLM-powered stock analysis system for A/H/US markets.
Code for "Diffusion Model Alignment Using Direct Preference Optimization"