Stars
Enjoy the magic of Diffusion models!
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
[ICLR 2026 oral] Official code for VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
Official code for "LagerNVS Latent Geometry for Fully Neural Real-time Novel View Synthesis" (CVPR 2026)
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image (CVPR 2026)
Official Code Release of SAGE: Scalable Agentic 3D Scene Generation for Embodied AI
GO-Renderer: Generative Object Rendering with 3D-aware Controllable Video Diffusion Models
Wan: Open and Advanced Large-Scale Video Generative Models
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
[ICCV 2025] DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Official Repo of TexVerse: A Universe of 3D Objects with High-Resolution Textures
[ICLR 26] Part-X-MLLM: Part-aware 3D Multimodal Large Language Model
[NeurIPS 2025]SyncHuman: Synchronizing 2D and 3D Generative Models for Single-view Human Reconstruction.
Native and Compact Structured Latents for 3D Generation
Native Multimodal Models are World Learners
[ICLR 2026] Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?
A part-based 3D generation framework & the largest and most comprehensively annotated 3D part dataset.