Stars
Official Repo of TexVerse: A Universe of 3D Objects with High-Resolution Textures
Native and Compact Structured Latents for 3D Generation
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Unified Multimodal Model for image generation/editing/understanding
Official Implementation of Paper Transfer between Modalities with MetaQueries
Native Multimodal Models are World Learners
[NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think
Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".
[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis
A part-based 3D generation framework & the largest and most comprehensively annotated 3D part dataset.
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model
Official code for VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS 2025] Improving Video Generation with Human Feedback
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Wan: Open and Advanced Large-Scale Video Generative Models
Enjoy the magic of Diffusion models!
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
[NeurIPS 2025]SyncHuman: Synchronizing 2D and 3D Generative Models for Single-view Human Reconstruction.
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.