Lists (2)
Sort Name ascending (A-Z)
Stars
Seen2Scene takes an incomplete real-world 3D scan and generates a complete, coherent 3D scene using visibility-guided flow matching — trained directly on real-world data.
[RSS 2026] Causal video-action world model for generalist robot control
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
The first multiplayer video world model in Minecraft
[ICLR 2026] FantasyWorld: Geometry-Consistent World Modeling via Unified Video and 3D Prediction
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
[ICML 2026 Spotlight] Latent Collaboration in Multi-Agent Systems
[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"
Official implementation of Continuous 3D Perception Model with Persistent State
(ICLR2026) ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation
🟣 Computer Vision interview questions and answers to help you prepare for your next machine learning and data science interview in 2026.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[NeurIPS 2025] Direct3D‑S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention
Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Official inference repo for FLUX.1 models
A open-source guide that demystifies how U.S. universities evaluate and admit students into Computer Science PhD programs.
机器学习工程师、算法工程师、软件工程师、数据科学家-面试指南 | Interview guide for MLE, SDE, DS
Efficient face emotion recognition in photos and videos
Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"