- Hong Kong
-
21:25
(UTC +08:00) - https://xiangyueliu.github.io/
- @star_chenxi
Stars
A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[CVPR 2026]Official implementation of "UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes"
Official implementation of "NoiseAR: AutoRegressing Initial Noise Prior for Diffusion Models"
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
Lets make video diffusion practical!
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
A simulation platform for versatile Embodied AI research and developments.
[CVPR2025] EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild
[CoRL 2024] DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes
[ICLR'25] 🍀 DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References
DressRecon: Freeform 4D Human Reconstruction from Monocular Video (3DV'25 Oral)
[3DV'25] GaussianAvatar-Editor: Photorealistic Animatable Gaussian Head Avatar Editor
NVIDIA Cosmos is an open platform of world models, datasets, and tools that enables developers to build Physical AI for robots, autonomous vehicles, smart infrastructure, and more.
Code for "DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT"
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
Depth Any Video with Scalable Synthetic Data (ICLR 2025)
RaDe-GS: Rasterizing Depth in Gaussian Splatting
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
[ECCV 2024] HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.
[SIGGRAPH'24] Official code of HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation
[SIGGRAPH 2024] Official PyTorch Implementation of "BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry".