CV && CG
Official code of "EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model"
[ECCV-2024] LN3Diff creates high-quality 3D object mesh from text within 8 V100-SECONDS.
[ICLR 2024 spotlight] Official implementation of "InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior".
[CVPR 2024] DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis
Code to access and generate ProciGen dataset, CVPR'24.
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
Localizing Regions on 3D Shapes via Text Descriptions
A generative world for general-purpose robotics & embodied AI learning.
Reference PyTorch implementation and models for DINOv3