Skip to content

insait-institute/USDNet

Repository files navigation

Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description

This repository contains the official code release for the Articulate3D paper, accepted at ICCV 2025. It provides the USDNet baseline implementation as well as the SceneDataLoader for the Articulate3D dataset.

πŸ“„ Paper: Articulate3D (ICCV 2025)
🏁 Challenge: Track 3 at OpenSUN3D Workshop, ICCV 2025
πŸ€— Dataset: Articulate3D is available on HuggingFace


πŸ“¦ What's in this repo?

Currently released:

  • USDNet: Implementation of the baseline for Articulate3D challenge tasks.
  • SceneDataLoader: Python class for loading and parsing Articulate3D annotations.

πŸš€ Challenge Participation

Join the Articulate3D Challenge at the OpenSUN3D Workshop (ICCV 2025)!
We're hosting Track 3, which focuses on articulated scene understanding.

πŸ“ Challenge details and submission portal: OpenSUN3D Challenge


USDNet

1. Code structure

We adapt the codebase of Mask3D which provides a highly modularized framework for 3D Semantic Instance Segmentation based on the MinkowskiEngine.

β”œβ”€β”€ USDNet
β”‚   β”œβ”€β”€ main_instance_segmentation_articulation.py <- the main file
β”‚   β”œβ”€β”€ conf                          <- hydra configuration files
β”‚   β”œβ”€β”€ datasets
β”‚   β”‚   β”œβ”€β”€ preprocessing             <- folder with preprocessing scripts
β”‚   β”‚   β”‚   β”œβ”€β”€ articulate3d_preprocessing_challenge.py   <- file of preprocessing for the challenge
β”‚   β”‚   β”œβ”€β”€ semseg.py                 <- indoor dataset
β”‚   β”‚   └── utils.py        
β”‚   β”œβ”€β”€ models                        <- USDNet model based on Mask3D
β”‚   β”œβ”€β”€ trainer
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── trainer.py                <- train loop
β”‚   └── utils
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ processed                     <- folder for preprocessed datasets
β”‚   └── raw                           <- folder for raw datasets
β”œβ”€β”€ scripts                           <- train scripts
β”œβ”€β”€ docs
β”œβ”€β”€ README.md
└── saved                             <- folder that stores models and logs
└──Dockerfile                         <- Dockerfile for env setup for cuda 12

2. Dependencies πŸ“

The main dependencies of the project are the following:

python: 3.10.9
cuda: 11.3

You can set up a conda environment following instructions in Mask3D.

We also provide a Docker file (./Dockerfile) for the environment setup for cuda: 12.1:

docker build -t usdnet:latest .

3. Data preprocessing πŸ”¨

After installing the dependencies, we preprocess the datasets. Note we also provide the preprocessed data here for the convenience. You can download it and put it in the ./data/processed and skip the following preprocessing steps.

First, put the dataset in the dir "./data/raw/articulate3d". Then run the bash file and the preprocessed files will be saved in "./data/processed/". For efficiency, the preprocessing code will downsample the pointcloud of the mesh from Scannet++ with voxel size 0.01 cm. Note that the evaluation in Articulate3D challenge is based on the voxelized point cloud with the ground truth annotations.

Note the splits files is in "./datasets/articulate3d" and should be copied to "./data/raw/articulate3d/".

The structure should look like this:

β”œβ”€β”€ USDNet
β”‚   β”œβ”€β”€ data
β”‚   β”‚   β”œβ”€β”€ raw                       <- raw data
β”‚   β”‚   β”‚   β”œβ”€β”€ articulate3d
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€splits             <- splits of training, validation and test set
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€train.txt
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€val.txt
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€tesst.txt
β”‚   β”‚   β”‚   β”‚   │──scans
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€0a5c013435
β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€mesh_aligned_0.05.ply      <- mesh file
β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€0a5c013435_parts.json      <- annotation for movable and interactable part segmentation
β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€0a5c013435_artic.json      <- annotation for articulation parameters of movable part
β”‚   β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ ... 
β”‚   β”‚   β”œβ”€β”€ processed                 <- folder with processed data by preprocessing_articulate3d.sh 
β”‚   β”‚   β”‚   β”œβ”€β”€articulate3d_challenge_mov             <- processed data for movable part seg and articulation prediction
β”‚   β”‚   β”‚   β”‚   │──train                              <- dataset with pointcloud, color and normal  + annotation for training set 
β”‚   β”‚   β”‚   β”‚   │──validation                         <- dataset with pointcloud, color and normal  + annotation for validation set 
β”‚   β”‚   β”‚   β”‚   │──test                               <- dataset with pointcloud, color and normal  + annotation for test set 
β”‚   β”‚   β”‚   β”‚   │──train_database.yaml                <- database for train set, used for dataloader to locate file paths
β”‚   β”‚   β”‚   β”‚   │──validation_database.yaml           <- database for validation set
β”‚   β”‚   β”‚   β”‚   │──train_validation_database.yaml     <- database for train+validation set
β”‚   β”‚   β”‚   β”‚   │──test_database.yaml                 <- database for test set
β”‚   β”‚   β”‚   β”‚   │──expand_dict                        <- neighbored point annotation of movable part, for coarse to fine segmentation training
β”‚   β”‚   β”‚   β”‚   │──instance_gt                        <- gt segmentation annotation in .txt

4. Training πŸš†

Movable part segmentation and articulation prediction

Step 1

Download the pretrained model of Mask3D.

Step 2

Check the notes and TODOs in the "./scripts/train_mov.sh" to set the correct key and path

Step 3

Start training for movable part segmentation and articulation parameter prediction:

bash ./scripts/train_mov.sh

Interactable part segmentation

Step 1

Get the trained model from "Movable part segmentation and articulation prediction" and use it for training interactable part segmentation to speed up converging.

Step 2

Check the notes and TODOs in the "./scripts/inter_mov.sh" to set the correct key and path

In the simplest case the inference command looks as follows:

Step 3

Start training for interactable part segmentation:

bash ./scripts/train_inter.sh

Trained checkpoints πŸ’Ύ

We provide the trained checkpoints for the 2 tasks here.

5. Inference πŸ“ˆ

Run inference script for evaluation of the trained mode and for the challange submission

Movable part segmentation and articulation prediction

bash ./scripts/infer_mov.sh

Interactable part segmentation

bash ./scripts/infer_inter.sh

πŸ“‚ SceneDataLoader Documentation: SceneDataLoader.py

Overview

SceneDataLoader is a Python iterator that loads Articulate3D annotations from its dataset directory.
Each scene is composed of:

  • <scene_id>_parts.json: part annotations and mesh face indices.
  • <scene_id>_artic.json: articulation parameters (axis, origin, range, type).

Usage

from loader import SceneDataLoader

loader = SceneDataLoader("path/to/Articulate3D/")
for scene_id, scene_dict, face_mask in loader:
    print(f"Scene: {scene_id}")
    print(f"Articulated parts: {list(scene_dict.keys())}")
    print(f"Face mask shape: {face_mask.shape}")

TODO List

  • Release Code
  • Set up Challenge Server
  • Training Code and instructions
  • Checkpoints (mov yes, inter no)
  • provide preprocessed data for user's convenience
  • Add docker file for env setup in cuda 12
  • Merge data loader with json format to datapreprocessing

BibTeX πŸ™

@InProceedings{halacheva2024articulate3d,
    author    = {Halacheva, Anna-Maria and Miao, Yang and Zaech, Jan-Nico and Wang, Xi and Van Gool, Luc and Paudel, Danda Pani},
    title     = {Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year      = {2025},
  }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors