Tracking and Understanding Object Transformations

Project Page | Paper | Video

Official PyTorch implementation for the NeurIPS 2025 paper: "Tracking and Understanding Object Transformations".

TODOs (By 12/2)

Expand and polish VOST-TAS documentations and visualizations - Done (10/31)
Expand and polish main code documentations
Add quick demo from input to all predictions

⚙️ Installation

The code is tested with python=3.10, torch==2.7.0+cu126 and torchvision==0.22.0+cu126 on a RTX A6000 GPU.

git clone --recurse-submodules https://github.com/YihongSun/TubeletGraph/
cd TubeletGraph/
conda create -n tubeletgraph python=3.10
conda activate tubeletgraph
TODO: add more packages
pip install torch==1.12.1 torchvision==0.13.1
pip install matplotlib opencv-python tqdm scikit-image pycocotools omegaconf
pip install imageio
pip install imageio[ffmpeg]

In addition, please install the following:

Install SAM2 with multi-mask predictions in thirdparty according to corresponding documentations.
Install CropFormer with a separate conda environments according to their documentations.
Install FC-CLIP with a separate conda environments according to their documentations.

And update the corresponding paths in configs/default.yaml for CropFormer and FC-CLIP, accordingly.

🔮 Predictions

Computing entities (region proposals)

python3 TubeletGraph/entity_segmentation/cropformer.py -c <CONFIG> -d <DATASET> -s <SPLIT> --num_workers <N> --wid <I>
## example
conda activate cropformer      ## requires separation installation
python3 TubeletGraph/entity_segmentation/cropformer.py -c configs/default.yaml -d vost -s val

Computing tubelets

python3 TubeletGraph/tubelet/compute_tubelets_sam.py -c <CONFIG> -d <DATASET> -s <SPLIT> --num_workers <N> --wid <I>
## example
python3 TubeletGraph/tubelet/compute_tubelets_sam.py -c configs/default.yaml -d vost -s val

Computing semantic similarity

python3 TubeletGraph/semantic_sim/compute_sim_fcclip.py -c <CONFIG> -d <DATASET> -s <SPLIT> -t <TUBELET_NAME> --num_workers <N> --wid <I>
## example
conda activate fcclip           ## requires separation installation
python3 TubeletGraph/semantic_sim/compute_sim_fcclip.py -c configs/default.yaml -d vost -s val -t tubelets_vost_cropformer

Compute predictions

python3 TubeletGraph/get_prediction.py -c <CONFIG> -d <DATASET> -s <SPLIT> -m <METHOD>
## example
python3 TubeletGraph/get_prediction.py -c configs/default.yaml -d vost -s val -m Ours

Obtain state graph description

python3 TubeletGraph/vlm/prompt_vlm.py -c <CONFIG> -p <PRED>
## example
python3 TubeletGraph/vlm/prompt_vlm.py -c configs/default.yaml -p vost-val-Ours

📊 Evaluations

Compute tracking performances

python3 eval/eval.py -c <CONFIG> -p <PRED>
## example
python3 eval/eval.py -c configs/default.yaml -p vost-val-Ours

Compute state-graph performances

python3 eval/compute_temploc_pr.py -c <CONFIG> -p <PRED>
python3 eval/compute_sem_acc.py -c <CONFIG> -p <PRED>
## example
python3 eval/compute_temploc_pr.py -c configs/default.yaml -p vost-val-Ours_gpt-4.1
python3 eval/compute_sem_acc.py -c configs/default.yaml -p vost-val-Ours_gpt-4.1

🖼️ Visualizations

Visualizing entity segmentations

python3 eval/vis_entities.py -c <CONFIG> -d <DATASET> -m <MODEL> -i <INSTANCE>
## example
python3 eval/vis_entities.py -c configs/default.yaml -d vost -m cropformer -i 3161_peel_banana

Visualizing tubelets

python3 eval/vis_tubelets.py -c <CONFIG> -d <DATASET> -m <MODEL> -i <INSTANCE>_<OBJ_ID>
## example
python3 eval/vis_tubelets.py -c configs/default.yaml -d vost -m cropformer -i 3161_peel_banana_1

Visualizing state graphs

python3 eval/vis_tubelets.py -c <CONFIG> -p <PRED>
## example
python3 eval/vis_states.py -c configs/default.yaml -p vost-val-Ours_gpt-4.1

Citation

If you find our work useful in your research, please consider citing our paper:

@article{sun2025tracking,
  title={Tracking and Understanding Object Transformations},
  author={Sun, Yihong and Yang, Xinyu and Sun, Jennifer J and Hariharan, Bharath},
  journal={Advances in Neural Information Processing Systems},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
TubeletGraph		TubeletGraph
VOST-TAS		VOST-TAS
assets		assets
configs		configs
eval		eval
splits/vost		splits/vost
thirdparty		thirdparty
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tracking and Understanding Object Transformations

Project Page | Paper | Video

TODOs (By 12/2)

⚙️ Installation

🔮 Predictions

📊 Evaluations

🖼️ Visualizations

Citation

About

Uh oh!

Releases

Packages

Languages

License

Hadryan/TubeletGraph

Folders and files

Latest commit

History

Repository files navigation

Tracking and Understanding Object Transformations

Project Page | Paper | Video

TODOs (By 12/2)

⚙️ Installation

🔮 Predictions

📊 Evaluations

🖼️ Visualizations

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages