[TPAMI 2026] Breaking Barriers, Localizing Saliency: A Large-scale Benchmark and Baseline for Condition-Constrained Salient Object Detection
Runmin Cong, Zhiyang Chen, Hao Fang*, Sam Kwong and Wei Zhang
- CSOD: We launch the new task Condition-Constrained Salient Object Detection (CSOD) with solutions from data and model dimensions, enabling intelligent systems to reliably address complex visual challenges in real/open environments. We also construct the large-scale benchmark CSOD10Kβthe first SOD dataset covering diverse constrained conditions, including 10,000 images, 3 constraint types, 8 real-world scenes, 101 object categories, and pixel-level annotations.
- SOTA Performance: We propose a unified end-to-end framework CSSAM for the CSOD task. We design a Scene Prior-Guided Adapter (SPGA) to enable the foundation model to better adapt to downstream constrained scenes. We propose a Hybrid Prompt Decoding Strategy (HPDS) that effectively generates and integrates multiple types of prompts to achieve adaptation to the SOD task.
- Python 3.9+
- Pytorch 2.0+ (we use the PyTorch 2.4.1)
- CUDA 12.1 or other version
Step 1: Create a conda environment and activate it.
conda create -n cssam python=3.9 -y
conda activate cssamStep 2: Install PyTorch. If you have experience with PyTorch and have already installed it, you can skip to the next section.
Step 3: Install other dependencies from requirements.txt
pip install -r requirements.txtPlease create a data folder in your working directory and put the CSOD10K dataset in it for training or testing. CSOD10K is divided into two parts, with 7503 images for training and 2497 images for testing.
data
βββ CSOD10K
β βββ class_list.txt
β βββ train
β β βββ image
β β β βββ 00001.jpg
β β β βββ ...
β β βββ mask
β β β βββ 00001.png
β β β βββ ...
β βββ test
β β βββ image
β β β βββ 00003.jpg
β β β βββ ...
β β βββ mask
β β β βββ 00003.png
β β β βββ ...
you can get our CSOD10K dataset in Baidu Disk (pwd:447k) or Google Drive.
Download the pretrained model of the scale you need:
Save them in ./checkpoints
To train the model(s) in the paper, run this command:
bash ./scripts/train.shWe also provide simple instructions if you want to train the base or tiny version of the model
bash ./scripts/train_base.shor
bash ./scripts/train_tiny.shTo test a model, run this command:
bash ./scripts/eval.shPre-trained weights for CSSAM variants are available for download:
| Model | Params (M) | Download Link | ||||
|---|---|---|---|---|---|---|
| CSSAM-T | 42.88 | 0.040 | 0.870 | 0.871 | 0.903 | Google Drive |
| CSSAM-B | 85.26 | 0.035 | 0.887 | 0.886 | 0.916 | Google Drive |
| CSSAM-L | 230.08 | 0.028 | 0.907 | 0.902 | 0.931 | Google Drive |
If you use CSOD in your research, please cite our paper:
@ARTICLE{11297835,
author={Cong, Runmin and Chen, Zhiyang and Fang, Hao and Kwong, Sam and Zhang, Wei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Breaking Barriers, Localizing Saliency: A Large-scale Benchmark and Baseline for Condition-Constrained Salient Object Detection},
year={2025},
volume={},
number={},
pages={1-18},
keywords={Salient Object Detection;Constrained Conditions;Benchmark Dataset;Scene Prior;Hybrid Prompt},
doi={10.1109/TPAMI.2025.3642893}}This repository is implemented based on the Segment Anything Model. Thanks to them for their excellent work.