OCDL

Offitial implementation of the ICASSP 2025 paper "Object-Centric Discriminative Learning for Text-Based Person Retrieval"Paper

Highlights

We propose a novel framework for text-based person retrieval, Object-Centric Discriminative Learning (OCDL), which incorporates person masks to indicate attentive regions, thereby enhancing the model’s focus on the pedestrians in images while suppressing the background noise. Additionally, a novel cross-modal matching loss, namely Soft Angular Distribution Matching (SADM), is introduced to learn discriminative visual and textual representations. Experiments on three widely-used TBPR benchmarks demonstrate the effectiveness of our approach.

Usage

Requirements

We use a single NVIDIA A100 GPU for training and evaluation.

conda create -n ocdl_reid python=3.8
conda activate ocdl_reid
pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

cd AlphaCLIP
pip install -e .

Prepare Datasets

We have uploaded the dataset to Google Drive, which includes the original CUHK-PEDES, ICFG-PEDES, and RSTPReid data, as well as additional person masks data (saved under the path ...alphas/). Note that we have not open-sourced the code for generating person masks, as it is quite simple and many open-source projects today can even produce better masks. If you're very interested, you can refer to Grounded-Segment-Anything as the basis.

Unzip and organize them in your dataset root dir folder as follows:

|-- YOUR_DATA_ROOT
   |-- CUHK-PEDES
       |-- imgs
            |-- cam_a
            |-- cam_b
            |-- ...
       |-- alphas
            |-- cam_a
            |-- cam_b
            |-- ...
       |-- reid_raw.json
       
   |-- ICFG-PEDES
       |-- imgs
            |-- test
            |-- train
       |-- alphas
            |-- test
            |-- train
       |-- ICFG_PEDES.json

   |-- RSTPReid
       |-- imgs
       |-- alphas
       |-- data_captions.json

Pretrained Weights

Download the model weights according to the provided link and place the downloaded files into the pretrain/ directory (e.g. pretrain/clip_b16_grit+mim_fultune_4xe.pth), or specify the weights directory using the --alpha_ckpt parameter. Note that you can select the architecture by specifying --pretrain_choice (e.g. --pretrain_choice ViT-B/16 for AlphaCLIP-B/16)

model	google drive link	openxlab link
AlphaCLIP-B/16	clip_b16_grit1m+mim_fultune_4xe	clip_b16_grit1m+mim_fultune_4xe
AlphaCLIP-L/14	clip_l14_grit1m+mim_fultune_6xe	clip_l14_grit1m+mim_fultune_6xe

Training

Change the YOUR_DATA_ROOT to your own path, specify a dataset and start to train your TBPR models.

# Training on text-based person retrieval benchmarks
YOUR_DATA_ROOT="data"
DATASET_NAME="CUHK-PEDES, ICFG-PEDES or RSTPReid"

CUDA_VISIBLE_DEVICES=0 \
python train_ocdl.py \
--root_dir $YOUR_DATA_ROOT \
--name OCDL \
--batch_size 128 \
--dataset_name $DATASET_NAME \
--loss_names 'sadm+id' \
--img_aug \
--lr 1e-5 \
--num_epoch 60 \
--pretrain_choice 'ViT-B/16' \
--sampler 'identity' \
--num_cls 4

Acknowledgments

Some components of this code implementation are adapted from CLIP, IRRA and AlphaCLIP. We sincerely appreciate for their contributions.

Citation

If you find our work useful for your research, please cite our paper.

@inproceedings{li2025object,
  title={Object-Centric Discriminative Learning for Text-Based Person Retrieval},
  author={Li, Haiwen and Liu, Delong and Su, Fei and Zhao, Zhicheng},
  booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={1--5},
  year={2025},
  organization={IEEE}
}

Contact

If you have any question, please contact us. E-mail: lihaiwen@bupt.edu.cn, liudelong@bupt.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
AlphaCLIP		AlphaCLIP
assets		assets
data		data
datasets		datasets
logs/CUHK-PEDES/BACKUP		logs/CUHK-PEDES/BACKUP
model		model
processor		processor
solver		solver
utils		utils
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
test_ocdl.py		test_ocdl.py
train_ocdl.py		train_ocdl.py
train_ocdl.sh		train_ocdl.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCDL

Highlights

Usage

Requirements

Prepare Datasets

Pretrained Weights

Training

Acknowledgments

Citation

Contact

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OCDL

Highlights

Usage

Requirements

Prepare Datasets

Pretrained Weights

Training

Acknowledgments

Citation

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages