Skip to content

OPA067/ReID

Repository files navigation

[Project] Person and Car Re-Identification (ReID)

🎯 Project Introduction

This project implements person and vehicle Re-Identification (ReID) tasks, supporting training, testing, and online inference workflows:

  • Dataset processing scripts, model code, pre-trained models, and evaluation tools.
  • A complete pipeline for environment setup, data preparation (for both person and vehicle datasets), training, and testing.
  • Support for multiple person ReID benchmarks (CUHK, Market, ICFG-PEDES, RSTP-Reid).
  • Vehicle ReID datasets from real-road videos, including object extraction, data cleaning, and dataset splitting.

📣 Updates

  • [2025, 08]: We have released the complete ReID training and testing code.

🚀 Quick Start

Setup

Download files

git clone https://github.com/OPA067/ReID
cd ReID

Setup code environment

conda create -n ReID python=3.12
conda activate ReID
pip install -r requirements.txt
pip install torch==1.8.1+cu102 torchvision==0.9.1+cu102 -f https://download.pytorch.org/whl/torch_stable.html

Prepare person and car datasets

  • 📂 reid_datasets
    • person_images
      • cam_a
      • cam_b
      • CUHK01
      • CUHK03
      • Market
      • ICFG-PEDES-train
      • ICFG-PEDES=test
      • train_query
      • test_query
      • RSTP-Reid
    • person_jsons
      • test_query.json
        • {"id": 0, "tar_path": "test_query/p3810/p3810_s4782.jpg", "can_path": "test_query/p3810/p3810_s4781.jpg"},
        • {"id": 1, "tar_path": "test_query/p7463/p7463_s9933.jpg", "can_path": "test_query/p7463/p7463_s9932.jpg"},
        • ...
      • train_cam_a_b.json
      • train_CUKH01.json
      • ...

where cam_a, cam_b, CUHK01, CUHK03, Market, train_query and test_query from CUHK-PEDES, RSTP-Reid from RSTP-Reid, and ICFG-PEDES-train and ICFG-PEDES-test from ICFG-PEDES, where .json stores the relative path of the image, e.g., "cam_a/010_0.bmp", and create .json details via:

python ./reid_datasets/create_person_reid_json.py and ./reid_datasets/create_car_reid_json.py

We collect real-road videos for object extraction, data cleaning, and train-test splitting, rather than using public datasets. The directory structure is as follows:

  • 📂 reid_datasets
    • car_images
      • train_video_1
      • train_video_2
      • train_video_3
      • ...
      • test_video_1
      • test_video_2
      • test_video_3
      • ...
    • car_jsons
      • train_video_1.json
      • ...
      • test_video_1.json
      • ...

where data organization is identical to that of the person dataset, and note that train_video_1 and test_video_1 do not represent the same video.

Download Pretrain Models

Models Download Link Functions Features
coco.txt Download For coco class_list -
yolov8l Download For target extraction -
yolov11 Download For target tracking -
RN50 Download For lightweight feature extraction [CLS], $I \in \mathcal{R}^{1 \times D}$
RN101 Download For lightweight feature extraction [CLS], $I \in \mathcal{R}^{1 \times D}$
RN50x4 Download For intensive feature extraction [CLS], $I \in \mathcal{R}^{1 \times D}$
RN50x64 Download For intensive feature extraction [CLS], $I \in \mathcal{R}^{1 \times D}$
ViT-B/32 Download For intensive feature extraction [CLS] and [Patch], $I \in \mathcal{R}^{1 \times D}$ and $P \in \mathcal{R}^{N \times D}$
ViT-B/16 Download For intensive feature extraction [CLS] and [Patch], $I \in \mathcal{R}^{1 \times D}$ and $P \in \mathcal{R}^{N \times D}$
ViT-L/14 Download For intensive feature extraction [CLS] and [Patch], $I \in \mathcal{R}^{1 \times D}$ and $P \in \mathcal{R}^{N \times D}$

The above models are stored as needed in the ./pretrain_models directory.

Train ReID model

#!/bin/bash
root_dir={YOUR WORDING ROOT DIR}
DATASET_NAME=RSTP-Reid
CUDA_VISIBLE_DEVICES=0 \
    python retrieval_train.py \
    --name RDE \
    --img_aug \
    --txt_aug \
    --batch_size 32 \
    --root_dir $root_dir \
    --output_dir experiments \
    --dataset_name $DATASET_NAME \
    --loss_names ReID  \
    --pretrain_choice {PRETRAIN MODEL} \
    --log_period 100 \
    --num_epoch 20

or you can run:

bash reid_train.sh

Test ReID model

python reid_test.py

✨ Vision

To enable real-time visualization of the retrieval process without pre-storing all candidate images, we integrate the detection and retrieval processes, directly showing how the target image matches with all candidate images, as illustrated below:

python reid_online.py

💪 Feature Enhancement

Feature enhancement includes single-feature re-representation and multi-feature aggregation. Single-feature re-representation involves remapping the [CLS] token using methods such as MLP or Transformer, while multi-feature aggregation aggregates [Patch] tokens through learnable modules like MHA or Cluster. For these two types of feature enhancement schemes, the following feature alignment methods are proposed:

1.zero-shot [CLS]

$S=sim({I}_1, {I}_2) = \frac{{I}_1 \cdot {I}_2}{||{I}_1||_2 \cdot ||{I}_2||_2}$

# update model/clip_model.py
def forward(self, tar_images, can_images):
    with torch.no_grad():
        tar_feats = self.encode_image(tar_images)
        can_feats = self.encode_image(can_images)

where $I$=[CLS]$\in \mathcal{R}^{1 \times D}$.

2.fine-tuning [CLS]

$S=sim({I}_1, {I}_2) = \frac{{I}_1 \cdot {I}_2}{||{I}_1||_2 \cdot ||{I}_2||_2}$

# update model/clip_model.py
def forward(self, tar_images, can_images):
    tar_feats = self.encode_image(tar_images)
    can_feats = self.encode_image(can_images)

3.zero-shot [CLS] + [Patch]

$S = \frac{1}{2} \left( sim({I}_1, {I}_2) + sim({P}_1, {P}_2) \right ) = \frac{1}{2} \left( \frac{{I}_1 \cdot {I}_2}{||{I}_1||_2 \cdot ||{I}_2||_2} + \frac{{P}_1 \cdot {P}_2}{||{P}_1||_2 \cdot ||{P}_2||_2} \right)$,
where $I$=[CLS]$\in \mathcal{R}^{1 \times D}$ and $P$=[Patch]$=\frac{1}{N}\sum_i^N P_i\in \mathcal{R}^{1 \times D}$.

4.fine-tuning [CLS] + [Patch]

$S = \frac{1}{2} \left( sim({I}_1, {I}_2) + sim({P}_1, {P}_2) \right ) = \frac{1}{2} \left( \frac{{I}_1 \cdot {I}_2}{||{I}_1||_2 \cdot ||{I}_2||_2} + \frac{{P}_1 \cdot {P}_2}{||{P}_1||_2 \cdot ||{P}_2||_2} \right)$,
where $I$=[CLS]$\in \mathcal{R}^{1 \times D}$ and $P$=[Patch]$=Model(P) \in \mathcal{R}^{1 \times D}$. $Model$ can use MLP, MHA, or PTM.

📄 Experiment Reports

行人再识别P2P技术报告1(初版) update 2025, 04.
行人再识别P2P技术报告2(优化) update 2025, 04.
行人再识别P2P技术报告3(补充) update 2025, 05.
PTM of ReID: Patch Token Merge update 2025, 05.
行人检索完整方案 update 2025, 06.
车辆再检索研究报告 update 2025, 07.
万物再检索研究报告 update 2025, 07.
轻量型密集型特征提取器实验报告 update 2025, 08.

📌 Hint

If you encounter any issues, feel free to contact 223081200014@smail.swufe.edu.cn

🎗️ Acknowledgments

Our code is based on CVPR2024RDE, CVPR2024HBI. We sincerely appreciate for their contributions.

About

[Project] Real-time Online Video Object Re-Identification (ReID)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors