Skip to content

tongzhang111/PDTNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation

Dataset Preparation

The used datasets are placed in data folder with the following structure.

data
|_ vidstg
|  |_ videos
|  |  |_ [video name 0].mp4
|  |  |_ [video name 1].mp4
|  |  |_ ...
|  |_ vstg_annos
|  |  |_ train.json
|  |  |_ ...
|  |_ sent_annos
|  |  |_ train_annotations.json
|  |  |_ ...
|  |_ data_cache
|  |  |_ ...
|_ hc-stvg2
|  |_ v2_video
|  |  |_ [video name 0].mp4
|  |  |_ [video name 1].mp4
|  |  |_ ...
|  |_ annos
|  |  |_ hcstvg_v2
|  |  |  |_ train.json
|  |  |  |_ test.json
|  |  data_cache
|  |  |_ ...
|_ hc-stvg
|  |_ v1_video
|  |  |_ [video name 0].mp4
|  |  |_ [video name 1].mp4
|  |  |_ ...
|  |_ annos
|  |  |_ hcstvg_v1
|  |  |  |_ train.json
|  |  |  |_ test.json
|  |  data_cache
|  |  |_ ...

The download link for the above-mentioned document is as follows:

hc-stvg: v1_video, annos, data_cache

hc-stvg2: v2_video, annos, data_cache

vidstg: videos, vstg_annos, sent_annos, data_cache

Model Preparation

The used datasets are placed in model_zoo folder

ResNet-101, VidSwin-T, roberta-base

Requirements

The code has been tested and verified using PyTorch 2.0.1 and CUDA 11.7. However, compatibility with other versions is also likely. To install the necessary requirements, please use the commands provided below:

pip3 install -r requirements.txt
apt install ffmpeg -y

Training

Please utilize the script provided below:

# run for HC-STVG
python3 -m torch.distributed.launch \
    --nproc_per_node=8 \
    scripts/train_net.py \
    --config-file "experiments/hcstvg.yaml" \
    INPUT.RESOLUTION 420 \
    OUTPUT_DIR output/hcstvg \
    TENSORBOARD_DIR output/hcstvg

# run for HC-STVG2
python3 -m torch.distributed.launch \
    --nproc_per_node=8 \
    scripts/train_net.py \
    --config-file "experiments/hcstvg2.yaml" \
    INPUT.RESOLUTION 420 \
    OUTPUT_DIR output/hcstvg2 \
    TENSORBOARD_DIR output/hcstvg2

# run for VidSTG
python3 -m torch.distributed.launch \
    --nproc_per_node=8 \
    scripts/train_net.py \
    --config-file "experiments/vidstg.yaml" \
    INPUT.RESOLUTION 420 \
    OUTPUT_DIR output/vidstg \
    TENSORBOARD_DIR output/vidstg

For additional training options, such as utilizing different hyper-parameters, please adjust the configurations as needed: experiments/hcstvg.yaml, experiments/hcstvg2.yaml and experiments/vidstg.yaml.

Evaluation

Please utilize the script provided below:

# run for HC-STVG
python3 -m torch.distributed.launch \
 --nproc_per_node=8 \
 scripts/test_net.py \
 --config-file "experiments/hcstvg.yaml" \
 INPUT.RESOLUTION 420 \
 MODEL.WEIGHT [Pretrained Model Weights] \
 OUTPUT_DIR output/hcstvg
 
# run for HC-STVG2
python3 -m torch.distributed.launch \
 --nproc_per_node=8 \
 scripts/test_net.py \
 --config-file "experiments/hcstvg2.yaml" \
 INPUT.RESOLUTION 420 \
 MODEL.WEIGHT [Pretrained Model Weights] \
 OUTPUT_DIR output/hcstvg2

# run for VidSTG
python3 -m torch.distributed.launch \
 --nproc_per_node=8 \
 scripts/test_net.py \
 --config-file "experiments/vidstg.yaml" \
 INPUT.RESOLUTION 420 \
 MODEL.WEIGHT [Pretrained Model Weights] \
 OUTPUT_DIR output/vidstg

Pretrained Model Weights

We provide our trained checkpoints for results reproducibility.

Experiments

🎏 CG-STVG achieves state-of-the-art performance on three challenging benchmarks, including HCSTVG-v1, HCSTVG-v2, and VidSTG, as shown below. Note that, the baseline is our CG-STVG without context generation and refinement.

Acknowledgement

This repo is partly based on the open-source release from STCAT and the evaluation metric implementation is borrowed from TubeDETR for a fair comparison.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages