GitHub - idiap/geomgaze: ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour; code and checkpoints

Overview

ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour
Samy Tafasca *, Anshul Gupta *, Jean-Marc Odobez (* equal contribution)
ICCV 2023
[Paper] [Video] [Dataset]

This repository provides the official code and checkpoints for the GeomGaze model, as introduced in our ICCV paper, ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour. It also includes annotations and scripts for our novel semantic metric that evaluates gaze performance when looking at heads.

The GeomGaze model constructs a geometrically consistent point cloud of the scene. This point cloud is matched with a predicted 3D gaze vector to compute the 3D Field-of-View (3DFoV), highlighting visible regions in 3D. The 3DFoV is then combined with the scene image to predict the final gaze target.

Setup

Download Data

Download the required datasets:

GazeFollow extended: [Download]
VideoAttentionTarget: [Download]
ChildPlay: [Download]

Update the dataset paths (*_data) in config.py accordingly.

Additionally, download our processed data: [Download]

Validation labels: found in labels/ after extraction from the download.
- Update *_train_label, *_val_label, and *_test_label in the config.
Fixed image cropping parameters: found in val_crop_params/ after extraction from the download.
- Update *_val_crop_params in the config.

Extract Depth and Focal Length

Extract Depth Maps:
- Use the SamsungLabs depth estimation model with domain=depth.
- We use the b5_lrn4 model.
Extract Focal Length:
- Use the AdelaiDepth model with ResNeXt101 backbone.
- Save focal lengths as separate .txt files per image.
- We provide a modified inference script at utils/test_shape.py.
- Optionally approximate focal length with the longest side of the image in pixels (there may be a loss in performance).

Ensure the extracted outputs follow the dataset directory structure and update *_depth and *_focal_length in the config.

Setup Conda Environment

We use PyTorch for our experiments. Install dependencies using:

conda env create -f environment.yml

Training

Train on GazeFollow

python train.py --dataset GazeFollow

Train on VideoAttentionTarget

python train.py --dataset VideoAtt --init_weights <path>

Provide initial weights from training on GazeFollow using --init_weights.

Train on ChildPlay

python train.py --dataset ChildPlay --init_weights <path>

Provide initial weights from training on GazeFollow using --init_weights.

Testing

AUC, Distance, In-Out AP Metrics

Test on GazeFollow

python test_on_gazefollow.py --orig_ar --model_weights <path> --csv_path <csv_path>

Provide the model weights using --model_weights and the output path for predictions using --csv_path.

Test on VideoAttentionTarget/ChildPlay

python eval_on_vat_childplay.py --orig_ar --model_weights <path> --dataset <dataset> --csv_path <csv_path>

Specify the dataset (ChildPlay or VideoAtt) using --dataset.

LAH Metric

Download our annotations: found in LAH_annotations/ after extraction from the download.
Update bbox_path and gt_path in compute_lah.py. The data_path remains as per config.py.
Also update the dataset, subset (only for ChildPlay) and pred_path to the predictions csv.
Compute the LAH scores:

python compute_lah.py

Pre-trained Models

Our checkpoints are available under the same download link as our processed data: [Download]

Model	Filename
Human-centric module (update `human_centric_weights` in config)	human_centric.pt
GazeFollow pre-trained	geomgaze_gazefollow.pt
VideoAttentionTarget pre-trained	geomgaze_vat.pt
ChildPlay pre-trained	geomgaze_childplay.pt

Citation

If you use our code, please cite:

@InProceedings{Tafasca_2023_ICCV,
    author    = {Tafasca*, Samy and Gupta*, Anshul and Odobez, Jean-Marc},
    title     = {ChildPlay: A New Benchmark for Understanding Children's Gaze Behaviour},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {20935-20946},
    note      = {* Equal contribution}
}

References

This code is adapted from our previous work:

idiap/multimodal_gaze_target_prediction
- This work, in turn, leverages code from ejcgt/attention-target-detection.

We thank the authors for their contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSES		LICENSES
datasets		datasets
images		images
utils		utils
README.md		README.md
compute_distance_inout.py		compute_distance_inout.py
compute_lah.py		compute_lah.py
config.py		config.py
environment.yml		environment.yml
eval_on_gazefollow.py		eval_on_gazefollow.py
eval_on_vat_childplay.py		eval_on_vat_childplay.py
model.py		model.py
test_on_gazefollow.py		test_on_gazefollow.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Setup

Download Data

Extract Depth and Focal Length

Setup Conda Environment

Training

Train on GazeFollow

Train on VideoAttentionTarget

Train on ChildPlay

Testing

AUC, Distance, In-Out AP Metrics

Test on GazeFollow

Test on VideoAttentionTarget/ChildPlay

LAH Metric

Pre-trained Models

Citation

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Setup

Download Data

Extract Depth and Focal Length

Setup Conda Environment

Training

Train on GazeFollow

Train on VideoAttentionTarget

Train on ChildPlay

Testing

AUC, Distance, In-Out AP Metrics

Test on GazeFollow

Test on VideoAttentionTarget/ChildPlay

LAH Metric

Pre-trained Models

Citation

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages