Skip to content

GingL/EARN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the PyTorch Implementation of EARN

Introduction

This repository is Pytorch implementation of Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding.

Prerequisites

  • Python 3.5
  • Pytorch 0.4.1
  • CUDA 8.0

Installation

  1. Please refer to MattNet to install mask-faster-rcnn, REFER and refer-parser2. Follow Step 1 & 2 in Training to prepare the data and features.

  2. Calculate semantic similarity as supervision infotmation.

  • Download Glove word embedding to cache/word_embedding.

  • Generate semantic similarity and word embedding file.

python tools/prepro_sub_obj_wds.py --dataset ${DATASET} --splitBy ${SPLITBY}
python tools/prepro_sim.py --dataset ${DATASET} --splitBy ${SPLITBY}

Training

Train EARN with ground-truth annotation:

sh train.sh

Evaluation

Evaluate ARN with ground-truth annotation:

sh eval.sh

Evaluation for Complex Relational Reasoning of Referring Expression Grounding

We gather the referring expressions with higher-order or multi-entity relationships (mainly based on the length of the referring expression and the number of entities) from the original RefCOCO, RefCOCO+ and RefCOCOg validation and test set to evaluate the ability of models to reason the complex relationship. You can download the validation set in cache/prepro/.

  1. Examples.

    The examples for referring expressions with higher-order or multi-entity relationships can be seen in visualization.ipynb.

  1. Our performance.

    Here we show the number (num) and its percentage (ratio) of the expressions with complex relationships in original validation and test set, and the accuracy (IoU > 0.5) comparison of the max-context pooling (mcxtp) and soft-context pooling (scxtp). The RefCOCOg dataset has longer queries, so the cases with complex relationships are much higher. From the results, we can see soft-context pooling can perform better on complex relational reasoning.

RefCOCO RefCOCO+ RefCOCOg
num 653 637 4233
ratio ~3% ~3% ~44%
mcxtp 17.46% 20.88% 43.11%
scxtp 21.75% 21.66% 46.47%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published