ETOG

Introduction

This is the official implementation code base for A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping accepted for ICRA 2025. Our project produces three separated github repos for ETOG, ETRG-A, ETRG-B models. Stay tuned for code release. Here is our Project Page.

This git repo includes the ETOG model, which is designed for parameter-efficient tuning on the Referring Expression Segmentation (RES) task.

Implementations for RGS and RGA Robotics Tasks

The ETRG-A model designed for Referring Grasp Synthesis (RGS) task can be found here.
The ETRG-B model designed for Referring Grasp Affordance (RGA) task can be found here.

Preparation

Conda env: We used Pytorch (2.1.0+cu118), other packages are in requirements.txt
Refcoco related dataset
- The detailed instruction is in prepare_datasets.md
- The folder arrangement after preparation should be like this:

$ETOG
├── config
├── model
├── enging
├── pretrain (manually download from CLIP -> R50, R101, ViT-B-16)
├── tools
│     ├── data_process.py
│     └── ...
├── ...
└── datasets
    ├── anns
    ├── lmdb
    │   ├── refcoco  
    │   ├── refcoco+
    │   ├── refcocog
    │   └── ...
    ├── masks
    │   ├── refcoco  
    │   ├── refcoco+
    │   ├── refcocog
    │   └── ...
    └── images

Pretrianed model wegihts and training/testing logs

Performance (mIoU) on Refcoco dataset:

Backbone	val	test A	test B	Weights	Train log	Test log
CLIP-R50	72.31	75.49	66.62	models	log	log
CLIP-R101	73.37	76.16	68.54	models	log	log
CLIP-ViT-B	73.37	76.90	69.34	models	log	log

We release all Refcoco-related pretrained weights reported on our paper.

More training/testing logs and model weights available for Refcoco+ and Refcocog benchmarks are available here on our google drive.

Train ETOG:

Quick run

bash run_scripts/train.sh

Please modify the config files (e.g. config/refcoco/bridge_r50.ymal) to change the batch_size, directory and test-split etc. values.

Our defualt setup: bs=16 on 1 NVIDIA RTX 2080 TI GPU.

Test ETOG:

Quick run

bash run_scripts/test.sh

or directly run test.py while changing the --config directory

We also provide prediction visualization saving functionality by setting up

TEST: 
  visualizate: True

in .yaml files. Currently, we support attention viusalizations for R50 and R101 (but not ViT backbone) in heatmap style.

Acknowledgment

The code is heavily adapted from ETIRS. We appreciate the authors for their wonderful codebase.

Citation

If ETOG-ETRG is useful for your research, please consider citing:

@article{yu2024parameter,
  title={A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping},
  author={Yu, Houjian and Li, Mingen and Rezazadeh, Alireza and Yang, Yang and Choi, Changhyun},
  journal={arXiv preprint arXiv:2409.19457},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
engine		engine
model		model
run_scripts		run_scripts
tools		tools
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attention_map.png		attention_map.png
pipeline.png		pipeline.png
requirement.txt		requirement.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ETOG

Introduction

Implementations for RGS and RGA Robotics Tasks

Preparation

Pretrianed model wegihts and training/testing logs

Train ETOG:

Test ETOG:

Acknowledgment

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ETOG

Introduction

Implementations for RGS and RGA Robotics Tasks

Preparation

Pretrianed model wegihts and training/testing logs

Train ETOG:

Test ETOG:

Acknowledgment

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages