CIRKDV2: Cross-Image Relational Knowledge Distillation with Contextual Modeling for Efficient Semantic Segmentation
This repository contains the source code of CIRKDV2 and implementations of semantic segmentation distillation methods on popular datasets.
Ubuntu 22.04 LTS
Python 3.9 (Anaconda is recommended)
CUDA 12.4
Install python packages:
pip install -r requirements.txt
Backbones pretrained on ImageNet-1K:
| Type | Backbone | Pretrained |
|---|---|---|
| CNN | ResNet-101 | Download |
| CNN | ResNet-18 | Download |
| CNN | MobileNetV3-Small | Download |
| CNN | MobileNetV3-Large | Download |
| Transformer | MobileViT-XXS | Download |
| Transformer | MiT-B0 | Download |
| Transformer | MiT-B4 | Download |
Supported datasets:
| Dataset | Train Size | Val Size | Test Size | Class | Link |
|---|---|---|---|---|---|
| Cityscapes | 2975 | 500 | 1525 | 19 | Download |
| Pascal VOC Aug | 10582 | 1449 | -- | 21 | Download |
| CamVid | 367 | 101 | 233 | 11 | Download |
| ADE20K | 20210 | 2000 | -- | 150 | Download |
| COCO-Stuff-164K | 118287 | 5000 | -- | 182 | Download |
| Role | Network | Method | Test mIoU | Pretrained | Script |
|---|---|---|---|---|---|
| Teacher | DeepLabV3-ResNet101 | - | 78.30 | Download | - |
| Student | DeepLabV3-ResNet18 | Baseline | 73.56 | - | Train|Eval |
| Student | DeepLabV3-ResNet18 | CIRKDV2 | 75.60 | Download | Train|Eval |
| Student | UperNet-ResNet18 | Baseline | 68.90 | - | Train|Eval |
| Student | UperNet-ResNet18 | CIRKDV2 | 72.11 | Download | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Small | Baseline | 65.05 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Small | CIRKDV2 | 67.62 | Download | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | Baseline | 62.78 | - | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | CIRKDV2 | 65.42 | Download | Train|Eval |
| Student | DeepLabV3-MobileViT-XXS | Baseline | 66.24 | - | Train|Eval |
| Student | DeepLabV3-MobileViT-XXS | CIRKDV2 | 68.91 | Download | Train|Eval |
| Student | PSPNet-MobileViT-XXS | Baseline | 65.48 | - | Train|Eval |
| Student | PSPNet-MobileViT-XXS | CIRKDV2 | 68.45 | Download | Train|Eval |
| Role | Network | Method | Test mIoU | Pretrained | Script |
|---|---|---|---|---|---|
| Teacher | SegFormer-MiT-B4 | - | 80.38 | Download | - |
| Student | SegFormer-MiT-B0 | Baseline | 74.12 | - | Train|Eval |
| Student | SegFormer-MiT-B0 | CIRKDV2 | 75.52 | Download | Train|Eval |
You can zip the resulting images and submit it to the Cityscapes test server to obtain the test mIoU.
| Role | Network | Method | Val mIoU | Pretrained | Script |
|---|---|---|---|---|---|
| Teacher | DeepLabV3-ResNet101 | - | 43.83 | Download | |
| Student | DeepLabV3-ResNet18 | Baseline | 36.92 | - | Train|Eval |
| Student | DeepLabV3-ResNet18 | CIRKDV2 | 39.82 | Download | Train|Eval |
| Student | UperNet-ResNet-18 | Baseline | 34.37 | - | Train|Eval |
| Student | UperNet-ResNet-18 | CIRKDV2 | 36.87 | Download | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | Baseline | 32.83 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | CIRKDV2 | 36.14 | Download | Train|Eval |
| Student | PSPNet-MobileNetV3-Large | Baseline | 33.63 | - | Train|Eval |
| Student | PSPNet-MobileNetV3-Large | CIRKDV2 | 36.01 | Download | Train|Eval |
| Role | Network | Method | Val mIoU | Pretrained | Script |
|---|---|---|---|---|---|
| Teacher | DeepLabV3-ResNet101 | - | 77.80 | Download | |
| Student | DeepLabV3-MobileNetV3-Small | Baseline | 62.45 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Small | CIRKDV2 | 64.67 | Download | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | Baseline | 69.33 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | CIRKDV2 | 71.90 | Download | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | Baseline | 61.92 | - | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | CIRKDV2 | 63.84 | Download | Train|Eval |
| Student | PSPNet-MobileNetV3-Large | Baseline | 68.77 | - | Train|Eval |
| Student | PSPNet-MobileNetV3-Large | CIRKDV2 | 71.55 | Download | Train|Eval |
| Role | Network | Method | Val mIoU | Pretrained | Script |
|---|---|---|---|---|---|
| Teacher | DeepLabV3-ResNet101 | - | 38.48 | Download | |
| Student | DeepLabV3-ResNet-18 | Baseline | 32.65 | - | Train|Eval |
| Student | DeepLabV3-ResNet-18 | CIRKDV2 | 34.42 | Download | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | Baseline | 26.48 | - | Train|Eval |
| Student | PSPNet-MobileNetV3-Small | CIRKDV2 | 28.28 | Download | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Small | Baseline | 26.04 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Small | CIRKDV2 | 27.66 | Download | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | Baseline | 30.31 | - | Train|Eval |
| Student | DeepLabV3-MobileNetV3-Large | CIRKDV2 | 32.14 | Download | Train|Eval |
| Dataset | Color Pallete | Blend | Scripts |
|---|---|---|---|
| Pascal VOC | sh | ||
| Cityscapes | sh | ||
| ADE20K | sh | ||
| COCO-Stuff-164K | sh |
We would appreciate it if you could give this repo a star or cite our paper!
@inproceedings{yang2022cross,
title={Cross-image relational knowledge distillation for semantic segmentation},
author={Yang, Chuanguang and Zhou, Helong and An, Zhulin and Jiang, Xue and Xu, Yongjun and Zhang, Qian},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={12319--12328},
year={2022}
}
@article{yang2023online,
title={CIRKDV2: Cross-Image Relational Knowledge Distillation with Contextual Modeling for Efficient Semantic Segmentation},
author={Yang, Chuanguang and Wang, Yu and Yu, Chengqing and Yu, Xinqiang and Feng, Weilun and Li, Yuqi and An, Zhulin and Huang, Libo and Diao, Boyu and Wang, Fei and Zhuang, Fuzhen and Xu, Yongjun and Tian, Yingli and Huang, Tingwen and Song, Yongduan},
journal={Technical Report},
pages={1--17},
year={2025}
}