Joint Multi-View Embedding with Progressive Multi-Scale Alignment for Unaligned Infrared-Visible Image Fusion

👨‍💻 Authors

Name	ORCID
Yida Chen (a)
Yafei Zhang (a)
Huafeng Li (a, ✉ corresponding author)
Zhengtao Yu (a)
Yu Liu (b)

🏫 Affiliations

a Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, 650500, China
b Department of Biomedical Engineering, Hefei University of Technology, Hefei, Anhui, 230009, China

Our paper has been accepted for publication in Information Fusion (2026). We would like to express our sincere gratitude to the reviewers, editors, and collaborators for their constructive feedback and continuous support.

README-Chinese | README-English

1. Introduction

We propose an end-to-end infrared-visible image fusion network (ME-PMA) with feature-level registration, featuring:

End-to-end registration and fusion for unaligned scenarios
Progressive multi-scale feature alignment with multi-view embedding
Superior performance across datasets with single model weights

Network Architecture

Key Components:

Feature Encoder: SFE, UIB_Block, and Restormer
MSPA: Multi-Scale Progressive Alignment module
Feature Decoder: FFCM fusion and FRRB reconstruction

MSPA Module

Restormer_Corr: Global feature extraction with local correlation
UIB_CA: Channel attention for local features
Reg_flow: Multi-view registration flow prediction

2. Environment Setup

Base Configuration

git clone https://github.com/yidamyth/ME-PMA.git
cd ME-PMA

# Create conda environment
conda create -n ME-PMA python=3.9.18
conda activate ME-PMA

# Install PyTorch
pip install torch==1.12.1+cu113
pip install torchvision==0.13.1+cu113

# Install dependencies
pip install -r requirements.txt

3. Data Preparation

.
└── ./DataSet/IVIF/
    ├── M3FD
        ├── test
            ├── ir
            ├── ir_move
            └── vis
    ├── MSRS
        ├── test
            ├── ir
            ├── ir_move
            └── vis
    └── RoadScene
        ├── RoadS_test
        │   ├── ir
        │   ├── ir_move
        │   └── vis
        └── RoadS_train
            ├── ir
            └── vis

4. Testing

End-to-end feature-level registration and fusion results (input images from unaligned scenarios)

python test_phase2.py
# Save to: ./DataSet/IVIF/RoadScene/RoadS_test/Results/UnAligned/

Direct fusion results, without using the registration module (input images from aligned scenarios)

python test.py
# Save to: ./DataSet/IVIF/RoadScene/RoadS_test/Results/Aligned/

You can switch datasets to get results from different datasets. The default is test_path['RoadScene'] or ['M3FD'] or ['MSRS']; the same model weights are used for all different dataset tests.

5. Training

First Stage Training (Fusion Network)

# 1. Find python location
which python
# ouput: /home/yida/anaconda3/envs/ME-PMA/bin/python

# 2. Edit conda path
vim run.sh

# 3. Switch to your anaconda3 conda path
eval "$(/home/your_user_name_xxx/anaconda3/bin/conda shell.bash hook)"

# 4. Save vim

# 5. Run
sh ./run.sh

# 6. Check logs
tail -f ./Logs/nohup/2024-1119-1001_time.log

# 7. Run program in background, can exit terminal
# Model save path: ./Model/Parameters/24-1119-1001/

# 8. Exit program
control + z

Second Stage Training (Registration Network)

# 1. Edit conda path
vim run_phase2.sh
eval "$(/home/your_user_name_xxx/anaconda3/bin/conda shell.bash hook)"

# Load first stage model path
phase2_model_id='24-1119-1001'
phase2_ModelPath='./Model/Parameters/24-1119-1001/RegImageFusModel-best.pth'
# Save vim

# 2. Run
sh ./run_phase2.sh

# 3. Check logs
tail -f ./Logs/nohup/2024-1119-1355_time.log

# 4. Exit program
control + z

6. Experiment Results

Registration + Fusion Comparison Visualization

Joint Optimization Comparison Visualization

Evaluation Metrics

Fusion Evaluation Metrics

$Q_{CE↓}$
$Q_{MI↑}$
$Q_{VIF↑}$
$Q_{AB/F↑}$
$Q_{CB↑}$
$Q_{CV↓}$

You can get our detailed quantitative evaluation metrics, using the following example:

python ./Util/metrics_fus.py

Registration Evaluation Metrics

$Q_{MI↑}$
$Q_{MS-SSIM↑}$
$Q_{NCC↑}$

You can get our detailed quantitative evaluation metrics, using the following example:

python ./Util/metrics_reg.py

For convenience, the provided metric scripts allow you to directly reproduce the results reported in the paper.

We provide the complete fusion results for direct and unbiased metric evaluation. For convenience, resized input images are also included to help reproduce the results. While the overall outputs remain consistent, minor pixel-level differences may occur due to the resizing operation.

Note that the registration evaluation metrics are the average of the three datasets.

Experiment Results Visualization

Registration + Fusion Optimization Results

📊 Table 1: Quantitative Comparison of Registration + Fusion Methods — Detailed Results

(Red bold = best, Orange bold = second-best)

📌 Note: In each column, values in red bold are the best, and values in orange bold are the second-best.

Joint Optimization Results

📊 Table 2: Quantitative Comparison of Joint Optimization Methods — Detailed Results

(Red bold = best, Orange bold = second-best)

📌 Note: In each column, values in red bold are the best, and values in orange bold are the second-best.

Only Registration Performance Comparison Results

Parameter Analysis: Joint Optimization

Object Detection and Semantic Segmentation

The related models will be updated and uploaded soon.

Model: Parameter Quantity + Calculation

cd ./ME-PMA
python -m Model.Architecture.RegImageFusModel

🥰 Acknowledgments

The overall architecture of this project was independently designed by the author@Yida Chen. However, parts of the implementation reference the following excellent open-source works:

Code References

CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
- (CVPR 2023) https://github.com/haozixiang1228/MMIF-CDDFuse
Correlation-aware Coarse-to-Fine MLPs for Deformable Medical Image Registration
- (CVPR 2024) https://github.com/MungoMeng/Registration-CorrMLP
MobileNetv4 Implementations
- MobileNetv4-1: https://github.com/jiaowoguanren0615/MobileNetV4
- MobileNetv4-2: https://github.com/jaiwei98/MobileNetV4-pytorch

Metric Implementation

Analysis of Quality Objective Assessment Metrics for Visible and Infrared Image Fusion
- (Journal of Image and Graphics 2023) https://github.com/sunbinuestc/VIF-metrics-analysis

Experimental Visualization Tools

MulimgViewer (for local detail visualization)
- https://github.com/nachifur/MulimgViewer

We sincerely appreciate the open-source community for providing valuable tools, resources, and inspiration that greatly supported the development of this project.

😘 Citation

If this work benefits your research, a citation to our paper would be greatly appreciated:

@article{2026_ME-PMA,
  title   = {Joint multi-view embedding with progressive multi-scale alignment for unaligned infrared-visible image fusion},
  author  = {Chen, Yida and Zhang, Yafei and Li, Huafeng and Yu, Zhengtao and Liu, Yu},
  journal = {Information Fusion},
  volume  = {128},
  pages   = {103960},
  year    = {2026},
  doi     = {10.1016/j.inffus.2025.103960}
}

🔖 License

This project uses the MIT License. See LICENSE file.

😃 Contact

Thank you for your attention. If you have any questions, please contact us by email at yida_myth@163.com. We will get back to you as soon as possible, and you may also raise your questions through the Issues page.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Config		Config
DataSet/IVIF		DataSet/IVIF
Figures		Figures
Logs/nohup		Logs/nohup
Model		Model
Util		Util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
requirements.txt		requirements.txt
run.sh		run.sh
run_batch.sh		run_batch.sh
run_phase2.sh		run_phase2.sh
test.py		test.py
test_phase2.py		test_phase2.py
train.py		train.py
train_phase2.py		train_phase2.py

License

yidamyth/ME-PMA

Folders and files

Latest commit

History

Repository files navigation