Dsfer-Net: A deep supervision and feature retrieval network for bitemporal change detection using modern Hopfield networks
IEEE Transactions on Geoscience and Remote Sensing, 2024•ieeexplore.ieee.org
Change detection, an essential application for high-resolution remote sensing (RS) images,
aims to monitor and analyze changes in the land surface over time. Due to the rapid
increase in the quantity of high-resolution RS data and the complexity of texture features,
several quantitative deep learning-based methods have been proposed. These methods
outperform traditional change detection (CD) methods by extracting deep features and
combining spatial–temporal information. However, reasonable explanations for how deep …
aims to monitor and analyze changes in the land surface over time. Due to the rapid
increase in the quantity of high-resolution RS data and the complexity of texture features,
several quantitative deep learning-based methods have been proposed. These methods
outperform traditional change detection (CD) methods by extracting deep features and
combining spatial–temporal information. However, reasonable explanations for how deep …
Change detection, an essential application for high-resolution remote sensing (RS) images, aims to monitor and analyze changes in the land surface over time. Due to the rapid increase in the quantity of high-resolution RS data and the complexity of texture features, several quantitative deep learning-based methods have been proposed. These methods outperform traditional change detection (CD) methods by extracting deep features and combining spatial–temporal information. However, reasonable explanations for how deep features improve detection performance are still lacking. In our investigations, we found that modern Hopfield network (MHN) layers significantly enhance semantic understanding. In this article, we propose a deep supervision and feature retrieval network (Dsfer-Net) for bitemporal CD. Specifically, the highly representative deep features of bitemporal images are jointly extracted through a fully convolutional Siamese network. Based on the sequential geographical information of the bitemporal images, we designed a feature retrieval module to extract difference features and leverage discriminative information in a deeply supervised manner. In addition, we observed that the deeply supervised feature retrieval (DSFR) module provides explainable evidence of the semantic understanding of the proposed network in its deep layers. Finally, our end-to-end network establishes a novel framework by aggregating retrieved features and feature pairs from different layers. Experiments conducted on three public datasets (LEVIR-CD, WHU-CD, and CDD) confirm the superiority of the proposed Dsfer-Net over other state-of-the-art methods. Compared to the best-performing DSAMNet, Dsfer-Net demonstrates significant improvements, with scores increasing by 4.7%, 5.9%, and 2.3%. Furthermore, compared to our previous FrNet, Dsfer-Net also achieves noteworthy enhancements, with scores increasing by 2.0%, 1.4%, and 4.5% on three datasets. The code will be available online ( https://github.com/ShizhenChang/Dsfer-Net ).
ieeexplore.ieee.org