0% found this document useful (0 votes)
9 views32 pages

SSRN 5059199

Uploaded by

murtaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views32 pages

SSRN 5059199

Uploaded by

murtaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

1

ed
Self-supervised Polarization Image Dehazing
Method via Frequency Domain Generative

iew
Adversarial Networks
Rui Sun a, b, c, *, Long Chen a, b, Tanbin Liao a, b, Zhiguo Fan a, b
a School of Computer Science and Information Engineering, Hefei University of Technology, No. 485
Danxia Road, Hefei 230009, China
b Key Laboratory of Industry Safety and Emergency Technology, Hefei University of Technology, Hefei

ev
230009, China
c Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education of the Peoples

Republic of China, Hefei 230009, China

Abstract—Haze interferes the application of autonomous driving, traffic

r
surveillance and remote sensing. Image dehazing is the key technology to improve the

er
sharpness of images acquired in haze. However, the lack of paired annotation of

training data severely restricts performances of deep learning-based dehazing


pe
methods in real world. In this work, we propose a self-supervised polarization image

dehazing method utilizing frequency domain generative adversarial networks. By

introducing a polarization calculation module in the generator, the Stokes parameters

of airlight are accurately calculated, which are used to reconstruct the synthesized
ot

hazy image with the dehazed image generated via densely connected

encoder-decoder. Moreover, we optimize the discriminator with frequency domain


tn

characteristics obtained by frequency decomposition module and design the pseudo

airlight coefficient supervision loss to enhance the self-supervised training. By


rin

discriminating between synthetic hazy images and real hazy images, we achieve

adversarial training without paired data. At the same time, supervised by the

atmospheric scattering model, our network can iteratively generate more realistic
ep

defogged images. Extensive experiments on the constructed multi-view polarization

* Corresponding author.
E-mail address: sunrui@hfut.edu.cn (R. Sun).
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
2

ed
datasets demonstrate that our method achieves state-of-the-art performance in

iew
experiments without the need for real-world ground truth.

Keywords—Image dehazing, polarization, frequency domain, self-supervised,

generative adversarial network.

ev
1. Introduction
In the modern society with increasingly serious pollution, severe weather

conditions are increasingly occurring with greater frequency. The presence of haze

r
particles in atmosphere weakens the sharpness and contrast of images. It is no doubt

er
that haze will reduce the ability of computer vision algorithms to perceive scene

information. The recognition, detection and segmentation of targets will be seriously


pe
affected or even ineffective. Therefore, the research of improving visibility in haze,

named image dehazing [1], has very important application significance.

Existing dehazing methods are mainly based on priors or deep learning. The

theoretical support of methods based on priors is the atmospheric scattering model,


ot

which is used widely in the computer vision to explain physical causes of the imaging

system. By using statistical laws to estimate parameters of the atmospheric scattering


tn

model from the hazy image, such as the transmission and infinite airlight intensity, the

dehazed image can be recovered. However, these methods [2-4] propagate cascaded
rin

error upstretched due to the employed priors, limiting the performance on improving

visibility.

In recent years, many researchers have tried to use deep learning methods to
ep

estimate parameters in the atmospheric scattering model [5-8], which reduces the

cascaded error to a certain extent. Besides, many deep learning-based dehazing


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
3

ed
iew
r ev
er
pe
Fig. 1. Dehazing models use different architectures. (a) represents GAN-based dehazing methods,
these methods rely on the ground truth of the hazy image while the ground truth of real scenes is
usually difficult to define and obtain. And the synthetic data-driven methods have poor generalization
in the real world. (b) represents our proposed self-supervised P-GAN, it does not require additional
ground truth of the hazy image and effectively improves the authenticity of dehazed image by
introducing the atmospheric scattering model. Our generated dehazed image is in the generator, the
specific structure can be seen in Fig. 4.
ot

methods do not use the atmospheric scattering model [9-16, 32-35], which called the

end-to-end dehazing methods. As shown in Fig. 1(a), some GAN-based methods


tn

recover the dehazed image directly by learning the mapping from hazy images to

clear ones, but they are strictly limited by training datasets. Because it is impossible to

acquire the real-world hazy image and its ground truth at the same time, so synthetic
rin

datasets are used for most networks training. Their performances on real-world

datasets are far less than that on synthetic datasets.

Unlike the methods that rely on a single hazy image, the dehazing methods based
ep

on polarization [17-20] have objective advantages over all others because of utilizing

the polarization properties. The intensity of airlight varies regularly as the angle of
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
4

ed
polarization. With this law, the intensity of airlight can be calculated using degree of

iew
polarization, recovering the dehazed image. Although there are some deficiencies, the

polarization dehazing methods effectively solve the problem of insufficient input

information.

End-to-end dehazing networks have gradually become the most popular with the

ev
rapid development of deep learning, but few researchers pay attention to the

application of polarization that may help to learn the physical properties of hazy

images. Admittedly, the shortage of unsynthesizable polarization datasets is the most

r
important limitation. Therefore, with the advantage of a large amount of polarization

er
datasets [20] we have previously collected, we try to introduce polarization properties

to dehazing methods based on deep learning, hoping to enhance the performance and
pe
robustness of dehazing methods on real-world datasets. In this work, we propose a

self-supervised polarization image dehazing method based on frequency domain

generative adversarial networks (PGAN).

First, polarization images are used to calculate the Stokes parameters of airlight
ot

while generating the dehazed image based on the densely connected encoder-decoder.

The generated dehazed image and Stokes parameters are used to synthesize the hazy
tn

image through the physical model, as one of the input samples of the discriminator.

By adding the polarization calculation module, our method only needs to acquire
rin

three hazy polarization images in different angles at one time, without the real-world

ground truth.

Second, we also add a frequency decomposition module to optimize the


ep

discriminator based on the frequency domain distribution properties of the hazy

image. The input synthetic and original hazy images are separated in the frequency
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
5

ed
domain, and they combine with their respective high and low frequency sub-bands as

iew
samples of the discriminator, which effectively enhances the supervision of the

discriminator.

In addition, we design the pseudo airlight coefficient supervision loss to improve

the supervised ability of the self-supervised training stage. This loss function is based

ev
on the atmospheric scattering model, which effectively avoid the production of

excessive image noises, and further enhance the robustness and generalization ability

of the network. By transforming the problem into the discrimination of generated

r
hazy image and real hazy image, our generator can generate realistic dehazed image

er
that conforms to the atmospheric scattering model as much as possible. Our main

contributions can be summarized as follows:


pe
·The novel self-supervised polarization image dehazing method is proposed, which

integrates polarization properties and frequency domain information within the

framework of generative adversarial learning.

·The generator based on polarization is proposed and the discriminator is optimized


ot

by the frequency distribution of hazy images. It solves the limitation of paired training

datasets, and significantly improves the dehazing performance of the network in the
tn

real-world datasets.

·The pseudo airlight coefficient supervision loss is designed for our self-supervised
rin

learning dehazing method. Extensive qualitatively and quantitatively experiments

demonstrate the superiority of our proposed method.


ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
6

ed
iew
ev
Fig. 2. Our fixed atmospheric polarization information observation platform, the UAV atmospheric
polarization information observation platform can be seen in Sec 4.1. We conducted extensive field

r
observation experiments under foggy, hazy and other weather conditions, and obtained multi-target
polarization data under various meteorological conditions.

2.1 The dehazing methods based on polarization er


2. Theoretical Backgrounds

As shown in Fig. 2, in order to promote the innovation and development of


pe
polarization dehazing algorithm, we construct an atmospheric polarization

information observation platform and UAV atmospheric polarization information

observation platform from horizontal and vertical observation perspectives,


ot

respectively. Through this platform, we can better obtain polarization hazy data and

explore the transmission characteristics of atmospheric polarized light in strong


tn

scattering media.

The atmospheric scattering model explains the contrast decay of hazy images from

the perspective of the physical mechanism [1], as shown in Fig. 3. The mathematical
rin

representation of this model is given in

I ( x)  D( x)  A( x)  J ( x)t ( x)  A (1  t ( x)) (1)


ep

where I(x) is the captured hazy image, and x is the position of an image pixel. I(x)

consists of two parts, which are the direct transmission D(x) and airlight A(x),
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
7

ed
iew
ev
Fig. 3. Illustration of the atmospheric scattering model.

r
respectively. J(x) is the object radiance, and it is the result to be recovered. A ∞ is the

er
airlight radiance from the infinite distance, or illumination.

Without considering the polarization of objects, the early polarization dehazing


pe
method controlled the received intensity of airlight by rotating a polaroid, using the

degree of polarization to describe the proportion of polarized light in natural light [17].

But this method has a poor real-time performance. Nowadays, polarization cameras

generally have focal focus planes that can acquire polarization images at multiple
ot

angles simultaneously. When there are polarization images at three or more angles (I0,

I45, I90, I135), the Stokes vector can be expressed as


tn

 S I   I 0  I 90 
   
SQ I I
S      0 90  (2)
S  I  I 
 U   45 135 
rin

S  0
 V  

where SI denotes the total light intensity. SQ and SU denote the intensity of the linear

polarization state. SV denotes the intensity of circular polarization state, which is


ep

sufficiently small for it to be neglected (SV = 0) in most natural light.

The degree and angel of polarization noted as p and  respectively can be given by
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
8

ed
SQ2  SU2
p (3)
SI

iew
1 SU
  arctan (4)
2 SQ

The airlight radiance from the infinite distance is estimated using the Stokes vector

ev
as shown in

I 0  A 1  p A  / 2
A p A  (5)
cos 2  A

r
where pA and  A denote the polarized degree and angel of airlight.

er
Finally, the mathematical expression for obtaining the dehazed image J is given as

follows
pe
S I  SQ / p
J (6)
1  SQ / ( pA )

2.2 The dehazing methods based on GAN


The application of generative adversarial networks (GAN) has achieved amazing
ot

results in computer vision tasks. There are a lot of GAN-based methods in image

dehazing [9,21-24], which learn the mapping from hazy images to clear images
tn

without using the atmospheric scattering model.

As shown in Fig. 1(a), the generator will generate a dehazed image based on the

input hazy image. Then, the discriminator will score the approximation of the dehazed
rin

image and ground truth. Generally, this score ranges from 0 to 1, and the higher it is,

the more similar the dehazed image is to the ground truth. The generator will generate
ep

higher quality dehazed images by feeding this score. The training process needs a lot

of paired datasets to support up. Nevertheless, paired real-world datasets are very
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
9

ed
iew
r ev
Fig. 4. The architecture of our proposed method PGAN. The generator input consists of three polarized

er
images, and the original hazy image computed from 𝑓. 𝑓 refer to (6). PGAN-J is a densely connected
encoder-decoder used to generate the dehazed image, and PGAN-A is a polarization calculation module
designed to estimate the atmospheric parameters in the haze image and provide physical constraints. The
discriminator input consists of the original hazy image and the synthesized hazy image output by the
generator. The discriminatory power of discriminators is enhanced by utilizing frequency distribution
pe
properties. For the trained model, we only use PGAN-J to generate dehazed images.

difficult to acquire. Moreover, this mapping from hazy images to clear ones lacks

constraints from the atmospheric scattering model. In contrast, our self-supervised

PGAN does not require paired data and conforms to the constraints of the physical
ot

model, which has high real-world applicability and good interpretability, as shown in

Fig. 1(b).
tn

3. Proposed Method
As shown in Fig. 4, the proposed method PGAN is mainly composed of the
rin

generator based on polarization (PGAN-G), and the discriminator based on frequency

distribution (PGAN-D). Compared to previous methods, the input of our generator

utilizes polarization images, and the output is a synthetic hazy image but not an
ep

intermediate generated dehazed image. Similarly, the input of our discriminator

changes to the synthetic hazy image and the original one. The original hazy image that
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
10

ed
is regarded as the generating domain data can be calculated from the polarization

iew
images.

The loss function is built between the original hazy image and the synthetic hazy

image, so the whole network is a self-supervised training strategy.

3.1 The generator based on polarization dehazing

ev
Our generator uses polarization images to calculate the Stokes vector of airlight and

use the atmospheric scattering model to re-synthesize the hazy image after acquiring

the dehazed image. As shown in Fig. 4, PGN-J that constructed by a densely

r
connected encoder-to-decoder generate the dehazed image J from the original hazy

er
image. Since each neuron in the dense connection layer receives input from all

neurons in the previous layer, in low-level computer vision tasks, the dense
pe
connection layer can effectively facilitate feature extraction and reconstruction of

scenes, and excels at generating clearer dehazed images. The original hazy image I

can be calculated from the polarization images using the Stokes vector as shown in

2
ot

I  ( I 0  I 60  I120 ) (6)
3

where I0, I60, I120 represent the polarization images at angle 0°, 60°, 120°, respectively.
tn

Because each neuron receives input from all the neurons in the previous layer, the

dense layer can effectively promote the feature extraction and reconstruction of the
rin

scene in low-level computer vision tasks [9], which can generate a clearer dehazed

image.

PGAN-A is a polarization calculation module employed to calculate the Stokes


ep

vector of airlight from three polarization images. As illustrated in Fig. 5, this module

utilizes the frequency distribution to separate airlight and avoid the halo effect, which
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
11

ed
iew
r ev
er
Fig. 5. The illustration of PGAN-A. PGAN-A is a polarization calculation module employed to
calculate the Stokes vector of airlight from three polarization images.
pe
does not need to discuss complex polarization properties of objects. Specifically, the

polarization images are decomposed via the non-subsampled pyramid (NSP) [27].

Because of the airlight constraints, the decomposed low-pass sub-bands which are

used as airlight in different polarization angles should be refined. We perform the


ot

trilateral filtering in experiments.

Simply put, by separating the low-frequency sub-bands from three polarization


tn

images and constraining them, the Stokes vector can be calculated by

2 
  A0  A60  A120  
 3
rin

 S AI 
  2 
S A   S AQ     2 A0  A60  A120   (7)
S  3 
 AU   2 
  0 120 
A  A 
 3 
ep

where A0, A60, A120 represent the airlight acquired from polarization images at angle 0°,

60°, 120°, respectively. On this basis, equation (4) is used to calculate A∞.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
12

ed
Lastly, the synthetic hazy image I  can be calculated using

iew
JA  S AI A  JS AI
I  (8)
A

Discussion. The polarization dehazing method has its own unique advantages

compared to other dehazing methods, as it alleviates the fundamental flaw of

ev
dehazing tasks - insufficient acquisition of scene information. This advantage is

specifically reflected in the ability to calculate the original fog image and polarization

degree from the polarized image, and then estimate the physical model, which is very

r
effective in real-world dehazing. At present, although deep learning methods can

er
achieve good performance through style transfer [31] or layer disentanglement [32],

they ignore the application of polarization characteristics. We introduce polarization


pe
characteristics on the basis of GAN-based dehazing method, improve the

performance and robustness of the network on real data through the introduction of

polarization as a physical mechanism, solve the limitations of training data, and

enhance its interpretability, thus making efforts to promote the application of


ot

polarization characteristics in other deep learning methods.


tn

3.2 The discriminator based on frequency distribution


As illustrated in Fig. 4, our discriminator separates the input original hazy image I

and synthetic one I  in frequency domain, acquiring their high-frequency and


rin

low-frequency sub-bands, which combined with their corresponding hazy images as

samples of the discriminator. In order to train the network, we use the dehazed image

J as an intermediate result of the generator, hoping that the discriminator will guide
ep

the generator to generate the synthetic hazy image more similar to the original one,

and then obtained a clearer dehazed image under a physical constraint of the
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
13

ed
atmospheric scattering model. Furthermore, we try to enrich the input of

iew
discriminator and add more training constraints including frequency information.

Direct transmission and airlight distribute in the high-frequency IH and low-frequency

IL sub-bands of the image frequency band [20], respectively. Therefore, the

low-frequency components of hazy images are richer than clear ones because of

ev
greater airlight proportions. This feature is used to optimize the design of our

discriminator, thus adding a frequency decomposition module to the discriminator.

It should be noted that the frequency decomposition method used in this module is

r
consistent with PGAN-A as shown in Fig. 4. Because low-frequency sub-bands IL can

er
be used as airlight after constraints, the supervised learning of them can actually be

regarded as the supervised learning of airlight coefficient, which inspired us to design


pe
the supervised loss function of the pseudo-airlight coefficient. In previous work,

researchers simply use generated dehazed images and ground truths as pair of

learning samples input to the discriminator, with less exploration of the model's

supervisory information. The discriminator we designed can more effectively


ot

distinguish between real and fake data, thereby more real and satisfactory dehazed

images can be generated. We conducted a detailed experimental analysis on the


tn

effectiveness of the frequency domain decomposition module in Sec IV-D.

3.3 The design of loss functions


rin

This subsection mainly discusses the loss function design of our network, which

mainly includes, and Pseudo airlight coefficient supervision loss. Apart from

improving some common losses including pixel-level loss, SSIM loss and adversarial
ep

loss to integrate into our framework, we also design a Pseudo airlight coefficient

supervision loss, which considers the low-frequency components of the discriminator


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
14

ed
input and the application of the physical model. As the main information of the image,

iew
designing a pseudo atmospheric scattering coefficient supervised loss based on

low-frequency components can increase the constraints of network training and make

the training model more robust.

3.3.1 Pixel-level loss


The pixel-level loss is used to measure the fidelity between the real and fake

ev
example. In our scheme, given an original hazy image I, the synthetic hazy image

output by the generator is I  . Then the pixel-level loss function LP in the form of L1

r
on N samples can be written as

3.3.2 SSIM loss


N
LP   I i  I i
i 1
er (9)
pe
The Structure Similarity Index Measure (SSIM) is an important reference image

quality evaluation index [25], which comprehensively considers differences in

brightness, contrast and structure. It accurately reflects the image quality of human

perception so as widely used in the algorithm performance evaluation of computer


ot

vision tasks. In our method, SSIM can be written as

(2  I   I  C1 )(2 I I  C2 )
SSIM ( I , I ) 
tn

(10)
(  I2   I2  C1 )( I2   I2  C2 )

where,  ,  2 are the average value and the variance, respectively.  is the
rin

covariance. C1 , C2 are constants used to maintain stability. SSIM ranges from 0 to 1.

SSIM loss is defined as

LSSIM  1  SSIM ( I , I ) (11)


ep

3.3.3 Adversarial loss


The adversarial loss forces the generator to continuously improve the quality of
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
15

ed
samples by minimizing differences between the generated and real samples, while

iew
maximizing the ability of the discriminator. It can introduce the adversarial

mechanism in the training process, and help the model to learn more effective

representations and features, so as to achieve better performance and robustness. In

our method, the hazy images combined with high and low sub-bands samples are

ev
expressed as I  I H  I L , then the adversarial loss can be expressed as

LAdv  log(1  D(G ( I  I H  I L ))) (12)

r
3.3.4 Pseudo airlight coefficient supervision loss
The airlight can be estimated from the low frequency sub-band, so the training

er
supervision of the airlight coefficient is particularly important. We expect to penalize

differences of airlight between the original and synthetic hazy images in the network
pe
by designing the pseudo airlight coefficient supervision loss. The airlight estimated

from the low frequency sub-band of the synthetic and original hazy image is recorded

as A and A, respectively. Then the pseudo airlight coefficient supervision loss on the

N samples is defined in the form of L1 as


ot

N
LA   Ai  Ai (13)
i 1
tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
16

ed
iew
r ev
er
Fig. 6. Example images of our constructed dataset. We collect polarized images with polarization
angles of 0°, 60°, and 120°, as well as hazy image I. The obtained hazy images are divided into three
concentration levels: dense, heavy, and thin. In addition, the dataset covers both horizontal and vertical
pe
directions, effectively solving the problem of a single perspective in existing datasets.

Finally, the loss function of the whole network consists of the pixel-level loss,

SSIM loss, adversarial loss, and pseudo airlight coefficient supervision loss, and it

can be expressed as
ot

L  P LP  SSIM LSSIM  Adv LAdv  A LA (14)

where  represent the weight coefficient of the corresponding loss.


tn

4. Experimental Results
4.1 Datasets construction and parameters settings
rin

Dataset. Most of the existing hazy image datasets are synthetic, but they cannot

reflect the real-world haze distribution especially when the wind is strong. There are

also many problems in a small number of real-world datasets of creating artificial


ep

haze using a smoking generator. For a long time, the disclosure of datasets in the field

of polarization dehazing has not attracted enough attention from the research
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
17

ed
iew
ev
Fig. 7. The UAV for vertical direction data acquisition.

r
community, which has seriously restricted the popularization and application of

polarization methods.

er
In order to efficiently group polarization datasets, we built different polarization

data collection platforms with polarization cameras, and carried out long-term data
pe
collection work. We have collected more than 30,000 hazy polarization images, and

some of them have been open in the form of datasets, as shown in Fig. 6.

We selected 600 polarization images from each of three different haze


ot

concentrations in the horizontal direction, and the haze concentration was based on

the value of no reference image quality index FADE [26]. In addition to the horizontal
tn

direction of data, vertical direction data experimental reference value is equally

important, because they can evaluate the dehazing method on the aerial work, so we

also used 600 polarization images in vertical direction, which collected by our
rin

unmanned aerial vehicle (UAV) platform, as shown in Fig. 7.

Parameters. We set the four weight parameters P , SSIM ,  Adv , A to 1, 0.5, 0.1
ep

and 1, respectively. The Nonsubsampled Pyramid (NSP) was used in all frequency

domain decomposition modules, and the layer of decomposition was set to 4. In terms
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
18

ed
of the size of the input image, the original size captured by the polarization camera is

iew
2480×1860×3. The polarization image and the original fog image have been

downsampled, and the downsampled image size is 320×240×3. The training of the

entire network is conducted on one Nvidia RTX A6000 GPU.

4.2 Qualitative experimental results in the horizontal and vertical directions

ev
We chose dehazing algorithms based on priors and polarization including DCP [2]

and PBD [17]. We also chose two dehazing algorithms based on deep learning

including YOLY [32], D4 [24] and Dehamer [11], DehazeFormer [33] and DEA-Net

r
[35]. Our method is defined as PGAN. The experimental results of two groups are

shown in Fig. 8 and Fig. 9, respectively.


er
In the horizontal experimental design, the haze concentration in the selected
pe
experimental scenarios (1) to (6) showed an increasing trend, in order to facilitate the

testing of the performance of various algorithms under different haze concentrations.

In the selection of scenes, we deliberately arranged many areas that are difficult to

handle in the algorithm, such as water surfaces, bright white buildings, white trucks,
ot

etc., and some of these scenes were specifically shot under low light conditions. This

test data organization is significantly different from the overly single data
tn

organization form in existing work. We hope that through this experimental data

organization form, the performance of various algorithms can be more


rin

comprehensively reflected.

Horizontal direction comparison. The results of DCP and PBD are very different

in thin haze scenes or containing white objects, such as scenes 1 and 2. This can be
ep

attribute to the reason that DCP is prone to estimating a higher airlight radiance from

the infinite distance. In the dense haze condition, such as scenes 5 and 6, their results
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
19

ed
iew
r ev
er
pe
Fig. 8. Qualitative experimental results of the data in horizontal direction. It can be observed that our
proposed PGAN demonstrates a better dehazing effect than other polarization-based and
deep-learning-based state-of-the-art methods. Especially in the dense haze scenes, PGAN presents a strong
dehazing ability.
ot

are similar due to the strong haze coverage.


tn

It is obvious that PGAN has achieved excellent results in all horizontal direction

images. Even in dark light and dense haze conditions, such as scenes 4 to 6, the

improvement of visibility is very effective. This is made possible by the introduction


rin

of polarization calculation module and frequency information, which constrains the

model to recover the hazy image as clearly and reasonably as possible under the

atmospheric scattering model. Other image noises are inevitably generated in the
ep

experimental results of each method. We constantly adjust the polarization degree

correction factor of PBD, but the appearance of local color shift cannot be avoided.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
20

ed
iew
r ev
er
pe
Fig. 9. Qualitative experimental results of the data in vertical direction. It can be observed that our
proposed PGAN demonstrates a better dehazing effect than other polarization-based and
deep-learning-based state-of-the-art methods. Data-driven deep learning-based method lacks physical
ot

model constraints and demonstrates poorer dehazing results in the absence of vertical view training data.
However, PGAN exhibits a robust dehazing capability in different views.

Deep-learning based methods can better restore image details in thin haze scenes, but
tn

YOLY and D4 are worse than the advanced DEA-Net and DehazeFormer in

brightness recovery. In heavy and dense haze scenes, almost all deep-learning based

methods demonstrate a poor dehazing effect. Dehamer got heavier color shift in scene
rin

5, even a seriously mistake result in scene 6. In addition to restoring clearer dehazed

images under dense haze condition, PGAN can also enhance the visibility of targets
ep

under low light conditions. Although it seems to have some haze residue and

saturation reduce, it does not affect the first visual feeling of human eyes.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
21

ed
Vertical direction comparison. The visibility of hazy images taken in the vertical

iew
direction is related to the height of the UAV. Most of the methods such as PBD,

DeahzeFormer and PGAN achieve good results from scenes 1 to 3, because these

hazy images are collected in sunny days. But in scenes 4 to 6, only PBD and PGAN

can keep high brightness, because there are many areas with very weak light

ev
illumination or objects with weak reflective effects in these scenes. For example, the

trees under the tall buildings are shielded from most of the light in scene 5, and the

asphalt road with low light reflection in scene 4 and 6. Compared with all other

r
methods, the experimental results of PGAN are clearer, and the target recovery of the

er
low light scene is excellent, especially the details of the blue dome of scene 5 is well

restored. The trees and paths are also clearly visible. These results demonstrate that
pe
our method maintains excellent dehazing capability in a variety of scenarios.

4.3 Quantitative experimental results


We adopt the currently recognized no-reference image quality assessment to

quantitatively evaluate the dehazing results of our method compared to other


ot

dehazing methods. It should be noted that compared with dehazing algorithms that

can use synthetic data, polarization based dehazing algorithms cannot directly use
tn

reference image quality assessments when quantitatively evaluating dehazing image.

Although some works have suggested collecting haze-free data at different times in
rin

the same scene or using experimental results of a certain algorithm as ground-truth,

the ground-truth obtained through this method is not strict, so these practices have not

yet been widely recognized. Our selected no-reference image quality assessments
ep

include Fog Aware Density Evaluator (FADE) [26], Blind Image Quality Index (BIQI)

[28], Blind Image Spatial Quality Evaluator (BRISQUE) [29] and Nature Image
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
22

ed
Table 1
Quantitative experimental results of the data in horizontal direction. No-reference image quality

iew
assessments are calculated.

Methods Publication FADE↓ BIQI↓ BRISQUE↓ NIQE↓ US↑


Type Original
Hazy 6.73 85.16 45.81 9.78 /
image
DCP [2] TPAMI’10 0.67 58.86 18.74 5.40 0.15
Prior

PBD [17] AO’03 0.65 59.90 16.74 4.81 0.27


CAP [4] TIP’15 3.48 78.63 20.01 7.71 0.12

ev
AOD-Net [6] ICCV’17 3.06 66.94 24.49 9.16 0.18
FD-GAN [9] AAAI’20 1.66 71.41 26.06 6.07 0.11
Deep-Learning

YOLY [32] IJCV’21 2.08 71.55 33.10 7.47 0.25


D4 [24] CVPR’22 1.54 75.18 33.68 7.58 0.09
Dehamer [11] CVPR’22 1.50 53.94 30.62 6.54 0.20
DehazeFormer [33] TIP’23 1.89 82.98 40.40 9.15 0.17

r
SGDRL [34] NN’24 2.76 79.77 34.58 8.36 0.12
DEA-Net [35] TIP’24 2.52 83.31 37.46 9.28 0.08
PGAN Ours 0.43 50.01 16.19 4.73 0.43

Table 2 er
Quantitative experimental results of the data in vertical direction. No-reference image quality
assessments are calculated.
pe
Methods FADE↓ BIQI↓ BRISQUE↓ NIQE↓ US↑
Hazy 1.98 75.63 34.88 7.00 /
DCP [2] 0.58 53.44 26.62 5.84 0.12
PBD [17] 1.18 52.14 29.37 5.71 0.10
YOLY [32] 0.47 42.02 38.16 5.26 0.23
D4 [24] 0.92 61.20 29.56 6.04 0.19
ot

Dehamer [11] 0.79 42.27 42.17 5.93 0.07


DehazeFormer [33] 1.04 73.60 31.59 6.54 0.08
SGDRL [34] 1.12 72.46 30.32 6.17 0.10
DEA-Net [35] 1.18 73.23 31.07 6.56 0.07
PGAN 0.37 44.24 22.01 5.22 0.52
tn

Quality Evaluator (NIQE) [30]. These indexes based on the natural scene statistics

have different priorities, so they can reflect the dehazing performance


rin

comprehensively, rather than focusing on visibility improvement. US represents the

percentage of votes in user study. The mean score of each dehazing algorithm in

different groups is listed in Table 1 and Table 2. The best scores are marked in bold
ep

font.

In the horizontal direction, PGAN has achieved significant advantages in all


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
23

ed
evaluation metrics. Physical model-based methods such as PBD and DCP achieved

iew
better FADE than deep learning-based methods, indicating higher visibility in

dehazing images. This proves that physical constraints are the key factor in dehazing,

therefore our PGAN can achieve optimal FADE after introducing polarization

information constraint. Similarly, deep learning-based methods perform poorly on

ev
BRISQUE, with a score of 16.19 indicating that our method achieves minimal image

distortion and achieves the best human visual effects. The optimal NIQE shows that

our method can preserve the natural attributes of the image (e.g. contrast, clarity)

r
during the dehazing process.

er
While in the vertical direction, PGAN has attained the best FADE, BRISQUE and

NIQE. PGAN only achieve the second best BIQI, demonstrating that it contains some
pe
noise when generating dehazed images. This is an inevitable problem for

generative-based method, such as the high BIQI value of D4. YOLY achieves the best

BIQI through a layer disentanglement network that does not require training. The US

scores reflect that the PGAN is the most recognized among user groups. These results
ot

demonstrate that our method can achieve robust dehazing performance in natural

scenes with multiple perspectives and uneven haze.


tn

4.4 Ablation Study


The ablation experiments are performed to verify the effectiveness of our generator
rin

based on polarization and discriminator based on frequency distribution. For the

generator, we remove the airlight Stokes calculation module in PGAN-A, and replace

the input of PGAN-A with the original hazy image to estimate the airlight. This
ep

method is denoted as w/o SP. For the discriminator, we remove the frequency

decomposition module, and only use the original and synthetic hazy images as input
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
24

ed
iew
Fig. 10. Qualitative results of the ablation experiment. “w/o SP” means removing the Stokes calculation
module of PGAN-A and replacing the input with the original hazy image. “w/o FS” represents for removing
the frequency decomposition module of discriminator and the pseudo airlight coefficient supervision loss.

ev
Table 3
Quantitative results of the ablation experiment. No-reference image quality assessments are calculated.

FADE↓ BIQI↓ BRISQUE↓ NIQE↓ US↑

r
Hazy 6.68 90.49 35.87 10.77 /
w/o SP 4.59 64.13 23.30 8.62 0.12
w/o FS 1.78 55.43 14.54 4.77 0.41
PGAN 0.59 44.65

er 12.45 3.95 0.47

samples. Meanwhile, the pseudo airlight coefficient supervision loss is removed. This
pe
method is denoted as w/o FS.

As shown in Fig. 10, in the absence of the Stokes calculation module, the

calculation accuracy of the airlight decreases after losing the polarization information.

As a result, the dehazing performance of w/o SP is very limited. In fact, this is similar
ot

to most dehazing methods based on generative adversarial networks. If only the

frequency decomposition module of our discriminator is removed, although the


tn

visibility of the scene can be improved overall, the experimental results of w/o FS are

obviously less restored than PGAN in detail. It can be observed that details of trees
rin

and buildings marked by red boxes are blurred, and these details are restored in the

experimental results of PGAN.

Quantitative analysis of experimental results for the three protocols on published


ep

datasets is shown in Table 3. It is not difficult to find that the introduction of

polarization characteristics is the greatest improvement on the fog removal


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
25

ed
performance of the method, while the frequency domain decomposition module can

iew
further optimize the overall performance of the network.

5. Conclusion
In this work, in order to solve the limitation of real-world labels and improve the

effectiveness and robustness of dehazing methods based on deep learning, we propose

ev
the self-supervised polarization image dehazing using frequency domain generative

adversarial networks, which is denoted as PGAN in this paper. The generator based

r
on polarization reconstruct the synthetic hazy image using Stokes parameters and the

dehazed image produced by the densely connected encoder-decoder so as to realize

er
self-supervised learning. The discriminator based on frequency distribution is more

powerful because of the frequency separated samples and pseudo airlight coefficient
pe
supervision loss. Benefit from the introduction of polarization and frequency

information, PGAN can more accurately estimate the airlight to obtain more detailed

dehazed images even though without real-word clear images. In the experiments,
ot

PGAN has achieved advanced performance in real-world datasets, and the ablation

experiments have verified the effectiveness of each module.


tn

CRediT authorship contribution statement


Rui Sun: Writing – review & editing, Methodology, Investigation, Funding
rin

acquisition, Conceptualization. Long Chen: Writing – original draft, Validation,

Methodology, Investigation, Conceptualization. Tanbin Liao: Writing – review &

editing, Investigation, Conceptualization. Zhiguo Fan: Writing –review & editing,


ep

Supervision.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
26

ed
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal

iew
relationship that could have appeared to influence the work reported in this paper.

Data availability
Data will be made available on request.

ev
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China

r
under Grant 62302142 and Grant 61876057; in part by the China Postdoctoral

er
Science Foundation under Grant 2022M720981; in part by the Anhui Province

Natural Science Foundation under Grant No. 2208085MF158; and in part by the Key

Research Plan of Anhui Province - Strengthening Police with Science and


pe
Technology under Grant 202004d07020012.

References
ot

[1] X. Guo, Y. Yang, C. Wang and J. Ma, “Image dehazing via enhancement,

restoration, and fusion: A survey,” Information Fusion, 86: 146-170 (2022).


tn

[2] K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,”

IEEE Trans. Pattern Anal. Mach. Intell. 33, 2341–2353 (2011).

[3] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan, “Efficient image dehazing
rin

with boundary constraint and contextual regularization,” in IEEE International

Conference on Computer Vision (ICCV) (2013), pp. 617–624.

[4] Q. Zhu, J. Mai, and L. Shao, “A fast single image haze removal algorithm using
ep

color attenuation prior,” IEEE Trans. Image Process. 24, 3522–3533 (2015).
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
27

ed
[5] Ren, Wenqi et al. “Single Image Dehazing via Multi-scale Convolutional Neural

iew
Networks,” European Conference on Computer Vision (2016).

[6] B. Li, X. Peng, Z. Wang, J. Xu, and F. Dan, “AOD-net: all-in-one dehazing

network,” in IEEE International Conference on Computer Vision (ICCV) (2017), pp.

4770–4778.

ev
[7] B. Li, Y. Gou, J. Z. Liu, H. Zhu, J. T. Zhou and X. Peng, "Zero-Shot Image

Dehazing," in IEEE Transactions on Image Processing, vol. 29, pp. 8457-8466, 2020,

doi: 10.1109/TIP.2020.3016134.

r
[8] S. Zhang, X. Zhang, S. Wan, W. Ren, L. Zhao and L. Shen, "Generative

er
Adversarial and Self-Supervised Dehazing Network," IEEE Trans. Ind. Informat., vol.

20, no. 3, pp. 4187-4197, March 2024.


pe
[9] Y. Dong, Y. Liu, H. Zhang, S. Chen, and Y. Qiao, “FD-GAN: generative

adversarial networks with fusion-discriminator for single image dehazing,” in AAAI

Conference on Artificial Intelligence (AAAI) (2020), pp. 10729–10736.

[10] H. Dong et al., “Multi-Scale Boosted Dehazing Network With Dense Feature
ot

Fusion,” 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition

(CVPR), Seattle, WA, USA, 2020, pp. 2154-2164.


tn

[11] C. Guo, Q. Yan, S. Anwar, R. Cong, W. Ren and C. Li, “Image Dehazing

Transformer with Transmission-Aware 3D Position Embedding,” 2022 IEEE/CVF


rin

Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA,

USA, 2022, pp. 5802-5810.

[12] R. -Q. Wu, Z. -P. Duan, C. -L. Guo, Z. Chai and C. Li, "RIDCP: Revitalizing
ep

Real Image Dehazing via High-Quality Codebook Priors," 2023 IEEE/CVF


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
28

ed
Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC,

iew
Canada, 2023, pp. 22282-22291.

[13] Liang, S. Li, D. Cheng, W. Wang, D. Li and J. Liang, “Image dehazing via

self-supervised depth guidance,” Pattern Recognition, vol. 158, pp. 111051, 2025.

[14] T. Jia, J. Li, L. Zhuo and G. Li, "Effective Meta-Attention Dehazing Networks

ev
for Vision-Based Outdoor Industrial Systems," IEEE Trans. Ind. Informat., vol. 18,

no. 3, pp. 1511-1520, March 2022.

[15] Y. Cui, Q. Wang, C. Li, W. Ren and A. Knoll, “EENet: An effective and

r
efficient network for single image dehazing,” Pattern Recognition, vol. 158, pp.

111074, 2025.
er
[16] N. Jiang, K. Hu, T. Zhang, W. Chen, Y. Xu and T. Zhao, “Deep hybrid model for
pe
single image dehazing and detail refinement,” Pattern Recognition, vol. 136, pp.

109227, 2023.

[17] Y. Y. Schechner, S. G. Narasimhan, and S. K. Nayar, “Polarization-based vision

through haze,” Appl. Opt. 42, 511–525 (2003).


ot

[18] Y. Qu and Z. Zou, “Non-sky polarization-based dehazing algorithm for

non-specular objects using polarization difference and global scene feature,” Opt.
tn

Express 25, 25004–25022 (2017).

[19] F. Huang, C. Ke, X. Wu, S. Wang, J. Wu, and X. Wang, “Polarization dehazing
rin

method based on spatial frequency division and fusion for a far-field and dense hazy

image,” Appl. Opt. 60, 9319–9332 (2021).

[20] Rui Sun, Tanbin Liao, Zhiguo Fan, Xudong Zhang, and Changxiang Wang,
ep

"Polarization dehazing method based on separating and iterative optimizing airlight


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
29

ed
from the frequency domain for different concentrations of haze," Appl. Opt. 61,

iew
10362-10373 (2022).

[21] A. Dudhane, H. S. Aulakh and S. Murala, “RI-GAN: An End-To-End Network

for Single Image Haze Removal,” 2019 IEEE/CVF Conference on Computer Vision

and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019, pp.

ev
2014-2023.

[22] A. Dudhane and S. Murala, “CDNet: Single Image De-Hazing Using Unpaired

Adversarial Training,” 2019 IEEE Winter Conference on Applications of Computer

r
Vision (WACV), Waikoloa, HI, USA, 2019, pp. 1147-1155.

er
[23] W. Liu, X. Hou, J. Duan and G. Qiu, “End-to-End Single Image Fog Removal

Using Enhanced Cycle Consistent Adversarial Networks,” in IEEE Transactions on


pe
Image Processing, vol. 29, pp. 7819-7833, 2020.

[24] Y. Yang, C. Wang, R. Liu, L. Zhang, X. Guo and D. Tao, "Self-augmented

Unpaired Image Dehazing via Density and Depth Decomposition," 2022 IEEE/CVF

Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA,
ot

USA, 2022, pp. 2027-2036.

[25] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality


tn

assessment: from error visibility to structural similarity,” IEEE Trans. Image Process.

13, 600–612 (2004).


rin

[26] L. K. Choi, J. You, and A. C. Bovik, “Referenceless prediction of perceptual fog

density and perceptual image defogging,” IEEE Trans. Image Process. 24, 3888–3901

(2015).
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
30

ed
[27] A. L. Da Cunha, J. Zhou, and M. N. Do, “The nonsubsampled contourlet

iew
transform: theory, design, and applications,” IEEE Trans. Image Process. 15, 3089–

3101 (2006).

[28] A. K. Moorthy and A. C. Bovik, “A two-step framework for constructing blind

image quality indices,” IEEE Signal Process. Lett. 17, 513–516 (2010).

ev
[29] A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality

assessment in the spatial domain,” IEEE Trans. Image Process. 21, 4695–4708

(2012).

r
[30] A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a “completely blind”

er
image quality analyzer,” IEEE Signal Process. Lett. 20, 209–212 (2012).

[31] Y. Wang et al., "UCL-Dehaze: Toward Real-World Image Dehazing via


pe
Unsupervised Contrastive Learning," in IEEE Transactions on Image Processing, vol.

33, pp. 1361-1374, 2024, doi: 10.1109/TIP.2024.3362153.

[32] B. Li, Y. Gou, S. Gu, J. Z. Liu, J. T. Zhou and X. Peng, “You only look yourself:

Unsupervised and untrained single image dehazing neural network” in International


ot

Journal of Computer Vision, vol. 129, pp. 1754-1767, 2021.

[33] Y. Song, Z. He, H. Qian and X. Du, "Vision Transformers for Single Image
tn

Dehazing," in IEEE Transactions on Image Processing, vol. 32, pp. 1927-1941, 2023,

doi: 10.1109/TIP.2023.3256763.
rin

[34] T. Jia, J. Li, L. Zhuo, and J. Zhang, “Self-guided disentangled representation

learning for single image dehazing,” in Neural Networks, vol. 172, pp. 106107, 2024.

[35] Z. Chen, Z. He and Z. -M. Lu, "DEA-Net: Single Image Dehazing Based on
ep

Detail-Enhanced Convolution and Content-Guided Attention," in IEEE Transactions

on Image Processing, vol. 33, pp. 1002-1015, 2024, doi: 10.1109/TIP.2024.3354108.


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
31

ed
Sui Sun received the B.S. degree from Central South University of China in 1998, the

iew
M.S. degree from Harbin Engineering University of China in 2000, the Ph.D. degree

from Huazhong university of Science and Technology of China in 2003. He was a

visiting scholar in Computer Science department, University of Missouri-Columbia,

USA from 2010 to 2011. He is currently a professor in the School of Computer and

ev
Information of Hefei University of Technology, China. His research interests include

object recognition and tracking, computer vision, and machine learning.

r
Long Chen received the B.S. degree from Hefei University of Technology of China

er
in 2022. He is currently pursuing M.S. degree in Hefei University of Technology. His

research interests include machine learning and computer vision, especially cross
pe
modal person re-identification and corruption robustness.

Tanbin Liao received the B.S. degree from Hefei University of Technology of China

in 2021. He is currently pursuing M.S. degree in Hefei University of Technology. His


ot

research interests include machine learning and computer vision, especially

polarization image dehazing.


tn

Zhiguo Fan was born in Anhui, China, in 1978. He received the M.S. and Ph.D.
rin

degrees from the Hefei University of Technology in 2007 and 2011, respectively. He

is a Professor with the School of Computer and Information, Hefei University of

Technology. He involved in bionic polarized light navigation, polarization optical


ep

detection, the research of intelligent information processing theory, and development

of the related application technology. In recent years, he has been authorized 17


Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
32

ed
invention patents, published more than 40 academic articles and one academic

iew
monograph [Bionic Polarized Light Navigation Method (Science Press), 2014].

r ev
er
pe
ot
tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199

You might also like