SSRN 5059199
SSRN 5059199
ed
Self-supervised Polarization Image Dehazing
Method via Frequency Domain Generative
iew
Adversarial Networks
Rui Sun a, b, c, *, Long Chen a, b, Tanbin Liao a, b, Zhiguo Fan a, b
a School of Computer Science and Information Engineering, Hefei University of Technology, No. 485
Danxia Road, Hefei 230009, China
b Key Laboratory of Industry Safety and Emergency Technology, Hefei University of Technology, Hefei
ev
230009, China
c Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education of the Peoples
r
surveillance and remote sensing. Image dehazing is the key technology to improve the
er
sharpness of images acquired in haze. However, the lack of paired annotation of
of airlight are accurately calculated, which are used to reconstruct the synthesized
ot
hazy image with the dehazed image generated via densely connected
discriminating between synthetic hazy images and real hazy images, we achieve
adversarial training without paired data. At the same time, supervised by the
atmospheric scattering model, our network can iteratively generate more realistic
ep
* Corresponding author.
E-mail address: sunrui@hfut.edu.cn (R. Sun).
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
2
ed
datasets demonstrate that our method achieves state-of-the-art performance in
iew
experiments without the need for real-world ground truth.
ev
1. Introduction
In the modern society with increasingly serious pollution, severe weather
conditions are increasingly occurring with greater frequency. The presence of haze
r
particles in atmosphere weakens the sharpness and contrast of images. It is no doubt
er
that haze will reduce the ability of computer vision algorithms to perceive scene
Existing dehazing methods are mainly based on priors or deep learning. The
which is used widely in the computer vision to explain physical causes of the imaging
model from the hazy image, such as the transmission and infinite airlight intensity, the
dehazed image can be recovered. However, these methods [2-4] propagate cascaded
rin
error upstretched due to the employed priors, limiting the performance on improving
visibility.
In recent years, many researchers have tried to use deep learning methods to
ep
estimate parameters in the atmospheric scattering model [5-8], which reduces the
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
3
ed
iew
r ev
er
pe
Fig. 1. Dehazing models use different architectures. (a) represents GAN-based dehazing methods,
these methods rely on the ground truth of the hazy image while the ground truth of real scenes is
usually difficult to define and obtain. And the synthetic data-driven methods have poor generalization
in the real world. (b) represents our proposed self-supervised P-GAN, it does not require additional
ground truth of the hazy image and effectively improves the authenticity of dehazed image by
introducing the atmospheric scattering model. Our generated dehazed image is in the generator, the
specific structure can be seen in Fig. 4.
ot
methods do not use the atmospheric scattering model [9-16, 32-35], which called the
recover the dehazed image directly by learning the mapping from hazy images to
clear ones, but they are strictly limited by training datasets. Because it is impossible to
acquire the real-world hazy image and its ground truth at the same time, so synthetic
rin
datasets are used for most networks training. Their performances on real-world
Unlike the methods that rely on a single hazy image, the dehazing methods based
ep
on polarization [17-20] have objective advantages over all others because of utilizing
the polarization properties. The intensity of airlight varies regularly as the angle of
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
4
ed
polarization. With this law, the intensity of airlight can be calculated using degree of
iew
polarization, recovering the dehazed image. Although there are some deficiencies, the
information.
End-to-end dehazing networks have gradually become the most popular with the
ev
rapid development of deep learning, but few researchers pay attention to the
application of polarization that may help to learn the physical properties of hazy
r
important limitation. Therefore, with the advantage of a large amount of polarization
er
datasets [20] we have previously collected, we try to introduce polarization properties
to dehazing methods based on deep learning, hoping to enhance the performance and
pe
robustness of dehazing methods on real-world datasets. In this work, we propose a
First, polarization images are used to calculate the Stokes parameters of airlight
ot
while generating the dehazed image based on the densely connected encoder-decoder.
The generated dehazed image and Stokes parameters are used to synthesize the hazy
tn
image through the physical model, as one of the input samples of the discriminator.
By adding the polarization calculation module, our method only needs to acquire
rin
three hazy polarization images in different angles at one time, without the real-world
ground truth.
image. The input synthetic and original hazy images are separated in the frequency
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
5
ed
domain, and they combine with their respective high and low frequency sub-bands as
iew
samples of the discriminator, which effectively enhances the supervision of the
discriminator.
the supervised ability of the self-supervised training stage. This loss function is based
ev
on the atmospheric scattering model, which effectively avoid the production of
excessive image noises, and further enhance the robustness and generalization ability
r
hazy image and real hazy image, our generator can generate realistic dehazed image
er
that conforms to the atmospheric scattering model as much as possible. Our main
by the frequency distribution of hazy images. It solves the limitation of paired training
datasets, and significantly improves the dehazing performance of the network in the
tn
real-world datasets.
·The pseudo airlight coefficient supervision loss is designed for our self-supervised
rin
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
6
ed
iew
ev
Fig. 2. Our fixed atmospheric polarization information observation platform, the UAV atmospheric
polarization information observation platform can be seen in Sec 4.1. We conducted extensive field
r
observation experiments under foggy, hazy and other weather conditions, and obtained multi-target
polarization data under various meteorological conditions.
respectively. Through this platform, we can better obtain polarization hazy data and
scattering media.
The atmospheric scattering model explains the contrast decay of hazy images from
the perspective of the physical mechanism [1], as shown in Fig. 3. The mathematical
rin
where I(x) is the captured hazy image, and x is the position of an image pixel. I(x)
consists of two parts, which are the direct transmission D(x) and airlight A(x),
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
7
ed
iew
ev
Fig. 3. Illustration of the atmospheric scattering model.
r
respectively. J(x) is the object radiance, and it is the result to be recovered. A ∞ is the
er
airlight radiance from the infinite distance, or illumination.
degree of polarization to describe the proportion of polarized light in natural light [17].
But this method has a poor real-time performance. Nowadays, polarization cameras
generally have focal focus planes that can acquire polarization images at multiple
ot
angles simultaneously. When there are polarization images at three or more angles (I0,
S I I 0 I 90
SQ I I
S 0 90 (2)
S I I
U 45 135
rin
S 0
V
where SI denotes the total light intensity. SQ and SU denote the intensity of the linear
The degree and angel of polarization noted as p and respectively can be given by
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
8
ed
SQ2 SU2
p (3)
SI
iew
1 SU
arctan (4)
2 SQ
The airlight radiance from the infinite distance is estimated using the Stokes vector
ev
as shown in
I 0 A 1 p A / 2
A p A (5)
cos 2 A
r
where pA and A denote the polarized degree and angel of airlight.
er
Finally, the mathematical expression for obtaining the dehazed image J is given as
follows
pe
S I SQ / p
J (6)
1 SQ / ( pA )
results in computer vision tasks. There are a lot of GAN-based methods in image
dehazing [9,21-24], which learn the mapping from hazy images to clear images
tn
As shown in Fig. 1(a), the generator will generate a dehazed image based on the
input hazy image. Then, the discriminator will score the approximation of the dehazed
rin
image and ground truth. Generally, this score ranges from 0 to 1, and the higher it is,
the more similar the dehazed image is to the ground truth. The generator will generate
ep
higher quality dehazed images by feeding this score. The training process needs a lot
of paired datasets to support up. Nevertheless, paired real-world datasets are very
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
9
ed
iew
r ev
Fig. 4. The architecture of our proposed method PGAN. The generator input consists of three polarized
er
images, and the original hazy image computed from 𝑓. 𝑓 refer to (6). PGAN-J is a densely connected
encoder-decoder used to generate the dehazed image, and PGAN-A is a polarization calculation module
designed to estimate the atmospheric parameters in the haze image and provide physical constraints. The
discriminator input consists of the original hazy image and the synthesized hazy image output by the
generator. The discriminatory power of discriminators is enhanced by utilizing frequency distribution
pe
properties. For the trained model, we only use PGAN-J to generate dehazed images.
difficult to acquire. Moreover, this mapping from hazy images to clear ones lacks
PGAN does not require paired data and conforms to the constraints of the physical
ot
model, which has high real-world applicability and good interpretability, as shown in
Fig. 1(b).
tn
3. Proposed Method
As shown in Fig. 4, the proposed method PGAN is mainly composed of the
rin
utilizes polarization images, and the output is a synthetic hazy image but not an
ep
changes to the synthetic hazy image and the original one. The original hazy image that
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
10
ed
is regarded as the generating domain data can be calculated from the polarization
iew
images.
The loss function is built between the original hazy image and the synthetic hazy
ev
Our generator uses polarization images to calculate the Stokes vector of airlight and
use the atmospheric scattering model to re-synthesize the hazy image after acquiring
r
connected encoder-to-decoder generate the dehazed image J from the original hazy
er
image. Since each neuron in the dense connection layer receives input from all
neurons in the previous layer, in low-level computer vision tasks, the dense
pe
connection layer can effectively facilitate feature extraction and reconstruction of
scenes, and excels at generating clearer dehazed images. The original hazy image I
can be calculated from the polarization images using the Stokes vector as shown in
2
ot
I ( I 0 I 60 I120 ) (6)
3
where I0, I60, I120 represent the polarization images at angle 0°, 60°, 120°, respectively.
tn
Because each neuron receives input from all the neurons in the previous layer, the
dense layer can effectively promote the feature extraction and reconstruction of the
rin
scene in low-level computer vision tasks [9], which can generate a clearer dehazed
image.
vector of airlight from three polarization images. As illustrated in Fig. 5, this module
utilizes the frequency distribution to separate airlight and avoid the halo effect, which
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
11
ed
iew
r ev
er
Fig. 5. The illustration of PGAN-A. PGAN-A is a polarization calculation module employed to
calculate the Stokes vector of airlight from three polarization images.
pe
does not need to discuss complex polarization properties of objects. Specifically, the
polarization images are decomposed via the non-subsampled pyramid (NSP) [27].
Because of the airlight constraints, the decomposed low-pass sub-bands which are
2
A0 A60 A120
3
rin
S AI
2
S A S AQ 2 A0 A60 A120 (7)
S 3
AU 2
0 120
A A
3
ep
where A0, A60, A120 represent the airlight acquired from polarization images at angle 0°,
60°, 120°, respectively. On this basis, equation (4) is used to calculate A∞.
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
12
ed
Lastly, the synthetic hazy image I can be calculated using
iew
JA S AI A JS AI
I (8)
A
Discussion. The polarization dehazing method has its own unique advantages
ev
dehazing tasks - insufficient acquisition of scene information. This advantage is
specifically reflected in the ability to calculate the original fog image and polarization
degree from the polarized image, and then estimate the physical model, which is very
r
effective in real-world dehazing. At present, although deep learning methods can
er
achieve good performance through style transfer [31] or layer disentanglement [32],
performance and robustness of the network on real data through the introduction of
samples of the discriminator. In order to train the network, we use the dehazed image
J as an intermediate result of the generator, hoping that the discriminator will guide
ep
the generator to generate the synthetic hazy image more similar to the original one,
and then obtained a clearer dehazed image under a physical constraint of the
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
13
ed
atmospheric scattering model. Furthermore, we try to enrich the input of
iew
discriminator and add more training constraints including frequency information.
low-frequency components of hazy images are richer than clear ones because of
ev
greater airlight proportions. This feature is used to optimize the design of our
It should be noted that the frequency decomposition method used in this module is
r
consistent with PGAN-A as shown in Fig. 4. Because low-frequency sub-bands IL can
er
be used as airlight after constraints, the supervised learning of them can actually be
researchers simply use generated dehazed images and ground truths as pair of
learning samples input to the discriminator, with less exploration of the model's
distinguish between real and fake data, thereby more real and satisfactory dehazed
This subsection mainly discusses the loss function design of our network, which
mainly includes, and Pseudo airlight coefficient supervision loss. Apart from
improving some common losses including pixel-level loss, SSIM loss and adversarial
ep
loss to integrate into our framework, we also design a Pseudo airlight coefficient
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
14
ed
input and the application of the physical model. As the main information of the image,
iew
designing a pseudo atmospheric scattering coefficient supervised loss based on
low-frequency components can increase the constraints of network training and make
ev
example. In our scheme, given an original hazy image I, the synthetic hazy image
output by the generator is I . Then the pixel-level loss function LP in the form of L1
r
on N samples can be written as
brightness, contrast and structure. It accurately reflects the image quality of human
(2 I I C1 )(2 I I C2 )
SSIM ( I , I )
tn
(10)
( I2 I2 C1 )( I2 I2 C2 )
where, , 2 are the average value and the variance, respectively. is the
rin
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
15
ed
samples by minimizing differences between the generated and real samples, while
iew
maximizing the ability of the discriminator. It can introduce the adversarial
mechanism in the training process, and help the model to learn more effective
our method, the hazy images combined with high and low sub-bands samples are
ev
expressed as I I H I L , then the adversarial loss can be expressed as
r
3.3.4 Pseudo airlight coefficient supervision loss
The airlight can be estimated from the low frequency sub-band, so the training
er
supervision of the airlight coefficient is particularly important. We expect to penalize
differences of airlight between the original and synthetic hazy images in the network
pe
by designing the pseudo airlight coefficient supervision loss. The airlight estimated
from the low frequency sub-band of the synthetic and original hazy image is recorded
as A and A, respectively. Then the pseudo airlight coefficient supervision loss on the
N
LA Ai Ai (13)
i 1
tn
rin
ep
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
16
ed
iew
r ev
er
Fig. 6. Example images of our constructed dataset. We collect polarized images with polarization
angles of 0°, 60°, and 120°, as well as hazy image I. The obtained hazy images are divided into three
concentration levels: dense, heavy, and thin. In addition, the dataset covers both horizontal and vertical
pe
directions, effectively solving the problem of a single perspective in existing datasets.
Finally, the loss function of the whole network consists of the pixel-level loss,
SSIM loss, adversarial loss, and pseudo airlight coefficient supervision loss, and it
can be expressed as
ot
4. Experimental Results
4.1 Datasets construction and parameters settings
rin
Dataset. Most of the existing hazy image datasets are synthetic, but they cannot
reflect the real-world haze distribution especially when the wind is strong. There are
haze using a smoking generator. For a long time, the disclosure of datasets in the field
of polarization dehazing has not attracted enough attention from the research
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
17
ed
iew
ev
Fig. 7. The UAV for vertical direction data acquisition.
r
community, which has seriously restricted the popularization and application of
polarization methods.
er
In order to efficiently group polarization datasets, we built different polarization
data collection platforms with polarization cameras, and carried out long-term data
pe
collection work. We have collected more than 30,000 hazy polarization images, and
some of them have been open in the form of datasets, as shown in Fig. 6.
concentrations in the horizontal direction, and the haze concentration was based on
the value of no reference image quality index FADE [26]. In addition to the horizontal
tn
important, because they can evaluate the dehazing method on the aerial work, so we
also used 600 polarization images in vertical direction, which collected by our
rin
Parameters. We set the four weight parameters P , SSIM , Adv , A to 1, 0.5, 0.1
ep
and 1, respectively. The Nonsubsampled Pyramid (NSP) was used in all frequency
domain decomposition modules, and the layer of decomposition was set to 4. In terms
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
18
ed
of the size of the input image, the original size captured by the polarization camera is
iew
2480×1860×3. The polarization image and the original fog image have been
downsampled, and the downsampled image size is 320×240×3. The training of the
ev
We chose dehazing algorithms based on priors and polarization including DCP [2]
and PBD [17]. We also chose two dehazing algorithms based on deep learning
including YOLY [32], D4 [24] and Dehamer [11], DehazeFormer [33] and DEA-Net
r
[35]. Our method is defined as PGAN. The experimental results of two groups are
In the selection of scenes, we deliberately arranged many areas that are difficult to
handle in the algorithm, such as water surfaces, bright white buildings, white trucks,
ot
etc., and some of these scenes were specifically shot under low light conditions. This
test data organization is significantly different from the overly single data
tn
organization form in existing work. We hope that through this experimental data
comprehensively reflected.
Horizontal direction comparison. The results of DCP and PBD are very different
in thin haze scenes or containing white objects, such as scenes 1 and 2. This can be
ep
attribute to the reason that DCP is prone to estimating a higher airlight radiance from
the infinite distance. In the dense haze condition, such as scenes 5 and 6, their results
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
19
ed
iew
r ev
er
pe
Fig. 8. Qualitative experimental results of the data in horizontal direction. It can be observed that our
proposed PGAN demonstrates a better dehazing effect than other polarization-based and
deep-learning-based state-of-the-art methods. Especially in the dense haze scenes, PGAN presents a strong
dehazing ability.
ot
It is obvious that PGAN has achieved excellent results in all horizontal direction
images. Even in dark light and dense haze conditions, such as scenes 4 to 6, the
model to recover the hazy image as clearly and reasonably as possible under the
atmospheric scattering model. Other image noises are inevitably generated in the
ep
correction factor of PBD, but the appearance of local color shift cannot be avoided.
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
20
ed
iew
r ev
er
pe
Fig. 9. Qualitative experimental results of the data in vertical direction. It can be observed that our
proposed PGAN demonstrates a better dehazing effect than other polarization-based and
deep-learning-based state-of-the-art methods. Data-driven deep learning-based method lacks physical
ot
model constraints and demonstrates poorer dehazing results in the absence of vertical view training data.
However, PGAN exhibits a robust dehazing capability in different views.
Deep-learning based methods can better restore image details in thin haze scenes, but
tn
YOLY and D4 are worse than the advanced DEA-Net and DehazeFormer in
brightness recovery. In heavy and dense haze scenes, almost all deep-learning based
methods demonstrate a poor dehazing effect. Dehamer got heavier color shift in scene
rin
images under dense haze condition, PGAN can also enhance the visibility of targets
ep
under low light conditions. Although it seems to have some haze residue and
saturation reduce, it does not affect the first visual feeling of human eyes.
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
21
ed
Vertical direction comparison. The visibility of hazy images taken in the vertical
iew
direction is related to the height of the UAV. Most of the methods such as PBD,
DeahzeFormer and PGAN achieve good results from scenes 1 to 3, because these
hazy images are collected in sunny days. But in scenes 4 to 6, only PBD and PGAN
can keep high brightness, because there are many areas with very weak light
ev
illumination or objects with weak reflective effects in these scenes. For example, the
trees under the tall buildings are shielded from most of the light in scene 5, and the
asphalt road with low light reflection in scene 4 and 6. Compared with all other
r
methods, the experimental results of PGAN are clearer, and the target recovery of the
er
low light scene is excellent, especially the details of the blue dome of scene 5 is well
restored. The trees and paths are also clearly visible. These results demonstrate that
pe
our method maintains excellent dehazing capability in a variety of scenarios.
dehazing methods. It should be noted that compared with dehazing algorithms that
can use synthetic data, polarization based dehazing algorithms cannot directly use
tn
Although some works have suggested collecting haze-free data at different times in
rin
the ground-truth obtained through this method is not strict, so these practices have not
yet been widely recognized. Our selected no-reference image quality assessments
ep
include Fog Aware Density Evaluator (FADE) [26], Blind Image Quality Index (BIQI)
[28], Blind Image Spatial Quality Evaluator (BRISQUE) [29] and Nature Image
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
22
ed
Table 1
Quantitative experimental results of the data in horizontal direction. No-reference image quality
iew
assessments are calculated.
ev
AOD-Net [6] ICCV’17 3.06 66.94 24.49 9.16 0.18
FD-GAN [9] AAAI’20 1.66 71.41 26.06 6.07 0.11
Deep-Learning
r
SGDRL [34] NN’24 2.76 79.77 34.58 8.36 0.12
DEA-Net [35] TIP’24 2.52 83.31 37.46 9.28 0.08
PGAN Ours 0.43 50.01 16.19 4.73 0.43
Table 2 er
Quantitative experimental results of the data in vertical direction. No-reference image quality
assessments are calculated.
pe
Methods FADE↓ BIQI↓ BRISQUE↓ NIQE↓ US↑
Hazy 1.98 75.63 34.88 7.00 /
DCP [2] 0.58 53.44 26.62 5.84 0.12
PBD [17] 1.18 52.14 29.37 5.71 0.10
YOLY [32] 0.47 42.02 38.16 5.26 0.23
D4 [24] 0.92 61.20 29.56 6.04 0.19
ot
Quality Evaluator (NIQE) [30]. These indexes based on the natural scene statistics
percentage of votes in user study. The mean score of each dehazing algorithm in
different groups is listed in Table 1 and Table 2. The best scores are marked in bold
ep
font.
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
23
ed
evaluation metrics. Physical model-based methods such as PBD and DCP achieved
iew
better FADE than deep learning-based methods, indicating higher visibility in
dehazing images. This proves that physical constraints are the key factor in dehazing,
therefore our PGAN can achieve optimal FADE after introducing polarization
ev
BRISQUE, with a score of 16.19 indicating that our method achieves minimal image
distortion and achieves the best human visual effects. The optimal NIQE shows that
our method can preserve the natural attributes of the image (e.g. contrast, clarity)
r
during the dehazing process.
er
While in the vertical direction, PGAN has attained the best FADE, BRISQUE and
NIQE. PGAN only achieve the second best BIQI, demonstrating that it contains some
pe
noise when generating dehazed images. This is an inevitable problem for
generative-based method, such as the high BIQI value of D4. YOLY achieves the best
BIQI through a layer disentanglement network that does not require training. The US
scores reflect that the PGAN is the most recognized among user groups. These results
ot
demonstrate that our method can achieve robust dehazing performance in natural
generator, we remove the airlight Stokes calculation module in PGAN-A, and replace
the input of PGAN-A with the original hazy image to estimate the airlight. This
ep
method is denoted as w/o SP. For the discriminator, we remove the frequency
decomposition module, and only use the original and synthetic hazy images as input
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
24
ed
iew
Fig. 10. Qualitative results of the ablation experiment. “w/o SP” means removing the Stokes calculation
module of PGAN-A and replacing the input with the original hazy image. “w/o FS” represents for removing
the frequency decomposition module of discriminator and the pseudo airlight coefficient supervision loss.
ev
Table 3
Quantitative results of the ablation experiment. No-reference image quality assessments are calculated.
r
Hazy 6.68 90.49 35.87 10.77 /
w/o SP 4.59 64.13 23.30 8.62 0.12
w/o FS 1.78 55.43 14.54 4.77 0.41
PGAN 0.59 44.65
samples. Meanwhile, the pseudo airlight coefficient supervision loss is removed. This
pe
method is denoted as w/o FS.
As shown in Fig. 10, in the absence of the Stokes calculation module, the
calculation accuracy of the airlight decreases after losing the polarization information.
As a result, the dehazing performance of w/o SP is very limited. In fact, this is similar
ot
visibility of the scene can be improved overall, the experimental results of w/o FS are
obviously less restored than PGAN in detail. It can be observed that details of trees
rin
and buildings marked by red boxes are blurred, and these details are restored in the
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
25
ed
performance of the method, while the frequency domain decomposition module can
iew
further optimize the overall performance of the network.
5. Conclusion
In this work, in order to solve the limitation of real-world labels and improve the
ev
the self-supervised polarization image dehazing using frequency domain generative
adversarial networks, which is denoted as PGAN in this paper. The generator based
r
on polarization reconstruct the synthetic hazy image using Stokes parameters and the
er
self-supervised learning. The discriminator based on frequency distribution is more
powerful because of the frequency separated samples and pseudo airlight coefficient
pe
supervision loss. Benefit from the introduction of polarization and frequency
information, PGAN can more accurately estimate the airlight to obtain more detailed
dehazed images even though without real-word clear images. In the experiments,
ot
PGAN has achieved advanced performance in real-world datasets, and the ablation
Supervision.
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
26
ed
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal
iew
relationship that could have appeared to influence the work reported in this paper.
Data availability
Data will be made available on request.
ev
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China
r
under Grant 62302142 and Grant 61876057; in part by the China Postdoctoral
er
Science Foundation under Grant 2022M720981; in part by the Anhui Province
Natural Science Foundation under Grant No. 2208085MF158; and in part by the Key
References
ot
[1] X. Guo, Y. Yang, C. Wang and J. Ma, “Image dehazing via enhancement,
[2] K. He, J. Sun, and X. Tang, “Single image haze removal using dark channel prior,”
[3] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan, “Efficient image dehazing
rin
[4] Q. Zhu, J. Mai, and L. Shao, “A fast single image haze removal algorithm using
ep
color attenuation prior,” IEEE Trans. Image Process. 24, 3522–3533 (2015).
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
27
ed
[5] Ren, Wenqi et al. “Single Image Dehazing via Multi-scale Convolutional Neural
iew
Networks,” European Conference on Computer Vision (2016).
[6] B. Li, X. Peng, Z. Wang, J. Xu, and F. Dan, “AOD-net: all-in-one dehazing
4770–4778.
ev
[7] B. Li, Y. Gou, J. Z. Liu, H. Zhu, J. T. Zhou and X. Peng, "Zero-Shot Image
Dehazing," in IEEE Transactions on Image Processing, vol. 29, pp. 8457-8466, 2020,
doi: 10.1109/TIP.2020.3016134.
r
[8] S. Zhang, X. Zhang, S. Wan, W. Ren, L. Zhao and L. Shen, "Generative
er
Adversarial and Self-Supervised Dehazing Network," IEEE Trans. Ind. Informat., vol.
[10] H. Dong et al., “Multi-Scale Boosted Dehazing Network With Dense Feature
ot
[11] C. Guo, Q. Yan, S. Anwar, R. Cong, W. Ren and C. Li, “Image Dehazing
Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA,
[12] R. -Q. Wu, Z. -P. Duan, C. -L. Guo, Z. Chai and C. Li, "RIDCP: Revitalizing
ep
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
28
ed
Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC,
iew
Canada, 2023, pp. 22282-22291.
[13] Liang, S. Li, D. Cheng, W. Wang, D. Li and J. Liang, “Image dehazing via
self-supervised depth guidance,” Pattern Recognition, vol. 158, pp. 111051, 2025.
[14] T. Jia, J. Li, L. Zhuo and G. Li, "Effective Meta-Attention Dehazing Networks
ev
for Vision-Based Outdoor Industrial Systems," IEEE Trans. Ind. Informat., vol. 18,
[15] Y. Cui, Q. Wang, C. Li, W. Ren and A. Knoll, “EENet: An effective and
r
efficient network for single image dehazing,” Pattern Recognition, vol. 158, pp.
111074, 2025.
er
[16] N. Jiang, K. Hu, T. Zhang, W. Chen, Y. Xu and T. Zhao, “Deep hybrid model for
pe
single image dehazing and detail refinement,” Pattern Recognition, vol. 136, pp.
109227, 2023.
non-specular objects using polarization difference and global scene feature,” Opt.
tn
[19] F. Huang, C. Ke, X. Wu, S. Wang, J. Wu, and X. Wang, “Polarization dehazing
rin
method based on spatial frequency division and fusion for a far-field and dense hazy
[20] Rui Sun, Tanbin Liao, Zhiguo Fan, Xudong Zhang, and Changxiang Wang,
ep
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
29
ed
from the frequency domain for different concentrations of haze," Appl. Opt. 61,
iew
10362-10373 (2022).
for Single Image Haze Removal,” 2019 IEEE/CVF Conference on Computer Vision
and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 2019, pp.
ev
2014-2023.
[22] A. Dudhane and S. Murala, “CDNet: Single Image De-Hazing Using Unpaired
r
Vision (WACV), Waikoloa, HI, USA, 2019, pp. 1147-1155.
er
[23] W. Liu, X. Hou, J. Duan and G. Qiu, “End-to-End Single Image Fog Removal
Unpaired Image Dehazing via Density and Depth Decomposition," 2022 IEEE/CVF
Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA,
ot
assessment: from error visibility to structural similarity,” IEEE Trans. Image Process.
density and perceptual image defogging,” IEEE Trans. Image Process. 24, 3888–3901
(2015).
ep
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
30
ed
[27] A. L. Da Cunha, J. Zhou, and M. N. Do, “The nonsubsampled contourlet
iew
transform: theory, design, and applications,” IEEE Trans. Image Process. 15, 3089–
3101 (2006).
image quality indices,” IEEE Signal Process. Lett. 17, 513–516 (2010).
ev
[29] A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality
assessment in the spatial domain,” IEEE Trans. Image Process. 21, 4695–4708
(2012).
r
[30] A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a “completely blind”
er
image quality analyzer,” IEEE Signal Process. Lett. 20, 209–212 (2012).
[32] B. Li, Y. Gou, S. Gu, J. Z. Liu, J. T. Zhou and X. Peng, “You only look yourself:
[33] Y. Song, Z. He, H. Qian and X. Du, "Vision Transformers for Single Image
tn
Dehazing," in IEEE Transactions on Image Processing, vol. 32, pp. 1927-1941, 2023,
doi: 10.1109/TIP.2023.3256763.
rin
learning for single image dehazing,” in Neural Networks, vol. 172, pp. 106107, 2024.
[35] Z. Chen, Z. He and Z. -M. Lu, "DEA-Net: Single Image Dehazing Based on
ep
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
31
ed
Sui Sun received the B.S. degree from Central South University of China in 1998, the
iew
M.S. degree from Harbin Engineering University of China in 2000, the Ph.D. degree
USA from 2010 to 2011. He is currently a professor in the School of Computer and
ev
Information of Hefei University of Technology, China. His research interests include
r
Long Chen received the B.S. degree from Hefei University of Technology of China
er
in 2022. He is currently pursuing M.S. degree in Hefei University of Technology. His
research interests include machine learning and computer vision, especially cross
pe
modal person re-identification and corruption robustness.
Tanbin Liao received the B.S. degree from Hefei University of Technology of China
Zhiguo Fan was born in Anhui, China, in 1978. He received the M.S. and Ph.D.
rin
degrees from the Hefei University of Technology in 2007 and 2011, respectively. He
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199
32
ed
invention patents, published more than 40 academic articles and one academic
iew
monograph [Bionic Polarized Light Navigation Method (Science Press), 2014].
r ev
er
pe
ot
tn
rin
ep
Pr
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5059199