Fake Image Detection
Fake Image Detection
Abstract—In this paper, we investigate whether robust hashing images. CycleGAN [10] and StarGAN [11] are typical image
has a possibility to robustly detect fake-images even when synthesis techniques with GANs. CycleGAN is a GAN that
multiple manipulation techniques such as JPEG compression performs one-to-one transformations, e.g. changing apples to
2021 IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech) | 978-1-6654-1875-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/LIFETECH52111.2021.9391842
Authorized licensed use limited to: Tsinghua University. Downloaded on August 10,2021 at 01:52:39 UTC from IEEE Xplore. Restrictions apply.
a specific manipulation technique to detect unique features we apply the robust hashing method proposed by Li et al [29]
caused by the manipulation technique. for applying it to fake-image detection. This robust hashing
There are several detection methods with deep learning for enables us to robustly retrieve images, and has the following
detecting fake images generated with an image editing tool properties.
as Photoshop. Some of them focus on detecting the boundary • Resizing images to 128×128 pixels prior to feature
between tampered regions and an original image [19] [20] extraction.
[21]. Besides, a detection method [22] enables us to train a • Performing 5×5-Gaussian low-pass filtering with a stan-
model without tamper images. dard deviation of 1.
Most detection methods with deep learning have been • Using rich features extracted from spatial and chromatic
proposed to detect fake images generated by using GANs. An characteristics.
image classifier trained only with ProGAN was shown to be • Outputting a bit string with a length of 120 bits as a hash
effective in detecting images generated by other GAN models value.
[23]. Various studies have focused on detecting checkerboard In the method, the similarity is evaluated in accordance with
artifacts caused in both of two processes: forward propagation the hamming distance between the hash string of a query
of upsampling layers and backpropagation of convolutional image and that of each image in a database.
layers [24]. In this work, the spectrum of images is used as Let vectors u = {u1 , u2 , . . . , un } and q = {q1 , q2 , . . . , qn },
an input image in order to capture the checkerboard artifacts. ui , qi ∈ {0, 1} be the hash strings of reference image U and
To detect fake videos called DeepFake, a number of de- query image Q, respectively. The hamming distance dH (u, q)
tection methods have been investigated so far. Some methods between U and Q is given by:
attempt to detect failures in the generation of fake videos, in
n
terms of poorly generated eyes and teeth [25], the frequency X
dH (u, q) , δ(ui , qi ) (1)
of blinking as a feature [26], and the correctness of facial
i=1
landmarks [27] or head posture [28]. However, all of these
methods have been pointed out to have problems in the where
robustness against the difference between training datasets 0, ui = qi
δ(ui , qi ) = . (2)
and test data [1]. In addition, the conventional methods have 6 qi
1, ui =
not considered the robustness against the combination of To apply this similarity to fake-image detection, we introduce
various manipulations such as the combination of resizing and a threshold d as follows.
DeepFake.
Q ∈ U0 , min (dH (u, q)) < d
III. P ROPOSED METHOD WITH ROBUST H ASHING u6=q,u∈U
(3)
Q ∈ / U0 , min (dH (u, q)) ≥ d
A. Overview u6=q,u∈U
Figure2 shows an overview of the proposed method. In the where U is a set of reference images and U0 is the an of images
framework, robust hash value is computed from easy reference generated with image manipulations from U, which does not
image by using a robust hash method, and stored in a database. include fake images. According to eq. (3), Q is judged whether
Similar to reference images, a robust hash value is computed it is a fake image or not.
from a query one by using the same hash method. The hash
value of the query is compared with those stored the database. IV. E XPERIMENT RESULTS
Finally, the query image is judged whether it is real or fake The proposed fake-image detection with robust hashing was
in accordance with the distance between two hash values. experimentally evaluated in terms of accuracy and robustness
against image manipulations.
A. Experiment setup
In the experiment, four fake-image datasets: Image Ma-
nipulation Dataset [31], UADFV [26], CycleGAN [10], and
StarGAN [11] were used. The details of datasets are shown
in Table I (see Figs. 1 and 3). The datasets consist of pairs of
a fake-image and the original one. JPEG compression with a
quantization parameter of QJ = 80 was applied to all query
images. d = 3 was selected as threshold d in accordance with
Fig. 2. Overview of proposed method the EER (Equal error rate) performance.
As one of the state-of-the-art fake detection methods,
Wang’s method [23] was compared with the proposed one.
B. Fake detection with Robust Hashing Wang’s method was proposed for detecting images generated
Various robust hashing methods have been proposed to by using CNNs including various GAN models, where a
retrieval similar images to a query one [29], [30]. In this paper, classifier is trained by using ProGAN.
Authorized licensed use limited to: Tsinghua University. Downloaded on August 10,2021 at 01:52:39 UTC from IEEE Xplore. Restrictions apply.
TABLE I consist of images generated with GANs. In addition, although
DATASETS UADFV consists of images generated by using DeepFake, they
have the influence of video compression.
dataset Fake-image generation real fake
No. of images
Image TABLE II
Manipulation copy-move 48 48 COMPARISON WITH WANG ’ S METHOD
Dataset [31]
UADFV [26] face swap 49 49 Wang’s method [23] proposed
CycleGAN [10] GAN 1320 1320 Dataset AP Acc (fake) AP Acc (fake)
StarGAN [11] GAN 1999 1999 Image Manipulation Dataset 0.5185 0.0000 0.9760 0.8750
UADFV 0.5707 0.0000 0.8801 0.7083
CycleGAN 0.9768 0.5939 1.0000 1.0000
StarGAN 0.9594 0.5918 1.0000 1.0000
TABLE III
C OMPARISON WITH WANG ’ S METHOD UNDER ADDITIONAL
Fig. 3. Example of datasets MANIPULATION ( DATASET: C YCLE GAN)
Authorized licensed use limited to: Tsinghua University. Downloaded on August 10,2021 at 01:52:39 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [21] P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, “Pros. of learning rich
features for image manipulation detection,” in Proc. of IEEE Conference
[1] L. Verdoliva, “Media forensics and deepfakes: An overview,” IEEE on Computer Vision and Pattern Recognition, June 2018.
Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. [22] M. Huh, A. Liu, A. Owens, and A. A. Efros, “Pros. of fighting fake
910–932, 2020. news: Image splice detection via learned self-consistency,” in Proc. of
[2] Y. Sugawara, S. Shiota, and H. Kiya, “Super-resolution using convolu- European Conference on Computer Vision, September 2018.
tional neural networks without any checkerboard artifacts,” in Proc. of [23] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, “Cnn-
IEEE International Conference on Image Processing, 2018, pp. 66–70. generated images are surprisingly easy to spot... for now,” in Proc. of
[3] Y. Sugawara, S. Shiota, and H. Kiya, “Checkerboard artifacts free IEEE/CVF Conference on Computer Vision and Pattern Recognition,
convolutional neural networks,” APSIPA Transactions on Signal and June 2020.
Information Processing, vol. 8, p. e9, 2019. [24] X. Zhang, S. Karaman, and S. Chang, “Detecting and simulating artifacts
[4] Y. Kinoshita and H. Kiya, “Fixed smooth convolutional layer for in gan fake images,” in Proc. of IEEE International Workshop on
avoiding checkerboard artifacts in cnns,” in Proc. in IEEE International Information Forensics and Security, 2019, pp. 1–6.
Conference on Acoustics, Speech and Signal Processing, 2020, pp. [25] F. Matern, C. Riess, and M. Stamminger, “Exploiting visual artifacts
3712–3716. to expose deepfakes and face manipulations,” in Proc. of IEEE Winter
Applications of Computer Vision Workshops, 2019, pp. 83–92.
[5] T. Osakabe, M. Tanaka, Y. Kinoshita, and H. Kiya, “Cyclegan
[26] Y. Li, M. Chang, and S. Lyu, “In ictu oculi: Exposing ai created
without checkerboard artifacts for counter-forensics of fake-image
fake videos by detecting eye blinking,” in Proc. of IEEE International
detection,” arXive preprint arXive:2012.00287, 2020. [Online].
Workshop on Information Forensics and Security, 2018, pp. 1–7.
Available: https://arxiv.org/abs/2012.00287
[27] X. Yang, Y. Li, H. Qi, and S. Lyu, “Exposing gan-synthesized faces
[6] T. Chuman, K. Iida, W. Sirichotedumrong, and H. Kiya, “Image manip-
using landmark locations,” in Proc. of ACM Workshop on Information
ulation specifications on social networking services for encryption-then-
Hiding and Multimedia Security, 2019, p. 113–118.
compression systems,” IEICE Transactions on Information and Systems,
[28] X. Yang, Y. Li, and S. Lyu, “Exposing deep fakes using inconsistent
vol. E102.D, no. 1, pp. 11–18, 2019.
head poses,” in Proc. of IEEE International Conference on Acoustics,
[7] T. Chuman, K. Kurihara, and H. Kiya, “Security evaluation for block Speech and Signal Processing, 2019, pp. 8261–8265.
scrambling-based etc systems against extended jigsaw puzzle solver [29] Y. N. Li, P. Wang, and Y. T. Su, “Robust image hashing based on
attacks,” in Proc. of IEEE International Conference on Multimedia and selective quaternion invariance,” IEEE Signal Processing Letters, vol. 22,
Expo (ICME), 2017, pp. 229–234. no. 12, pp. 2396–2400, 2015.
[8] W. Sirichotedumrong and H. Kiya, “Grayscale-based block scram- [30] K. Iida and H. Kiya, “Robust image identification with dc coefficients
bling image encryption using ycbcr color space for encryption-then- for double-compressed jpeg images,” IEICE Transactions on Information
compression systems,” APSIPA Transactions on Signal and Information and Systems, vol. E102.D, no. 1, pp. 2–10, 2019.
Processing, vol. 8, p. e7, 2019. [31] “Image manipulation dataset,” https://www5.cs.fau.de/research/data/image-
[9] T. Chuman, W. Sirichotedumrong, and H. Kiya, “Encryption-then- manipulation/.
compression systems using grayscale-based image encryption for jpeg
images,” IEEE Transactions on Information Forensics and Security,
vol. 14, no. 6, pp. 1515–1525, 2019.
[10] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image
translation using cycle-consistent adversarial networks,” in Proc. of IEEE
International Conference on Computer Vision, Oct 2017.
[11] Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “Stargan:
Unified generative adversarial networks for multi-domain image-to-
image translation,” in Proc. of IEEE Conference on Computer Vision
and Pattern Recognition, June 2018.
[12] J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Niessner,
“Face2face: Real-time face capture and reenactment of rgb videos,” in
Proc. of IEEE Conference on Computer Vision and Pattern Recognition,
June 2016.
[13] Y. Nirkin, I. Masi, A. Tran Tuan, T. Hassner, and G. Medioni, “On face
segmentation, face swapping, and face perception,” in Proc. of IEEE
International Conference on Automatic Face Gesture Recognition, 2018,
pp. 98–105.
[14] A. T. S. Ho, X. Zhu, J. Shen, and P. Marziliano, “Fragile watermarking
based on encoding of the zeroes of the z-transform,” IEEE Transactions
on Information Forensics and Security, vol. 3, no. 3, pp. 567–569, 2008.
[15] G. Zhenzhen, N. Shaozhang, and H. Hongli, “Tamper detection method
for clipped double jpeg compression image,” in Proc. of International
Conference on Intelligent Information Hiding and Multimedia Signal
Processing, 2015, pp. 185–188.
[16] T. Bianchi and A. Piva, “Detection of nonaligned double jpeg com-
pression based on integer periodicity maps,” IEEE Transactions on
Information Forensics and Security, vol. 7, no. 2, pp. 842–848, 2012.
[17] M. Chen, J. Fridrich, M. Goljan, and J. Lukas, “Determining image ori-
gin and integrity using sensor noise,” IEEE Transactions on Information
Forensics and Security, vol. 3, no. 1, pp. 74–90, 2008.
[18] G. Chierchia, G. Poggi, C. Sansone, and L. Verdoliva, “A bayesian-mrf
approach for prnu-based image forgery detection,” IEEE Transactions
on Information Forensics and Security, vol. 9, no. 4, pp. 554–567, 2014.
[19] Y. Rao and J. Ni, “A deep learning approach to detection of splicing
and copy-move forgeries in images,” in Pros. of IEEE International
Workshop on Information Forensics and Security, 2016, pp. 1–6.
[20] J. H. Bappy, A. K. Roy-Chowdhury, J. Bunk, L. Nataraj, and B. S.
Manjunath, “Exploiting spatial structure for localizing manipulated
image regions,” in Proc. of IEEE International Conference on Computer
Vision, Oct 2017.
Authorized licensed use limited to: Tsinghua University. Downloaded on August 10,2021 at 01:52:39 UTC from IEEE Xplore. Restrictions apply.