A Lightweight and Secure-Enhanced Strong PUF Design On FPGA: Letter
A Lightweight and Secure-Enhanced Strong PUF Design On FPGA: Letter
24, 1–6
LETTER
A lightweight and secure-enhanced Strong PUF design on FPGA
Shen Hou1,2a), Yang Guo1, Shaoqing Li1, Ding Deng1, and Yan Lei3
Abstract Physical unclonable function (PUF) is a reliable physical se-        RO (ring oscillator) PUF is also a Weak PUF. It can
curity primitive. The Weak PUF and Strong PUF are two well-known PUF          generate responses by comparing the oscillation frequen-
topologies. Strong PUF can be used to authenticate and protect intellectual   cies of two identical RO circuits [7, 8]. The Strong PUFs
property on FPGA chips. Classic PUF designs, like arbiter PUF, are hard
                                                                              have exponential number of CRPs and are suitable for
to implement on FPGA and severely threatened by the machine learning
based modeling attacks. In this work, we propose a new Strong PUF             authentication [9, 10]. A typical kind of Strong PUF is
on FPGA by combining Weak PUF with obfuscation logic. Experiment              the arbiter PUF [11, 12]. It contains a multi-level multi-
results on a 28 nm FPGA show that the resistance to modeling attack is        plexer chain which has two identical paths. It can output
good and the hardware overhead is small.                                      responses by an arbiter behind the chain to announce which
Keywords: physical unclonable function, lightweight, FPGA, feedback
                                                                              path is faster. Due to these features, PUF can provide
shift register, modeling attack
Classification: Integrated circuits                                            anti-tamper solutions to protect IP and sensitive data in
                                                                              FPGAs [13, 14].
                                                                                   However, PUF designs on FPGA are few because
1. Introduction                                                               of the following reasons. Highly regular architecture of
                                                                              FPGA and the automated design method make it very
In recent years, big data, artificial intelligence, and cloud                  difficult to ensure identical circuit implementation on dif-
technologies have developed rapidly. As a hardware re-                        ferent chips. [15]. For delay-based PUFs, automatic routing
configurable architecture, FPGA has strong computing                           and optimizing causes a wire length difference between
power and sufficient flexibility. As a kind of accelerator                       original design and actual circuit. Furthermore, the security
that has been paid increased attention in the field of deep                    of PUFs has become a serious issue. The machine learning
learning, FPGA has become a new research and application                      based modeling attacks, which are non-invasive attack
hotspot. Widespread use of FPGAs brings new security                          method, can successfully break various Strong PUFs [16].
challenges, such as overbuilding, tampering, cloning, and                          To address some of the issues outlined above, a new
reverse engineering [1].                                                      lightweight FPGA-based Strong PUF is proposed. The
     Physical unclonable function (PUF) can extract some                      research contributions of this paper are as follows: 1) A
random deviations in the manufacturing process to make                        new compact FPGA-based time-delay Weak PUF which
a unique “fingerprint” of the circuit, so as to accurately                     can generate 2-bit response in 1 slice is implemented on
identify each circuit and prevent the circuit and chip from                   28 nm FPGA. 2) A lightweight Strong PUF is construct by
being over-manufactured or tampered. PUF generates a                          using some structural features of FPGA. 3) The proposed
corresponding response only when a special challenge                          Strong PUF has small hardware overhead and good resist-
is given. It is called “Challenge-Response” mechanism.                        ance to machine learning based attacks.
According to the relationship between the number of CRPs                           The rest of the paper is organized as follows. Section 2
and the size of physical entities, PUF can be defined as                       reviews related research work on FPGA PUF and machine
Weak PUF and Strong PUF [2]. PUFs with limited number                         learning attack. Section 3 details the proposed Strong PUF
of CRPs, known as Weak PUFs, are commonly used for                            design. Section 4 gives the experiments results and per-
key generation in cryptographic functions [3, 4]. Memory-                     formance analysis. Final is the conclusion section.
based PUF, like SRAM PUF [5], butterfly PUF [6] is one
kind of Weak PUF. It uses asymmetry caused by manu-                           2. Related work
facturing process deviation between the cross-coupling
gate devices in most memory cells to produce responses.                       2.1 PUF designs on FPGA
                                                                              Anderson claimed to implement the PUF structure on the
    1
     School of Computer, National University of Defense                       FPGA for the first time [17]. His design refers to the basic
    Technology, Changsha 410073, China                                        idea of the delay-based PUF, and takes advantage of
    2
     Department of Basic Courses, Information Engineering
                                                                              SLICEM. It is implemented on Xilinx Virtex-5 FPGA
    University, Luoyang 471003, China
    3                                                                         board and cost 1 SLICEM for each response bit. The FPGA
     Department of Software Engineering, Chongqing University,
    Chongqing 404100, China                                                   identification generator proposed by Gu et al. in [18], [19]
   a) houshen@outlook.my                                                      uses the flip-flop element in slice as the delay path and
DOI: 10.1587/elex.16.20190695
                                                                              the cross-coupled NAND gates as an arbiter. Due to the
Received November 17, 2019                                                    deviation of the manufacturing process, random responses
Accepted November 20, 2019
Publicized December 2, 2019
                                                                              can be generated. They implemented and analyzed the
Copyedited December 25, 2019                                                  performance on the Spartan-6 series FPGA evaluation
                                                                                                                                                  1
                     Copyright © 2019 The Institute of Electronics, Information and Communication Engineers
                                                                                              IEICE Electronics Express, Vol.16, No.24, 1–6
board. The results show that the design has better device         c½0c½n  1 and 1-bit response r is shown in Fig. 2. The
uniqueness and reliability, and only one slice is needed to       same input is connected to the upper and lower paths, and as
generate a single ID on the 6 series FPGA. The hardware           the control signals of each stage, challenge determines the
overhead is smaller than Anderson’s design, and according         two multiplexers status. The process variations result in a
to new architecture of the 7 series FPGA, the overhead can        slight transition time difference of two paths. A latch behind
be further reduced.                                               the chain acts as an arbiter to judge which path is faster.
                                                                  If upper signal arrives earlier, then r is 1, or else r is 0.
                                                                                                                                         2
                                                                                           IEICE Electronics Express, Vol.16, No.24, 1–6
paths is slightly different, and the time of Q0 and Q1           SLICEM can also be configured as an SRL with a max-
arriving at two NAND is different too. Thus, the output          imum shifting depth of 16 or 32 bits. When configured as
values of the cross-coupled NAND gate Z0 and Z1 are set         a 32-bit SRL as shown in Fig. 5, the shift depth can be
to “01” or “10”. In such a manner, the two states output        dynamically adjusted from 1 to 32 bits by five input
logic 0 and logic 1 respectively, controlled by a multiplexer   A4∼A0 of the LUT.
to generate 1-bit response. According to the internal struc-
ture of slice, each NAND can be generated by one LUT.
The multiplexer and flip-flops can directly use the internal
logic elements of slice. Therefore, the 1-bit response can be
implemented by one single slice.
                                                                                                                                      3
                                                                                             IEICE Electronics Express, Vol.16, No.24, 1–6
    The chip’s response information is tested and enrolled      4.3 Performance analysis
in manufacturer’s security database before the chip is          1. Uniformity
shipped from factory. Then the customers can configure a              Uniformity characterizes the distribution of 0 and 1 in
WPUF in their design as an ID or key generator, or con-         the PUF response. As the main reference for PUF perform-
figure a WPUF-based “pseudo” Strong PUF (abbreviated             ance, the value of uniformity reflects the randomness of
as p-SPUF) as a CRP provider. Due to the relatively high        the PUF response. The ideal value for uniformity is 50%,
cost of the FPGA, although the CRP space is not big             meaning that the probability of 0 and 1 in the PUF response
enough as classic Strong PUF solutions like arbiter PUF,        is supposed to be identical.
the circuit is simple and the hardware overhead is very              We separately calculate the uniformity of the front-end
small. It is a good and low-cost solution for the security of   WPUF and the overall p-SPUF output response. A total of
FPGA.                                                           1536 bits responses are generated by 48 32-bit PUFs. The
                                                                number of 1s is 738, and the uniformity is 48%, which is
4. Experimental result                                          close to the ideal value. After loading 4 32-bit challenges
                                                                into all the 48 p-SPUF instances, 192 32-bit responses are
4.1 FPGA implementation                                         generated, and we get 6144 bits response. The number of
The 2-bit response generating circuit is implemented on an      1s is 3047, and the uniformity of Strong PUF is 49.6%.
Alinx development board (Zynq-7000 XC7Z020 FPGA).               2. Uniqueness
The final implementation on FPGA is shown in Fig. 7.                  A good PUF design should have good uniqueness.
We divide the FPGA side of the Zynq-7000 chip into 16           When different PUF instances are implemented on different
regions, implementing a 32-bit WPUF and a 32-bit p-SPUF         devices, different responses are produced for the same
in each region. Then we do the same work on 3 develop-          challenge. Uniqueness measures inter-chip variation by
ment boards and get 48 32-bit PUF instances in total. The       evaluating how the design differentiates d different devices.
challenges and responses are exchanged with PC through          It can be calculated with the inter-chip Hamming distance
the UART interface. The experimental data is recorded and       (HD) as shown in (3). Ri and Rj represent the n-bit
processed using Python.                                         responses generated from two chips i and j using the same
                                                                challenge C.
                                                                                  2     Xd1 Xd         HDðRi ; Rj Þ
                                                                Uniqueness ¼                                          100 ð3Þ
                                                                               dðd þ 1Þ    i¼1    j¼iþ1     n
                                                                Ideally, implemented on different devices, a PUF circuit is
                                                                expected to produce an average inter-chip HD close to 50%
                                                                when supplied with the same challenge, implying that half
                                                                the response bits are different between the two devices
                                                                even though the same challenge has been used.
                                                                     We test 48 PUF instances with four challenges and get
                                                                a total of 4512 HD values. The maximum inter-chip HD is
                                                                28, the minimum is 4, and the average is 16.19, which is
             Fig. 7. 32-bit p-SPUF implementation               close to the ideal value of 16. That is, the uniqueness value
                                                                of the PUF is 50.58%. The probability histogram of the
                                                                inter-chip HD is shown in Fig. 8.
4.2 Hardware overhead                                           3. Reliability
The core part of a 32-bit p-SPUF design uses 1536 of the             Ideally, a PUF design should have a fully reproducible
53200 LUTs as logic on the Zynq-7000 XC7Z020 FPGA               output response. We obtain a total of n-bit responses for the
(2.89%), 128 of the 17400 LUTs as memory (0.74%), and           s group. For each response, Ri is measured under normal
                                                                                                                                        4
                                                                                               IEICE Electronics Express, Vol.16, No.24, 1–6
                                                                                                                                            5
                                                                           IEICE Electronics Express, Vol.16, No.24, 1–6
     1587/elex.16.20190296).
 [6] X. Xu, et al.: “A highly reliable butterfly PUF in SRAM-based
     FPGAs,” IEICE Electron. Express 14 (2017) 20170551 (DOI:
     10.1587/elex.14.20170551).
 [7] C.-E. Yin, et al.: “Design and implementation of a group-based RO
     PUF,” IEEE/ACM DATE (2013) 416 (DOI: 10.7873/DATE.2013.
     094).
 [8] S. Pei, et al.: “A low-overhead RO PUF design for Xilinx FPGAs,”
     IEICE Electron. Express 15 (2018) 20180093 (DOI: 10.1587/elex.
     15.20180093).
 [9] G. E. Suh and S. Devadas: “Physical unclonable functions for
     device authentication and secret key generation,” ACM DAC
     (2007) 9 (DOI: 10.1145/1278480.1278484).
[10] W. Che, et al.: “PUF-based authentication,” IEEE ICCAD (2015)
     337 (DOI: 10.1109/ICCAD.2015.7372589).
[11] D. Lim, et al.: “Extracting secret keys from integrated circuits,”
     IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 13 (2005) 1200
     (DOI: 10.1109/TVLSI.2005.859470).
[12] S. Ghosh and R. Govindaraj: “A strong arbiter PUF using resistive
     RAM,” SAMOS (2016) 275 (DOI: 10.1109/SAMOS.2016.
     7818358).
[13] S. S. Kumar, et al.: “Extended abstract: The butterfly PUF pro-
     tecting IP on every FPGA,” IEEE HOST (2008) 67 (DOI: 10.1109/
     HST.2008.4559053).
[14] J. Guajardo, et al.: “FPGA intrinsic PUFs and their use for IP
     protection,” CHES (2007) 63 (DOI: 10.1007/978-3-540-74735-
     2_5).
[15] C. Gu and M. O’Neill: “Ultra-compact and robust FPGA-based
     PUF identification generator,” IEEE ISCAS (2015) 934 (DOI:
     10.1109/ISCAS.2015.7168788).
[16] U. Rührmair, et al.: “Modeling attacks on physical unclonable
     functions,” ACM CCS (2010) 237 (DOI: 10.1145/1866307.
     1866335).
[17] J. H. Anderson: “A PUF design for secure FPGA-based embedded
     systems,” ACM/IEEE ASP-DAC (2010) 1 (DOI: 10.1109/
     ASPDAC.2010.5419927).
[18] C. Gu, et al.: “A unique and robust single slice FPGA identification
     generator,” IEEE ISCAS (2014) 1223 (DOI: 10.1109/ISCAS.
     2014.6865362).
[19] C. Gu, et al.: “FPGA-based strong PUF with increased uniqueness
     and entropy properties,” IEEE ISCAS (2017) 1 (DOI: 10.1109/
     ISCAS.2017.8050838).
[20] Y. Gao, et al.: “Emerging physical unclonable functions with nano-
     technology,” IEEE Access 4 (2016) 61 (DOI: 10.1109/ACCESS.
     2015.2503432).
[21] Y. Hori, et al.: “Evaluation of physical unclonable functions for
     28-nm process field-programmable gate arrays,” J. Inf. Process. 22
     (2014) 344 (DOI: 10.2197/ipsjjip.22.344).
[22] M. Bhargava and K. Mai: “An efficient reliable PUF-based
     cryptographic key generator in 65 nm CMOS,” IEEE/ACM DATE
     (2014) 1 (DOI: 10.7873/DATE.2014.083).
[23] L. Santiago, et al.: “Realizing strong PUF from weak PUF via
     neural computing,” IEEE DFT (2017) 1 (DOI: 10.1109/DFT.2017.
     8244433).
[24] B. L. P. Gassend: “Physical random functions,” Master Disserta-
     tion, Massachusetts Institute of Technology (2003).
[25] M. Majzoobi, et al.: “Lightweight secure PUFs,” IEEE/ACM
     ICCAD (2008) 670 (DOI: 10.1109/ICCAD.2008.4681648).
[26] “Vivado Design Suite User Guide: Using Constraints (UG903)”
     (2018) 194.
[27] “Vivado Design Suite Tutorial: Using Constraints (UG945)”
     (2018) 47.
[28] B. Schneier: Applied Cryptography: Protocols, Algorithms, and
     Source Code in C (Wiley, New York, 1995) 2nd ed. 40.
[29] U. Rührmair, et al.: “PUF modeling attacks on simulated and
     silicon data,” IEEE Trans. Inf. Forensics Security 8 (2013) 1876
     (DOI: 10.1109/TIFS.2013.2279798).
[30] C. Gu, et al.: “Novel lightweight FF-APUF design for FPGA,”
     IEEE SOCC (2016) 75 (DOI: 10.1109/SOCC.2016.7905439).