0% found this document useful (0 votes)
28 views8 pages

FSMX

It's a research paper published in IEEE conference. It's about hardware security

Uploaded by

Abid Abdullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views8 pages

FSMX

It's a research paper published in IEEE conference. It's about hardware security

Uploaded by

Abid Abdullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/361339703

FSMx: Finite State Machine Extraction from Flattened Netlist With Application
to Security

Conference Paper · April 2022


DOI: 10.1109/VTS52500.2021.9794151

CITATIONS READS
6 76

4 authors, including:

Rasheed Kibria Farimah Farahmandi


University of Florida University of Florida
5 PUBLICATIONS 9 CITATIONS 192 PUBLICATIONS 1,144 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Rasheed Kibria on 19 March 2023.

The user has requested enhancement of the downloaded file.


FSMx: Finite State Machine Extraction from
Flattened Netlist With Application to Security
Rasheed Kibria, Nusrat Farzana, Farimah Farahmandi and Mark Tehranipoor
Department of Electrical and Computer Engineering, University of Florida, Gainesville, Florida
{rasheed.kibria, ndipu}@ufl.edu, {farimah, tehranipoor}@ece.ufl.edu
2022 IEEE 40th VLSI Test Symposium (VTS) | 978-1-6654-1060-1/22/$31.00 ©2022 IEEE | DOI: 10.1109/VTS52500.2021.9794151

Abstract—A number of security vulnerability assessments structures at the gate-level [9]–[11]. Hence, in the pre-silicon
require accurate and fast extraction of the finite state machines design stages, the overall system security can be enhanced by
(FSMs) in the circuit. FSM should be accurately extracted for analyzing and mitigating fault injection and Trojan insertion-
watermark insertion, FSM-based logic locking, fault injection
assessment of control paths in a system-on-chip (SoC), infor- based vulnerabilities associated with the extracted FSMs [8].
mation leakage assessment, and reverse engineering at gate- In order to protect against IP piracy, some FSM-based
level. Unfortunately, as of today, there are no good solutions watermarking approaches embed the authorship information
available that can provide very fast and accurate extraction of in the states or transitions which require FSM extraction
FSMs from the flattened netlist to perform effective security [12]. Other FSM-based watermarking approaches implicitly
assessment. The difficulty of developing such a solution lies in
precisely identifying FSM state flip-flops present in a netlist that manipulate the State Transition Graph (STG) of the FSM to
contains numerous circuit flip-flops. In this paper, we propose implant the watermark as a property [12]. Besides, partitioned
to develop a framework called FSMx to extract FSMs from FSM-based sequential logic locking techniques have shown
highly unstructured synthesized designs. FSMx utilizes graph promise to be more resilient than combinational logic locking
theory to identify FSM state registers from other registers. to oracle-guided attacks while preventing overproduction [13]–
FSMx automatically extracts gate-level state transition graphs
(STGs) for each of the detected FSM state registers. Experimental [17]. In [18], the authors have shown that understanding a
results demonstrate that FSMx efficiently recovers FSMs from design’s extracted FSM is needed to alleviate information
synthesized netlists with various complexity and size in less than leakage vulnerability. Moreover, for safe design transformation
7 minutes in the worst case of NIST AES 128-bit design with and reducing the verification gap between specification and
12,976 gates on a personal desktop. implementation, equivalence checking should be performed
Keywords-FSM Extraction, Flattened Gate-Level Netlist Anal- between the extracted FSMs of higher abstraction levels and
ysis, Security Validation, FSM Automata Theory flattened gate-level netlist [19]. However, all these protection
and validation techniques to enhance the security of an SoC
can not be used in practice due to the limitation of the current
I. I NTRODUCTION gate-level FSM extraction tool. Therefore, an approach is
A modern System-on-Chip (SoC) is comprised of various highly required to extract all states and transactions of FSMs
functional blocks referred to as hardware intellectual property in an SoC, especially for the security-critical IPs.
(IP) cores that communicate and co-ordinate with each other to Although the extraction of design’s FSM is essential for
perform complex operations and deliver the expected service many security-critical applications, the methods and algo-
of the SoC [1]. The design houses heavily rely on third-party rithms for the extraction of FSM’s reported in the literature
vendors to design, integrate, and fabricate their designs to mostly focus on extracting FSMs from higher level of design
reduce the cost, and shrink the time-to-market (TMT). Thus, abstraction (e.g., RTL) [20], [21]. However, during synthesis,
the designer’s IPs become transparent to diverse potentially the FSM state registers get mixed up with non-FSM registers
untrusted stakeholders. Hence, IPs become inevitably suscepti- due to the design flattening and multiple optimizations (e.g.,
ble to IP piracy [2] and tampering [3]. Moreover, SoC security area, power, performance) performed by CAD tools. Thus,
may be endangered while in deployment. Attackers can exploit the FSM state registers, additional don’t care states, and don’t
design for test (DFT) structures, analyze timing, power, and care transitions added at the gate-level abstraction after logic
electromagnetic emission, and inject faults to gain access to synthesis become hard to identify from highly unstructured
the system or leak sensitive information [4]–[7]. synthesized designs. Moreover, identification of all the gates
Different security evaluation approaches have been pro- that exist in the feedback loop of the potential FSM state
posed over the years to assess and counter the aforementioned registers using [22] exhibits polynomial time complexity.
threats. These approaches have primarily focused on protecting Hence, it becomes practically impossible to retrieve all states
the control flow of the device upon which the whole system’s and transitions of the FSMs of a complex design. Another
functionality depends. Since control logic blocks are generally challenge of FSM extraction is distinguishing an FSM from
implemented using Finite State Machines (FSMs), these ap- an accumulator or similar arithmetic logic blocks with similar
proaches most often require the knowledge of FSM structures. feedback loop features. Some recent works have proposed
For example, the Computer-Aided Design (CAD) tools can techniques to extract FSMs of flattened gate-level netlists but
potentially introduce additional don’t care states to the design’s have several shortcomings (as discussed in Section II) in the
FSM during RTL to gate-level translation. Attackers can utilize case of control-intensive circuits [23]–[25].
the don’t care states to access the protected states of the design In this paper, we propose a framework called Finite State
utilizing fault injection techniques [8]. Moreover, untrusted Machine Extractor (FSMx) to reconstruct FSMs of security-
third-party IP (3PIP) vendors can insert sequential Trojan to critical flattened gate-level netlists systematically while tak-
the design’s FSM while integrating the design for test (DfT) ing a short computational time with 100% precision. FSMx
utilizes different industry-standard CAD tools based on our
978-1-6654-1060-1/22/$31.00 c 2022 IEEE proposed metrics to extract the FSM of designs with varying

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
complexity and size. Specifically, our contributions in this focused on distinguishing potential FSM state registers from
paper are as follows- the non-FSM registers only and did not provide any technique
• Developing FSMx, an automatic framework for fast and to extract STGs of the detected FSMs.
accurate FSM extraction from highly unstructured flat- In [24], the authors proposed an FSM extraction technique
tened gate-level netlists; that considers two important properties of an FSM: self
• Developing and implementing Input Similarity Metric and cross flip-flop (FF) influence. Despite considering two
(ISM) and FSM Probability Metric (FPM) to eliminate important FSM properties, this approach cannot distinguish
non-FSM registers with 100% precision; control FSMs from counters reliably, although a technique
• Extracting human-readable STGs for each of the detected was proposed to remove counters from the set of FSM can-
FSMs present in the flattened gate-level netlists; didates after performing topological analysis. Moreover, this
• Demonstrating the effectiveness of FSMx on FSMs of method requires additional manual analysis on the topological
NIST benchmarks. analysis result after obtaining the final FSM candidates to
The rest of the paper is organized as follows. In Section II, we determine which FSMs are control FSMs. Furthermore, none
discuss the motivation of this work and provide an overview of these approaches can extract human-readable gate-level
of the terminologies used in the paper. Section III describes STGs separately for each detected FSM since those solely
our proposed FSMx framework in detail. We present the rely on SCCs for FSM detection. As a result, all the FSMs
experimental result with detailed timing complexity analysis existing in a design tend to be a part of the same SCC,
and effectiveness of FSMx in Section IV. Finally, Section V since mathematically, SCCs are regions having at least one
concludes the paper. loop. Finally, the loop detection technique for identifying
FSMs [22] exhibits polynomial time complexity and fails to
II. P RELIMINARIES AND M OTIVATION analyze even moderately sized netlists. The proposed FSMx
framework is geared towards addressing these shortcomings.
A. Basic Terminologies and Definitions Fast and accurate extraction of FSMs and automatic detection
Finite State Machine (FSM): An FSM is mathematically of control FSMs from flattened netlists, and generation of
defined as a 6-tuple (S, I, O, s0 , φ, λ), where S is a finite set human-readable STGs have motivated us to develop the FSMx
of states, I is a finite set of inputs, O is a finite set of outputs, framework. Currently, we focus on extracting FSMs from
s0 is the reset/initial state, φ : S × I → S is the next-state netlists synthesized using standard cell technology libraries.
function and λ : S × I → O is the output logic function [26]. However, we envision extending FSMx to provide support
An FSM is generally represented as a directed graph where for extracting FSMs from netlists synthesized using field-
each node represents a state s ∈ S and every edge represents programmable gate array (FPGA) libraries in the future.
a certain transition between two states, t = T (si , sj ) from the
current state si to its next state sj . This graph is widely known III. T HE P ROPOSED FSM X F RAMEWORK
as state transition graph (STG) [8]. If an FSM is implemented
As shown in Fig. 1, there are two main modules in the FSMx
as the control logic of a certain design, it is defined as a control
framework. The first module is Netlist Graph Analyzer, and
FSM. We provide this definition to differentiate control FSMs
the second one is Gate-Level Boolean Function Analyzer. The
from counters.
Reset State and Unreachable State: According to FSM fundamental purpose of Netlist Graph Analyzer is to identify
automata theory [27], the reset state of an FSM is the entrance the minimal fragments of the entire netlist that constitutes
state to the other states existing in an FSM. A particular state FSMs accurately. The Gate-Level Boolean Function Analyzer
of an FSM is said to be unreachable if it can never be reached analyzes these detected FSM candidates’ netlists to extract
starting from an initial state by any means. gate-level STGs for each candidate separately. After extracting
Flattened Gate-Level Netlist: A gate-level netlist repre- state transition logic from the FSM candidate netlists, they
sents a design through a complex interconnection of standard can easily be analyzed to construct the gate-level STGs. The
cells from a particular technology library. A flattened gate- technology library is also required to perform exhaustive gate-
level netlist does not possess any hierarchy. It is a highly level functional simulation of the state transition logic.
optimized netlist that seems to be a sea of complex logic gates
and exhibits a mixing of logic between hierarchical modules.

B. Related Work and Motivation


Efficient identification of FSM structures and distinguishing
them from non-FSM ones are quite difficult in a flattened gate-
level netlist due to multi-level optimizations during synthesis.
To the best of our knowledge, [23] was the first approach
towards extracting FSMs from a gate-level netlist and us-
ing topological analysis based on structural facts of FSMs.
Nevertheless, this method can not distinguish an FSM from
accumulators or similar arithmetic blocks. Moreover, it fails
to analyze netlists containing multiple FSMs and applies to Fig. 1: Overview of the FSMx framework.
small-sized netlists only. To overcome these drawbacks, a
strongly connected component (SCC) based FSM detection A. Netlist Graph Analyzer
scheme was proposed in [25], mounting on [28]. This method The main goal of this module is to perform graph algo-
only focuses on identifying and analyzing registers with pure rithmic analysis to identify potential FSM candidates’ netlists.
combinational feedback loops to detect FSMs but fails to iso- It can be decomposed into two sub-modules, as illustrated in
late control FSMs from counters since counters exhibit FSM- Fig. 1: Netlist-to-Graph Representation Converter and FSM
like structures. In addition, the proposed approach of [25] has Candidates Identifier. The Connected Components Report is

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
an intermediate output of the first module and acts as the input cells after being synthesized since they represent data flow in
to the second module. a particular design [29]. Sometimes accumulator and counter
1) Netlist-to-Graph Representation Converter: This sub- register FFs also exhibit this property which is evident from
module converts a highly unstructured flattened netlist into a Fig. 3b. However, for control FSM FFs, D-inputs of those FFs
suitable graph representation for applying graph algorithms, as are driven by dissimilar standard cells as they form control
shown in Fig. 2. The first stage Intermediate Gate Level Netlist structures of a design [29] which is also apparent from Fig.
Processor generates an intermediate structured representation 3a. This property helps us to develop a metric, Input Similarity
of the unstructured input netlist, Structured Nets Report with Metric (ISM) that calculates the D-input similarity of the FFs
Instances, which acts as the input for the second stage, Graph present in a particular register. ISM is defined as follows:
Representation Generator. This intermediate representation of max(N1 , N2 , N3 , ....)
the netlist is later used for reconstructing the fragments of the ISM = × 100% (1)
input netlist that represent FSM structures. N
where, N is the size of a register, meaning it has total N FFs
and among them, N1 FFs have one type, N2 FFs have another
type of standard cells driving their D-inputs and so on. We
need to take the maximum of all these values since we must
consider the maximum similarity possible. The control FSM,
shown in Fig. 3a, has ISM of 50% since a FSM FF is driven
Fig. 2: Netlist-to-Graph Representation Converter framework. by a 2-input OR gate and a 3-input OR gate drives the other,
First, input and output pins of all the cells present in the making max(N1 , N2 ) = 1 and N = 2 (since the control FSM
technology library are identified. The intermediate represen- has 2 FFs). However, for the accumulator shown in Fig. 3b,
tation of the netlist contains the standard cell names, cell in- ISM is 100% since four 2-input XOR gates present in the adder
stance names, names of the pins, types of the pins, and the wire block drive the four FFs of the accumulator register, meaning
names connected to the pins in a structured way. Then, Graph all the four FFs are driven by a single type of standard cell with
Representation Generator stage analyzes this intermediate N = max(N1 ) = 4. We have set ISM of 85% as the threshold
representation of the netlist to generate the desired graph of the R to filter the non-FSM registers.
input netlist, Structured Connected Components Report. The
netlist graph is directed and is represented as a massive pair
of nodes. Each node in a pair represents a cell, and each pair
represents an edge of the graph referring to the interconnection
between two cells. The output of a particular cell connects to
an input port of another cell. Counting the number of pairs, we
can easily find the number of edges present in the input netlist
graph. The total cell count obtained from the synthesis report (a) Sequence detector FSM [30] (b) Accumulator [31]
generated provides the number of nodes. We have chosen an
adjacency list representation for the netlist graph. The netlist Fig. 3: Example FSM and accumulator.
graph can be represented mathematically as G = (V, E) where P-II is common in both accumulator and FSM structures
V is the number of standard cells i.e. number of nodes, and [25] which is also evident from Fig. 3. Therefore, a clear
E is the number of interconnections between two connected distinction must be made between these two. P-III helps in
cells present in the netlist i.e. number of edges. this regard. P-II ensures the self-updating structure of state
2) FSM Candidates Identifier: This sub-module is designed FFs prevalent in control FSMs. P-III stands for ensuring
to perform graph theory-based analysis on the netlist graph state-FFs’ cross-influence characteristics present in control
obtained from Netlist-to-Graph Representation Converter. Our FSM structures that are absent in accumulators. Moreover,
proposed scheme mounts on the structural facts of FSM these two structural facts help us in deriving another metric,
topology, and these facts can isolate FSM state registers from FSM Probability Metric (FPM) for a particular register which
other registers (data registers, accumulators, and counters). To calculates the probability of a register present in the netlist
develop an effective scheme for extracting FSMs, we have being an FSM. The maximum score, T, can be obtained from
thoroughly investigated the structural facts of simple FSMs, the following two parameters:
data registers, accumulators, and counters and found three • Self-influence Parameter: Each FF of a certain FSM
properties mounting on how these circuits are implemented. register should influence themselves, which implies that
These properties are used in mathematically deriving and for- starting from a particular FF, the same FF should be
mulating two important metrics for separating FSM structures traversed back through a combinational logic block which
from the non-FSM ones without adopting sophisticated post- is observable from Fig. 3a. A register of size N should
processing methods. Important properties (P) of potential state have a self-influence parameter of N.
flip-flops (FFs) constituting an FSM register are listed here: • Cross-influence Parameter: Each FF present in a particu-
• P-I: Data (D) inputs of potential state FFs are driven by lar FSM register should influence other FFs and should
dissimilar standard cells. also be influenced by others. It means that starting from
• P-II: Potential state FFs must contain pure combinational the data output port of a particular FSM FF, the data
self-feedback loops. input of another FSM FF of that FSM register should be
• P-III: Potential state FFs present in a particular FSM reached via a combinational logic block which is obvious
register must influence other state FFs and should also from Fig. 3a. One state FF of that register should have a
be influenced by other state FFs of that register [24]. parameter of (N – 1). So, for the register of size N, the
P-I is quite effective in distinguishing data registers from parameter should be N(N – 1).
accumulators, counters, and control FSMs. D-inputs of FFs Therefore, the maximum score will be T = N + N(N – 1) =
forming a data register tend to be driven by similar standard N 2 . For each of the registers, the calculated score, K, comes

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
from the total number of self-influence paths, S, and cross- on the combinational DAG to calculate the self-influence and
influence paths, C. Finally, FPM can be calculated using Eq. cross-influence parameters, thus finding the value of the FPM
2. For the sequence detector FSM with N = 2, shown in Fig. metric according to Eq. 2. It not only improves overall run-
3a, has S = 2 and C = 2 since all the FFs have self-influence time significantly but also simplifies post-processing efforts
and cross-influence paths. As a result, it has an FPM of 100%. to extract final FSM Netlists accurately. Our post-processing
On the contrary, the accumulator with N = 4, presented in Fig. method is based on FPM to isolate counters from the control
3b, has S = 4 since all the FFs have self-influence paths (via FSMs. In this way, we have improved overall run-time for
the adder block), but C = 0 as no FFs have cross-influence FSM extraction and provided an automatic detection method
paths. So, FPM is only 25% for the example accumulator. of control FSMs eliminating manual analysis on the extracted
Theoretically, the FSM registers have FPM of exactly or near set of FSMs. The FSM Register Candidates Report contains
100%, but for non-FSM registers, FPM is way less than 100%. all the names and sizes of the potential state registers, the
calculated FPM, and the extracted FSM candidates region
K S+C
FPM = × 100% = × 100% (2) report (containing names of the FFs and combinational gates
T N2 for a particular register). From this report, we have observed
Our developed framework for this sub-module is shown that only registers which form FSMs have FPM of exactly or
in Fig. 4. The framework takes Structured Connected Com- near 100%. Counters (which were not filtered by ISM metric)
ponents Report generated by Netlist-to-Graph Representation show a very low FPM. ISM acts as the pre-processing filter,
Converter as its input that represents the graph representation and FPM acts as a post-processing filter to eliminate non-
of the input netlist. First, the input netlist containing FSMs is control FSM registers. Results also support this claim.
a directed and cyclic graph (DCG) since it contains multiple The overall time complexity of our proposed framework
cycles inside. Analyzing such netlist graphs using existing is O(Vr × (Vc + Ec )). In general, in large flattened netlist
efficient graph algorithms such as Johnson’s algorithm [22] graphs, Ec >> Vc since number of edges always supersedes
is computationally very expensive. Johnson’s algorithm can number of nodes by a large margin. As a result, we can
detect all possible cycles with a fixed length in a graph. ignore the effect of Vc and the overall time complexity can
Finding all cycles with cycle length k in a directed graph be approximated as O(Vr × Ec ). Vr and Ec are quite small
exhibits polynomial time complexity. The time complexity of numbers compared to V and E respectively. Therefore, the time
Johnson’s algorithm is O(V 2 ×log(V ) + V ×E), and it is quite complexity of our proposed approach is much lower compared
time-consuming even to analyze a medium-sized netlist having to existing SCC-based approaches [23]–[25] having the time
200 gates. To reduce time complexity and improve run-time, complexity of O(V + E). FSMx is better than the technique
we partition the netlist DCG into two directed acyclic sub- proposed in [24] in terms of detecting control FSMs accurately.
graphs (DAGs): Combinational and Sequential DAGs. The se- FSMx does not require additional manual analysis on the
quential DAG contains all the edges where one node represents extracted set of FSMs to identify control FSMs. Moreover,
a sequential cell (FFs and latches), and the other represents the space complexity of FSMx is O(Vr + Vc ), which is lower
a non-sequential cell. The combinational DAG contains the compared to existing SCC-based approaches [23]–[25] having
rest of the edges. Next, registers (groups of FFs) are formed space complexity of O(V). These characteristics make FSMx
by analyzing the sequential DAG since it contains all the distinctive from the existing methods proposed in [23]–[25]
sequential cells of the netlist. In other words, E is minimized and more efficient in terms of both run-time and memory.
to Ec that holds only the edges between two non-sequential
cells, and V is partitioned as: Vs which holds all the sequential B. Gate-Level Boolean Function Analyzer
cells (i.e. flip-flops and latches) and Vc that holds rest of the This module is intended to perform an exhaustive functional
non-sequential cells. The other part of E, containing edges analysis of the extracted netlists forming FSM candidates.
between a sequential cell and a non-sequential cell, belongs to The main objective is to extract human-readable gate-level
the sequential DAG. Then, ISM is calculated according to Eq. STGs separately for each of the detected candidates, which
1 and sequential DAG is minimized, taking only the registers makes our framework unique from the previous approaches
having ISM less than the threshold R. Registers existing in in [23]–[25]. Reconstruction of the localized parts of the
the minimized sequential DAG are defined as Potential State entire input netlist representing FSM structures is done in this
Registers. Thus, Vs is minimized to potential state FF vertices phase. After analyzing the FSM Register Candidates Report
Vr , applying our developed ISM metric, and Vr is a very small generated by FSM Candidates Identifier, only FSM candidates
fragment of Vs . with maximum FPM are chosen since these registers have
the highest possibility of constructing control FSMs. The
intermediate representation of the input netlist, Structured Nets
Report with Instances generated by Intermediate Gate-Level
Netlist Processor stage of Netlist-to-Graph Representation
Converter, as shown in Fig. 2, comes very handy at this stage
to automatically generate the desired HDL design modules
and corresponding testbench files for gate-level functional
simulation. As illustrated in Fig. 5, there are two desired de-
sign modules: FSM Netlist which contains state FFs and pure
Fig. 4: FSM Candidates Identifier framework. combinational state transition logic, also termed as Minimum
Finally, only the start points and endpoints of the FFs Extraction Region [23], [24], and the Combinational State
forming the potential state registers are identified, and Depth- Transition Logic (STL). A testbench is generated automatically
First Search (DFS) is applied on the combinational DAG for to perform gate-level functional simulation exhaustively on the
analyzing self-influence and cross-influence among the flip- extracted STL with all possible input patterns.
flops present in a certain register. Instead of considering the The extracted STL and its corresponding testbench are the
entire Vs , we analyze registers containing Vr and apply DFS two inputs to HDL simulator tools for performing the gate-

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
implementation methods proposed in [24], [25] by running on
all the benchmarks presented in Table I and found that FSMx
is roughly 10 times faster than those approaches on average.
Since we have used brute-force simulation of the extracted
state transition logic to construct the STG, it takes a major
portion of the overall run-time of FSMx on several benchmarks
showed in Table I. However, we envision addressing this issue
in the future with a different implementation scheme.

Fig. 5: Gate-Level Boolean Function Analyzer framework. B. FSM Extraction Accuracy


level functional simulation. Simulation results are stored in We have provided illustrations of two case studies to
a structured format so that the Gate-Level State Transition demonstrate that the FSMx framework can extract control
Extractor module can quickly analyze that and generate FSMs accurately from designs varying in size, complexity, and
corresponding gate-level STG. As a result, we obtain all number of FSMs. The first benchmark is the implementation
the probable state transitions which might exist in the gate- of SHA-512 [37] which is the RTL implementation of NIST
level with their corresponding input patterns. Since we are FIPS 180-4. The second one is the RTL implementation of
performing brute-force simulation of the state transition logic, AES [39], the hardware implementation of NIST FIPS 197. To
there is a high degree of possibility that some of the states analyze the accuracy, we have compared the extracted gate-
will be unreachable and the reset state of the gate-level STG level STGs with corresponding RTL STGs and checked the
needs to be also detected [23], [24]. To extract the gate-level functional equivalence of the extracted FSM netlists with the
STG, we have implemented the sub-module Gate-Level State corresponding FSM RTLs using Synopsys Formality.
Transition Extractor. It is intended to analyze the gate-level 1) NIST SHA-512: SHA-512 implements an iterative and
functional simulation result of the Combinational STL and one-way hash function which can process a certain message
construct the corresponding gate-level STG of that particular to generate message digest. We have used sha512 as the top
FSM Netlist. First, it constructs an initial STG and identifies all module of [37] that contains the SHA-512 core along with an
the transitions with corresponding input patterns. Next, DFS interface, and obtained the benchmark flattened netlist using
is performed to identify the reset and unreachable states, if Cadence Genus. The complex netlist contains 10,763 gates
any. Finally, the unreachable states are removed, and the final (nodes) and 361,582 edges as shown in Table I. The design has
gate-level STG is displayed. We have used Python’s library a control FSM sha512 ctrl fsm and three counters: round ctr,
Graphviz to display the extracted final human-readable gate- w ctr and work factor ctr having widths of 7, 7 and 32 bits
level STGs generated by FSMx. Moreover, input-independent respectively [37], with ISM of 85.7%, 71.4% and 87.5%, and
FSM registers, forming counters, are detected and removed FPM of 28.6%, 42.9% and 52.1% respectively.
to construct the final control FSM set. Most of the counters
get removed with our developed ISM and FPM metrics. This
step acts as a further pruning stage as counters are sometimes
difficult to isolate from control FSMs.

IV. E XPERIMENTAL R ESULTS


A. FSM Extraction Run-time
There are two primary inputs to the FSMx framework: the
synthesized flattened gate-level netlist and the standard cell (a) RTL STG extracted by Quartus II (b) Gate-Level STG
technology library used during synthesis as depicted in Fig. 1. Fig. 6: RTL and gate-level STGs of SHA-512 control FSM.
The technology library in .v format that contains all the HDL
descriptions of the standard cells present is also required by the Obtained flattened netlist was analyzed using the FSMx
NC-Verilog tool to perform gate-level simulation and generate framework. After processing, FSMx extracted only one FSM
results to extract individual gate-level STGs for each of the and its corresponding gate-level STG (shown in Fig. 6b)
identified control FSMs. The benchmark circuits listed in from the netlist. Here, the ‘00’ state is the detected reset
Table I were synthesized using Cadence Genus, and Synopsys state displayed in light blue color. The only STG obtained
SAED90nm was used as the technology library. Then, the from the entire SHA-512 RTL [37], extracted by INTEL
obtained flattened netlists were analyzed using FSMx, and Quartus II shown in Fig. 6a, corresponds to the SHA-512
all the control FSMs present in the netlists were extracted core control FSM RTL STG. The states are encoded using
perfectly. All the experiments on the benchmark circuits have binary representation as ‘00’, ‘01’ and ‘10’ for CTRL IDLE,
been performed using an Intel Core i7-1065G7 processor CTRL ROUNDS and CTRL DONE respectively, as shown
clocked at 1.3 GHz with 16GB RAM on a personal desktop in the RTL. Table II shows all the state transitions extracted
and obtained worst-case run-times for extracting FSMs are from SHA-512 netlist where the transitions shown in red are
shown in Table I. Our experimental results showed that data common in both the RTL and extracted STG, indicating the
registers, accumulators, and counters exhibit ISM of exact or state transitions matched perfectly in the RTL and extracted
near 100%. However, to provide more flexibility, we have STG. The rest of the state transitions have been found due to
considered ISM of 85% to be high enough to filter out the exhaustive analysis of the extracted state transition logic and
data register and majority of counter and accumulator FFs correspond to don’t care state transitions added after synthesis.
and set it as the threshold R. Testing on 22 benchmarks ‘11’ is the only don’t care state. From Fig. 6 and Table II, it is
varying in size and complexity has aided us to get such a evident that the RTL STG is a subset of the extracted gate-level
good threshold. We compared the run-time of FSMx with the STG. It proves that FSMx can extract FSMs perfectly. State

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
TABLE I: Worst-case run-time for sample benchmarks.
Name Gate Count Edge Count Control FSM Count Control FSM-FF Count Control FSM ISM(%) Control FSM FPM(%) FSMx Run-time
UART [32] 576 1402 2 3, 3 67, 33 100, 100 0.8 s.
XTEA Cipher [33] 955 2214 1 2 50 100 1 s.
SAYEH CPU [34] 1320 9601 1 4 50 100 10 s.
CMAC Cipher [35] 1549 3255 1 3 67 100 1 s.
SHA-256 [36] 5254 120075 1 2 50 100 19 s.
SHA-512 [37] 10763 361582 1 2 50 100 3 min. 28 s.
POLY1305 MAC [38] 11586 81709 2 3, 3 67, 33 100, 100 4 min. 30 s.
AES-128 [39] 12976 838121 4 2, 2, 2, 2 50, 50, 50, 50 100, 100, 100, 100 6 min. 20 s.
Tiny MIPS CPU [40] 17443 662773 1 4 50 100 4 min. 11 s.
Smart Card RSA [41] 35521 83882 3 2, 3, 4 50, 67, 75 100, 100, 100 2 min. 16 s.

transition conditions of the extracted FSM have been analyzed logic synthesis and optimization stage. Designers can easily
further. The state transitions are input-dependent, indicating all detect the vulnerable state transitions existing in the gate-level
the three counters were removed using ISM and FPM. Thus, abstraction and hence perform the fault-injection assessment
the control FSM of SHA-512 was identified successfully. using the proposed AVFSM framework [8]. Furthermore, the
Run-time of the techniques proposed in [24], [25] on this extracted human-readable STGs can also aid the designers in
benchmark were 29 minutes and 52 minutes, respectively, with developing novel schemes for FSM-based IP watermarking
additional manual analysis, but FSMx required only around 3.5 and logic obfuscation. STG extraction scheme proposed in [24]
minutes with no further manual analysis to make a decision. generates a single STG for a particular SCC region identified
TABLE II: State transitions extracted from SHA-512 netlist. using [28], hence fails to separate individual STGs of the
detected FSMs if multiple FSMs are present inside that SCC
Transitions Common in RTL & GL Transitions Existing in GL Only region. Finally, an attacker can localize all the control FSM
Present State Next State Present State Next State
00 00 01 00 regions quite easily using FSMx, launch powerful structural
00 01 01 11 attacks on a flattened netlist to perform malicious activities, for
10 00 10 11 example, to implant stealthy Hardware Trojan in the control
01 01 11 11 circuit to bypass certain state transitions, and eventually leak
01 10 11 00
10 01 – – secret information [8]. The accuracy and run-time of FSMx
can help significantly in localizing the control FSMs perfectly
2) NIST AES: Analysis on the synthesized flattened netlist and reducing the overall time required for a particular struc-
of the RTL implementation of NIST AES [39] seemed quite tural attack. However, from a security standpoint, FSMx can
challenging since it contains 4 control FSMs: aes core ctrl, assist the designers in evaluating the efficacy of a particular
decipher ctrl, encipher ctrl and key mem ctrl with 6 counters sequential logic locking scheme by analyzing the required time
used for various operations. The counters have widths of 2, 4, by an attacker to extract all the control FSMs from the netlist
2, 4, 2 and 4 bits respectively with ISM of 100%, 75%, 100%, and launch powerful structural attacks. In these scenarios,
50%, 100% and 50%, and FPM of 50%, 75%, 50%, 62.5%, extraction of all the control FSMs from a flattened netlist
50% and 62.5%, respectively. The flattened netlist obtained with their STGs is a pre-processing stage. FSMx is a better
using Cadence Genus contains 12,976 gates and 838,121 edges candidate compared to existing approaches [23]–[25] for this.
as mentioned in Table I. All extracted gate-level STGs of the
detected four control FSMs are shown in Fig. 7. The same
validation method, described in Section IV-B1 was used and
found that RTL STG is a subset of the extracted gate-level
STG in all of the cases. All the counters were successfully
filtered out using ISM and FPM. These two case studies
and experimental results presented in Table I strengthen our
claim that FSMx is capable of extracting all the control FSMs
present in the netlist 100% accurately. The control FSM count (a) Core Controller STG (b) Key Memory Controller STG
and control FSM-FF count obtained from the corresponding
extracted FSM netlists from the synthesized flattened netlists
of the benchmarks presented in Table I have been verified
by analyzing the corresponding RTL and comparing with the
control FSM state variables present in the RTL. Moreover, in
both of the presented case studies, we have used Formality to
check the functional equivalence of each of the extracted con-
trol FSM from the netlist and the corresponding control FSM
RTL. Results demonstrate that both are functionally equivalent (c) Encryption Controller STG (d) Decryption Controller STG
in every scenario, with all the FFs perfectly matched. Run-time Fig. 7: Extracted gate-level STGs from AES netlist.
of the existing methods [24], [25] on AES benchmark were
41 minutes and 58 minutes, respectively, with further manual V. C ONCLUSION
analysis. However, FSMx required around 6.5 minutes with This paper presents a fast and efficient method based on
no additional manual analysis to remove the counters. graph theory to automatically extract FSMs precisely control
From Fig. 6b and Fig. 7, it is evident that FSMx is capable FSMs from an optimized flattened gate-level netlist. We envi-
of extracting human-readable STGs separately for each of the sion incorporating our current work to develop novel FSM-
control FSMs present in the netlist. Moreover, all the probable based watermarking and logic obfuscation techniques from
hidden states and corresponding state transitions present in a hardware security perspective. We also envision extending
the gate-level abstraction can be easily identified (absent in FSMx to extract FSMs from netlists synthesized using FPGA
the RTL abstraction). These features of FSMx help designers libraries and netlists preserving hierarchy that lack mixing of
perform a number of security assessments quite easily after the logic blocks and design flattening stage.

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [30] http://www.ee.ncu.edu.tw/∼jimmy/courses/DSD06/06 FSM.pdf.
[31] G. Vera, J. Parra, C. Kief, M. Pattichis, and H. Pollard, “Integrating
[1] N. Farzana, F. Rahman, M. Tehranipoor, and F. Farahmandi, “SoC se- Reconfigurable Logic in the First Digital Logic Course,” 2006.
curity verification using property checking,” in 2019 IEEE International [32] https://github.com/secworks/uart/tree/master/src/rtl.
Test Conference (ITC). IEEE, 2019, pp. 1–10. [33] https://github.com/secworks/xtea/tree/master/src/rtl.
[2] M. Fyrbiak, S. Strauß, C. Kison, S. Wallat, M. Elson, N. Rummel, and [34] https://opencores.org/projects/sayeh processor.
C. Paar, “Hardware reverse engineering: Overview and open challenges,” [35] https://github.com/secworks/cmac/tree/master/src/rtl.
in 2017 IEEE 2nd International Verification and Security Workshop [36] https://github.com/secworks/sha256/tree/master/src/rtl.
(IVSW). IEEE, 2017, pp. 88–94. [37] https://github.com/secworks/sha512/tree/master/src/rtl.
[3] R. Karri, J. Rajendran, K. Rosenfeld, and M. Tehranipoor, “Trustworthy [38] https://github.com/secworks/poly1305/tree/master/src/rtl.
hardware: Identifying and classifying hardware trojans,” Computer, [39] https://github.com/secworks/aes/tree/master/src/rtl.
vol. 43, no. 10, pp. 39–46, 2010. [40] https://github.com/gremerritt/multicycle-processor.
[4] D. Hély, M.-L. Flottes, F. Bancel, B. Rouzeyre, N. Berard, and M. Ren- [41] https://github.com/wvangansbeke/Smart-Card-RSA.
ovell, “Scan design and secure chip [secure IC testing],” in IOLTS:
International On-Line Testing Symposium. IEEE, 2004, pp. 219–224.
[5] P. C. Kocher, “Timing attacks on implementations of Diffie-Hellman,
RSA, DSS, and other systems,” in Annual International Cryptology
Conference. Springer, 1996, pp. 104–113.
[6] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in Annual
international cryptology conference. Springer, 1999, pp. 388–397.
[7] E. Biham and A. Shamir, “Differential fault analysis of secret key
cryptosystems,” in Annual international cryptology conference, 1997.
[8] A. Nahiyan, K. Xiao, K. Yang, Y. Jin, D. Forte, and M. Tehranipoor,
“AVFSM: A framework for identifying and mitigating vulnerabilities in
FSMs,” in 2016 53nd ACM/EDAC/IEEE Design Automation Conference
(DAC), 2016, pp. 1–6.
[9] G. K. Contreras, A. Nahiyan, S. Bhunia, D. Forte, and M. Tehranipoor,
“Security vulnerability analysis of design-for-test exploits for asset
protection in socs,” in 2017 22nd Asia and South Pacific Design
Automation Conference (ASP-DAC). IEEE, 2017, pp. 617–622.
[10] M. Tehranipoor, H. Salmani, and X. Zhang, “Integrated circuit authen-
tication,” Switzerland: Springer, Cham. doi, vol. 10, pp. 978–3, 2014.
[11] S. Bhunia and M. Tehranipoor, “The Hardware Trojan War,” Cham,,
Switzerland: Springer, 2018.
[12] A. T. Abdel-Hamid, S. Tahar, and E. M. Aboulhamid, “A survey on IP
watermarking techniques,” Design Automation for Embedded Systems,
vol. 9, no. 3, pp. 211–227, 2004.
[13] K. Juretus and I. Savidis, “Synthesis of Hidden State Transitions for
Sequential Logic Locking,” IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 2020.
[14] L. Zhang and C.-H. Chang, “State encoding watermarking for field
authentication of sequential circuit intellectual property,” in 2012 IEEE
International Symposium on Circuits and Systems (ISCAS). IEEE, 2012,
pp. 3013–3016.
[15] S. E. Quadir, J. Chen, D. Forte, N. Asadizanjani, S. Shahbazmohamadi,
L. Wang, J. Chandy, and M. Tehranipoor, “A survey on chip to
system reverse engineering,” ACM journal on emerging technologies
in computing systems (JETC), vol. 13, no. 1, pp. 1–34, 2016.
[16] M. T. Rahman, D. Forte, Q. Shi, G. K. Contreras, and M. Tehranipoor,
“CSST: Preventing distribution of unlicensed and rejected ICs by un-
trusted foundry and assembly,” in 2014 IEEE International symposium
on defect and fault tolerance in VLSI and nanotechnology systems
(DFT). IEEE, 2014, pp. 46–51.
[17] D. Forte, S. Bhunia, and M. M. Tehranipoor, Hardware protection
through obfuscation. Springer, 2017.
[18] M. Borowczak and R. Vemuri, “Mitigating information leakage during
critical communication using S* FSM,” IET Computers & Digital
Techniques, vol. 13, no. 4, pp. 292–301, 2019.
[19] F. Farahmandi and P. Mishra, “FSM Anomaly Detection Using Formal
Analysis,” in 2017 IEEE International Conference on Computer Design
(ICCD), 2017, pp. 313–320.
[20] M. E. Gilford et al., “Recognition of a state machine in high-level
integrated circuit description language code,” 2004, uS Patent 6,675,359.
[21] J.-C. Giomi, “Method of extracting implicit sequential behavior from
hardware description languages,” Jun. 30 1998, uS Patent 5,774,370.
[22] D. B. Johnson, “Finding all the elementary circuits of a directed graph,”
SIAM Journal on Computing, vol. 4, no. 1, pp. 77–84, 1975.
[23] K. S. McElvain, “Methods and apparatuses for automatic extraction of
finite state machines,” US Patent 6,182,268, Tech. Rep., Jan. 2001.
[24] M. Fyrbiak, S. Wallat, J. Déchelotte, N. Albartus, S. Böcker, R. Tessier,
and C. Paar, “On the difficulty of FSM-based hardware obfuscation,”
IACR Transactions on Cryptographic Hardware and Embedded Systems,
pp. 293–330, 2018.
[25] Y. Shi, C. W. Ting, B.-H. Gwee, and Y. Ren, “A highly efficient method
for extracting FSMs from flattened gate-level netlist,” in Proceedings of
2010 IEEE international symposium on circuits and systems. IEEE,
2010, pp. 2610–2613.
[26] E. F. Moore, “Gedanken-experiments on sequential machines,” in Au-
tomata Studies.(AM-34), Volume 34. Princeton University Press, 2016.
[27] Z. Kohavi and N. K. Jha, “Switching and Finite Automata Theory.”
Cambridge University Press, 2009.
[28] R. Tarjan, “Depth-First Search and Linear Graph Algorithms,” SIAM J.
Comput., vol. 1, pp. 146–160, 1972.
[29] T. Meade, Y. Jin, M. Tehranipoor, and S. Zhang, “Gate-level netlist
reverse engineering for hardware security: Control logic register identifi-
cation,” in 2016 IEEE International Symposium on Circuits and Systems
(ISCAS), 2016, pp. 1334–1337.

Authorized licensed use limited to: University of Florida. Downloaded on July 22,2022 at 18:38:53 UTC from IEEE Xplore. Restrictions apply.
View publication stats

You might also like