0% found this document useful (0 votes)
13 views5 pages

FPGA Chaotic Generator Design

This document discusses the model-based design and FPGA implementation of a Lorenz chaotic generator using Xilinx System Generator. It presents 32-bit and 16-bit fixed-point models of the Lorenz attractor and optimizations to improve timing performance and reduce resource usage. Implementation results on a Zynq FPGA show that the proposed design approach triples the maximum operating frequency for both fixed-point configurations.

Uploaded by

Amr ATIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

FPGA Chaotic Generator Design

This document discusses the model-based design and FPGA implementation of a Lorenz chaotic generator using Xilinx System Generator. It presents 32-bit and 16-bit fixed-point models of the Lorenz attractor and optimizations to improve timing performance and reduce resource usage. Implementation results on a Zynq FPGA show that the proposed design approach triples the maximum operating frequency for both fixed-point configurations.

Uploaded by

Amr ATIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2017 2nd Asia-Pacific Conference on Intelligent Robot Systems

System Generator Model-Based FPGA Design Optimization and Hardware Co-


simulation for Lorenz Chaotic Generator

Lei Zhang
Faculty of Engineering and Applied Science
University of Regina, Regina
S4S0A2 Canada
e-mail: lei.zhang@uregina.ca

Abstract — Chaotic systems can be synchronized and used for Spartan 3E FPGA device is reported in[5], using 32-bit signed
secure communication to transmit video, audio and text files. fixed-point data format with 20 bits of fraction. One more
Field Programmable Gate Arrays (FPGAs) are beneficial for the Lorenz attractor hardware implementation is given by[6],
implementation of high speed, low cost and low power embedded
communication systems. In this paper, a model-based design ap- [7] using 16Q16 fixed-point data format on a Virtex II-Pro
proach is presented for FPGA implementation and optimization FPGA. All the above mentioned designs use 32-bit fixed-
of chaotic generators. Lorenz attractor has its significance in point data format. In this paper, an extended model-based
studying chaotic systems and is used as the design subject in design approach and optimization methods are presented for
this paper. The conceptual model design is built using MATLAB FPGA implementation of chaotic systems using Xilinx System
Simulink, and the equivalent hardware model is created using
Xilinx System Generator for FPGA implementation. The design Generator on a Zynq 7z020 FPGA. 32-bit fixed-point and 16-
models are created with 32-bit fixed-point and 16-bit fixed- bit fixed-point data formats are used for the model design.
point data formats and implemented on FPGA to evaluate the Timing analysis are carried out based on critical paths listed in
design performance, including the maximum operating clock FPGA implementation time reports. These models are further
frequency, resource utilization and power consumption. The 32- optimized with added delays to break long data paths and
bit and 16-bit fixed-point models are further optimized by using
timing analysis and adding delays to break critical paths to improve timing performance.
improve timing performance. The implementation results show The Lorenz attractor is represented by equation1.
that the proposed design and optimization approach has achieved
promising improvement on design performance by tripling the dx
maximum operating frequency for both 32-bit and 16-bit fixed- = σ(y − x)
dt
point configurations. The FPGA hardware co-simulation results dy
demonstrate the anticipated Lorenz chaotic generator outputs for = ρx − y − xz (1)
both designs. dt
dz
Keywords-chaotic generator; Lorenz attractor, hardware
= −βz + xy
dt
cosimulation; FPGA
where variables σ = 10, ρ = 28, β = 83 , initial values x0 =
I. I NTRODUCTION 10, y0 = 20, z0 = 30, and step size dt = 0.01. The solutions of
Chaotic systems are aperiodic and appear random in the these three dimensional ordinary differential equations (ODEs)
time domain[1], but they can be synchronized and used depend on the initial values.
for message encryption in communication systems. Various
chaotic generators have been implemented on FPGA in real- A. Forward Eular Method
time for synchronous communication applications[2]. It is When implementing chaotic generator such as Lorenz at-
proved that two chaotic systems with the same parameter tractor on FPGA, simple discrete integration method can be
settings will synchronize with each other[3]. One big challenge used to reduce FPGA resource usage, but may introduce
in embedded communication system design is to achieve the rounding errors and cause the output not to converge. The
specified security level with constraint on hardware resources problem of fixed-point representation errors will always be
such as memory size and computational capacity, meanwhile present, which can be accepted so long as the solutions to the
meeting the performance requirements for speed and power. differential equations converge at a given step size[8]. Euler
A Lorenz chaotic generator conceptual model and its Xilinx method is a first-order numerical procedure for solving ODEs.
System Generator model are presented in[4], using 32-bit The forward Euler method is based on a truncated Taylor
signed fixed-point with 18 bits of fraction, at a clock step series expansion. Given n ODEs with n-variable in equation(2),
size of 0.01(dt), and achieving a maximum frequency of 2.5 the forward Euler method for FPGA implementation is repre-
MHz. Another Lorenz attractor implementation on a Xilinx sented by equations(3). It is noted that large step size (dt) could

978-1-5090-6793-0/17/$31.00 ©2017 IEEE 170


introduce anomalies in chaotic generators[9]. Other numerical Therefore the timing constraint for the clock period of the
solutions such as fourth order Runge-Kutta method (RK-4) target design is set to 10 ns. Timing closure is the process
can also be used for solving ODEs[7]. by which an FPGA design is modified to meet its timing
dx1 requirements. The maximum frequency of a FPGA design
= f1 (x1 , ..., xn ) is not generated directly by the Vivado software tool. It can
dt (2)
... be calculated using the clock period and the Worst Negative
dxn Slack (WNS) given by the implementation timing report, as
= fn (x1 , ..., xn )
dt in equation (4).
apply forward Euler’s method: 1
fmax = (4)
x1 (t + dt) = x1 + f1 (x1 (t), ..., xn (t))dt Ts − W N S
... (3) where fmax is the maximum frequency, Ts is the clock
xn (t + dt) = xn + fn (x1 (t), ..., xn (t))dt period. When an implementation is completed successfully
meeting all timing constraints, the WNS value should be
B. Fixed-point FPGA Implementation
positive, which means a faster fmax or a shorter Ts can be
The Lorenz attractor model is designed using Xilinx Sys- used. On the other hand, if an implementation fails to meet
tem Generator (XSG) and Simulink. The Simulink blocks all timing constrains, the WNS is negative, and the timing
are configured with 32-bit fixed-point and 16-bit fixed-point constraint for Ts can be increased to achieve timing closure
data format respectively. The fixed-point models are further for a successful implementation, without changing the design.
optimized to improve timing performance and reduce FPGA The fmax can only be calculated when the implementation is
resource utilization. The optimization approach is to firstly completed without timing failure.
generate Simulink conceptual model for Lorenz attractor and
obtain simulation results for 32-bit floating point data format. II. 32- BIT M ODEL FPGA I MPLEMENTATION
Then based on the output data range, an initial fixed-point data The 32-bit fixed-point XSG model is created using
format is selected for the XSG blocks to create the hardware Fix32 18 data format, with 1 sign bit, 18 fractional bits
model for FPGA implementation. A commonly used signed and 13 integer bits. The model is used to generate Vivado
fixed-point representation Qm.n gives m bits of integer, n project for FPGA Implementation. The same design model is
bits of fraction, and 1 bit of sign. Its representation range implemented with different timing constraint settings for clock
is between −2m and 2m − 2−n , and its precision is 2−n . period. This method can be used to find maximum frequency
In system generator model, signed fixed-point data format is for a design without predefined timing requirement. The timing
represented as Fixaa bb, where aa is the total number of closure of the implementation is achieved by increase the
bits and bb is the number of fractional bits. e.g, Fix32 18 clock period Ts when timing failure (negative WNS) occurs
represents a 32-bit fixed-piont data format with 1 sign bit, 18 during implementation. The new Ts should be set greater than
fractional bits and 13 integer bits. the subtraction of current Ts and the negative WNS. The
It is observed from the Lorenz attractor conceptual sim- implementation results are listed in Table I.
ulation that the intermediate values in the model are in an TABLE I. 32-BIT FIXED-POINT FPGA IMPLEMENTATION
approximate range between -1024 and +1024. therefore, at
Spartan 3Ea Zynq 7020
least 10-bit is required for integer part to avoid overflow. In
Clock period Ts (ns) 20 40 30 25
order to compare implementation result with the referenced Worst Negative Slack(ns) NA 9.816 2.457 (-0.683)
design[5], Fix32 18 data format is used. When Fix16 5 data Maximum Frequency(MHz) 18.03 33.13 36.31 NA
format with 1 sign bit and 10 integer bits is used for the 16- Look-up Table(LUT) 1912 868 868 868
Registers/Flip-flop(FF) 144 96 96 96
bit model design, only 5 bits are left to be used for fraction. Slices 1029 338 338 343
The fractional precision is 2−5 (0.03125). This is problematic DSP48E1 8 8 8 8
because when setting the clock step size (dt) to a smaller Total On-chip Power(W) NA 0.153 0.154 0.173
a Reference design in [5]
value than the minimum fraction, it will be rounded to 0.
Therefore, the fractional bits for the dt multiplier blocks are The implementation succeeds when Ts = 30n, but failed
set to Fix16 9. Moreover, overflow is generated by the x ∗ z when Ts = 25n. The fmax at 30 ns is 36.31 MHz, which
multiplier block during simulation. Therefore, this individual is twice faster than the referenced design on Spartan-3E
multiplier block is configured with Fix16 4 data format with FPGA. The implementation results also show that the power
11-bit for integer to avoid overflow. consumption increases as clock frequency increases.
C. Maximum Clock Frequency III. 32- BIT M ODEL O PTIMIZATION
A Xilinx Zedboard with a Zynq7020 FPGA is used for The 32-bit fixed-point design is optimized in order to meet
FPGA implementation and hardware co-simulation. This board the timing requirement for 10 ns clock period at 100 MHz
has a 100 MHz system clock, with 10 ns clock period (Ts ). system clock frequency. The implementation with 25 ns clock

171
period reports a number of critical paths with long delay for system clock frequency on the Zedboard at 100 MHz, several
register 1. In order to increase the maximum frequency to the optimization methods are applied [10], [11].

Figure 1. Lorenz Attractor 32-bit Fixed-point Optimization Model with 4 Delays

Firstly, double delay blocks are added to the outputs. Each * Add two delay (latency) for Multiplier blocks, one delay
delay block has 1 latency. This setup allows one register to be for Constant Gain block; add additional delays to match the
placed next to the FPGA on-chip logic and the other one to be delays for three data paths.
packed into an IOB (Input/Output Block) on the FPGA, which * Add delay blocks to the feedback path for three integrators,
avoid generating critical path from logic to IOB. A delay block set latency to 4 for three feedback delay blocks to match the
with a latency of 2 is implemented by shift register SRL16 delays on the forward data paths.
in FPGA and does not give the same result for the model- * Set Adder/Subtracter blocks to be implemented using
based design. The implementation results are improved by this dedicated FPGA resource DSP48. This will save FPGA logic
optimization as shown in Table II. The timing performance resource and achieve better performance.
at 25 ns clock period meets the timing requirement for the * Set Multiplier blocks to optimize for speed and use
design with a small margin of 0.02 ns WNS; but fails at 20n embedded multipliers (DSP48). This will save FPGA logic
sample period with a negative WNS of -5.181 ns. resource and achieve better performance.
TABLE II. 32-BIT IMPLEMENTATION RESULTS
* Set Constant Multiplier blocks to implement using
Distributed RAM. Distributed RAMs are implemented using
Clock period Ts (ns) 40ns 30ns 25ns 20ns
FPGA logic resources, which are suitable for implementing
Worst Negative Slack(ns) 10.167 2.983 0.020 (-5.181)
Maximum Frequency(MHz) 33.52 37.01 40.03 NA small size memory. They can be relative faster and more
Look-up Table(LUT) 868 868 868 868 flexible than Block RAM for the ‘Place and Routing’ process
Registers/Flip-flop(FF) 320 320 320 320 in FPGA implementation to meet timing requirement of the
Slices 375 388 400 389
DSP48E1 8 8 8 8
design.
Total On-chip Power(W) 0.154 0.165 0.175 0.188 * Add Down Sample block to each output. The number of
down sample equals the total number of delays on each of the
Secondly, delay blocks are added to cut critical paths. three data paths for x, y and z. Each path has a total delay of
Additional delays are also added on the non-critical paths to 5. Set the down sample rate for three output Down Sample
match the number of delays on all paths and ensure that the blocks to 5. Select ‘first value of frame’ as output sample.
output signals are correctly aligned. Timing closure is achieved * Add one additional delay block before the ‘Gateway Out
after adding four delays blocks on the critical data path. The
optimization approach is taken by the following steps:

172
block’. Set the block to implement using ‘behavioral HDL’. IV. 16- BIT M ODEL FPGA I MPLEMENTATION &
This block will be bounded to the IOB and hence decrease O PTIMIZATION
the long time path from the logic to output ports. Fig.2 shows the optimized FPGA hardware implementation
* Balancing the delays by cutting critical paths with timing model with 16-bit fixed-point data format using the same
failures listed in the implementation report. optimization approach as the 32-bit model. The challenge for
Fig.1 shows the design optimization model. The implemen- the 16-bit fixed-point model design is to avoid overflow of
tation results for the original and optimized designs are listed the chaotic generator. Overflow occurs when all blocks are
in Table III. In the original design model without delay block, configured as Fix16 5 data format. Trial and error is used
the three outputs are generated within one clock cycle. By with the model simulation to correctly represent the data range
adding delay blocks, critical long time paths can be cut into and data precision.The implementation results of 16-bit fixed-
shorter segments to meet the timing requirement. However, point model and the optimization model for Lorenz generator
this means it will take multiple clock cycles to complete are listed in Table IV.
the calculation, which reduces the data throughput. The trade
off between time performance and throughput needs to be TABLE IV. 16-BIT FIXED-POINT FPGA IMPLEMENTATION AND
OPTIMIZATION RESULTS
considered for specific applications.
Model Non-Optimized Optimized
TABLE III. 32-BIT OPTIMIZATION RESULTS Sample period Ts (ns) 25 20 20 10
Sample period Ts =10ns) No delay 3-delay 4-delay Worst Negative Slack(ns) 2.429 -1.117 4.808 2.564
Worst Negative Slack(ns) (-15.603) (-2.454) 0.059 Maximum Frequency(MHz) 44.30 – 65.82 136.13
Maximum Frequency(MHz) NA NA 100.59 Look-up Table(LUT) 376 376 377 376
Look-up Table(LUT) 868 875 945 Registers/Flip-flop(FF) 48 48 48 362
Registers/Flip-flop(FF) 96 875 1142 Slices 149 151 156 185
Slices 336 408 444 DSP48E1 2 2 2 2
DSP48E1 8 8 8 Total On-chip Power(W) 0.145 0.151 0.141 0.200
Total On-chip Power(W) 0.252 0.171 0.177

Figure 2. Lorenz Attractor 16-bit Fixed-point Optimization Model with 1 Delay

The original 16-bit model can achieve 44.3 MHz maximum fails to meet the timing requirement. After adding one delay to
frequency at 25 ns Ts , but has negative WNS at 20ns Ts and each data path, the optimized model achieves timing closure

173
at 20 ns Ts with 65.82 MHz maximum frequency, as well as VI. C ONCLUSIONS
at 10ns Ts with 136.13 MHz maximum frequency, more than
This paper demonstrated the System Generator model-based
tripled the original model. Adding one delay block reduced
FPGA design approach and implementation results for Lorenz
the data throughput by half, but this can be compensated by
chaotic generator. This design approach can be extended for
using pipelining with increased FPGA resource usage.
FPGA implementation of other chaotic generator designs. The
V. FPGA H ARDWARE C O - SIMULATION aim is to increase the frequency of Lorenz chaotic generator
with FPGA acceleration. Timing closure is achieved by critical
The Zedboard is used for hardware co-simulation to evaluate paths timing analysis and adding delays to break critical paths.
the Lorenz attractor outputs of the fixed-point models. The The implementation results show that the optimized design
designed models are converted into FPGA configuration image models achieves better design performance by increasing
and downloaded to the FPGA on Zedboard using JTAG the maximum operating frequency threefold for both 32-bit
configuration port. The models run on the FPGA device, and and 16-bit fixed-point models. The optimized 32-bit model
the generated results are send back to the PC via JTAG and achieves a maximum frequency of 100.59 MHz with 5 delays.
displayed by MATLAB. The hardware co-simulation results The optimized 16-bit model achieves a maximum frequency
demonstrate the same chaotic outputs for the Lorenz attractor of 136.13 MHz with 1 delay. The FPGA resource usage is
for the 32-bit fixed-point model. The hardware co-simulation reduced by approximately 75% for 16-bit fixed-point model
block generated from the optimized design is shown in Fig.3. compared to equivalent 32-bit fixed-point model. The correct
The hardware co-simulation outputs x, y, z and their 3D Lorenz attractor outputs are generated by the hardware co-
outputs are shown in Fig.4. The hardware co-simulation for simulation.
16-bit fixed-point optimization model can also generate correct
outputs for the Lorenz attractor. R EFERENCES
[1] A. Abel and W. Schwarz, “Chaos communications: Principles, schemes,
and system analysis,” Proceedings of the IEEE, vol. 90, no. 5, pp. 691–
710, May 2002.
[2] P. Wu, J. Alam, C. Hu, and J. Li, “Controlling unified hyperchaotic
system to encryption digital information,” in Proceedings of the 3rd
International Conference on Cloud Security and Management, 2015,
pp. 118–121.
[3] H. Kamata, T. Endo, and Y. Ishida, “Practical private speech communi-
cation system with chaos using digital signal processor,” J. Acoust. Soc,
jjpn. (E) 19. 6, 1998.
Figure 3. Lorenz Attractor Hardware Co-simulation Model [4] M. Aseeri, M. I. Sobhy, and P. Lee, “Lorenz chaotic model using filed
programmable gate array (fpga),” in The 45th Midwest Symposium on
Circuits and Systems, vol. 1, 2002.
[5] M. Aseeri and M. I. Sobhy, “Field programmable gate array (fpga) as a
new approach to implement the chaotic generators,” in 3rd International
Conference on Advanced Engineering Design AED, Prague, Czech
Republic, June 2003.
[6] C. Tanougast, Chaos-Based Cryptography: Theory, Algorithms and
Application, 2011, ch. Chapter 9: Hardware Implementation of Chaos
Based Cipher: Design of Embedded Systems for Security Applications.
[7] M. S. Azzaz, C. Tanougast, S. Sadoudi, and A. Dandache, “Real-
time fpga implementation of lorenz’s chaotic generator for ciphering
telecommunications,” in Circuits and Systems and TAISA Conference,
2009. NEWCAS-TAISA ’09. Joint IEEE North-East Workshop on, June
2009, pp. 1–4.
[8] J. M. E. Cuautle and L. Fraga, Engineering Applicaitons of FPGAs -
Chaotic systems, Artificial neural Networks, Random Number Genera-
tors, and Secure Communication systems. Switzerland: Springer, 2016.
[9] B. Muthuswamy and S. Banerjee, A Route to Chaos Using FPGAs
Volume I Experimental Observations. Switzerland: Springer, 2015.
[10] Xilinx, Vivado Design Suite User Guide: MOdel-Based DSP Design
using System Generator, UG897, v2016.1 ed., Xilinx, Apr. 2016.
[11] Vivado Design Suite Tutorial: Model-Based DSP Design Using System
Generator, UG948, v2015.3 ed., Xilinx, Oct. 2015.

Figure 4. 32-bit Fixed-point Hardware Co-simulation Outputs

174

You might also like