Introduction to Field Programmable Gate Arrays
Lecture 1/3
CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May 9 June 2007 Javier Serrano, CERN AB-CO-HT
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Historical Introduction
z In the beginning, digital design was done with the 74 series of chips. z Some people would design their own chips based on Gate Arrays, which were nothing else than an array of NAND gates:
Historical Introduction
z The first programmable chips were PLAs (Programmable Logic Arrays): two level structures of AND and OR gates with user programmable connections. z Programmable Array Logic devices were an improvement in structure and cost over PLAs. Today such devices are generically called Programmable Logic Devices (PLDs).
Historical introduction
z A complex PLD (CPLD) is nothing else than a collection of multiple PLDs and an interconnection structure. z Compared to a CPLD, a Field Programmable Gate Array (FPGA) contains a much larger number of smaller individual blocks + large interconnection structure that dominates the entire chip.
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Basics of digital design
z Unless you really know what you are doing, stick to synchronous design: sandwiching bunches of combinational logic in between flip flops. z Combinational logic: state of outputs depend on current state of inputs alone (forgetting about propagation delays for the time being). E.g. AND, OR, mux, decoder, adder... z D-type Flip flops propagate D to Q upon a rising edge in the clk input. z Synchronous design simplifies design analysis, which is good given todays logic densities.
Dont do this!
Toggle flip-flops get triggered by glitches produced by different path lengths of counter bits.
Basics of (synchronous) Digital Design
Clk DataInB[31:0]
[31:0] [31:0]
D[31:0]
Q[31:0]
[31:0]
D[0]
Q[0]
[31:0] [31:0]
dataBC[31:0]
DataSelect
dataSelectC
0 1
[31:0]
[31:0]
D[31:0]
Q[31:0]
[31:0] [31:0]
DataOut[31:0]
DataOut_3[31:0]
[31:0] [31:0]
DataOut[31:0]
+
sum[31:0]
[31:0]
DataInA[31:0]
[31:0] [31:0]
D[31:0]
Q[31:0]
[31:0]
dataAC[31:0]
6.90 ns
Q[0] D[0] Q[0]
High clock rate: 144.9 MHz on a Xilinx Spartan IIE.
DataSelect
D[0]
dataSelectC
dataSelectCD1
Clk DataInB[31:0]
[31:0] [31:0] [31:0]
D[31:0]
Q[31:0]
[31:0] [31:0] [31:0]
0 1
[31:0]
D[31:0]
Q[31:0]
[31:0]
[31:0]
dataBC[31:0]
dataACd1[31:0]
D[31:0]
Q[31:0]
[31:0] [31:0]
DataOut[31:0]
DataOut_3[31:0]
DataOut[31:0]
[31:0]
DataInA[31:0]
[31:0] [31:0]
D[31:0]
Q[31:0]
[31:0] [31:0]
+
sum_1[31:0]
[31:0] [31:0]
D[31:0]
Q[31:0]
[31:0]
dataAC[31:0]
sum[31:0]
6.60 ns
Higher clock rate: 151.5 MHz on the same chip.
Illustrating the latency/throughput tradeoff
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Basic FPGA architecture
The logic block: a summary view
Example: using a LUT as a full adder.
A practical example: Xilinx Virtex II Pro family (used in the lab)
Overview
Configurable Logic Block (CLB)
Embedded PowerPC Digitally Controlled Impedance (DCI)
A practical example: Xilinx Virtex II Pro family
Slice Detail of half-slice
A practical example: Xilinx Virtex II Pro family
Routing resources
FPGA state of the art
z In addition to logic gates and routing, in a modern FPGA you can find:
{ Embedded processors (soft or hard). { Multi-Gb/s transceivers with equalization and hard IP for serial standards as PCI Express and Gbit Ethernet. { Lots of embedded MAC units, with enough bits to implement single precision floating point arithmetic efficiently. { Lots of dual-port RAM. { Sophisticated clock management through DLLs and PLLs. { System monitoring infrastructure including ADCs. { On-substrate decoupling capacitors to ease PCB design. { Digitally Controlled Impedance to eliminate on-board termination resistors.
Embedded processors
Why use embedded processors?
Customization: take only the peripherals you need and replicate them as many times as needed. Create your own custom peripherals.
Strike optimum balance in system partitioning.
Serial signaling
z Avoids clock/data skew by using embedded clock. z Reduces EMI and power consumption. z Simplifies PCB routing.
Clock management
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Traditional design flow 1/3
HDL
Behavioral Simulation
Implement your design using VHDL or Verilog
Synthesis
Functional Simulation
Implementation
Timing Simulation
Download
In-Circuit Verification
Traditional design flow 2/3
HDL
Behavioral Simulation
Synthesis
Functional Simulation
Synthesize the design to create an FPGA netlist
Implementation
Timing Simulation
Download
In-Circuit Verification
Traditional design flow 3/3
HDL
Behavioral Simulation
Synthesis
Functional Simulation
Implementation
Timing Simulation
Download
In-Circuit Verification
Translate, place and route, and generate a bitstream to download in the FPGA
VHDL 101
Both VHDL code segments produce exactly the same hardware.
VHDL 101: hierarchy
Outline
z Historical introduction. z Basics of digital design. z FPGA structure. z Traditional (HDL) design flow. z Demo.
Demo
z Now, lets see how you go from design idea to hardware, using the traditional flow. z Many thanks to Jeff Weintraub (Xilinx University Program), Bob Stewart (University of Strathclyde) and Silica for some of the slides.