Vivado ZynqMP S1
Vivado ZynqMP S1
2021.1.7 08:33:18
Machine Translated by Google
Copyright Notice
Company
website: Http://www.alinx.com.cn
Technical
Forum: http://www.heijin.org
Official flagship
store: http://alinx.jd.com
Email:
avic@alinx.com.cn
Telephone:
021-67676997
fax:
021-37737073
http://www.alinx.com.cn 2 / 244
Machine Translated by Google
We promise that this tutorial is not a one-time, fixed document. We will take feedback from the forum and
And the actual development practice experience accumulates to continuously revise and optimize the tutorials.
http://www.alinx.com.cn 3 / 244
Machine Translated by Google
sequence
First of all, thank you for purchasing the ZYNQ development board AXU3EG produced by Xinyi Electronic Technology (Shanghai) Co., Ltd.
AXU4EV, AXU5EV! Your support and trust in us and our products give us the confidence and courage to move forward.
gas.
Someone asked if it is possible to learn ZYNQ without any basic knowledge? It depends on where the zero is. If you can't even understand the schematic diagram, how can you learn C language?
I don't know what an array is, and I have no idea about pointers. This is a negative foundation. Learning ZYNQ requires basic hardware.
This tutorial is about the FPGA part. Through continuous practice, you can master the basic process of FPGA development.
It’s a very simple truth, but practice makes perfect. Practice more and you will gradually master the secrets.
http://www.alinx.com.cn
4 / 244
Machine Translated by Google
Table of contents
Copyright Statement..................................................................................................................2
Preface..........................................................................................................................................4
Contents..................................................................................................................................5
Chapter 1 Introduction to Ultrascale+ MPSoC..................................................................................11
PS and PL Interconnection Technology...........................................................................11
Introduction to ZYNQ Chip Development Process............................................................17
What skills are required to learn ZYNQ...................................................................18 1.3.1
Software Developers...............................................................................................18
1.3.2 Logic Developers...............................................................................................18
Chapter 2 Introduction to Development Board Hardware...................................................................19
ACU3EG Core Board.................................................................................................22
2.1.1 Introduction.........................................................................................................22
2.1.2 ZYNQ Chip.........................................................................................................22
2.1.3 DDR4 DRAM......................................................................................................................24
2.1.4 QSPI Flash......................................................................................................................29
2.1.5 eMMC Flash............................................................................................................................30
http://www.alinx.com.cn 5 / 244
Machine Translated by Google
http://www.alinx.com.cn 6 / 244
Machine Translated by Google
Simulation....................................................................................................109
Download...................................................................................................114 Online
Debugging................................................................................................117 4.9.1 Add ILA IP
Core..................................................................................................117
4.9.2 MARK DEBUG......................................................................................................121
Experimental Summary..................................................................................................124
Chapter 5 PLL Experiment in Vivado..........................................................................................125
Experimental Principles.........................................................................................................125
Create a Vivado Project..................................................................................................126
Simulation..................................................................................................................131 On-
Board Verification..................................................................................................132 Chapter
6 FPGA On-Chip RAM Read and Write Test Experiment.................................133 Experimental
Principles..................................................................................................133 Create a Vivado
Project..................................................................................................133 RAM Port Definition
http://www.alinx.com.cn 7 / 244
Machine Translated by Google
http://www.alinx.com.cn 8 / 244
Machine Translated by Google
13.2.1 Create a PL-side DDR4 test project and configure DDR4 IP..........183 13.2.2
Add other test codes..................................................................................185
Download and debug..................................................................................................186
Experimental summary..................................................................................................186
Chapter 14 HDMI output experiment..................................................................................187
Hardware introduction..................................................................................................187
Program design..................................................................................................188
Add XDC constraint file..................................................................................190
Download and debug..................................................................................................191
Experimental summary..................................................................................................192
Chapter 15 HDMI character display experiment..................................................................193
Experimental principle..................................................................................................193
Program design..................................................................................................193
Experimental phenomenon........................................................................................199
Chapter 16 7 17.1.1 AD7606 Timing Sequence..................................................................204
17.1.2 AD7606 Timing Sequence..................................................................204
http://www.alinx.com.cn 9 / 244
Machine Translated by Google
Hardware Introduction..................................................................................................233
20.1.1 Parameter Description of AN9767 Module..................................................................234
20.1.2 Principle Block Diagram of AN9767 Module..................................................................234
20.1.3 Introduction to AD9767 Chip..................................................................................235
20.1.4 Current-to-Voltage Conversion and Amplification................................................236
20.1.5 Current-to-Voltage Conversion and Amplification................................................237
Program Design..................................................................................................237 20.2.1
Generate ROM Initialization File..................................................................238 20.2.2 Dual
Channel Sine Wave Generator Program..................................................240 Experimental
Phenomenon..................................................................................................241
http://www.alinx.com.cn 10 / 244
Machine Translated by Google
The Zynq UltraScale+ MPSoC series is Xilinx's second-generation Zynq platform. Its highlight is that the FPGA contains a complete
The ARM processing subsystem (PS) includes a quad-core Cortex-A53 processor or a dual-core Cortex-A53 plus a dual-core Cortex-R5 processor.
The entire processor is built around the processor, and the processor subsystem integrates a memory controller and a large number of peripherals,
making the processor core completely independent of the programmable logic unit in Zynq. In other words, if the programmable logic unit (PL) is not
used for the time being, the ARM processor subsystem can also work independently. This is fundamentally different from the previous FPGA, which
is processor-centric.
Zynq consists of two functional blocks, the PS part and the PL part. To put it simply, it is the ARM SOC part and the FPGA part. Among
them, the PS integrates the APU ARM Cortex™-A53 processor, the RPU Cortex-R5 processor, the AMBA® interconnect, the internal memory
(OCM), the external memory interface (DDR Controller) and the peripherals (IOU). These peripherals (IOU) mainly include USB bus interface,
Ethernet interface, SD/eMMC interface, I2C bus interface, CAN bus interface, UART interface, GPIO, etc. High-speed interfaces such as PCIe, SATA,
Display Port.
PS: Processing System, PL: Programmable Logic, which It is the part of ARM's SoC that has nothing to do with FPGA.
ZYNQ is a product that closely combines high-performance ARM Cortex-A53 series processors with high-performance FPGAs in a single chip.
http://www.alinx.com.cn
11 / 244
Machine Translated by Google
In order to achieve high-speed communication and data exchange between ARM processor and FPGA, the ARM processor and FPGA are used to play their
To achieve performance advantages, it is necessary to design efficient interconnection paths between the on-chip high-performance processor and the FPGA.
The PL and PS data interaction path is the top priority of ZYNQ chip design and one of the keys to the success or failure of product design.
In this section, we will mainly introduce the connection between PS and PL, so that users can understand the connection technology between PS and PL.
In fact, in the specific design, we often do not need to do too much work on the connection. After we add the IP core,
The system will automatically use the AXI interface to connect our IP core to the processor. We only need to do a little bit of supplementation.
AXI stands for Advanced eXtensible Interface, an interface protocol introduced by Xilinx starting with the 6 series FPGA.
It mainly describes the data transmission method between the master device and the slave device. It is still used in ZYNQ, and the version is AXI4.
We often see AXI4.0, and Zynq internal devices all have AXI interfaces. In fact, AXI is proposed by ARM.
AXI3 is a new on-chip bus that replaces the previous AHB and APB buses.
The second version of AXI, AXI4, was included in AMBA 4.0, which was released in 2010.
The AXI protocol mainly describes the data transmission method between the master device and the slave device. The master device and the slave device communicate through handshake.
The slave device sends a READY signal when it is ready to receive data.
The VALID signal is sent and maintained to indicate that the data is valid. Data is valid only when both the VALID and READY signals are valid.
When these two signals remain valid, the master device will continue to transmit the next data. The master device can cancel
The VALID signal is sent from the slave device, or the slave device cancels the READY signal to terminate the transmission. The AXI protocol is shown in the figure. At T2, the READY signal of the slave device
The signal is valid. At T3, the VILID signal of the master device is valid and data transmission begins.
In Zynq, AXI-Lite, AXI4 and AXI-Stream are supported. From Table 5-1, we can see that these three
AXI4
transfer Address/burst data Bulk transfer of addresses
AXI4-Stream
transfer only data transfer, burst data transfer data stream and media stream transfer
AXI4-Lite:
It is lightweight and simple in structure, suitable for small batches of data and simple control situations. It does not support batch transmission, reading and writing
http://www.alinx.com.cn 12 / 244
Machine Translated by Google
Only one word (32 bits) can be read and written at a time. It is mainly used to access some low-speed peripherals and control peripherals.
AXI4:
The interface is similar to AXI-Lite, but with the addition of a batch transfer function, which can continuously transfer a piece of address
One-shot read and write. In other words, it has the burst function of data reading and writing.
Both of the above two methods use memory mapping control, that is, ARM programs the user-defined IP into a certain address for access, and reads
When writing, it is just like reading and writing your own on-chip RAM, programming is also very convenient, and the development difficulty is relatively low. The cost is that too many
resources are occupied, and additional signal lines such as read address lines, write address lines, read data lines, write data lines, and write response lines are required.
AXI4-Stream:
This is a continuous stream interface that does not require address lines (much like FIFO, just keep reading or writing). For this type of IP, ARM
cannot control it through the above memory mapping method (FIFO has no concept of address at all), and there must be a conversion device, such as the
AXI-DMA module, to achieve the conversion from memory mapping to streaming interface. AXI-Stream is applicable to many occasions: video stream
processing; communication protocol conversion; digital signal processing; wireless communication, etc. Its essence is a data path built for numerical
streams, building a continuous data stream from the source (such as ARM memory, DMA, wireless receiving front end, etc.) to the sink (such as HDMI display,
high-speed AD audio output, etc.). This interface is suitable for real-time signal processing.
Each channel is an independent AXI handshake protocol. The following two figures show the read and write models respectively:
The AXI bus protocol is implemented in hardware inside the ZYNQ chip, including 12 physical interfaces, namely
http://www.alinx.com.cn
13 / 244
Machine Translated by Google
S_AXI_HP{0:3}_FPD interface is a high-performance/bandwidth AXI4 standard interface. There are four of them in total. The PL module is used as
Connected to the main device. Mainly used for PL to access the memory on the PS (DDR and FPD Main Switch)
S_AXI_LPD interface, high-performance port, connects PL to LPD. Low latency access to OCM and TCM, access to PS side
DDR.
S_AXI_HPC{0,1}_FPD interface, connects PL to FPD, can be connected to CCI, access L1 and L2 Cache, by
Because accessing the DDR controller through CCI will have a larger delay.
M_AXI_HPM{0,1}_FPD interface, high-performance bus, PS is the master, connecting FPD to PL, can be used for CPU, DMA, PCIe, etc. to
M_AXI_HPM0_LPD interface, low latency interface bus, PS is the master, connects LPD to PL, can directly access
Ask about the BRAM, DDR, etc. on the PL side, and are also often used to configure the registers on the PL side.
Only M_AXI_HPM{0,1}_FPD and M_AXI_HPM0_LPD are Master Ports, i.e. host interfaces, and the rest are Slave Ports. The host interface has the
authority to initiate reads and writes. ARM can use the two host interfaces to actively access the PL logic, which is actually mapping the PL to a certain
address. Reading and writing PL registers is like reading and writing its own memory. The rest of the slave interfaces are passive interfaces, accepting
reads and writes from the PL. In PS and PL interconnection applications, the most commonly used interfaces are S_AXI_HP{0:3}_FPD, M_AXI_HPM{0,1}
The ARM on the PS side directly supports the AXI interface with hardware, while the PL needs to use logic to implement the corresponding AXI protocol.
Xilinx provides ready-made IPs in the Vivado development environment, such as AXI-DMA, AXI-GPIO, AXI-Dataover, and AXI-Stream, which all
implement corresponding interfaces. When using them, you can directly add them from the Vivado IP list to implement the corresponding functions. The
http://www.alinx.com.cn
14 / 244
Machine Translated by Google
The following is an introduction to the functions of several commonly used AXI interface IPs:
AXI-DMA: realizes the conversion from PS memory to PL high-speed transmission high-speed channel AXI-HP<---->AXI-Stream
AXI-FIFO-MM2S: Implements the conversion from PS memory to PL general transmission channel AXI-HPM<----->AXI-Stream
AXI-VDMA: realizes the conversion from PS memory to PL high-speed transmission high-speed channel AXI-HP<---->AXI-Stream, but does not
However, it is specifically designed for two-dimensional data such as videos and images.
AXI-CDMA: This is done by the PL to move data from one location in memory to another without the CPU intervening. We will give examples of how to
use
these IPs in the following chapters. Sometimes, users need to develop their own
IP communicates with PS, and the corresponding IP can be generated by the wizard. User-defined IP cores can have AXI4-Lite, AXI4, AXI-Stream, PLB
and FSL interfaces. The latter two are not used because they are not supported by ARM.
With the above official IPs and custom IPs generated by the wizard, users do not need to know too much about AXI timing (unless they really encounter
problems), because Xilinx has encapsulated all the details related to AXI timing, and users only need to focus on their own logic implementation.
Strictly speaking, the AXI protocol is a point-to-point master-slave interface protocol. When multiple peripherals need to exchange data with each other,
we need to add an AXI Interconnect module, which is the AXI interconnect matrix. Its function is to provide a switching mechanism that connects one or more
AXI master devices to one or more AXI slave devices (somewhat similar to the switching matrix in a switch).
This AXI Interconnect IP core can support up to 16 master devices and 16 slave devices. If more interfaces are needed, you can add more IP cores.
http://www.alinx.com.cn
15 / 244
Machine Translated by Google
Many-to-one situation
One-to-many situation
http://www.alinx.com.cn
16 / 244
Machine Translated by Google
The AXI interface devices inside ZYNQ are interconnected through an interconnection matrix, which ensures the transmission of data.
In Vivado, Xilinx provides the IP core axi_interconnect to realize this interconnection matrix, which can be called by calling.
AXI Interconnect IP
Since Zynq integrates CPU and FPGA, developers need to design ARM operating system applications.
In addition to developing the device drivers, the hardware logic design of the FPGA part needs to be designed. In the development process, it is necessary to understand
the Linux operating system and the system architecture, and to build a hardware design platform between the FPGA and ARM systems. Therefore, the development of ZYNQ
requires the collaborative design and development of software personnel and hardware personnel. This is the so-called "software and hardware collaborative design" in ZYNQ
development.
The design and development of the ZYNQ system's hardware and software systems require the following development environments and debugging tools:
Xilinx Vivado.
The Vivado design suite implements the design and development of the FPGA part, pin and timing constraints, compilation and
simulation, and the design process from RTL to bitstream. Vivado is not a simple upgrade of the ISE design suite, but a brand new design
suite. It replaces all the important tools of the ISE design suite, such as Project Navigator, Xilinx Synthesis
Xilinx SDK (Software Development Kit), SDK is Xilinx software development kit (SDK), in Vivado hardware system
Based on the system, the system automatically configures some important parameters, including tool and library paths, compiler options, JTAG and
flash settings, debugger connections, and bare metal board support packages (BSP). The SDK also provides drivers for all supported Xilinx IP hard
cores. The SDK supports co-debugging of IP hard cores (on FPGA) and processor software. We can use high-level C or C++ languages to develop
and debug ARM and FPGA systems and test whether the hardware system is working properly. The SDK software is also an automatic Vivado software.
http://www.alinx.com.cn 17 / 244
Machine Translated by Google
process is as follows: 1) Create a new project on Vivado and add an embedded source
file. 2) Add and configure some basic peripherals of PS and PL in Vivado, or add custom peripherals if necessary. 3) Generate the
top-level HDL file in Vivado and add constraint files. Then compile and generate a bitstream file (*.bit). 4) Export the hardware information
to the SDK software development environment, where you can write some debugging software to verify the hardware.
and software, combined with the bitstream file to debug the ZYNQ system alone.
boot.elf and bootloader image in VMware virtual machine. 7) Generate a BOOT.bin in SDK
document.
8) Generate Ubuntu kernel image file Zimage and Ubuntu root file system in VMware.
9) Put the BOOT, kernel, device tree, and root file system files into the SD card, power on the development board, and Linux
The above is a typical ZYNQ development process, but ZYNQ can also be used as an ARM alone, so there is no need to close
ZYNQ can also use only the PL part, but the PL configuration still needs to be completed by PS, which means that it is impossible to solidify the
Learning ZYNQ requires higher standards than learning traditional development tools such as FPGA, MCU, ARM, etc., and learning ZYNQ well is not something that
ÿ Principles of computer
organization ÿ C, C++
ÿ C Language ÿ
http://www.alinx.com.cn
18 / 244
Machine Translated by Google
The 2020 version of the development board (model: AXU3EG) based on the XILINX Zynq UltraScale+ MPSoCs development platform has been officially
released by Xinyi Electronic Technology (Shanghai) Co., Ltd. In order to let you quickly understand this development platform, we have compiled this user manual.
This MPSoCs
development platform adopts the mode of core board plus expansion board, which is convenient for users to secondary develop and utilize the core board.
The core board uses the XILINX Zynq UltraScale+ EG chip ZU3EG solution, which uses Processing System (PS) + Programmable Logic
(PL) technology to integrate dual-core ARM Cortex-A53 and FPGA programmable logic on a single chip. In addition, the PS side of the core board has 4
high-speed DDR4 SDRAM chips with a total of 4GB, 1 8GB eMMC storage chip and 1 256Mb QSPI FLASH chip; the PL side of the core board has a 1GB
DDR4 SDRAM chip. In the baseboard design, we have expanded a wealth of peripheral interfaces for users, such as 1 FMC LPC interface, 1 SATA M.2
interface, 1 DP interface, 1 USB3.0 interface, 1 Gigabit Ethernet interface, 1 UART serial port interface, 1 SD card interface, 2 40-pin expansion interfaces, 2
CAN bus interfaces, 2 RS485 interfaces, etc. It is a "professional-level" ZYNQ development platform that meets users' requirements for high-speed data
exchange, data storage, video transmission processing, deep learning, artificial intelligence, and industrial control. It provides the possibility for high-speed data
transmission and exchange, early verification of data processing, and later application.
I believe that such a product is very suitable for students, engineers and other groups engaged in MPSoCs development.
http://www.alinx.com.cn
19 / 244
Machine Translated by Google
Here, a brief functional introduction to this AXU3EG MPSoCs development platform is given.
The entire structure of the development board inherits our usual core board + expansion board model.
The core board is mainly composed of the minimum system of ZU3EG + 5 DDR4 + eMMC +1 QSPI FLASH. ZU3EG adopts
Xilinx's Zynq UltraScale+ MPSoCs EG series chip, model number is XCZU3EG-1SFVC784I. ZU3EG chip
It can be divided into the processor system (PS) and the programmable logic (PL).
Four DDR4 chips are connected to the PS and one DDR4 chip is connected to the PL of the ZU3EG chip. Each DDR4 chip has a capacity of up to 1G bytes.
The ARM system and FPGA system can independently process and store data. The 8GB eMMC FLASH storage chip on the PS side and
One 256Mb QSPI FLASH is used to statically store the MPSoCs' operating system, file system, and user data.
The baseboard expands the core board with a variety of peripheral interfaces, including 1 M.2 interface, 1 DP interface, 4
USB3.0 interface, 2 Gigabit Ethernet interfaces, 2 UART serial ports, 1 SD card interface, 2 40-pin expansion connectors
port, 2 CAN bus interfaces, 2 RS485 interfaces, 1 MIPI interface and some key LEDs.
The following figure is a schematic diagram of the structure of the entire development system:
USB3.0
Interface x2
USB3320
DP output Core board
QSPI eMMC
C
FLASH FLASH
GL3523
USB3.0
USB
CP2102 33.333M hz
port x2
UART
XILINX
USB
CP2102
UltraScale+ Ethernet port
UART 200Mhz KSZ9031R
MPSoC
XCZU3EG/
DDR4
SN65HVD
CANx2 XCZU4EV
232 KSZ9031R Ethernet port
TXS0261
SD Card
MIPI
JTAG LED&KEY Si5332 2RTW
Connectors
Through this schematic diagram, we can see the interfaces and functions that our development platform can contain.
It consists of ZU3EG+4GB DDR4 (PS)+1GB DDR4 (PL)+8GB eMMC FLASH+256Mb QSPI FLASH.
There are also two crystal oscillators to provide clocks, a single-ended 33.3333MHz crystal oscillator for the PS system and a differential 200MHz crystal oscillator
ÿ M.2 interface
1 PCIEx1 standard M.2 interface, used to connect M.2 SSD solid state drives, with a communication speed of up to 6Gbps.
http://www.alinx.com.cn
20 / 244
Machine Translated by Google
ÿ DP output interface
1 standard Display Port output interface for video image display. Supports up to 4K@30Hz or 1080P@60Hz output. ÿ USB3.0
interface
4 USB3.0 HOST ports, USB port type is TYPE A. Used to connect external USB peripherals, such as mouse
2 10/100M/1000M Ethernet RJ45 interfaces, 1 each for PS and PL. Used to exchange Ethernet data with computers or other network
2-way Uart to USB interface, 1 for PS and 1 for PL. Used to communicate with the computer, convenient for user debugging.
Silicon Labs CP2102GM USB-UAR chip, the USB interface uses a MINI USB interface.
1 Micro SD card slot, used to store operating system images and file systems.
2 expansion ports with 40 pins and 2.54mm pitch, which can be connected to various modules of Heijin (binocular camera, TFT LCD screen,
High-speed AD module, etc.). The expansion port includes 1 5V power supply, 2 3.3V power supplies, 3 grounds, and 34 IO ports.
2-way CAN bus interface, using TI's SN65HVD232 chip, the interface uses a 4-pin green terminal block. ÿ 485 communication
interface
2-way 485 communication interface, using MAXIM's MAX3485 chip. The interface uses a 6-pin green terminal block. ÿ MIPI interface
2 LANE MIPI camera input interfaces, used to connect to MIPI camera module (AN5641). ÿ JTAG debug port
1 10-pin 2.54mm standard JTAG port for downloading and debugging FPGA programs. Users can use XILINX
a temperature and humidity sensor chip LM75, which is used to detect the temperature and humidity of the board's surrounding environment.
ÿ EEPROM
light
5 LEDs, 2 on the core board and 3 on the base board. 1 power indicator and 1 DONE indicator on the core board
Configuration indicator light. There is 1 power indicator light and 2 user indicator lights on the bottom
panel. ÿ Button
http://www.alinx.com.cn
21 / 244
Machine Translated by Google
2.1.1 Introduction
ACU3EG (core board model, the same below) core board, ZYNQ chip is based on XILINX's Zynq UltraScale+
This core board uses 5 Micron DDR4 chips MT40A512M16GE, of which 4 DDR4 chips are mounted on the PS side, forming a 64-
bit data bus bandwidth and 4GB capacity. One chip is mounted on the PL side, with a 16-bit data bus width and 1GB capacity. The
maximum operating speed of the DDR4 SDRAM on the PS side can reach 1200MHz (data rate 2400Mbps), and the maximum operating
speed of the DDR4 SDRAM on the PL side can reach 1066MHz (data rate 2132Mbps). In addition, a 256MBit QSPI FLASH and an
8GB eMMC FLASH chip are also integrated on the core board for boot storage configuration and system files. In order to connect to the
baseboard, the 4 board-to-board connectors of this core board extend the USB2.0 interface on the PS side, with Gigabit and higher speeds.
Ethernet interface, SD card interface and other remaining MIO ports; 4 pairs of PS MGT high-speed transceiver interfaces are also
expanded; and almost all IO ports on the PL side (HP I/O: 96, HD I/O: 84). The routing between the XCZU3EG chip and the interface is
equal-length and differential, and the core board size is only 80*60 (mm), which is very suitable for secondary development.
The development board uses the Zynq UltraScale+ MPSoCs EG series chip from Xilinx.
http://www.alinx.com.cn
22 / 244
Machine Translated by Google
XCZU3EG-1SFVC784I. The PS system of the ZU3EG chip integrates four ARM Cortex™-A53 processors with a speed of up to 1.2Ghz and
supports 2-level cache; it also includes two Cortex-R5 processors with a speed of up to 500Mhz.
ZU3EG chip supports 32-bit or 64-bit DDR4, LPDDR4, DDR3, DDR3L, LPDDR3 memory chips.
The PL end has a rich set of high-speed interfaces such as PCIE Gen2, USB3.0, SATA 3.1, DisplayPort; it also supports USB2.0, Gigabit Ethernet,
SD/SDIO, I2C, CAN, UART, GPIO and other interfaces. The PL end contains a rich set of programmable logic units, DSP and internal RAM.
The overall block diagram of the ZU3EG chip is shown in Figure 2-2-1
- ARM quad-core Cortex™-A53 processor, up to 1.2GHz, 32KB level 1 instruction and data cache per CPU
- ARM dual-core Cortex-R5 processor, up to 500MHz, 32KB level 1 instruction and data cache per CPU,
- External storage interface, supports 32/64bit DDR4/3/3L, LPDDR4/3 interface. - Static storage
interface, supports PCIe Gen2 x4, 2xUSB3.0, Sata 3.1, DisplayPort, 4x Tri-mode Gigabit
Ethernet.
http://www.alinx.com.cn
23 / 244
Machine Translated by Google
- Common connection interfaces: 2xUSB2.0, 2x SD/SDIO, 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIO.
- System monitoring: 10-bit 1Mbps AD sampling for temperature and voltage detection.
- Flip-flops: 141K;
The speed grade of the XCZU3EG-1SFVC784I chip is -1, industrial grade, and the package is SFVC784.
The ACU3EG core board is equipped with five Micron 512MB DDR4 chips, model MT40A512M16GE-
083E, where the PS side mounts 4 DDR4s, forming a 64-bit data bus bandwidth and 4GB capacity. The PL side mounts 1 16-bit
The data bus width and capacity of 1GB. The maximum operating speed of DDR4 SDRAM on the PS side can reach 1200MHz (data rate
2400Mbps), 4 DDR4 memory systems are directly connected to the memory interface of BANK504 of PS.
The maximum operating speed of SDRAM can reach 1066MHz (data rate 2133Mbps), and one DDR4 is connected to the BANK64 of the FPGA.
The hardware design of DDR4 requires strict consideration of signal integrity. We have fully considered the signal integrity in circuit design and PCB design.
Matching resistors/terminal resistors, trace impedance control, and trace equal length control are taken into consideration to ensure high-speed and stable operation of DDR4.
The hardware connection method of DDR4 on the PS side is shown in Figure 2-3-1:
U1
U12,U14,U15,U16
Data 64 bits
QUR
Ultra BANK
504 DDR4
Scale+ Clock address line, control line
(MT40A512M1
6GE-083E)
http://www.alinx.com.cn 24 / 244
Machine Translated by Google
The hardware connection method of DDR4 DRAM on the PL side is shown in Figure 2-3-2:
U1
U17
Data 16 bits
http://www.alinx.com.cn 25 / 244
Machine Translated by Google
http://www.alinx.com.cn 26 / 244
Machine Translated by Google
http://www.alinx.com.cn 27 / 244
Machine Translated by Google
http://www.alinx.com.cn 28 / 244
Machine Translated by Google
The ACU3EG core board is equipped with a 256MBit Quad-SPI FLASH chip to form an 8-bit bandwidth data bus.
The FLASH model is MT25QU256ABA1EW9, which uses the 1.8V CMOS voltage standard.
Features, in use, it can be used as the system boot device to store the system boot image. These images mainly include
FPGA bit files, ARM application code and other user data files.
QSPI FLASH is connected to the GPIO port of BANK500 in the PS part of ZYNQ chip. This needs to be configured in system design.
The GPIO port function of the PS end is the QSPI FLASH interface. Figure 4-1 shows the QSPI Flash part in the schematic diagram.
U1
U5
QUR QSPI0_CS
http://www.alinx.com.cn 29 / 244
Machine Translated by Google
The ACU3EG core board is equipped with a large-capacity 8GB eMMC FLASH chip, model
MTFC8GAKAJCN-4M, it supports the HS-MMC interface of JEDEC e-MMC V5.0 standard, the level supports 1.8V or
3.3V. The data width of the connection between eMMC FLASH and ZYNQ is 8bit. Due to the large capacity and non-volatile nature of eMMC FLASH
In the ZYNQ system, it can be used as a large-capacity storage device for the system, such as storing ARM applications,
System files and other user data files. The specific models and related parameters of eMMC FLASH are shown in Table 2-5-1.
eMMC FLASH is connected to the GPIO port of BANK500 of the PS part of ZYNQ UltraScale+.
To configure the GPIO ports of these PS terminals as EMMC interfaces, see Figure 2-5-1 for the schematic diagram of the eMMC Flash.
point.
U1
U19
MMC_CCLK
QUR MMC_CMD eMMC
Ultra BANK
500
(MTFC8GAKAJ
CN-4M)
Scale+ MMC_DAT0~MMC_DAT7
http://www.alinx.com.cn 30 / 244
Machine Translated by Google
The core board provides reference clock and RTC real-time clock for PS system and PL logic respectively, making PS system and PL logic
U1
Y2
Passive crystal
oscillator 32.768Khz
BANK
503 X1
QUR PS_CLK Single-ended clock
Ultra 33.33Mhz
Scale+ G1
BANK PL_CLK0_P
Differential Clock
PL_CLK0_N
64 200Mhz
The passive crystal Y2 on the core board provides a 32.768KHz real-time clock source for the PS system.
The chip's BANK503's PS_PADI_503 and PS_PADO_503 pins are connected to the chip's BANK503's PS_PADI_503 and PS_PADO_503 pins. The schematic diagram is shown in Figure 2-6-2:
PS_PADI_503 N17
PS_PADO_503 N18
The X1 crystal oscillator on the core board provides a 33.333MHz clock input for the PS part. The clock input is connected to the ZYNQ chip
http://www.alinx.com.cn
31 / 244
Machine Translated by Google
PS_CLK R16
A differential 200MHz PL system clock source is provided on the board for the reference clock of the DDR4 controller.
Connected to the global clock (MRCC) of PL BANK64, this global clock can be used to drive the DDR4 controller and
User logic circuit. The schematic diagram of the clock source is shown in Figure 2-6-4
PL_CLK0_P AE5
PL_CLK0_N AF5
The ACU3EG core board has a red power indicator light (PWR) and a configuration LED light (DONE).
After the FPGA is configured, the power indicator will light up; after the FPGA is configured, the configuration LED will light up.
http://www.alinx.com.cn
32 / 244
Machine Translated by Google
U1
3.3V
1.8V
D21
Ultra 503
D2
Scale+ (DONE indicator light)
The ACU3EG core board is powered by +12V, which is connected to the baseboard.
The chip TPS6508641 generates all the power required by the XCZU3EG chip. For the TPS6508641 power design, please refer to the power chip
http://www.alinx.com.cn
33 / 244
Machine Translated by Google
In addition, the VCCIO power supply of BANK65 and BANK66 of the XCZU3EG chip is provided by the baseboard, which is convenient for users to modify.
http://www.alinx.com.cn
34 / 244
Machine Translated by Google
Top View
The core board has a total of 4 high-speed expansion ports, using 4 120-pin board connectors (J29~J32) and the bottom board connector.
The connector used is Panasonic's AXK5A2137YG, and the corresponding connector model of the baseboard is AXK6A2337YG.
BANK65, BANK66 IO, J30 connects BANK25, BANK26, BANK66 IO and BANK505 MGT transceiver signal
No., J31 connects to IO of BANK24, BANK44, J32 connects to MIO of PS, VCCO_65, VCCO_66 and +12V power supply.
The IO level standard of BANK43~46 is 3.3V, and the level standard of BANK65,66 is determined by VCCO_65 of the baseboard.
The voltage level of MIO is determined by the VCCO_66 power supply, but cannot exceed +1.8V; the voltage level of MIO is also 1.8V.
J29 Pin Signal Name Pin Number J29 Pin Signal Name Pin Number
1 B65_L2_N V9 2 B65_L22_P K8
3 B65_L2_P U9 4 B65_L22_N K7
5 GND - 6 GND -
7 B65_L4_N T8 8 B65_L20_P J6
9 B65_L4_P R8 10 B65_L20_N H6
11 GND - 12 GND -
13 B65_L1_N Y8 14 B65_L6_N T6
15 B65_L1_P W8 16 B65_L6_P R6
17 GND - 18 GND -
19 B65_L7_P L1 20 B65_L17_P N9
http://www.alinx.com.cn 35 / 244
Machine Translated by Google
twenty one
B65_L7_N K1 twenty two
B65_L17_N N8
twenty three GND - twenty four GND -
25 B65_L15_P N7 26 B65_L9_P K2
27 B65_L15_N N6 28 B65_L9_N J2
29 GND - 30 GND -
31 B65_L16_P P7 32 B65_L3_N V8
33 B65_L16_N P6 34 B65_L3_P U8
35 GND - 36 GND -
37 B65_L14_P M6 38 B65_L19_P J5
39 B65_L14_N L5 40 B65_L19_N J4
41 GND - 42 GND -
43 B65_L5_N T7 44 B65_L18_P M8
45 B65_L5_P R7 46 B65_L18_N L8
47 GND - 48 GND -
49 B65_L11_N K3 50 B65_L8_P J1
51 B65_L11_P K4 52 B65_L8_N H1
53 GND - 54 GND -
55 B65_L10_N H3 56 B65_L24_N H8
57 B65_L10_P H4 58 B65_L24_P H9
59 GND - 60 GND -
61 B66_L3_P F2 62 B65_L12_P L3
63 B66_L3_N E2 64 B65_L12_N L2
65 GND - 66 GND -
67 B66_L1_P G1 68 B65_L13_N L6
69 B66_L1_N F1 70 B65_L13_P L7
71 GND - 72 GND -
73 B66_L6_P G5 74 B65_L21_P J7
75 B66_L6_N F5 76 B65_L21_N H7
77 GND - 78 GND -
79 B66_L16_P G8 80 B65_L23_P K9
81 B66_L16_N F7 82 B65_L23_N J9
83 GND - 84 GND -
85 B66_L15_P G6 86 B66_L5_N E3
87 B66_L15_N F6 88 B66_L5_P
89 GND - 90 GND -
91 B66_L4_P G3 92 B66_L2_P E1
93 B66_L4_N F3 94 B66_L2_N D1
95 GND - 96 GND -
97 B66_L11_P D4 98 B66_L20_P C6
99 B66_L11_N C4 100 B66_L20_N B6
101 GND - 102 GND -
http://www.alinx.com.cn 36 / 244
Machine Translated by Google
J30 Tube Signal name Pin No. J30 Pin Signal Name Pin Number
foot
13 B66_L19_N A5 14 B66_L21_N A6
15 B66_L19_P B5 16 B66_L21_P A7
17 GND - 18 GND -
19 B66_L24_P C9 20 B66_L17_P F8
twenty one
B66_L24_N B9 twenty two
B66_L17_N E8
twenty three GND - twenty four GND -
http://www.alinx.com.cn 37 / 244
Machine Translated by Google
J31 Pin Signal Name Pin Number J31 Pin Signal Name Pin Number
31 AG13 32 - -
B24_L3_P
33 AH13 34 - -
B24_L3_N
35 GND - 36 GND -
http://www.alinx.com.cn 38 / 244
Machine Translated by Google
79 - - 80 P16
PS_POR_B
81 - - 82 - -
83 GND - 84 GND -
86 - - 86 - -
87 - - 88 - -
89 GND - 90 GND -
91 224_CLK0_P Y6 92 224_CLK1_P V6
93 224_CLK0_N Y5 94 224_CLK1_N V5
95 GND - 96 GND -
97 224_RX3_P P2 98 224_TX3_P N4
99 224_RX3_N P1 100 224_TX3_N N3
101 GND - 102 GND -
J32 Pin Signal Name Pin Number J32 Pin Signal Name Pin Number
7 - - 8 F18
PS_MIO58
9 - - 10 D16
PS_MIO53
11 GND - 12 GND -
31 - - 32 K15
PS_MIO28
33 PS_MIO77 F20 34 PS_MIO59 E17
http://www.alinx.com.cn 39 / 244
Machine Translated by Google
35 GND - 36 GND -
49 - - 50 A18
PS_MIO65
51 PS_MIO40 K18 52 PS_MIO66 G19
53 GND - 54 GND -
97 - - 98 - -
99 - 100 -
VCCO_65 VCCO_66
101 - 102 -
VCCO_65 VCCO_66
103 - 104 -
VCCO_65 VCCO_66
105 GND - 106 GND -
http://www.alinx.com.cn 40 / 244
Machine Translated by Google
Expansion Board
2.2.1 Introduction
functions of the
interface 1 DP output
communication interfaces 2
485 communication
interfaces JTAG
debugging port 1
temperature sensor 1
EEPROM 1 RTC
http://www.alinx.com.cn 41 / 244
Machine Translated by Google
The AXU3EG development board is equipped with a PCIE x1 standard M.2 interface for connecting to an M.2 SSD solid state drive.
The communication speed is up to 6Gbps. The M.2 interface uses an M key slot and only supports PCI-E, not SATA. Users can choose SSD solid state
When choosing a hard drive, you need to choose a PCIE type SSD solid state drive.
The PCIE signal is directly connected to the ZU3EG BANK505 PS MGT transceiver. Both the TX signal and the RX signal are connected in
The differential signal is connected to LANE1 of MGT. The PCIE clock is provided by Si5332 chip with a frequency of 100Mhz.
The schematic diagram of the road design is shown in Figure 3-2-1 below:
U1
PCIE_TX_P PCIE_TX_C_P
PCIE_TX_N PCIE_TX_C_N
QUR BANK
505
PCIE_RX_P
PCIE_RSTn_MIO37 M2_PCIE_RST_N
Level conversion
2.2.3 DisplayPort
The AXU3EG development board has a standard DisplayPort output interface for video image display.
Supports VESA DisplayPort V1.2a output standard, supports up to 4K x 2K@30Fps output, supports Y-only, YCbCr444, YCbCr422, YCbCr420 and RGB video
The DisplayPort data transmission channel is directly driven by ZU3EG's BANK505 PS MGT output, and the MGT's LANE2 and
http://www.alinx.com.cn 42 / 244
Machine Translated by Google
LANE3 TX signal is connected to the DP connector as a differential signal. DisplayPort auxiliary channel is connected to the MIO pin of PS
The schematic diagram of DP output interface design is shown in Figure 3-3-1 below:
U1
DP reference clock
27Mhz
Si5332
GT0_DP_TX_P GT0_DP_TX_C_P
BANK
GT0_DP_TX_N GT0_DP_TX_C_N
505
MGT
QUR GT1_DP_TX_P GT1_DP_TX_C_P
Scale+ U46
U37
DP_AUX_OUT DP_AUX_OUT_LS
DPAUX_P
DP_AUX_IN DP_AUX_IN_LS
BANK Level
Single-ended to
DPAUX_N
differential
DP_OE DP_OE_LS
501 Conversion
DP_HPD DP_HPD_LS
The AXU3EG expansion board has 4 USB3.0 interfaces, supports HOST working mode, and the data transmission speed is up to
5.0Gb/s. USB3.0 is connected via the PIPE3 interface, and USB2.0 is connected to the external USB3320C chip via the ULPI interface.
The USB interface is a flat USB interface (USB Type A), which is convenient for users to connect different USB Slave peripherals (such as USB
http://www.alinx.com.cn 43 / 244
Machine Translated by Google
Mouse, keyboard or USB flash drive). The schematic diagram of USB3.0 connection is shown in 3-4-1:
U1
U69
U5
USB_CLK
USB_STP
DP/DM
USB_NXT
BANK USB PHY
502 USB__DIR (USB3320C)
USB3.0 HUB
Ultra 501
USB_RESET_N
Scale+
USB_SSTXP USB_TXP_UP
USB_SSTXN USB_TXN_UP
BANK
505 USB_SSRXP USB_RXP_UP
U1
MGT USB_SSRXN USB_RXN_UP
26Mhz
Si5332
http://www.alinx.com.cn 44 / 244
Machine Translated by Google
The AXU3EG expansion board has two Gigabit Ethernet interfaces, one connected to the PS end and the other connected to the PL end.
The chip uses Micrel's KSZ9031RNX Ethernet PHY chip to provide users with network communication services.
The chip supports 10/100/1000 Mbps network transmission rate and communicates with the MAC layer of the ZU3EG system through the RGMII interface.
Communication. KSZ9031RNX supports MDI/MDX adaptation, various speed adaptation, Master/Slave adaptation, and supports MDIO
When KSZ9031RNX is powered on, it will detect the level status of some specific IOs to determine its own working mode. Table 3-5-1
PHYAD[2:0]
PHY address for MDIO/MDC mode PHY Address is 011
CLK125_EN
Enable 125Mhz clock output selection enable
LED_MODE
LED Light Mode Configure a single LED light mode
MODE0~MODE3
Link auto-adaptation and full-duplex configuration 10/100/1000 auto-adaptation, compatible with full-duplex
Duplex, Half-duplex
When the network is connected to Gigabit Ethernet, data transmission between ZYNQ and PHY chip KSZ9031RNX is done through RGMII bus.
The transmission clock is 125Mhz and data is sampled on the rising and falling edges of the clock.
When the network is connected to 100M Ethernet, the data transmission between ZYNQ and PHY chip KSZ9031RNX is through RMII bus
Communication, the transmission clock is 25Mhz. Data is sampled on the rising and falling edges of the clock.
Figure 3-5-1 is a schematic diagram of the ZYNQ Ethernet PHY chip connection:
U1
U4
J6
RGMII TX
GPHY
BANK
(KSZ9031RNX)
502
RGMII RX
QUR
Ultra U22
Scale+ RGMII TX
J11
BANK GPHY
66 (KSZ9031RNX)
RGMII RX
http://www.alinx.com.cn
45 / 244
Machine Translated by Google
The AXU3EG expansion board is equipped with two Uart to USB interfaces, one connected to the PS end and one connected to the PL end.
The chip is replaced with Silicon Labs CP2102GM USB-UAR chip, and the USB interface uses a MINI USB interface, which can be used with USB
http://www.alinx.com.cn 46 / 244
Machine Translated by Google
Connect it to the USB port of the PC through a cable for serial data communication. The schematic diagram of the USB Uart circuit design is shown below:
U1
U10 U9
PS_UART1_RX
J7
PS_UART0_TX U9_RXD
RxD VBUS
BANK
UART-USB REGIN
501 PS_UART0_RX
Level conversion
U9_TXD (CP2102-GM)
TXD
D+/-
Ultra
U11
Scale+ J8
PL_UART_RX
RxD VBUS
BANK UART-USB REGIN
43 PL_UART_TX (CP2102-GM)
TXD
D+/-
Micro USB
The AXU3EG expansion board includes a Micro SD card interface to provide users with access to the SD card memory for storing
ZU3EG chip BOOT program, Linux operating system kernel, file system and other user data files.
The SDIO signal is connected to the IO signal of the PS BANK501 of ZU3EG, because the VCCIO of 501 is set to 1.8V, but the data of the SD card is
The data level is 3.3V, so we use the TXS02612 level converter to connect. The schematic diagram of the ZU3EG PS and SD card connector is as follows:
http://www.alinx.com.cn 47 / 244
Machine Translated by Google
U1
U24
SD_D0~D3 D0~D3
QUR CLK
Ultra BANK SD_CLK
TXS02612
CMD
Scale+ 501 SD_CMD
SD_CD
MICRO SD
The AXU3EG expansion board has two 2.54mm standard pitch 40-pin expansion ports J45 and J46 reserved for connecting the black gold
The expansion port has 40 signals, including 1 5V power supply, 1 3.3V power supply, and 1 5V power supply.
2 sources, 3 grounds, 34 IO ports. The IO of the expansion port is connected to the IO of the ZYNQ chip BANK44, 24, 25, 26, and the level is
J45 Pin Signal Name Pin Number J17 Pin Signal name Pin Number
1 GND - 2 +5V -
http://www.alinx.com.cn 48 / 244
Machine Translated by Google
39 +3.3V - 40 +3.3V -
J46 Pin Signal Name Pin Number J13 Pin Signal Name Pin Number
1 GND - 2 +5V -
39 +3.3V - 40 +3.3V -
There are 2 CAN communication interfaces on the AXU3EG expansion board, which are connected to the MIO interface of BANK501 on the PS system side.
The CAN transceiver chip uses TI's SN65HVD232C chip to provide user CAN communication services.
Figure 3-9-1 is the connection diagram of the CAN transceiver chip on the PS side
http://www.alinx.com.cn 49 / 244
Machine Translated by Google
U1
U28
SN65HVD232
PS_CAN1_TX TXD CANL
QUR BANK
Ultra 501
U30
Scale+
PS_CAN2_RX RxD CANH
SN65HVD232
PS_CAN2_TX TXD CANL
There are two 485 communication interfaces on the AXU3EG expansion board. The 485 communication ports are connected to the IO interfaces of BANK43~45 at the PL end.
The 485 transceiver chip uses MAXIM's MAX3485 chip to provide user 485 communication services.
Figure 3-11-1 is the connection diagram of the PL end 485 transceiver chip
U1
U12
PL_485_RXD1
RO B
PL_485_TXD1 MAX3485
DI A
/RE
QUR PL_485_DE1
DE
BANK
Ultra 43,44
U2
Scale+ 45
PL_485_RXD2
RO B
PL_485_TXD2 MAX3485
DI A
/RE
PL_485_DE2
DE
http://www.alinx.com.cn 50 / 244
Machine Translated by Google
The baseboard contains a MIPI camera interface, which can be used to connect our MIPI OV5640 camera module
(AN5641). MIPI interface 15PIN FPC connector, for 2 LANE data and 1 pair of clock, connected to BANK65
The voltage level of the differential IO pins is 1.2V; other control signals are connected to the IO of BANK43, and the voltage level is
3.3V.
U1
J23
MIPI_CLK_P/N
BANK
65 MIPI_LAN0_P/N
MIPI_LAN1_P/N
QUR MIPI
Ultra Connectivity
Device
Scale+ BANK
CAM_GPIO
CAM_CLK
43 CAM_SCL
CAM_SDA
http://www.alinx.com.cn 51 / 244
Machine Translated by Google
A JTAG interface is reserved on the AXU3EG expansion board for downloading ZYNQ UltraScale+ programs or firmware.
To prevent damage to the ZYNQ UltraScale+ chip caused by live plugging and unplugging, we added protection to the JTAG signal.
The diode ensures that the signal voltage is within the range accepted by the FPGA to avoid damage to the ZYNQ UltraScale+ chip.
The ZU3EG chip has an internal RTC real-time clock function, which has year, month, day, hour, minute, second and week timekeeping functions.
Connect a 32.768KHz passive clock to provide an accurate clock source to the internal clock circuit so that the RTC can accurately
To provide clock information. At the same time, in order to ensure that the real-time clock can still operate normally after the product loses power, it is generally necessary to equip it with a power supply.
The battery supplies power to the clock chip. The BT1 on the development board is a 1.5V button battery (model LR1130, voltage is 1.5V).
The button battery can also power the RTC system and provide continuous time information. Figure 3-12-1 shows the RTC system.
http://www.alinx.com.cn 52 / 244
Machine Translated by Google
U1
Y2
Scale+ VBAT_IN
The AXU3EG development board has an EEPROM onboard, model 24LC04, capacity: 4Kbit (2*256*8bit), through IIC
The bus is connected to the PS end for communication. In addition, the board also has a high-precision, low-power, digital temperature sensor chip, model
The temperature accuracy of the LM75 chip is 0.5 degrees. The EEPROM and temperature sensor are
The I2C bus is mounted on the Bank500 MIO of ZYNQ UltraScale+. Figure 3-14-1 is the schematic diagram of the EEPROM and temperature sensor.
U27
U1
LM75
Change
PS_IIC_B_SCL
PS_IIC_B_SDA
EEPROM
Scale+
There are 3 LEDs on the AXU3EG expansion board. They include 1 power indicator, 1 PS control indicator, and 1
http://www.alinx.com.cn 53 / 244
Machine Translated by Google
The user can control the on and off of the indicator light through the program. When the IO voltage connected to the user LED light is low, the user
When the IO voltage is high, the user LED will be lit. The user LED hardware connection diagram is as follows:
3.3V
U1 3.3V
PS_LED
Ultra 501
Scale+ BANK
43
2.2.16 Buttons
The AXU3EG expansion board has a reset button RESET and two user buttons. The reset signal is connected to the reset pin of the core board.
Chip input, user can use this reset button to reset the ZYNQ system. User button 1 connected to the MIO of PS
The reset button and user button are both low level effective. The connection diagram of the user button is as follows:
http://www.alinx.com.cn
54 / 244
Machine Translated by Google
U1
PS KEY
BANK PS_KEY1
501
QUR
Ultra
PL KEY
Scale+ BANK PL_KEY1
43
There is a 4-bit DIP switch SW1 on the development board to configure the startup mode of the ZYNQ system.
The platform supports 4 boot modes. These 4 boot modes are JTAG debug mode, QSPI FLASH, EMMC and SD2.0 card.
Startup mode. After the ZU3EG chip is powered on, it will detect the level of (PS_MODE0~3) to determine the startup mode. Users can
Use the DIP switch SW1 on the expansion board to select different startup modes. The SW1 startup mode configuration is shown in Table 3-17-1 below.
Show.
SW1 MODE[3:0]
DIP switch position (1, 2, 3, 4) Startup Mode
0000 PS JTAG
ON, ON, ON, ON
0101
ON, OFF, ON, OFF SD Card
http://www.alinx.com.cn 55 / 244
Machine Translated by Google
0110 EMMC
ON, OFF, OFF, ON
The power input voltage of the AXU3EG development board is DC12V. The baseboard is connected via a DC/DC power chip TPS54620 and 2
The DC/DC power chip MP1482 converts it into +5V, +3.3V, +1.8V. In addition, the baseboard generates +1.2V through LDO to the core board
BANK65 is powered by +1.8V, and BANK66 is powered by +1.8V. The power supply design diagram on the board is shown in Figure 3-18-1:
U38
+12V 2A +5.0V/6A
12V power supply TPS54620
U35
+1.2V
LDO
U40
+3.3V/2A
MP1482
U39
+1.8V/2A
MP1482
The functions of each power distribution are shown in the following table:
+5.0V
USB Power Supply
http://www.alinx.com.cn 56 / 244
Machine Translated by Google
2.2.19 Fan
Because ZU3EG generates a lot of heat when it is working normally, we added a heat sink and fan to the chip on the board.
The fan is controlled by the ZYNQ chip, and the control pin is connected to the IO of BANK43.
(AA11), if the IO level output is low, the MOSFET is turned on and the fan works; if the IO level output is high, the fan stops.
The fan has been fixed to the development board with screws before leaving the factory. The power supply of the fan is connected to the J42 socket, with the red one being the positive pole
http://www.alinx.com.cn
57 / 244
Machine Translated by Google
http://www.alinx.com.cn 58 / 244
Machine Translated by Google
Introduction
This article mainly introduces the basic modules of Verilog, which will lay a solid foundation and will be of great help in in-depth study of FPGA.
Data Types
3.2.1 Constants
For example, 8'b00001111 represents a binary integer with a width of 8 bits, and 4'ha represents a hexadecimal integer with a width of 4 bits.
X and Z: X represents an indeterminate value, and z represents a high resistance value. For example, 5'b00x11, the third digit is an indeterminate value, and 3'b00z represents the lowest resistance value.
Underscore: When the number of bits is too long, it can be used to split the number of bits to improve program readability, such as 8'b0000_1111
Parameter parameter: parameter can use identifiers to define constants. When using it, only the identifier is used, which improves readability.
For example, if parameter width = 8 is defined, register reg [width-1:0] a is defined, which means that the register width is 8 bits.
register.
Parameter passing: If a module has defined parameters, the parameters can be passed when other modules call this module.
You can modify the parameters as shown below, indicated by #() after module.
endmodule r1
(
,
,
.addr(addr) .data(data) .result(result)
);
endmodule
Parameter can be used to pass parameters between modules, while localparam is only used within this module and cannot be used to pass parameters between modules.
http://www.alinx.com.cn
59 / 244
Machine Translated by Google
variable
A variable is a quantity whose value can be changed while a program is running. The following mainly introduces several commonly used variable types.
Wire type variables, also called network type variables, are used for physical connections between structural entities, such as between doors.
It can store values and use the continuous assignment statement assign to assign values. It is defined as wire [n-1:0] a; where n represents the bit width.
For example, if we define wire a; assign a = b;, it connects the node of b to the wire a. As shown in the figure below, the wire between two entities is a wire
type variable.
Reg type variables, also called register variables, can be used to store values and must be used in an always statement.
for
reg [n-1:0] a; indicates a register with a width of n bits, such as reg [7:0] a; indicates a register a with a width of 8 bits.
The register q is defined as shown below. The generated circuit is sequential logic. The right figure
shows its structure, which is a D flip-
flop. module
top(d, clk, q) ;
input d ; input clk ; output reg q ;
endmodule
It is also possible to generate combinational logic, such as data selectors, where sensitive signals have no clocks, define reg mux, and finally generate electrical
http://www.alinx.com.cn
60 / 244
Machine Translated by Google
end
endmodule
The memory type can be used to define RAM, ROM and other memories. Its structure is reg [n-1:0] memory name [m-1:0],
It means m registers with a width of n bits. For example, reg [7:0] ram [255:0] means 256 8-bit registers are defined, 256 is the depth of the
Operators
"+" (addition operator), "-" (subtraction operator), "*" (multiplication operator), "/" (division operator, such as 7/3 = 2),
"=" is blocking assignment, "<=" is non-blocking assignment. Blocking assignment means executing one assignment statement before executing the next one.
It is executed sequentially, and the assignment is executed immediately; non-blocking assignment can be understood as parallel execution, regardless of the order, and the assignment
is performed after the always block statement is executed. For example, the following blocking assignment:
http://www.alinx.com.cn
61 / 244
Machine Translated by Google
end begin
#({$random}%100) din = ~din ;
endmodule end
end
top
t0(.din(din),.a(a),.b(b),.c(c),.clk(clk)) ;
endmodule
It can be seen from the simulation results that at the rising edge of clk, the value of a is equal to din and is immediately assigned to b, and the value of b is assigned to c.
If it is changed to non-blocking assignment, the simulation results are as follows: at the rising edge of clk, the value of a is not immediately assigned to b, and b is the original value of a.
You can see obvious differences from the RTL diagrams of the two:
Blocking assignment RTL diagram Non-blocking assignment RTL diagram In general, non-blocking
assignment is used in sequential logic circuits to avoid competition hazards during simulation.
Blocking assignment is used in logic, and changes are made immediately after the assignment statement is executed; blocking assignment must be used in the assign statement.
http://www.alinx.com.cn
62 / 244
Machine Translated by Google
Used to express the relationship between two operands, such as a>b, a<b, and is often used to determine conditions, for example:
of a is greater than or equal to the value of b, the value of q is 1, otherwise the value of q is 0
“&&” (logical AND of two operands), “||” (logical OR of two operands), “!” (logical NOT of a single operand) For example: If (a>b && c <d) means the condition
is a>b and c<d; if (!a) means the condition is that the value of a is not 1, that is, 0.
“?:” is a conditional judgment, similar to if else, for example, assign a = (i>8)?1'b1:1'b0; judge whether the value of i is greater than 8, such as
"~" bitwise inversion, "|" bitwise OR, "^" bitwise XOR, "&" bitwise AND, "^" bitwise XOR, except for "~" which only needs one
In addition to the operand, the others all require two operands, such as a&b, a|b. The specific application will be explained in the later section on combinational logic.
“<<” is the left shift operator, and ”>>” is the right shift operator. For example, a<<1 means shifting 1 bit to the left, and a>>2 means shifting two bits to the right.
The "{ }" concatenation operator concatenates multiple signals bit by bit, such as {a[3:0], b[1:0]}, which concatenates the lower 4 bits of a and the lower 2 bits of b.
In addition, {n{a[3:0]}} means splicing n a[3:0], and {n{1'b0}} means splicing n bits of 0. For example, {8{1'b0}} means 8'b0000_0000.
3.4.9 Priority
http://www.alinx.com.cn
63 / 244
Machine Translated by Google
Combinational Logic
This section mainly introduces combinational logic. The characteristic of combinational logic circuit is that the output at any time depends only on the input signal.
When the input signal changes, the output changes immediately, independent of the clock.
In Verilog, "&" represents bitwise AND, such as c=a&b, the truth table is as follows, the result is only when a and b are both equal to 1
begin
http://www.alinx.com.cn
64 / 244
Machine Translated by Google
#({$random}%100)
a = ~a ;
#({$random}%100) b = ~b ; end
end
endmodule
If the bit width of a and b is greater than 1, for example, if we define input [3:0] a, input [3:0]b, then a&b refers to the pair of a and b.
3.5.2 OR Gate
In Verilog, "|" represents bitwise OR, such as c = a|b, the truth table is as follows, the result is only when a and b are both 0
0.
end
http://www.alinx.com.cn 65 / 244
Machine Translated by Google
endmodule
In Verilog, “~” represents bitwise inversion, such as b=~a, the truth table is as follows, b is equal to the opposite of a.
assign b = ~a ; endmodule
initial begin
a=0;
forever begin
#({$random}
%100)
a = ~a ; end
end
endmodule
3.5.4 XOR
In Verilog, “^” represents XOR, such as c= a^b, the truth table is as follows, when a and b are the same, the output is 0.
http://www.alinx.com.cn
66 / 244
Machine Translated by Google
end
endmodule
3.5.5 Comparator
In Verilog, the symbols are greater than ">", equal to "==", less than "<", greater than or equal to ">=", less than or equal to "<=", and not equal to "!="
Indicates, taking greater than as an example, such as c = a > b; if a is greater than b, then the value of c is 1, otherwise it is 0. The truth table is as follows
Down:
http://www.alinx.com.cn
67 / 244
Machine Translated by Google
a=0;b
= 0 ; forever
begin
#({$random}%100)
a = ~a ;
#({$random}%100) b = ~b ; end
end
endmodule
Half adder and full adder are the basic units in arithmetic operation circuits. Since the half adder does not consider the carry from the low bit,
It is called a half adder, sum represents the addition result, count represents the carry, and the truth table can be expressed as follows:
the truth table as follows: module top(a, b, follows: `timescale 1 ns/1 ns module
sum, count) ; top_tb() ; reg a ; reg b ; wire sum ;
input a ; input wire count ;
b ; output sum ;
output count ;
^
assign sum = a assign b;
count = a & b ; initial
begin a =
endmodule 0;b=0;
forever begin
http://www.alinx.com.cn
68 / 244
Machine Translated by Google
#({$random}%100)
a = ~a ;
#({$random}%100) b = ~b ; end
end
endmodule
The full adder needs to add the carry signal cin from the low bit, and the truth table is as follows:
end
http://www.alinx.com.cn
69 / 244
Machine Translated by Google
end
endmodule
3.5.8 Multiplier
assign c = a endmodule * b ;
initial
begin a
=0;b=0;
forever
begin
#({$random}%100)
a = ~a ;
#({$random}%100) b = ~b ; end
end
endmodule
Data selectors are often used in Verilog. By selecting signals, different input signals are selected and output to the output end. As shown
in the truth table below, there is a four-to-one data selector, sel[1:0] is the selection signal, a, b, c, d are input signals, and Mux is the output signal.
http://www.alinx.com.cn 70 / 244
Machine Translated by Google
module top(a, b, c, d, sel, Mux) ; input a ; input b ; input c ; input d ; follows: `timescale 1 ns/1 ns module
top_tb() ; reg a ; reg b ; reg c ; reg
d ; reg [1:0]
sel ; wire Mux ;
begin
#({$random}%100) a = {$random}
end %3 ; #({$random}%100) b =
{$random}%3 ; #({$random}
endmodule %100) c = {$random}%3 ;
#({$random}%100) d = {$random}
%3 ; end
end
initial begin
sel =
2'b00 ; #2000 sel = 2'b01 ;
#2000 sel = 2'b10 ; #2000 sel = 2'b11 ;
end
top
t0(.a(a), .b(b),.c(c),.d(d), .sel(se l),
http://www.alinx.com.cn 71 / 244
Machine Translated by Google
.Mux(Mux)) ;
endmodule
3-8 The decoder is a very commonly used device. Its truth table is shown below. Different results are obtained according to the values of A2, A1, and A0.
fruit.
follows: module top(addr, decoder) ; input [2:0] addr ; follows: `timescale 1 ns/1 ns module
output reg [7:0] decoder ; top_tb() ; reg [2:0] addr ; wire
[7:0] decoder ;
http://www.alinx.com.cn
72 / 244
Machine Translated by Google
8'b1111_1011 ; 3'b011 :
decoder = 8'b1111_0111 ; 3'b100 :
decoder =
8'b1110_1111 ; 3'b101 :
decoder = top
8'b1101_1111 ; 3'b110 : t0(.addr(addr),.decoder(decoder)) ;
decoder = 8'b1011_1111 ; 3'b111 :
decoder = endmodule
8'b0111_1111 ; endcase
end
endmodule
In FPGA use, bidirectional IO is often used, and a tri-state gate is required, such as bio = en? din: 1'bz; where en is the enable signal, which is used to open
and close the tri-state gate. The following RTL diagram implements bidirectional IO, and you can refer to the code. The stimulus file realizes the connection between two
bidirectional IOs.
module top(en, din, dout, bio) ; input din ; input en ; output `timescale 1 ns/1 ns module top_tb() ;
dout ; inout bio ; reg en0 ; reg din0 ; wire dout0 ;
reg en1 ; reg
din1 ; wire dout1 ;
wire bio ;
endmodule
initial begin
din0 = 0 ;
din1 = 0 ; forever
begin
#({$random}%100) din0 =
~din0 ; #({$random}%100)
din1 = ~din1 ;
http://www.alinx.com.cn
73 / 244
Machine Translated by Google
end
end
initial
begin
en0 = 0 ; en1 =
1 ; #100000 en0
= 1 ; en1 =
0 ; end
top
t0(.en(en0),.din(din0),.dout(dout0), .bi
o(bio)) ; top
t1(.en(en1),.din(din1),.dout(dout1), .bi
o(bio)) ;
endmodule
The simulation results are as follows: when en0 is 0 and en1 is 1, channel 1 is open, and the bidirectional IO bio is equal to din1 of channel 1, 1
Channel 0 sends data outward, channel 0 receives data, dout0 equals bio; when en0 is 1 and en1 is 0, channel 0 is turned on, the bidirectional IO bio equals
din0 of channel 0, channel 0 sends data outward, channel 1 receives data, dout1 equals bio
Sequential Logic
The characteristic of combinational logic circuit in terms of logic function is that the output at any time depends only on the input at the current time, and has
nothing to do with the original state of the circuit. The characteristic of sequential logic in terms of logic function is that the output at any time depends not only on the
current input signal, but also on the original state of the circuit. The following is an analysis of typical sequential logic.
http://www.alinx.com.cn
74 / 244
Machine Translated by Google
3.6.1 D Flip-Flop
The D flip-flop stores data on the rising or falling edge of the clock, and the output is the same as the state of the input signal before the clock jumps. The code is
initial
begin d =
0 ; clk = 0 ;
endmodule forever
begin
#({$random}%100) d = ~d ; end
end
top t0(.d(d),.clk(clk),.q(q)) ;
endmodule
The simulation results are as follows. It can be seen that at time t0, the value of d is 0, and the value of q is also 0; at time t1, d has
If the value of q changes to 1, then q also changes accordingly and becomes 1. It can be seen that in a clock cycle between t0 and t1, no matter how the value of the input
signal d changes, the value of q remains unchanged, that is, it has a storage function, and the saved value is
The software performs timing analysis based on the model of a two-stage D flip-flop. Specifically, it can analyze the timing of two D flip-flops at the same time.
The data output by the device is different. The RTL diagram is as follows:
http://www.alinx.com.cn
75 / 244
Machine Translated by Google
follows: module top(d, clk, q, q1) ; input d ; input clk ; follows: `timescale 1 ns/1 ns module
output reg q ; top_tb() ; reg d ; reg clk ; wire q ;
output reg q1 ; wire q1 ;
d = 0 ; clk
= 0 ; forever
endmodule end
top
t0(.d(d),.clk(clk),.q(q),.q1(q1)) ;
endmodule
The simulation results are as follows. It can be seen that at t0, d is 0 and q output is 0. At t1, q changes with the data change of d, and the
value of q is still 0 before the clock jump, so the value of q1 is still 0. At t2, the value of q is 1 before the clock jump, so the value of q1 is 1
Asynchronous reset is independent of the clock. Once the asynchronous reset signal is valid, the reset operation is triggered. This function is written in the code
It is often used to reset and initialize the signal. Its RTL diagram is as follows:
http://www.alinx.com.cn
76 / 244
Machine Translated by Google
The code is as follows. Note that the asynchronous reset signal should be placed in the sensitive list. If it is a low-level reset, it is negedge.
module top(d, rst, clk, q); input d ; input rst ; input clk ; `timescale 1 ns/1 ns module top_tb() ;
output reg q ; reg d ;
begin
#({$random}%100) d = ~d ; end
endmodule end
initial begin
rst = 0 ;
#200 rst = 1 ; end
top
t0(.d(d),.rst(rst),.clk(clk),.q(q))
;
endmodule
The simulation results are as follows. It can be seen that before the reset signal, although the input signal d data has changed,
Bit state, the input signal q is always 0, and the value of q is normal after reset.
http://www.alinx.com.cn
77 / 244
Machine Translated by Google
As mentioned above, asynchronous reset is independent of the clock operation, while synchronous clear is synchronized with the clock signal.
It is limited to synchronous clearing, and can also be other synchronous operations. The RTL diagram is as follows:
begin
#({$random}%100) d = ~d ; end
q <= d ; end
end
endmodule
initial begin
rst = 0 ;
clr = 0 ; #200 rst
= 1 ; #200 clr =
1 ; #100 clr = 0 ; end
top
t0(.d(d),.rst(rst),.clr(clr),.clk(cl k),
.q(q)) ;
endmodule
The simulation results are as follows. It can be seen that after the clr signal is pulled high, q is not cleared immediately, but is executed after the next clk rising edge.
http://www.alinx.com.cn
78 / 244
Machine Translated by Google
A shift register is a register that moves one bit to the left or right when each clock pulse comes. Due to the characteristics of the D flip-flop, the data
The output is synchronized with the clock edge, and its structure is as follows: when each clock comes, the output q of each D flip-flop is equal to the value of the
implementation: module top(d, rst, clk, q); input d ; `timescale 1 ns/1 ns module top_tb() ;
input rst ; input clk ; reg d ;
output reg [7:0] q ;
right
begin
end #({$random}%100) d = ~d ; end
endmodule
end
initial
begin rst
= 0 ; #200 rst =
1 ; end
top
t0(.d(d),.rst(rst),.clk(clk),.q(q))
;
endmodule
The simulation results are as follows. It can be seen that after reset, each clk rising edge shifts left by one position.
http://www.alinx.com.cn
79 / 244
Machine Translated by Google
The write address and read address of the single-port RAM share the same address. The code is as follows, where reg [7:0] ram [63:0] means
64 8-bit width data are defined. Addr_reg is defined to hold the read address and return the data after a delay of one cycle.
According to sent.
initial
reg [7:0] ram[63:0]; //declare ram reg [5:0] addr_reg; //addr register begin
data = 0 ;
addr = 0 ;
wr = 1 ;
always @ (posedge clk) clk = 0 ;
begin end
if (wr) //write ram[addr] <= data;
always #10 clk = ~clk ;
The simulation results are as follows. It can be seen that the output of q is consistent with the written data.
The read and write addresses of the pseudo dual-port RAM are independent, and the write or read address can be randomly selected to perform read and write operations at the same time.
As shown below, the en signal is defined in the stimulus file, and the read address is sent when it is valid.
http://www.alinx.com.cn
80 / 244
Machine Translated by Google
input [5:0] read_addr, input wr, input rd, reg [5:0] read_addr ; reg wr ;
input clk, output
reg [7:0] q ); reg clk ; reg rd ;
wire [7:0] q ;
initial begin
reg [7:0] ram[63:0]; //declare ram reg [5:0] addr_reg; //addr register data = 0 ;
write_addr = 0 ;
read_addr = 0 ; wr = 0 ; rd =
0 ; clk = 0 ; #100 wr = 1 ;
always @ (posedge clk) begin //write if #20 rd = 1 ;
(wr) end
ram[write_addr] <= data; if (rd) //read q <=
ram[read_addr];
end
always #10 clk = ~clk ;
endmodule
always @(posedge clk) begin if (wr)
begin
data <= data
+ 1'b1 ;
write_addr <= write_addr +
1'b1 ;
if (rd)
read_addr <= read_addr +
1'b1 ; end
end
top
The simulation results are as follows. It can be seen that when rd is valid, the read address is operated and the data is read out.
True dual-port RAM has two sets of control lines and data lines, allowing two systems to read and
module top ( write it. The code is as follows:
`timescale 1 ns/1 ns
input [7:0] data_a, data_b, input [5:0] addr_a, module top_tb() ; reg [7:0] data_a,
addr_b, input wr_a, wr_b, input rd_a, rd_b, input data_b ; reg [5:0] addr_a, addr_b ; reg
clk, wr_a, wr_b ; reg rd_a,
rd_b ; reg clk ;
http://www.alinx.com.cn
81 / 244
Machine Translated by Google
initial
reg [7:0] ram[63:0]; //declare ram begin
data_a = 0 ;
//Port A data_b = 0 ;
always @ (posedge clk) addr_a = 0 ;
begin addr_b = 0 ;
if (wr_a) begin //write wr_a = 0 ;
wr_b = 0 ;
ram[addr_a] <= data_a; rd_a = 0 ;
q_a <= data_a ; rd_b = 0 ;
end clk = 0 ;
if (rd_a) //read #100 wr_a = 1 ;
q_a <= #100 rd_b = 1 ;
ram[addr_a]; end
end
always #10 clk = ~clk ;
end
top
t0(.data_a(data_a), .data_b(data_b),
.addr_a(addr_a), .addr_b(addr
_b
),
.wr_a(wr_a), .wr_b(wr_b),
.rd_a(rd_a), .rd_b(rd_b),
http://www.alinx.com.cn
82 / 244
Machine Translated by Google
ROM is used to store data. You can initialize ROM in the following code form, but this method cannot handle large capacity
ROM is more troublesome, it is recommended to use the ROM IP core that comes with the FPGA and add an initialization
initial
always @(posedge clk) begin case(addr) begin
4'd0 : q addr = 0 ; clk = 0 ;
<= 8'd15 ; 4'd1 : q end
<= 8'd24 ; 4'd2 : q <= 8'd100 ; 4'd3 : q
<= 8'd78 ; 4'd4 : q <= 8'd98 ; 4'd5 : q <=
8'd105 ; 4'd6 : q <= 8'd86 ; 4'd7 : q <= always #10 clk = ~clk ;
8'd254 ; 4'd8 : q <= 8'd76 ; 4'd9 : q <=
8'd35 ; 4'd10 : q <= 8'd120 ; 4'd11 : q <= always @(posedge clk) begin addr <=
8'd85 ; 4'd12 : q <= 8'd37 ; 4'd13 : q <= addr +
8'd19 ; 4'd14 : q <= 8'd22 ; 4'd15 : q <= 1'b1 ;
8'd67 ; default: q <= 8'd0 ; endcase end
end
endmodule
http://www.alinx.com.cn
83 / 244
Machine Translated by Google
Finite state machines are often used in Verilog to handle relatively complex logic, set different states, and
An 8-bit shift register is designed in the program. In the Idle state, it is determined whether the shift_start signal is high. If
High, enter the Start state, delay 100 cycles in the Start state, enter the Run state, and perform shift processing. If
When the shift_stop signal is valid, the system enters the Stop state. In the Stop state, the value of q is cleared and the system jumps to the Idle state.
Mealy finite state machine, the output is not only related to the current state, but also to the input signal.
No. is connected.
module top (
input shift_start,
input shift_stop,
input rst,
input clk,
input d,
output reg [7:0] q );
http://www.alinx.com.cn 84 / 244
Machine Translated by Google
if (delay_cnt == 5'd99)
begin
delay_cnt <= 0 ;
state <= Run ;
end
else
delay_cnt <= delay_cnt + 1'b1 ;
end
Run : begin
if (shift_stop)
state <= Stop ;
else
q <= {q[6:0], d} ;
end
Stop : begin
q <= 0 ;
state <= Idle ;
end
default: state <= Idle ;
endcase
end
endmodule
Moore finite state machine, the output is only related to the current state, not the input signal, and the input signal only affects the change of state.
Changes do not affect the output, for example, the processing of delay_cnt and q is only related to the state.
module top (
input shift_start,
input shift_stop,
input rst,
input clk,
input d,
output reg [7:0] q );
begin
if (!rst)
current_state <= Idle ;
else
current_state <= next_state ;
end
//Second part: combination logic, judge statement transition
condition
always @(*)
begin
case(current_state)
Idle : begin
if (shift_start)
next_state <= Start ;
else
next_state <= Idle ;
http://www.alinx.com.cn
85 / 244
Machine Translated by Google
end
Start : begin
if (delay_cnt == 5'd99) next_state <= Run ;
else
(current_state ==
Run) q <=
{q[6:0], d} ; else
q <= 0 ; end
endmodule
In the above two programs, two writing methods are used. The first Mealy state machine adopts a one-segment writing method,
which uses only one always statement. All state transitions, state transition conditions, and data output are in one always statement. The
disadvantage is that if there are too many states, the whole program will be lengthy. The second Moore state machine adopts a three-
segment writing method. State transition uses an always statement, and the state transition condition is combinatorial logic, which uses an
always statement. Data output is also a separate always statement. This is more intuitive and clear to write, and it will not be cumbersome
http://www.alinx.com.cn
86 / 244
Machine Translated by Google
initial
begin rst
= 0 ; clk = 0 ; d =
0 ; #200 rst = 1 ;
forever
begin
#({$random}%100) d = ~d ; end
end
http://www.alinx.com.cn
87 / 244
Machine Translated by Google
initial
begin
shift_start = 0 ; shift_stop = 0 ;
#300 shift_start = 1 ; #1000
shift_start = 0 ; shift_stop = 1 ; #50
shift_stop = 0 ; end
top t0 (
Summarize
This document introduces the commonly used modules in combinational logic and sequential logic. Among them, the finite state machine is relatively complex but often used.
I hope everyone can have a deep understanding, apply more in the code, think more, which will help to quickly improve their level.
http://www.alinx.com.cn
88 / 244
Machine Translated by Google
PL (FPGA) development is crucial for ZYNQ, and this is also where ZYNQ has an advantage over other ARMs. It can customize many ARM-
side peripherals. Before customizing the ARM-side peripherals, let us first use an LED routine to get familiar with the PL (FPGA) development process
and the basic operations of Vivado software. This development process is exactly the same as that of FPGA chips without ARM.
In this example, we are going to do an LED light control experiment, controlling the LED light on the development board to flip once per second.
Now I can control the LED lights, and other peripherals will be controlled gradually.
1) The PL part of the development board is connected to a red user LED. This light is completely controlled by the PL. If PL_LED1
If the voltage is high, the transistor will be turned on and the light will be on, otherwise it will be off.
2) We can determine the binding relationship between the LED and PL pins based on the wiring relationship in the schematic diagram.
Base Plate
http://www.alinx.com.cn
89 / 244
Machine Translated by Google
Core board
3) PS_MIO
In the schematic diagram IO Both PS endIO , no need to bind, and can not bind
At the beginning
1) Start Vivado. In Windows, you can double-click the Vivado shortcut to start Vivado.
2) Click “Create New Project” in the Vivado development environment to create a new project.
http://www.alinx.com.cn 90 / 244
Machine Translated by Google
4) In the pop-up dialog box, enter the project name and the directory where the project is stored. Here we choose a project name of LED.
Please note that the project location cannot have any Chinese spaces and the path name cannot be too long.
http://www.alinx.com.cn
91 / 244
Machine Translated by Google
6) Select "Verilog" for the target language. Although Verilog is selected, VHDL can also be used, supporting multi-
http://www.alinx.com.cn 92 / 244
Machine Translated by Google
8) Take the AXU3EG development board as an example. In the “Part” option, select “Zynq UltraScale+
MPSoCs", select "sfvc784" for package type, "-1" for speed, and "I" for temperature.
Select "xczu3eg-sfvc784-1-i" in the drop-down list. "-1" indicates the rate level. The larger the number, the higher the rate level.
The better the performance, the higher the speed of the chip is, and the lower the speed of the chip is. (The model selected in the example is "xazu3eg-
http://www.alinx.com.cn
93 / 244
Machine Translated by Google
AXU4EV development board, in the "Part" option, select "Zynq UltraScale+ MPSoCs" for the device family, select
"sfvc784" for the package type, select "-1" for the speed, and select "I" for the temperature to reduce the selection range.
AXU5EV development board, in the "Part" option, select "Zynq UltraScale+ MPSoCs" for the device family, "sfvc784" for
the package type, "-1" for the speed, and "I" for the temperature to reduce the selection range. Select "xczu5ev-sfvc784-1-
i" in the drop-down list; (The model selected in the example is "xazu5ev-sfvc784-1-i", the two are
http://www.alinx.com.cn 94 / 244
Machine Translated by Google
compatible)
http://www.alinx.com.cn
95 / 244
Machine Translated by Google
1) Click the Add Sources icon under Project Manager (or use the shortcut Alt+A)
http://www.alinx.com.cn 96 / 244
Machine Translated by Google
http://www.alinx.com.cn 97 / 244
Machine Translated by Google
6) In the pop-up module definition "Define Module", you can specify the module name of the "led.v" file "Module
"name", the default is "led", you can also specify some ports, but don't specify them here for now, click "OK".
http://www.alinx.com.cn 98 / 244
Machine Translated by Google
http://www.alinx.com.cn 99 / 244
Machine Translated by Google
9) Write "led.v", which defines a 32-bit register timer for loop counting from 0 to 199999999 (1 second).
When the count reaches 199999999 (1 second), the register timer becomes 0 and flips the four LEDs. In this way, if the
original LED is off, it will light up, and if the original LED is on, it will go out. Since the input clock is a 200MHz differential clock,
it is necessary to add an IBUFDS primitive to connect the differential signal. The written code is as follows:
IBUFDS IBUFDS_inst
( .O(sys_clk), // Buffer output .I(sys_clk_p), //
Diff_p buffer input (connect directly to top-level port)
http://www.alinx.com.cn
100 / 244
Machine Translated by Google
end
else
begin led
<= led; timer_cnt <=
timer_cnt + 32'd1; end
end
endmodule
The constraint file format used by Vivado is xdc file. The xdc file mainly completes the pin constraints, clock constraints,
Here we need to assign the input and output ports in the led.v program to the real pins of the FPGA.
http://www.alinx.com.cn
101 / 244
Machine Translated by Google
4) You can see the pin assignment in the pop-up I/O Ports
5) Bind the reset signal rst_n to the button on the PL side, assign pins and level standards to the LED and clock, and click Save when finished.
Save Icon
6) A window pops up asking you to save the constraint file. Enter "led" as the file name and "XDC" as the default file type. Click
"OK"
7) Open the "led.xdc" file just generated, we can see that it is a TCL script. If we understand these languages
You can constrain the pins by writing the led.xdc file yourself.
http://www.alinx.com.cn
103 / 244
Machine Translated by Google
The following is an introduction to the most basic XDC syntax. For common IO ports, only pin numbers and voltages need to be constrained.
as follows:
"
set_property PACKAGE_PIN Pin Number
" [get_ports Port Name ]
"
set_property IOSTANDARD " [get_ports
Level Standard Port Name ]
Pay attention to the capitalization of the text here. If the port name is an array, use { } to enclose it. The port name must be the same as in the source code.
The port name must be consistent with the keyword, and the port name cannot be the same as the keyword.
The number after "LVCMOS33" in the voltage level standard refers to the BANK voltage of the FPGA. The BANK voltage of the LED is 3.3.
.
Number
In addition to pin allocation, an FPGA design also has an important constraint, which is timing constraint.
6) The Timing Constraint Wizard analyzes the clock in the design. Here, set the frequency of "sys_clk_p" to 200Mhz, and then click
http://www.alinx.com.cn
106 / 244
Machine Translated by Google
8) Click “Finish”
9) At this time, the led.xdc file has been updated. Click "Reload" to reload the file and save the file.
1) The compilation process can be divided into synthesis, layout and routing, bit file generation, etc. Here we directly click "Generate
2) In the pop-up dialog box, you can select the number of tasks. This is related to the number of CPU cores. Generally, the larger the number, the faster the compilation.
3) Start compiling at this time. You can see a status message in the upper right corner. During the compilation process, anti-virus software, computer
Brain Manager intercepts the running, resulting in failure to compile or failure to compile successfully for a long time.
4) There are no errors in the compilation. After the compilation is completed, a dialog box pops up for us to choose the follow-up operation. We can choose
"Open Hardware Manger". Of course, you can also choose "Cancel". We choose "Cancel" here and do not download it yet.
http://www.alinx.com.cn
108 / 244
Machine Translated by Google
Vivado Simulation
Next, let's try it out and use the simulation tool provided by Vivado to output waveforms to verify the design results of the pipeline lamp program.
Check whether the result is consistent with our expectation (Note: simulation can also be performed before generating the bit file). The
specific steps are as follows: 1) Set the simulation configuration of Vivado, right-click Simulation Settings in SIMULATION.
2) In the Simulation Settings window, configure as shown below. Here, it is set to 50ms (set as needed).
3) Add the stimulus test file, click the Add Sources icon under Project Manager, set it as shown below and click Next.
http://www.alinx.com.cn
109 / 244
Machine Translated by Google
In the pop-up dialog box, enter the name of the stimulus file. Here we enter the name vtf_led_test.
http://www.alinx.com.cn
110 / 244
Machine Translated by Google
There is a new vtf_led_test file in the Simulation Sources directory. Double-click to open this file.
You can see that there is only the definition of the module name, nothing else.
6) Next we need to write the contents of the vtf_led_test.v file. First, define the input and output signals, then instantiate the led_test
module and make the led_test program part of this test program. Then add reset and clock stimuli.
http://www.alinx.com.cn
111 / 244
Machine Translated by Google
`timescale 1ns /
1ps //////////////////////////////////////////////////////////////////////// /////////////
// Module Name:
vtf_led_test ///////////////////////////////////////////////////////////////////////// /////////////
module vtf_led_test;
// Inputs reg
sys_clk_p; reg rst_n ;
wire sys_clk_n;
// Outputs wire
led;
initial begin //
Initialize
Inputs sys_clk_p = 0; rst_n = 0; // Wait
for global reset to finish
#1000; rst_n = 1;
end //Create clock always #2.5 sys_clk_p = ~ sys_clk_p; assign
sys_clk_n
= ~sys_clk_p ;
endmodule
7) After writing and saving, vtf_led_test.v automatically becomes the top level of the simulation hierarchy, and below it are the design files
led_test.v.
8) Click the Run Simulation button, and then select Run Behavioral Simulation. Here we will do a behavioral simulation.
http://www.alinx.com.cn
112 / 244
Machine Translated by Google
That's all.
10. When the simulation interface pops up as shown below, the simulation software automatically runs to the 50ms waveform set for simulation.
Since the state change time of LED[3:0] in the program is long and the simulation is time-consuming, observe the change of timer[31:0]
counter here. Put it into Wave to observe (click uut under Scope interface, then right-click and select timer under Objects interface,
After adding, the timer is displayed on the Wave interface, as shown in the figure below.
http://www.alinx.com.cn
113 / 244
Machine Translated by Google
11. Click the Restart button marked below to reset, and then click the Run All button. (Patience is required!!!), and you can see that the simulation waveform is
consistent with the design. (Note: The longer the simulation time, the more disk space the simulation waveform file occupies. The waveform file is in the xx.sim
We can see that the LED signal will become 1, which means the LED light will become brighter.
download
1) Connect the JTAG interface of the development board and power on the development board
http://www.alinx.com.cn
114 / 244
Machine Translated by Google
Note that the DIP switches must be in JTAG mode, that is, all of them should be pulled to "ON". The value represented by "ON" is 0. If the JTAG mode is not used, the
2) Click "Auto Connect" on the "HARDWARE MANAGER" interface to automatically connect the device
http://www.alinx.com.cn
115 / 244
Machine Translated by Google
3) You can see that JTAG scans the arm and FPGA cores
7) After the download is complete, we can see that the PL LED starts to change once a second. This is the end of the Vivado simple process experience
Completed. The following chapters will introduce how to burn the program to Flash. It requires the cooperation of the PS system to complete it. Only PL
projects cannot be burned directly to Flash. It is introduced in the FAQ of the chapter "Experience ARM, bare metal output "Hello World".
Online debugging
Simulation and downloading were introduced above, but simulation does not require the program to be burned into the board, which is a more ideal
result. The following introduces the Vivado online debugging method to observe the changes of internal signals. Vivado has an embedded logic analyzer called
ILA, which can be used to observe the changes of internal signals online, which is very helpful for debugging. In this experiment, we observe the signal
1) Click IP Catalog, search for ila in the search box, and double-click the IP of ILA
2) Change the name to ila. Since two signals are to be sampled, the number of Probes is set to 2. Sample Data Depth refers to
The higher the sampling depth is set, the more signals are collected, and the more resources are consumed.
http://www.alinx.com.cn
117 / 244
Machine Translated by Google
3) On the Probe_Ports page, set the Probe width and set the PROBE0 bit width to 32 for sampling timer_cnt.
http://www.alinx.com.cn
118 / 244
Machine Translated by Google
5) Regenerate Bitstream
http://www.alinx.com.cn
119 / 244
Machine Translated by Google
7) Now you will see bit and ltx files, click program
8) At this time, the online debugging window pops up and the signal we added appears.
Click the Run button and the signal data will appear.
You can also trigger the acquisition, click "+" in the Trigger Setup window, and select the timer_cnt signal in depth.
http://www.alinx.com.cn
120 / 244
Machine Translated by Google
Change Radix to U, which is decimal, and set Value to 199999999, which is the maximum value of timer_cnt count.
Click Run again, and you can see that the trigger is successful. At this time, timer_cnt is displayed in hexadecimal, and the LED is also flipped at this
time.
The above describes how to add ILA IP to debug online. The following describes how to add comprehensive attributes to the code to achieve online debugging.
try.
2) Add (* MARK_DEBUG="true" *) before the definition of led and timer_cnt, and save the file.
3) Click Comprehensive
http://www.alinx.com.cn
122 / 244
Machine Translated by Google
Click Finish
Click Save
The added ila core constraints can be seen in the xdc file
6) The debugging method is the same as before and will not be repeated here.
Experimental Summary
This chapter introduces how to develop programs on the PL side, including project establishment, constraints, simulation, online debugging, etc.
http://www.alinx.com.cn
124 / 244
Machine Translated by Google
Many beginners are confused when they see that there is only one 200Mhz clock input on the board. Why is the clock 200Mhz?
What if you want to work at 100Mhz or 150Mhz? In fact, many FPGA chips have PLL integrated inside. Other manufacturers may not call it PLL, but
they have similar functional modules. PLL can multiply and divide the frequency to generate many other clocks. This experiment uses PLL IP core to
Experimental Principle
PLL (phase-locked loop) is an important resource in FPGA. Since a complex FPGA system often requires multiple clock signals with different
frequencies and phases, the number of PLLs in an FPGA chip is an important indicator to measure the capabilities of the FPGA chip. In the design of
FPGA, the high-speed design of the clock system is extremely important. A low-jitter, low-latency system clock will increase the success rate of FPGA
design.
This experiment will use PLL to output a square wave to the expansion port on the development board to demonstrate how to use PLL in
Vivado software.
Ultrascale+ series FPGAs use dedicated global and regional IO and clock resources to manage various clock requirements in the design. Clock
Management Tiles (CMT) provide clock frequency synthesis, deskew, and jitter filtering functions.
Each CMTs contains a MMCM (mixed-mode clock manager) and a PLL. As shown in the figure below, the input of the CMT can be BUFR,
IBUFG, BUFG, GT, BUFH, local wiring (not recommended), and the output needs to be connected to BUFG or BUFH before use.
MMCM is used to generate different clock signals with a set phase and frequency relationship with a given input clock. MMCM
http://www.alinx.com.cn
125 / 244
Machine Translated by Google
Phase-locked loops (PLLs) are mainly used for frequency synthesis. Using a PLL, multiple clocks can be generated from one input clock signal.
Compared with MMCM, deskew cannot perform clocking, does not have advanced phase adjustment, and has a smaller adjustable range of multipliers and
dividers.
For more information about clock resources, I recommend you to read the document "7 Series FPGAs Clocking
This experiment demonstrates how to use the PLL IP core provided by Xilinx to generate clocks of different frequencies and
A clock is output to the external IO of FPGA. The following are the detailed steps of program design. 1)
Create a new pll_test project and click IP Catalog under the Project Manager interface.
http://www.alinx.com.cn
126 / 244
Machine Translated by Google
2) In the IP Catalog interface, select Clocking Wizard under FPGA Features and Design\Clocking and double-click
3) By default, the name of the Clocking Wizard is clk_wiz_0, so we will not modify it here. In the first interface, Clocking Options, enter the clock
frequency as 200Mhz and select Differential clock capable pin, because the clock input
http://www.alinx.com.cn
127 / 244
Machine Translated by Google
4) In the Output Clocks interface, select the output of the four clocks clk_out1~clk_out4, with the frequency of 200Mhz,
100Mhz, 50Mhz, 25Mhz. You can also set the phase of the clock output here. We do not set it and keep the default phase.
Click OK to finish.
5) In the pop-up dialog box, click the Generate button to generate the design file of the PLL IP.
http://www.alinx.com.cn
128 / 244
Machine Translated by Google
6) At this time, a clk_wiz_0.xci IP will be automatically added to our pll_test project. The user can double-click it to modify
Select the IP Sources page and double-click to open the clk_wiz_0.veo file, which provides an example of this IP.
We just need to copy the content in the box into our Verilog program to instantiate the IP.
7) Let's write a top-level design file to instantiate this PLL IP, and write the pll_test.v code as follows. Note that the reset of the PLL
is high level effective, that is, when the level is high, it is always in the reset state, and the PLL will not work. Many novices
will ignore this. Here we bind rst_n to a button, and the button is a low level reset, so it needs to be reversely connected to
);
wire locked;
clk_wiz_0 clk_wiz_0_inst
(
// Clock out ports
// output clk_out1
// output clk_out2
// output clk_out3
// output clk_out4
.clk_out1(), .clk_out2(), .clk_out3(), .clk_out4(clk_out), // Status and control signals
.reset(~rst_n), // input reset
.locked(locked), // Clock // output locked
in ports
// input clk_in1_p
.clk_in1_p(sys_clk_p), .clk_in1_n(sys_clk_n)); // input clk_in1_n
endmodule
In the program, first instantiate clk_wiz_0, input the differential 200Mhz clock signal to clk_in1_p and clk_wiz_0
Note: The purpose of instantiation is to call the instantiated module in the upper-level module to complete the codeVerilog
function. Instantiated Letter
The format of the module name must be consistent with the module name to be instantiated, such as the name clk_wiz_0 , including module signals
clk_in1
in the program must also be consistent, ,
such as clk_out1 , clk_out2..... TOP
The connection signal is
The signals transmitted between the program and the modules, and the connection signals between modules cannot conflict with each other, otherwise compilation errors will occur.
8) After saving the project, pll_test automatically becomes the top file, and clk_wiz_0 becomes a submodule of the Pll_test file.
9) Add the xdc pin constraint file pll.xdc to the project. The adding method is as shown in the "PL "Hello World" LED experiment".
You can directly copy the following content and compile it to generate bitstream.
create_clock -period 5.000 -name sys_clk_p -waveform {0.000 2.500} [get_ports sys_clk_p]
simulation
Add a vtf_pll_test simulation file. After running, the PLL lock signal will become high, indicating that the PLL IP phase-locked loop has been
initialized. clk_out has a clock signal output, and the output frequency is 1/8 of the input clock frequency, which is 25Mhz. The simulation method can
http://www.alinx.com.cn
131 / 244
Machine Translated by Google
On-board verification
Compile the project and generate the pll_test.bit file, then download the bit file to the FPGA.
Use the ground wire of the oscilloscope probe to connect to the ground on the development board (PIN1 of J46 of the development board), and the signal end to connect to
PIN3 of J46 of the development board (be careful when measuring to avoid the oscilloscope head touching other pins and causing a short circuit between the power supply and the ground).
At this time, we can see the 25Mhz clock waveform in the oscilloscope. The amplitude of the waveform is 3.3V, the duty cycle is 1:1, and the waveform
If you want to output waveforms of other frequencies, you can modify the clock output to clk_out2 or clk_out3 of clk_wiz_0.
Or clk_out4. You can also modify clk_out4 of clk_wiz_0 to the frequency you want. Here you also need to pay attention, because the clock output is obtained by PLL's
multiplication and division coefficients of the input clock signal, so not all clock frequencies can be accurately generated by PLL, but PLL will also automatically calculate the
It should also be noted that the bandwidth and sampling rate of some users' oscilloscopes are too low, which will cause the high-frequency part to attenuate too much when
measuring high-frequency clock signals, and the amplitude of the measured waveform will become lower.
http://www.alinx.com.cn
132 / 244
Machine Translated by Google
RAM is a commonly used basic module in FPGA and can be widely used to cache data. It is also the core of ROM and FIFO.
This experiment will introduce how to use the RAM inside the FPGA and how to read and write data to the RAM through the program.
Experimental Principle
Xilinx has provided us with a RAM IP core in VIVADO. We only need to instantiate a RAM through the IP core and write and read the data stored
in the RAM according to the RAM read and write timing. In the experiment, we can observe the RAM read and write timing and the data read from the
Before adding RAM IP, create a new ram_test project, and then add RAM IP to the project as follows:
1) Click IP Catalog in the figure below, search for ram in the interface that pops up on the right, find Block Memory Generator, and double-click
Open.
2) Change Component Name to ram_ip, and under Basic, change Memory Type to Simple Dual Port
RAM, also known as pseudo dual-port RAM. Generally speaking, "Simple Dual Port RAM" is the most commonly used because it has two ports
http://www.alinx.com.cn
133 / 244
Machine Translated by Google
3) Switch to the Port A Options column and change the RAM bit width Port A Width to 16, which is the data width.
RAM depth Port A Depth is changed to 512. Depth refers to how much data can be stored in the RAM. Enable pin
4) Switch to Port B Options, change the RAM bit width Port B Width to 16, change the Enable Port Type to Always Enable, and of course, you
can also Use ENB Pin, which is equivalent to the read enable signal. Uncheck Primitives Output Register, which is to add registers to
the output data, which can effectively improve the timing, but the read
In many cases, do not enable this function and keep the data behind the address by one cycle.
Expect.
http://www.alinx.com.cn
134 / 244
Machine Translated by Google
5) In the Other Options column, unlike ROM, we don’t need to initialize the RAM data here. We can write it in the
The Simple Dual Port RAM module ports are described as follows:
clka in
Port A Clock Input
wea in
Port A Enable
addra in
Port A address input
dina in
Port A Data Input
clkb in
Port B Clock Input
addrb in
Port B address input
doutb out
Port B Data input and output
The data writing and reading of RAM are operated according to the rising edge of the clock. Port A needs to be set high when writing data.
wea signal, providing both the address and the data to be written. The following figure is the timing diagram of input writing to RAM.
http://www.alinx.com.cn
136 / 244
Machine Translated by Google
Port B cannot write data, but can only read data from RAM. As long as the address is provided, it is usually
Next, we will write a RAM test program. To test the function of RAM, we write a
A string of continuous data is written only once and read from port B. Use a logic analyzer to view the data. The code is as follows
//-------------------------------------------------------------
reg [8:0] w_addr; //RAM PORTA write address
reg [15:0] w_data; //RAM PORTA write data
reg wea; //RAM PORTA enable
reg [8:0] r_addr; //RAM PORTB read address
wire [15:0] r_data; //RAM PORTB read data
wire clk ;
IBUFDS IBUFDS_inst (
.O(clk), // Buffer output
.I(sys_clk_p), // Diff_p buffer input (connect directly to top-level port)
);
w_addr <= w_addr ; //Keep the address and data values and write RAM only once
w_data <= w_data ;
end
else
begin
w_addr <= w_addr + 1'b1;
w_data <= w_data + 1'b1;
end
end
end
end
//-------------------------------------------------------------
//Instantiate RAM
ram_ip ram_ip_inst (
(clk .clka // input clka
.wea (wea (w_addr // input [0 : 0] wea
.addra (w_data (clk // input [8 : 0] addra
.dina (r_addr // input [15 : 0] dina
.clkb (r_data // input clkb
.addrb // input [8 : 0] addrb
.doutb ); ), ), ), ), ), ),//),output
), [15 : 0] doublet
);
endmodule
In order to see the data value read from RAM in real time, we add the ila tool to observe the data of RAM PORTB
For more information on how to generate ila, please refer to the "PL's "Hello World" LED Experiment".
Binding Pins
create_clock -period 5.000 -name sys_clk_p -waveform {0.000 2.500} [get_ports sys_clk_p]
simulation
The simulation method refers to the "PL's "Hello World" LED experiment". The simulation results are as follows. From the figure, we can see that address 1 is written
http://www.alinx.com.cn
139 / 244
Machine Translated by Google
The data is 0002, and in the next cycle, that is, time 2, valid data is read out.
On-board verification
Generate bitstream and download the bit file to FPGA. Next, we use ila to observe the data read from RAM.
In the Waveform window, set the r_addr address to 0 as the trigger condition. We can see that r_addr is constantly increasing from 0 to
1ff. As r_addr changes, r_data also changes. The data in r_data is exactly the 512 data we wrote into RAM. It should be noted here that when
a new address appears in r_addr, the data corresponding to r_data will be delayed by two clock cycles before it appears. The data appears
two clock cycles later than the address, which is consistent with the simulation results.
FPGA itself is SRAM architecture, after power off, the program will disappear, so how to use FPGA to realize a ROM? We can use the RAM
resources inside FPGA to realize ROM, but it is not a real ROM, but each time the power is turned on, the initialization value will be written into RAM.
This experiment will introduce how to use the ROM inside FPGA and the program to read the data of the ROM.
Experimental Principle
Xilinx has provided us with a ROM IP core in VIVADO. We only need to instantiate a ROM through the IP core.
The data stored in the ROM is read according to the ROM read timing. In the experiment, we can observe the ROM read timing and the data read
from the ROM through the online logic analyzer ila integrated in VIVADO.
Programming
Since it is a ROM, we must prepare the data for it in advance, and then when the FPGA is actually running, we directly
Just read the pre-stored data in these ROMs. The on-chip ROM of Xilinx FPGA supports initialization data configuration. As shown in the figure below,
we can create a file named rom_init.coe. Note that the suffix must be ".coe", and the name in front can be arbitrary.
The content format of the ROM initialization file is very simple, as shown in the figure below. The first line defines the data format, and 16
represents the ROM data format is hexadecimal. From line 3 to line 34, it is the initialization data of this 32*8bit ROM. Each line of numbers is followed
http://www.alinx.com.cn
141 / 244
Machine Translated by Google
After writing rom_init.coe, save it. Next, we will start designing and configuring the ROM IP core.
Before adding ROM IP, create a new project called rom_test, and then add ROM IP to the project. The method is as follows: 1)
Click IP Catalog in the figure below, search for rom in the pop-up interface on the right, find Block Memory Generator, and double-click
Open.
http://www.alinx.com.cn
142 / 244
Machine Translated by Google
2) Change the Component Name to rom_ip, and under the Basic column, change the Memory Type to Single Port ROM.
3) Switch to Port A Options, change the ROM Width to 8 and the ROM Depth to
Change Depth to 32, change Enable Port Type to Always, and cancel Primitives Output Register
4) Switch to the Other Options column, check Load Init File, click Browse, and select the .coe file you created previously.
Piece.
http://www.alinx.com.cn
144 / 244
Machine Translated by Google
The ROM program design is very simple. In the program, we only need to change the address of the ROM at each clock, and the ROM will output the current
The internal storage data of the previous address is instantiated to observe the changes of address and data.
Down:
module rom_test(
input sys_clk_p, //system clock 200Mhz postive pin
reset, low level is effective //system clock 200Mhz negetive pin input sys_clk_n, //
input rst_n );
wire sys_clk ;
IBUFDS IBUFDS_inst (
.O(sys_clk), // Buffer output
.I(sys_clk_p), // Diff_p buffer input (connect directly to top-level port)
);
http://www.alinx.com.cn
145 / 244
Machine Translated by Google
if(!rst_n) rom_addr
<= 10'd0; else
Instantiate ROM
rom_ip rom_ip_inst (
); //Instantiate logic
analyzer ila_0
ila_m0 (
.clk (sys_clk), .probe0
(rom_addr), .probe1 (rom_data)
);
endmodule
Binding Pins
create_clock -period 5.000 -name sys_clk_p -waveform {0.000 2.500} [get_ports sys_clk_p]
simulation
The simulation results are as follows, which are in line with expectations. Like the RAM read data, the data also lags behind the address by one cycle.
On-board verification
Using address 0 as the trigger condition, we can see that the read data is consistent with the simulation.
http://www.alinx.com.cn
146 / 244
Machine Translated by Google
FIFO is a very important module in FPGA applications and is widely used in data caching and cross-clock domain data processing.
A good FIFO is the key to FPGA, and using FIFO flexibly is a necessary skill for an FPGA engineer. This chapter mainly introduces the use of FIFO IP
Experimental Principle
FIFO: First in, First out means that the first-in data is first out, and the last-in data is last out. Xilinx has provided us with the FIFO IP core in VIVADO.
We only need to instantiate a FIFO through the IP core, and write and read the data stored in the FIFO according to the read and write timing of the FIFO. In
functions on the basis of RAM. The typical structure of FIFO is as follows. It is mainly divided into two parts: read and write. In addition, there are
status signals, empty and full signals, and data quantity status signals. The biggest difference from RAM is that FIFO has no address line and cannot read
data at random addresses. What is random data reading? That is, data at a certain address can be read at will. FIFO is different. It cannot be read randomly.
The advantage of this is that the address line does not need to be controlled frequently.
Although the user cannot see the address lines, there are still address operations inside the FIFO to control the read and write interfaces of the RAM.
The address of the FIFO is shown in the following figure during read and write operations, where the depth value is the maximum number of data that can be stored in a FIFO.
In the initial state, both the read and write addresses are 0. After a data is written into the FIFO, the write address is incremented by 1. After a data is read out
from the FIFO, the read address is incremented by 1. At this time, the FIFO state is empty because a data is written and a data is read out.
http://www.alinx.com.cn
148 / 244
Machine Translated by Google
You can think of FIFO as a pool. The write channel is to add water, and the read channel is to drain water. If you keep adding and draining water,
If the speed of adding water is faster than the speed of draining water, then the FIFO will be full. If you continue to add water when it is full, it will overflow. If
the speed of draining water is faster than the speed of adding water, then the FIFO will be empty. Therefore, it is a very difficult task to grasp the timing and
speed of adding and draining water to ensure that there is always water in the pool. That is, to judge the empty and full states and choose the right time to
According to the read and write clocks, it can be divided into synchronous FIFO (the read and write clocks are the same) and asynchronous FIFO (the read and write clocks are different).
FIFO control is relatively simple, so I will not introduce it here. This section mainly introduces the control of asynchronous FIFO, where the read clock is
75MHz and the write clock is 100MHz. In the experiment, we can observe the read and write timing of FIFO and the data read from FIFO through the logic
Before adding FIFO IP, create a new fifo_test project, and then add FIFO IP to the project as follows:
1) Click IP Catalog in the figure below, search for fifo in the interface that pops up on the right, find FIFO Generator, and double-click to open it.
http://www.alinx.com.cn
149 / 244
Machine Translated by Google
2) In the pop-up configuration page, you can choose to separate the read and write clocks or use the same one. Generally speaking, we use
FIFO to cache data, and the clock speeds on both sides are usually different. Therefore, independent clocks are the most commonly
used. Here we select "Independent Clocks Block RAM" and click "Next" to the next configuration page.
3) Switch to the Native Ports column, select 16 for data width and 512 for FIFO depth. You can set it according to your needs in actual
use. There are two Read Modes: Standard FIFO, which is the common FIFO, where the data lags behind the read signal by one
cycle, and First Word Fall Through, a data pre-fetch mode, referred to as FWFT mode. That is, FIFO will pre-fetch a data, and when
the read signal is valid, the corresponding data is also valid. Let's first do the standard FIFO experiment.
http://www.alinx.com.cn
150 / 244
Machine Translated by Google
4) Switch to the Data Counts column and enable Write Data Count (how much data has been written to the FIFO) and Read Data Count
(how much data can be read from the FIFO). In this way, we can use these two values to see how much data is inside the FIFO.
rst in
Reset signal, high effective
wr_clk in
Write clock input
rd_clk in
Read clock input
din in
Writing Data
wr_en in
Write enable, high effective
rd_en in
Read enable, high effective
dout out
Read Data
full
out Full signal
http://www.alinx.com.cn
151 / 244
Machine Translated by Google
The writing and reading of data in FIFO are both operated on the rising edge of the clock. When the wr_en signal is high, the FIFO data is
written. When the almost_full signal is valid, it means that only one more data can be written into the FIFO. Once a data is written, the full signal will be
pulled high. If wr_en is still valid in the full case, that is, data continues to be written to the FIFO, then the overflow of the FIFO will be valid, indicating
an overflow.
When the rd_en signal is high, read the FIFO data, and the data is valid in the next cycle. valid is the data valid signal, almost_empty means there is
one more data to read, when reading another data, the empty signal is valid, if you continue to read, underflow is valid, indicating underflow, and the data
From the FWFT mode data reading timing diagram, it can be seen that when the rd_en signal is valid, the valid data D0 is already on the data line.
Once it is ready to be effective, it will not be delayed for another cycle. This is the difference from the standard FIFO.
For details about FIFO, please refer to the pg057 document, which can be downloaded from the Xilinx official website.
We design it according to asynchronous FIFO and use PLL to generate two clocks, 100MHz and 75MHz respectively, for
Write clock and read clock, that is, the write clock frequency is higher than the read clock frequency.
assign fifo_rst_n ; //Assign = locked ; //Assign the PLL LOCK signal to the reset signal of the fifo
assign wr_clk = clk_100M =100MHz
clk_75Mclock
; // to write clock
assign rd_clk Assign 75MHz clock to read clock
reg[2:0] write_state;
reg[2:0] next_write_state;
always@(*)
begin
case(write_state)
W_IDLE:
begin
if(wcnt == 8'd79) the //Wait for a certain period of time after reset, safety
slowest clock in circuit mode is 60 cycles
next_write_state <= W_FIFO;
else
next_write_state <= W_IDLE;
end
W_FIFO:
next_write_state <= W_FIFO; default: //Always writing FIFO status
end
//In write FIFO state, if not full, write data to FIFO assign wr_en =
(write_state == W_FIFO) ? ~full : 1'b0; //When write enable is valid, add 1 to the write data value
always@(posedge wr_clk or negedge fifo_rst_n)
begin if(!fifo_rst_n) w_data <= 16'd1; else if (wr_en) w_data <= w_data + 1'b1;
end
localparam R_IDLE =1 ;
localparam R_FIFO =2;
reg[2:0] read_state; reg[2:0]
next_read_state;
always@(*)
begin
case(read_state)
R_IDLE:
begin if
(rcnt == 8'd59) //Wait for a certain period of time after reset,
end
R_FIFO:
next_read_state <= R_FIFO ; default: //Always reading FIFO status
end
//In IDLE state, that is, after reset, the counter counts
always@(posedge rd_clk or negedge fifo_rst_n) begin if(!fifo_rst_n) rcnt <= 8'd0;
In read FIFO state, if it is not empty, read data from FIFO assign rd_en
= (read_state == R_FIFO) ? ~empty : 1'b0;
//-------------------------------------------------------------
//Instantiate FIFO
fifo_ip fifo_ip_inst (
endmodule
In the program, the lock signal of PLL is used as the reset of fifo, and the 100MHz clock is assigned to the write clock.
One thing to note is that the FIFO setting defaults to using the safety circuit. This function is to ensure that the data reaches the internal RAM.
The input signal is synchronous. In this case, if an asynchronous reset is performed, it is necessary to wait for 60 slowest clock cycles.
In the experiment, that is 60 cycles of 75MHz, so a 100MHz clock will require approximately (100/75)x60=80 cycles.
http://www.alinx.com.cn
156 / 244
Machine Translated by Google
Therefore, in the write state machine, wait for 80 cycles to enter the write FIFO state
In the read state machine, wait 60 cycles to enter the read state
http://www.alinx.com.cn
157 / 244
Machine Translated by Google
If the FIFO is not empty, keep reading data from the FIFO
Instantiate two logic analyzers and connect the signals of the write channel and the read channel respectively.
simulation
The following is the simulation result. It can be seen that after the write enable wr_en is valid, data starts to be written. The initial value is 0001. It takes
a certain period of time from the beginning of writing to the empty state, because internal synchronization processing is required. After the empty state is not empty,
data is read. The read data lags one cycle relative to rd_en.
You can see later that if the FIFO is full, according to the design of the program, no data will be written to the FIFO when it is full, and wr_en will be pulled
low. Why is it full? It is because the write clock is faster than the read clock. If the write clock and the read clock are swapped, that is, the read clock is faster, the
http://www.alinx.com.cn
158 / 244
Machine Translated by Google
If you change the FIFO Read Mode to First Word Fall Through
The simulation results are as follows. It can be seen that when rd_en is valid, the data is also valid, without a cycle difference.
On-board verification
Generate the bit file, download the bit file, and two ila will appear. Let's look at the write channel first. You can see that the full signal is high.
When the level is high, wr_en is low and no more data is written into it.
http://www.alinx.com.cn
159 / 244
Machine Translated by Google
If the rising edge of rd_en is used as the trigger condition, click Run, then press Reset, which is the PL we bound
KEY1, the following result will appear, which is consistent with the simulation. In standard FIFO mode, the data lags behind rd_en by one cycle.
http://www.alinx.com.cn
160 / 244
Machine Translated by Google
The button is the most commonly used and simplest peripheral in FPGA design. This chapter uses the button detection experiment to detect the development board.
Check whether the key functions are normal, understand the specific relationship between hardware description language and FPGA, and learn the use of Vivado RTL ANALYSIS.
use.
As can be seen from the figure, the circuit is at a high level when the button is released and at a low level when it is pressed.
AXU3EG/AXU4EV/AXU5EV development board LED part circuit and LED part, high
Programming
This program is not designed to be complicated. It uses simple hardware description language to understand the hardware description language and FPGA hardware.
First, we pass the key input through a NOT gate and then through two sets of D flip-flops. The signal passing through the D flip-flop will be latched on the rising edge of the
http://www.alinx.com.cn
161 / 244
Machine Translated by Google
Clock Input
Before coding in hardware description language, we have already built the hardware, which is a normal development process.
The hardware design idea can be completed by drawing, Verilog HDL or VHDL.
1) First, create a button test project, add Verilog test code, and complete the compilation and pin allocation processes.
);
wire clk ;
IBUFDS IBUFDS_inst (
.O(clk), // Buffer output
.I(sys_clk_p), // Diff_p buffer input (connect directly to top-
http://www.alinx.com.cn
162 / 244
Machine Translated by Google
level port)
.IB(sys_clk_n) // Diff_n buffer input (connect directly to top-level port) );
endmodule
3) Analyzing the RTL diagram, we can see that the first-stage D flip-flop is input after inversion, and the second-stage is directly input, which is the same as the expected design.
To.
On-board verification
After the Bit file is downloaded to the development board, the "PL LED" on the development board is off. Press the "PL KEY" button to turn on the "PL LED".
http://www.alinx.com.cn
164 / 244
Machine Translated by Google
This article mainly explains how to use PWM to control LED to achieve the effect of breathing light.
Experimental Principle
As shown in the figure below, using an N-bit counter, the maximum value can be expressed as 2 to the power of N, and the minimum value is 0.
The accumulation is performed with "period" as the step value. When the maximum value is reached, it will overflow and enter the next accumulation cycle.
When the pulse is on duty, the pulse output is high, otherwise the output is low, so that the pulse duty cycle can be adjusted as shown by the red line in the figure.
Pulse output, while "period" can adjust the pulse frequency, which can be understood as the step value of the counter.
When the square wave output with different pulse duty ratios is added to the LED, the LED light will show different brightness.
The duty cycle of the square wave is adjusted to adjust the brightness of the LED lamp.
Experimental design
The PWM module design is very simple, which has been explained in the above principle, so I will not explain the principle here.
clk
in Clock input
rst
in Asynchronous reset input, high reset
period in
PWM Pulse Width Period (Frequency) Control. period = PWM output frequency
rate*(2 to the power of N)/system clock frequency. Obviously, the larger N is, the higher the frequency will be.
duty
In duty cycle control, duty cycle = duty / (2 to the power of N) * 100%
)(
reg[N - 1:0] period_r; reg[N - 1:0] duty_r; //period register //duty register
reg[N - 1:0] period_cnt; //period counter
reg pwm_r; assign pwm_out = pwm_r; always@(posedge clk or posedge rst)
begin if(rst==1)
begin period_r <= { N {1'b0} }; duty_r <= { N
{1'b0} }; end
else
begin
period_r <= period; duty_r <= duty;
end
end
//period counter, step is period value always@(posedge clk or posedge rst)
begin if(rst==1) period_cnt <= { N {1'b0} }; else
else
begin
if(period_cnt >= duty_r) //if period counter is bigger or equals to duty value, then set pwm value to high pwm_r <=
1'b1; else pwm_r <= 1'b0;
end
end
So how do we achieve the effect of breathing light? We know that the effect of breathing light is that it changes from dark to bright, and then from bright to dark.
http://www.alinx.com.cn
166 / 244
Machine Translated by Google
The process of light and dark, and the light and dark effect is adjusted by the duty cycle, so we mainly control the duty cycle, that is, control the duty
The value of .
In the following test code, by setting the value of period, the PWM frequency is set to 200Hz, and the PWM_PLUS state is
The duty value is increased. If it reaches the maximum value, pwm_flag is set to 1 and the duty value is reduced.
The minimum value starts to increase the duty value and the cycle continues. The PWM_GAP state is the adjustment interval, and the time is 100us.
//200MHz
localparam US_COUNT = CLK_FREQ ; //1 us counter
localparam MS_COUNT = CLK_FREQ*1000 ; //1 ms counter
reg[3:0] state;
reg[31:0] timer; //duty adjustment counter
wire clk ;
IBUFDS IBUFDS_inst (
.O(clk), // Buffer output
.I(sys_clk_p), // Diff_p buffer input (connect directly to top-level port)
);
http://www.alinx.com.cn
167 / 244
Machine Translated by Google
else
case(state)
IDLE:
begin
period <= 32'd17179; //The pwm step value, pwm
200Hz(period = 200*2^32/50000000) state
<= PWM_PLUS;
duty <= DUTY_MIN_VALUE;
end
PWM_PLUS :
begin if
(duty > DUTY_MAX_VALUE - DUTY_STEP) //if duty is bigger than DUTY MAX VALUE minus
DUTY_STEP , begin to minus duty value begin pwm_flag <= 1'b1 ; duty <= duty - DUTY_STEP ; end
else
begin
pwm_flag <= 1'b0 ; duty <= duty +
DUTY_STEP ;
end
end
ax_pwm_m0( .clk
(clk),
.rst (~rst_n), .period
(period), .duty (duty), .pwm_out
(pwm_out) ); endmodule
Download Verification
Generate bitstream and download bit file, you can see the PL LED light produces breathing light effect. PWM is a commonly
used module, such as fan speed control, motor speed control and so on.
http://www.alinx.com.cn
169 / 244
Machine Translated by Google
This chapter uses the UART interface circuit on the PL side of the development board to implement UART data transmission.
Programming
The serial port described in this article refers to asynchronous serial communication, and asynchronous serial refers to UART (Universal
Asynchronous Receiver/Transmitter), universal asynchronous reception/transmission. This experimental program is designed to send "HELLO ALINX" to the serial
port every second. If the data received by RXD is received, the received data will be sent out to realize the loopback function.
FPGA
RxD
UART Receive
Program
USB Serial Port
USB to Serial CP2102
UART control
chip
program
Program
The message frame starts with a low start bit, followed by 7 or 8 data bits, an optional parity bit and one or more
When the receiver sees the start bit it knows that data is ready to be sent and tries to synchronize with the transmitter clock frequency.
If parity is selected, the UART adds a parity bit after the data bits. The parity bit can be used to assist in error detection.
During this process, the UART removes the start and end bits from the message frame, performs a parity check on the incoming bytes, and sends the data bytes out of the
Serial to parallel conversion. UART transmission timing is shown in the figure below:
From the waveform, we can see that the start bit is low level, the stop bit and the idle bit are both high level, which means there is no data transmission.
When it is high level, we can use this feature to receive data accurately. When a falling edge event occurs, we think that it will enter
http://www.alinx.com.cn
170 / 244
Machine Translated by Google
Common serial communication baud rates include 2400, 9600, 115200, etc. The sending and receiving baud rates must be consistent.
Correct communication. Baud rate refers to the maximum number of data bits transmitted in 1 second, including start bit, data bit, check bit, and stop bit.
If the communication baud rate is set to 9600, then the duration of a data bit is 1/9600 seconds. The baud rate in this experiment is
The serial port receiving module uart_rx is a parameterized configurable module. The parameter "CLK_FRE" defines the system clock of the receiving module.
Frequency, the unit is Mhz, parameter "BAUD_RATE" is the baud rate. The state transition diagram of the receiving state machine is as follows:
The "S_IDLE" state is the idle state. After power-on, it enters "S_IDLE". If the signal "rx_pin" has a falling edge, we
It is considered as the start bit of the serial port and enters the state "S_START". After a BIT time, the start bit ends and enters the data bit receiving state.
The data bit design in this experiment is 8 bits. After receiving, it enters the "S_STOP" state.
"S_STOP" does not wait for a BIT cycle, Only waited half a BIT time , this is because if you wait for one cycle,
It is possible to miss the start bit judgment of the next data, and finally enter the "S_DATA" state, sending the received data to other
Module. In this module, we mention one thing: in order to satisfy the sampling theorem, each data is received in the baud rate counter
Notice: .
There is no parity bit in this experiment
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low level reset
rx_data out
8 Received serial port data (8-bit data)
rx_data_valid out
1 The received serial port data is valid (high effective)
rx_data_ready in
1 means the user can receive data from the receiving module.
Send
rx_pin in
1 Serial port receives data input
The design of the sending module uart_tx is similar to that of the receiving module, and it also uses a state machine. The state transition diagram is as follows:
After power-on, it enters the "S_IDLE" idle state. If there is a send request, it enters the send start bit state "S_START".
After the start bit is sent, it enters the send data bit state "S_SEND_BYTE", and after the data bit is sent, it enters the send stop bit state
"S_STOP", after the stop bit is sent, it enters the idle state. In the data sending module, the data written from the top module
Directly pass to register 'tx_reg', and simulate the conditional transition of the serial port transmission protocol in the state machine through the 'tx_reg' register
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low level reset
tx_data in 8
Serial port data to be sent (8-bit data)
tx_data_valid in
1 The serial port data sent is valid (high effective)
tx_data_ready out
1 The sending module is ready to send data. The user can
tx_pin out
Serial port sends data
In the sending and receiving modules, the parameter CYCLE is declared, which is the count value of one cycle of UART. Of course, the count is
It is performed under 50MHz clock. The user only needs to set the two parameters CLK_FRE and BAUD_RATE.
The test program is designed to send "HELLO ALINX\r\n" to the serial port once every 1 second. If it receives
Serial port data directly sends the received data to the sending module and then returns. "\r\n" here is consistent with the C language, both are carriage
test program instantiates the sending module and the receiving module respectively, and passes the parameters in. The baud rate is set to
115200.
if(tx_data_valid == 1'b1 && tx_data_ready == 1'b1 && tx_cnt < 8'd12)//Send 12 bytes
data
begin
tx_cnt <= tx_cnt + 8'd1; //Send data counter
end
else if(tx_data_valid && tx_data_ready)//last byte sent is complete begin tx_cnt <= 8'd0;
end
else if(~tx_data_valid) begin
end
default:
state <= IDLE;
endcase
end
//combinational logic
//Send "HELLO ALINX\r\n"
always@(*)
begin
case(tx_cnt)
8'd0 : tx_str <= "H"; 8'd1 :
tx_str <= "E"; 8'd2 : tx_str
<= "L"; 8'd3 : tx_str <= "L";
8'd4 : tx_str <= "O"; 8'd5 :
tx_str <= " "; 8'd6 : tx_str <=
"A"; 8'd7 : tx_str <= "L";
8'd8 : tx_str <= "I"; 8'd9 :
tx_str <= "N"; 8'd10: tx_str
<= "X"; 8'd11: tx_str <= "\r";
8'd12: tx_str <= "\n";
default:tx_str <= 8'd0; endcase
end
uart_rx# (
(
.clk (sys_clk ),
(rst_n ),
(rx_data ),
(rx_data_valid ),
(rx_data_ready ),
.rst_n .rx_data .rx_data_valid .rx_data_ready
(uart_rx
.rx_pin )
);
uart_tx#
(
.CLK_FRE(CLK_FRE),
.BAUD_RATE(115200)
) uart_tx_inst
(
(sys_clk ),
(rst_n ),
(tx_data ),
(tx_data_valid ),
(tx_data_ready ),
(uart_tx
.clk .rst_n .tx_data .tx_data_valid .tx_data_ready .tx_pin )
);
simulation
Here we add a serial port receiving stimulus program vtf_uart_test.v file to simulate uart serial port receiving.
Here, the data 0xa3 is sent to the uart_rx of the serial port module. Each bit of data is sent at a baud rate of 115200, starting with bit 1.
The simulation results are as follows: when the program receives 8 bits of data, rx_data_valid is valid, and the data of rx_data[7:0] is
Position a3.
http://www.alinx.com.cn
175 / 244
Machine Translated by Google
Experimental testing
http://www.alinx.com.cn
176 / 244
Machine Translated by Google
Open the serial port debugging, select "COM79" as the port (select according to your own situation), set the baud rate to 115200, and check the bit
Select None, select 8 for data bits, select 1 for stop bits, and then click "Open Serial Port". This software is in the example folder.
After opening the serial port, you can receive "HELLO ALINX" every second. Enter the text you want to send in the sending area input box, click
"Manual Send", and you can see that the characters you sent are received.
http://www.alinx.com.cn
177 / 244
Machine Translated by Google
This chapter introduces RS485 data transmission using the AN3485 module.
Experimental Principle
In the previous UART experiment, RS485 uses differential signal transmission, but RS485 is half-duplex transmission, that is, data can only be transmitted
in one direction at the same time. There are only differential signals A and B, and the signals connected to ARM or FPGA are DE (direction selection), DI (input signal
From the MAX3485 document, the sending direction, if DE is 1, that is, output enable, DI value is 1, for
The differential signals A and B have values of 1 and 0, otherwise they are 0 and 1.
From the receiving point of view, if DE is 0 and the difference between A and B is greater than or equal to +0.2V, the RO value is 1, otherwise it is 0.
http://www.alinx.com.cn
179 / 244
Machine Translated by Google
Programming
Since RS485 is half-duplex transmission, we need to formulate a transmission protocol for handshake. Set the first byte to 8'h55,
indicating the beginning of a frame of data, followed by the length of the transmitted data. Due to the FIFO size limit (256), the range is 1~255,
uart_tx and uart_rx are the same as the UART experiment, so here we only need to modify uart_test.
In the initial state, DE is set to 0, that is, input, waiting to receive data sent by the host computer and cache it in FIFO. The FIFO size is set to
256, and then switch DE to 1, that is, output, read the received data from FIFO and send it out. Note that the cached data is minus the starting 8'h55
and quantity information. In the RCV_HEAD state, determine whether the received data
is "S".
In the RCV_COUNT state, if the data length is less than 0, it jumps to the IDLE state; if it is greater than 0, it enters the
In the RCV_DATA state, write the data into the FIFO, check the data length, and switch the direction of RS485 to output.
When switching bus states, in order to ensure reliable operation, in the WAIT state, a delay of 1ms is applied to switch the direction.
http://www.alinx.com.cn
180 / 244
Machine Translated by Google
Then the data in the FIFO is sent. The SEND_WAIT state controls the read enable signal fifo_rden and determines the data
Experimental testing
We use a USB to serial device to connect the A and B of RS485_1 to the A and B of the device respectively through Dupont cables.
http://www.alinx.com.cn
181 / 244
Machine Translated by Google
Open the serial port tool, set the serial port number and baud rate, select hexadecimal transmission, and send data starting with 8'h55. Click
Send, and you can see the returned data in the receiving window.
http://www.alinx.com.cn
182 / 244
Machine Translated by Google
Hardware Introduction
The PL side of the development board has a 16-bit DDR4, which greatly facilitates us to migrate the previous FPGA project to the ZYNQ system.
13.2.1 Create a PL-side DDR4 test project and configure the DDR4 IP
1) Search for "mig" in the search box of "IP Catalog" and quickly find "Memory Interface Generator".
hit
http://www.alinx.com.cn
183 / 244
Machine Translated by Google
2) Component Name can be modified, Controller/PHY Mode select "Controller and physical layer",
Select 200MHz for reference clock, i.e. 5003ps, select "MT40A512M16HA-083E" for Mother Part, and select "MT40A512M16HA-083E" for Data
3) Generate
The main function of other codes is to read and write ddr3 and compare whether the data is consistent. I will not go into details here. Please refer to the engineering code
code.
Add mark_debug debugging in mem_test.v. For the specific operation process, please refer to PL's "Hello World" LED experiment
http://www.alinx.com.cn
185 / 244
Machine Translated by Google
After the bit file is generated, use JTAG to download it to the development board. The MIG_1 window will display DDR4 calibration information.
Experimental Summary
This experiment uses the PL-side Verilog code to directly read and write ddr4. We can also configure ddr4 as an AXI interface.
http://www.alinx.com.cn
186 / 244
Machine Translated by Google
Earlier, we introduced the LED flashing experiment, just to understand the basic development process of Vivado.
The LED flash experiment is more complicated. It makes a color bar for HDMI output, which is also the basis for our later study of display and video
processing. The experiment does not involve the PS system. From the experimental design, it can be seen that if you want to use the ZYNQ chip very well,
Hardware Introduction
Since the development board only has DP for display, but it is on the PS side, and the PL side does not have an HDMI interface, we use
The HDMI expansion module of AN9134 realizes HDMI display. It encodes 24-bit RGB and outputs TMDS differential signals. SIL9134 has powerful
functions, and this experiment only uses a small part of it to display RGB24 video data.
http://www.alinx.com.cn
187 / 244
Machine Translated by Google
SI9134 chip needs to configure registers through I2C bus to work properly. From the schematic diagram, we can see that I2C bus connection
The IO connected to the PL side can be directly configured through the PL.
Programming
video_clk
8-bit R data Tmds_data0_p/n
sys_clk video_clk
video_pll 8-bit G data
color_bar Tmds_data1_p/n
8-bit B data
clk_100MHz HS/VS/DE
SI9134
Tmds_data2_p/n
SCL
lut_data Tmds_clk_p/n
i2c_config SDA
Lut_si9134 lut_index
This experiment realizes the display of color bars through HDMI. The video timing generation and color bar generation modules are designed in the experiment.
"color_bar.v", I2C Master register configuration module "i2c_config.v", configuration data lookup table module "lut_si9134.v".
The specific codes are not introduced here one by one, you can go and see them yourself.
A brief introduction:
The top-level module top.v is the top-level file of the project, which mainly instantiates four sub-modules (clock module vidio_pll, color bar generator
It is composed of module color_bar, I2C configuration module i2c_config and configuration lookup table module lut_si9134.
The color bar generation module color_bar.v generates 8 colors of VGA format color bars, namely white, yellow, cyan,
Green, purple, red, blue and black. Generates color bars with a resolution of 1920x1080 and a refresh rate of 60Hz, which is the so-called 1080P high-
definition video image. Therefore, this module will output R (8 bits) G (8 bits) B (8 bits) image signals, row synchronization, column synchronization and
100Mhz clock and a 1080P pixel clock of 148.5Mhz. To generate the clock IP, click IP Catalog under the Project Manager directory, and then select FPGA
http://www.alinx.com.cn
188 / 244
Machine Translated by Google
Add the following xdc constraint file to the project, and add clock and HDMI related pins in the constraint file.
Save the project and compile to generate a bit file, connect the HDMI module to the J45 expansion port, and connect the HDMI interface to the HDMI display.
Please note that 1920x1080@60Hz is used here, please make sure your monitor supports this resolution.
Experimental Summary
This experiment is a preliminary contact with video display and video knowledge, which is not the focus of Zynq learning, so it is not
introduced in detail, but Zynq is widely used in the field of video processing, and learners need to have a good basic knowledge. In the experiment,
only PL is used to drive the HDMI chip, including I2C register configuration. Of course, it is more appropriate to use PS to configure I2C.
http://www.alinx.com.cn
192 / 244
Machine Translated by Google
In the HDMI output experiment, the HDMI display principle and display mode were explained. This experiment introduces how to use FPGA to realize digital
Through this experiment, we can have a deeper understanding of HDMI display mode.
Experimental Principle
The experiment uses a character conversion tool to convert characters into hexadecimal coe files and stores them in a single-port ROM IP core.
The converted data is read out from the ROM and displayed on the HDMI.
Programming
The character display routine adds an osd_display module based on the HDMI display.
It is used to read the converted character information stored in the Rom ip core and display it in the specified area. The program flowchart is shown below:
osd_display
8-bit R data 8-bit R data Tmds_data0_p/n
sys_clk video_clk
video_pll 8-bit G data timing_gen 8-bit G data
_xy
color_bar Tmds_data1_p/n
8-bit B data 8-bit B data
HS/VS/DE HS/VS/DE
clk_100MHz osd_rom SI9134
video_clk Tmds_data2_p/n
video_clk
SCL
lut_data Tmds_clk_p/n
i2c_config SDA
lut_si9134 lut_index
1) In the "timing_gen_xy" module, two counters "x_cnt" and "y_cnt" are defined according to the HDMI timing standard and are
These two counters generate the "x" and "y" coordinates of the HDMI display. The program uses "vs_edge" and "de_falling"
They represent the field synchronization start signal and the data valid end signal respectively. The principle is shown in the figure below:
http://www.alinx.com.cn
193 / 244
Machine Translated by Google
rst_n in
Asynchronous reset input, low reset
clk in
External clock input
i_hs in
Line sync signal
i_vs in
Field sync signal
i_de in
Data valid signal
i_data in
color_bar data
2) The following describes how to store text information in ROM IP. First, you need to generate a .coe file that can be recognized by XILINX FPGA.
First, find the "FPGA Font Extraction" tool in the project folder.
Enter the characters you want to display in the "Character Input" box of the extraction tool. The font and character height can be customized.
After the settings are completed, click the "Convert" button. You can see the converted character dot matrix size, dot matrix width and height in the lower left corner of the interface.
The width and height of the dot matrix are 144x32 here, which needs to be consistent with the definition in the osd_display program:
Click the "Save" button to save the file to the source file directory of this example. It should be noted that the save type should be
http://www.alinx.com.cn
195 / 244
Machine Translated by Google
Find the generated .coe file and open it, and you can see the following:
The process of calling the single-port ROM IP core has been introduced in the previous ROM usage. Set it to Single Port ROM
http://www.alinx.com.cn
196 / 244
Machine Translated by Google
Add the osd.coe file as shown below (find the coe file generated earlier), and click the "OK" button when completed:
http://www.alinx.com.cn
197 / 244
Machine Translated by Google
3) The osd_display module includes the timing_gen_xy module and the osd_rom module. The character data stored in osd_rom, such as
If the data is 1, the OSD area displays the foreground red in the ROM (displaying ALINX core station), if the data is 0, OSD
Set the area valid signal, that is, the characters are displayed in this area, the starting coordinates are set to (9, 9), and the area size can be
Many people may not understand why the ROM read address is [15:3], which means that it takes eight clock cycles to read one.
This is because one dot of a character represents only 1 bit, and the storage data width of ROM is 8 bits, so eight cycles are needed.
Periodically take out a data and compare the value of each bit, converting a character point into a pixel on the image.
rst_n in
Asynchronous reset input, low reset
http://www.alinx.com.cn
198 / 244
Machine Translated by Google
pclk in
External clock input
i_hs in
Line sync signal
i_vs in
Field sync signal
i_de in
Data valid signal
i_data in
color_bar data
Experimental phenomenon
Connect the development board and the display. For the connection method, refer to the "HDMI Output Experiment" tutorial. Note that the various
Do not hot-swap the connectors while they are powered on. After downloading the experimental program, you can see the display showing characters with a color bar as the background.
The board is used as an HDMI output device and can only be displayed through an HDMI display device. Do not try to use the HDMI of a laptop.
The interface is used for display, because the notebook is also an output device.
The default character display position is at coordinates (9, 9). In addition, users can modify the following pos_y and pos_x judgments
Based on the HDMI output experiment, this chapter introduces the display of 7-inch LCD screen.
Hardware Introduction
AN970 LCD touch screen module consists of TFT LCD screen, capacitive touch screen and driver board. For details, please refer to
Programming
The experiment in this chapter is actually very simple. The biggest difference from HDMI display is that it does not require i2c configuration and the output can be RGB.
http://www.alinx.com.cn
200 / 244
Machine Translated by Google
At the same time, because the resolution of the LCD screen is 800x480, the macro definition of video_define.v needs to be modified.
At the same time, the output clock frequency of the PLL is modified to 33MHz, which is the pixel clock of 800x480.
At the same time, ax_pwm is instantiated in top.v to adjust the brightness of the LCD screen, which is set to 200Hz and 30% dot-to-space ratio.
http://www.alinx.com.cn
201 / 244
Machine Translated by Google
Experimental phenomenon
Connect the LCD screen to the J45 expansion port, download the program, and you can see the color bar display.
Character Display
http://www.alinx.com.cn
202 / 244
Machine Translated by Google
This experiment uses ADC. The ADC module model used in the experiment is AN706, with a maximum sampling rate of 200Khz and an accuracy of
16 bits. In the experiment, the two inputs of AN706 are displayed on HDMI in the form of waveforms. We can observe the waveforms in a more intuitive
http://www.alinx.com.cn
203 / 244
Machine Translated by Google
Experimental Principle
The AD7606 is an integrated 8-channel simultaneous sampling data acquisition system that integrates input amplifiers, overvoltage protection
circuits, second-order analog antialiasing filters, analog multiplexers, a 16-bit 200 kSPS SAR ADC and a digital filter, 2.5 V reference voltage source,
reference voltage buffer, and high speed serial and parallel interfaces.
The AD7606 is powered by a single +5V power supply and can handle ±10V and ±5V true bipolar input signals.
Sampling at throughput rates up to 200KSPS. Input clamp protection circuitry can tolerate voltages up to ±16.5V.
Regardless of the sampling frequency, the analog input impedance of AD7606 is 1M ohm. It uses a single power supply.
On-chip filtering and high input impedance eliminate the need for a driver op amp and external bipolar power supply.
The AD7606 anti-aliasing filter has a 3dB cutoff frequency of 22kHz; when the sampling rate is 200kSPS, it has a 40dB anti-aliasing filter.
Aliasing suppression features. Flexible digital filters are pin-driven to improve signal-to-noise ratio (SNR) and reduce 3dB bandwidth.
http://www.alinx.com.cn
204 / 244
Machine Translated by Google
The AD7606 can sample all eight analog input channels simultaneously.
CONVSTB) together, all channels are sampled synchronously. The rising edge of this common CONVST signal starts sampling on all analog inputs.
AD7606 has an internal oscillator for conversion. The conversion time for all ADC channels is tCONV. The BUSY signal
The user knows that the conversion is in progress, so when the rising edge of CONVST is applied, BUSY becomes a logic high level.
The falling edge of the BUSY signal is used to return all eight sample-and-hold amplifiers to tracking mode.
The falling edge also indicates that the data of the 8 channels can now be read from the parallel bus DB[15:0].
In the AN706 8-channel AD module hardware circuit design, we add pull-up resistors to the three configuration pins of AD7606.
The AD7606 chip supports external reference voltage input or internal reference voltage. If an external reference voltage is used, the chip
REFIN/REFOUT requires an external 2.5V reference source. If the internal reference voltage is used, the REFIN/REFOUT pin is
The internal reference voltage output is 2.5V. The REF SELECT pin is used to select the internal reference voltage or the external reference voltage.
In the circuit design, the internal reference voltage of AD7606 is chosen because of its high accuracy (2.49V~2.505V).
REF SELECT
High level Using the internal reference voltage 2.5V
The AD7606 AD conversion data acquisition can be in parallel mode or serial mode. The user can set
PAR/SER/BYTE SEL pin level to set the communication mode. When we design, we choose parallel mode to read AD7606
AD data.
PAR/SER/BYTE SEL
Low level Select Parallel Interface
The input range of the AD7606 AD analog signal can be set to ±5V or ±10V. When the ±5V input range is set,
1LSB=152.58uV; when the input range is set to ±10V, 1LSB=305.175uV. Users can set the RANGE pin voltage
http://www.alinx.com.cn
205 / 244
Machine Translated by Google
RANGE
Low level Analog signal input range selection: ±5V
The AD7606 contains an optional digital first-order sinc filter that can be used in applications where lower throughput rates are used or where a higher signal-to-noise ratio is required.
In the case of oversampling, a filter should be used. The oversampling ratio of the digital filter is controlled by the oversampling pin OS[2:0]. The following table provides the
In the hardware design of the AN706 module, OS[2:0] has been introduced to the external interface, and the FPGA or CPU can control
The OS[2:0] pin level is used to select whether to use the filter to achieve higher measurement accuracy.
The output coding of AD7606 is two's complement. The designed code conversion is in the middle of consecutive LSB integers (i.e. 1/2LSB).
and 3/2LSB). The LSB size of AD7606 is FSR/65536. The ideal transfer characteristic of AD7606 is shown in the figure below:
Programming
The display part of this experiment is based on the previous HDMI display color bar experiment. Grid lines and waveforms are superimposed on the color bar.
http://www.alinx.com.cn
206 / 244
Machine Translated by Google
video_pll
Second Road
First Road
buf_addr
buf_data
buf_addr
buf_data
buf_wr
buf_wr
SDA
SCL
adc_pll i2c_config
ad7606_sample ad7606_sample
Channel 1 acquisition Channel 2 acquisition
ad7606_if
AN706 Module
The ad7606_if module is the interface module of AN706, which completes the data acquisition of 8-way AD input of AD706.
The timing of the AD706 chip generates the AD conversion signal ad_convstab, and after the ADC busy signal is invalid, it generates the chip select signal.
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low reset
adc_data in 16
ADC Data Input
ad_busy in 1
ADC busy signal
first_data in
1 The first channel data indication signal
ad_os out 3
ADC Oversampling
ad_cs out 1
ADC chip select
ad_rd out 1
ADC read signal
ad_reset out 1
ADC reset signal
ad_convstab out 1
ADC converts the signal
adc_data_valid in 1
ADC data valid
ad_ch1 out 16
ADC Channel 1 Data
ad_ch2 out 16
ADC Channel 2 Data
ad_ch3 out 16
ADC channel 3 data
ad_ch4 out 16
ADC Channel 4 Data
ad_ch5 out 16
ADC channel 5 data
ad_ch6 out 16
ADC channel 6 data
ad_ch7 out 16
ADC channel 7 data
ad_ch8 out 16
ADC channel 8 data
The ad7606_sample module mainly completes the single-channel data conversion of ad706. First, the input data needs to be converted to unsigned
The last data is the high 8 bits, and the data width is converted to 8 bits (to be compatible with other 8-bit AD module programs).
In addition, 1280 data are collected each time, and then the next 1280 data are collected after a period of time.
(bit)
adc_clk in 1
adc system clock
rst in
1 Asynchronous reset, high reset
adc_data in 16
ADC Data Input
adc_data_valid in 1
adc data valid
adc_buf_wr out 1
ADC data write enable
adc_buf_addr out 12
ADC data write address
adc_buf_data out
8 Unsigned 8-bit ADC data
The grid_display module mainly completes the grid line overlay of the video image. In this experiment, the color bar video is input and then overlaid with a
Video display positions from 9 to 1018 horizontally (left to right) and from 9 to 308 vertically (top to bottom).
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The wav_display display module is mainly used to complete the superposition display of waveform data. The module contains a dual-port RAM, write port
It is written by the ADC acquisition module, and the read port is the display module. When the grid display area is valid, each line of display will read
Get the AD data value stored in RAM and compare it with the Y coordinate to determine whether to display the waveform or not.
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
wave_color in twenty four
adc_clk in 1
adc module clock
adc_buf_wr in 1
adc data write enable
adc_buf_addr in 12
adc data write address
adc_buf_data in 8
adc data, unsigned
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The timing_gen_xy module is a submodule of other modules, which completes the coordinate generation of the video image, the x coordinate, increasing from left to right.
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output
o_vs out
1 Video field sync output
o_de out
1 Video data is output effectively
o_data out
24 Video data output
x out
12 Coordinate x output
y out
12 Coordinate y output
Experimental phenomenon
The connection circuit is as follows. Insert the AN706 module and connect the SMA to the waveform generator. In order to observe the display effect conveniently, the
waveform generator sampling frequency is set in the range of 500Hz~10KHz, and the maximum voltage amplitude is 10V. The result is the effect diagram at the beginning of
this chapter.
http://www.alinx.com.cn
212 / 244
Machine Translated by Google
Hardware Introduction
BlackGold high-speed AD module AN9238 is a 2-channel 65MSPS, 12-bit analog signal to digital signal converter
The module uses the AD9238 chip from ADI Company for AD conversion. The AD9238 chip supports 2
The module supports 2-channel AD input conversion, so one AD9238 chip supports 2-channel AD input conversion. The analog signal
input supports single-ended analog signal input, the input voltage range is -5V~+5V, and the interface is an SMA socket. The module
has a standard 2.54mm pitch 40-pin female header for connecting to the FPGA development board.
Parameter
http://www.alinx.com.cn
213 / 244
Machine Translated by Google
AD1 Input
SMA
12-bit AD1 data interface Single-ended to
Op amp SMA
differential
AD8065 interface
SMA
40 40
AD8138
Row
Row Dual channel AD chip
mother
mother
12-bit AD2 data AD9238 AD2 Input
even SMA
even
Single-ended to
catch interface Op amp
catch SMA
differential
Device
Device 65M AD2 clock AD8065 interface
SMA AD8138
interface
For the specific reference design of AD9238 circuit, please refer to the chip manual of AD9238.
Single-ended input AD1 and AD2 are input through two SMA connectors J5 or J6. The voltage of single-ended input is
-5V~+5V.
The AD8065 chip and the voltage divider resistor on the board reduce the -5V~+5V input voltage to -1V~+1V. If
the user wants to input a wider range of voltage input, just modify the resistance value of the voltage divider resistor at the front end.
http://www.alinx.com.cn
214 / 244
Machine Translated by Google
The following table is a comparison table of analog input signals and the voltage after the AD8065 op amp output:
-5V -1V
0V 0V
+5V +1V
The input voltage of -1V~+1V is converted into a differential signal (VIN+ ÿ VIN
ÿ), the common mode level of the differential signal is determined by the CML pin of AD.
The following table is a voltage comparison table from analog input signal to AD8138 differential output:
AD analog input value AD8065 op amp output AD8138 differential output (VIN+ÿ
VINÿ
0V 0V 0V
3) AD9238 conversion
By default, AD is configured as offset binary. The value of AD conversion is shown in the figure below:
In the module circuit design, the VREF value of AD9238 is 1V, so the final analog signal input
0V 0V 0V 100000000000
From the table we can see that when -5V is input, the digital value converted by AD9238 is the smallest, and +5V is the smallest.
The digital output of the AD9238 dual-channel AD is a +3.3V CMOS output mode, with two channels (A and
B) Independent data and clock. AD data is converted on the rising and falling edges of the clock. When AD is available on the FPGA side
Programming
The display part of this experiment is based on the previous HDMI display color bar experiment. Grid lines and waveforms are superimposed on the color bar.
video_pll
Second Road
First Road
Grid Line Overlay AD waveform superposition
AD waveform superposition
buf_addr
buf_data
buf_addr
buf_data
buf_wr
buf_wr
adc_pll ad9238_sample
ad9238_sample
Channel 1 acquisition Channel 2 acquisition
AN9238 Module
The ad9238_sample module mainly completes the single-channel data conversion of AN9238. The final data only takes the high 8 bits.
The data width is converted to 8 bits (to be compatible with other 8-bit AD module programs). In addition, 1280 data are collected each time.
Then wait for a while and continue to collect the next 1280 data.
(bit)
adc_clk in 1
adc system clock
rst in
1 Asynchronous reset, high reset
adc_data in 12
ADC Data Input
adc_buf_wr out 1
ADC data write enable
adc_buf_addr out 12
ADC data write address
adc_buf_data out
8 Unsigned 8-bit ADC data
The grid_display module mainly completes the grid line overlay of the video image. In this experiment, the color bar video is input and then overlaid with a
Video display positions from 9 to 1018 horizontally (left to right) and from 9 to 308 vertically (top to bottom).
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The wav_display display module is mainly used to complete the superposition display of waveform data. The module contains a dual-port RAM, write port
It is written by the ADC acquisition module, and the read port is the display module. When the grid display area is valid, each line of display will read
Get the AD data value stored in RAM and compare it with the Y coordinate to determine whether to display the waveform or not.
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
wave_color in twenty four
adc_clk in 1
adc module clock
adc_buf_wr in 1
adc data write enable
adc_buf_addr in 12
adc data write address
adc_buf_data in 8
adc data, unsigned
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The timing_gen_xy module is a submodule of other modules, which completes the coordinate generation of the video image, the x coordinate, increasing from left to right.
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output
o_vs out
1 Video field sync output
o_de out
1 Video data is output effectively
o_data out
24 Video data output
x out
12 Coordinate x output
y out
12 Coordinate y output
Experimental phenomenon
The circuit is connected as follows. Adjust the frequency and amplitude of the signal generator. The input range of AN9238 is -5V-5V. For easy observation
Waveform data, the recommended signal input frequency is 200Khz to 1Mhz. Observe the display output, the red waveform is CH1 input, the blue is CH2 input,
the top horizontal line of the yellow grid represents 5V, the bottom horizontal line represents -5V, the middle horizontal line represents 0V, and each vertical line
http://www.alinx.com.cn
221 / 244
Machine Translated by Google
This experiment uses ADC and DAC. The ADDA module model used in the experiment is AN108. The maximum sampling rate of ADC is 32Mhz,
the precision is 8 bits, and the maximum sampling rate of DAC is 125Mhz, the precision is 8 bits. In the experiment, DAC is used to output sine waves, and
then ADC is used to collect and display the waveform on the HDMI display.
ADDA Module
Hardware Introduction
http://www.alinx.com.cn
222 / 244
Machine Translated by Google
As shown in the hardware structure diagram, the DA circuit consists of a high-speed DA chip, a 7th-order Butterworth low-pass filter, an amplitude
output interface. The high-speed DA chip we use is the AD9708 launched by AD. AD9708 is an 8-bit, 125MSPS DA conversion chip with a built-in
1.2V reference voltage and differential current output. The internal structure of the chip is shown in the figure below.
After the AD9708 chip differential output, in order to prevent noise interference, a 7th-order Butterworth low-pass filter is connected to the circuit.
The bandwidth is 40MHz and the frequency response is shown in the figure below
http://www.alinx.com.cn
223 / 244
Machine Translated by Google
After the filter, we used two high-performance 145MHz bandwidth AD8056 op amps to convert differential to single-ended, and
The amplitude adjustment function maximizes the performance of the entire circuit. A 5K potentiometer is used for amplitude adjustment.
Note: Since the accuracy of the circuit is not very precise, the final output has a certain error, and it is possible that the waveform amplitude cannot reach
10Vpp, there may also be problems such as waveform clipping, which are all normal
.
As shown in the hardware structure diagram, the AD circuit consists of a high-speed AD chip, an attenuation circuit, and a signal input interface.
The high-speed AD chip we use is the 8-bit AD9280 chip launched by AD Company with a maximum sampling rate of 32MSPS.
http://www.alinx.com.cn
224 / 244
Machine Translated by Google
According to the configuration in the figure below, we set the AD voltage input range to: 0V~2V
Before the signal enters the AD chip, we use an AD8056 chip to build an attenuation circuit. The input range of the interface is -
5V~+5V(10Vpp). After attenuation, the input range meets the input range of the AD chip (0~2V). The conversion formula is as follows:
When the input signal Vin=5(V), the signal Vad input to AD=2(V); When the input signal
Programming
The program design of this experiment is basically similar to the AN706 waveform display experiment, except that the ADDA module is a single-
channel AD, and here it is just the superposition of the collected waveforms. In addition, the FPGA generates sine wave data through the ROM IP and outputs
it to the DA chip for DA conversion to generate a positive wave analog signal. The user only needs to connect the AD and DA ports of the module with a
BNC line to form a loop. In this way, the DA positive wave signal is displayed on the HDMI display.
http://www.alinx.com.cn
225 / 244
Machine Translated by Google
video_pll
color_b
video_b grid_displaygrid_b wav_dis wave0_b play
SI9134
arvideo_hs video_vs grid_hs wave0_hs
video_de grid_vs wave0_vs
grid_de wave0_de
buf_addr
SDA
SCL
buf_wr
dac_clk adc_clk buf_data
ad9280_sample i2c_confi
ROM adc_pll
g
DA output AD Collection
ADDA module
The ad9280_sample module mainly completes the AD 8-bit data acquisition and conversion of ad9280, collecting 1280 data each time.
Then wait for a while before collecting the next 1280 data.
(bit)
adc_clk in 1
adc system clock
rst in
1 Asynchronous reset, high reset
adc_data in 8
ADC Data Input
adc_buf_wr out 1
ADC data write enable
adc_buf_addr out 12
ADC data write address
adc_buf_data out
8 Unsigned 8-bit ADC data
The grid_display module mainly completes the grid line overlay of the video image. In this experiment, the color bar video is input and then overlaid with a
Video display positions from 9 to 1018 horizontally (left to right) and from 9 to 308 vertically (top to bottom).
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The wav_display display module is mainly used to complete the superposition display of waveform data. The module contains a dual-port RAM, write port
It is written by the ADC acquisition module, and the read port is the display module. When the grid display area is valid, each line of display will read
Get the AD data value stored in RAM and compare it with the Y coordinate to determine whether to display the waveform or not.
(bit)
pclk in
1 Pixel Clock
rst_n in
1 Asynchronous reset, low level reset
wave_color in twenty four
adc_clk in 1
adc module clock
adc_buf_wr in 1
adc data write enable
adc_buf_addr in 12
adc data write address
adc_buf_data in 8
adc data, unsigned
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output with grid
o_vs out
1 Video field sync output with grid
o_de out
1 Valid output of grid video data
o_data out
24-band grid video data output
The timing_gen_xy module is a submodule of other modules, which completes the coordinate generation of the video image, the x coordinate, increasing from left to right.
(bit)
clk in
1 System clock
rst_n in
1 Asynchronous reset, low level reset
i_hs in
1 Video line sync input
i_vs in
1 Video field sync input
i_de in
1 Video data valid input
i_data in
24 Video data input
o_hs out
1 Video line sync output
o_vs out
1 Video field sync output
o_de out
1 Video data is output effectively
o_data out
24 Video data output
x out
12 Coordinate x output
y out
12 Coordinate y output
In addition, a ROM IP module is added in this example, and the ROM IP needs to be initialized with data.
Use the waveform data generation tool and find the tool in the software tools and driver folder. Its icon is as follows:
2. You can select the waveform as needed. In this example, a sine wave is selected and the data length and bit width remain the default
3. Click the Save button to save the generated data file to the project directory file (pay attention to the saved file type):
http://www.alinx.com.cn
229 / 244
Machine Translated by Google
4. After saving, the following dialog box appears, indicating that the save is successful. Click OK to close the tool.
Save the .coe file to the generated Rom IP core. I will not repeat it here.
Experimental phenomenon
Connect the DAC input of AN108 to the output of the signal generator. Here we use a special shielded wire. If you use other
.
The line may have a large interference
http://www.alinx.com.cn
231 / 244
Machine Translated by Google
This chapter introduces the experiment of using AN9767 module to realize two-way sine wave generation.
Hardware Introduction
The dual-channel 14-bit DA output module AN9767 uses the AD9767 chip from ANALOG DEVICES and supports independent
Dual-channel, 14-bit, 125MSPS digital-to-analog conversion. The module has a 40-pin female header for connecting to the FPGA development
http://www.alinx.com.cn
233 / 244
Machine Translated by Google
layers: 4 layers, independent power layer and GND layer; ÿ Module interface: 40-
The chips used in the module meet the industrial temperature range ÿ Output interface: 2 BNC
analog output interfaces (can be directly connected to an oscilloscope using a BNC cable);
http://www.alinx.com.cn
234 / 244
Machine Translated by Google
Needle
Needle
AD
AD
number
number
First
amp stage op
First stage Second
Second stage
stage op
op amp
amp
according
accordingtoto Low pass filter BBNNCC interface
op amp (current
voltage) (currenttoto voltage) (voltage
(Voltageamplification)
Amplification)
lose
lose
out
out
High
DAC -High-speed
speed dual-channel
dual-
Expand
Expand
channel
DAC chip
exhibition
exhibition
AD9767
AD9767
mouth
mouth
First
amp stage op
First stage Second
Second stage
stage op
op amp
amp
Low pass filter BBNNCC interface
op amp (current
voltage) (currenttoto voltage) (voltage
(voltage amplification)
amplification)
The AD9767 is a dual-port, high speed, dual-channel, 14-bit CMOS DAC that integrates two high quality TxDAC+® cores,
The AD9767 can support an update rate of up to 125 MSPS. The functional block diagram of the AD9767 is as follows:
The two DA outputs of AD9767 are both current outputs IoutA and IoutB in the form of complementary code.
When the DAC input 14-bit data is high, IoutA outputs the full-scale current output of 20mA.
The current is 0mA. The specific relationship between the current and the DAC data is shown in the following formula:
Where IoutFS = 32 x Iref. In the AN9767 module design, the value of Iref is determined by the value of resistor R16. If
R16=19.2K, then the value of Iref is 0.625mA, and the value of IoutFS is 20mA.
The current output by AD9767 is converted into a voltage of -1V~+1V through the first-stage operational amplifier AD6045. The specific conversion circuit is as follows
The -1V~+1V voltage converted by the first stage op amp is converted to a higher amplitude voltage signal by the second stage op amp.
The amplitude of the amplifier can be changed by adjusting the adjustable resistor on the board. Through the second stage op amp, the output range of the analog signal is high
Reach -5V~+5V.
The following table is a comparison table of digital input signals and voltages after output of each level of op amp:
DAC Data Input Value AD9767 Current Output First Stage Op Amp Output Second Stage Op Amp Output
0mA 0V 0V
2000 (median value)
The digital interface of the AD9767 chip can be configured into dual-port mode (Dual) or alternating mode (AC) through the chip's mode pin (MODE).
In the AN9767 module design, the AD9767 chip works in dual-port mode, with dual-channel DA
The digital input interface is independent and separated. The data timing diagram of dual port mode (Dual) is shown below:
The DA data for the AD9767 chip is input to the chip through the rising edge of the clock CLK and the write signal WRT for DA conversion.
Change.
Programming
The example program provides the DA test program of the AN9767 module, which can realize the input of the sine wave signal through the AN9767 module.
out.
The positive wave test program reads the positive wave data stored in a ROM inside the FPGA, and then
The data is output to the AN9767 module for digital-to-analog conversion, thereby obtaining the analog signal of the positive selection wave.
http://www.alinx.com.cn
237 / 244
Machine Translated by Google
DAC1
FPGA
Development Boards
In the program we will use a ROM to store 1024 14-bit sine wave data. First we need to prepare the ROM
Initialization file (if it is ALTERA development board, it is mif file; if it is Xilinx development board, it is coe file).
The following is the method to generate a sine wave ROM data file:
Find the tool in the software tools and drivers folder, its icon is as follows:
2. You can select the waveform as needed. In this example, a sine wave is selected, the data length is 1024, the data bit width is 14, and the others are the default:
3. Click the Save button to save the generated data file to the project directory (pay attention to the saved file type):
4. After saving, the following dialog box appears, indicating that the save is successful. Click OK to close the tool.
http://www.alinx.com.cn
239 / 244
Machine Translated by Google
Save the .coe file to the generated Rom IP core. This has been introduced in the character display experiment tutorial and will not be repeated here.
repeat.
);
assign da1_clk=clk_125M;
assign da1_wrt=clk_125M;
assign da1_data=rom_data;
assign da2_clk=clk_125M;
assign da2_wrt=clk_125M;
http://www.alinx.com.cn
240 / 244
Machine Translated by Google
assign da2_data=rom_data;
ROM ROM_inst
(
.clka(clk_125M), // input clka
.addra(rom_addr), // input [8 : 0] addra
.douta(rom_data) // output [7 : 0] douta
);
PLL PLL_inst
(// Clock in ports
.clk_in1_p (sys_clk_p .clk_in1_n // IN
(sys_clk_n // Clock out ports ), ), // IN
endmodule
The program generates a 125M DA output clock through a PLL IP, and then cyclically reads the 125M DA output clock stored in the ROM.
1024 data, and output to the DA data lines of channel 1 and channel 2 at the same time. In the program, you can add 1 to the address.
Experimental phenomenon
Insert the AN9767 module into the J11 expansion port of the development board, and use the BNC cable we provide to connect the output of the AN9767 to the
The input of the oscilloscope is as shown below. Then power on the development board, download the program and observe the analog signal output from the DA module on the oscilloscope.
http://www.alinx.com.cn
241 / 244
Machine Translated by Google
http://www.alinx.com.cn
242 / 244
Machine Translated by Google
We can modify the address in the program to +4 as follows, so that the output point of the sine wave is
256, the frequency of the output sine wave will increase by 4 times:
After the program is modified and the FPGA is re-downloaded, the frequency of the sine wave becomes higher, and the waveform displayed by the oscilloscope is as follows:
Users can also change the amplitude of the 2-channel output waveform by adjusting the adjustable resistor on the AN9767 module.
http://www.alinx.com.cn
243 / 244
Machine Translated by Google