Xapp 858
Xapp 858
Summary                                   This application note describes the controller and data capture technique for high-performance
                                          DDR2 SDRAM interfaces. This data capture technique uses the Input Serializer/Deserializer
                                          (ISERDES) and Output Double Data Rate (ODDR) features available in every Virtex™-5 I/O.
Introduction                              A DDR2 SDRAM interface is source-synchronous where the read data and read strobe are
                                          transmitted edge aligned. To capture this transmitted data using Virtex-5 FPGAs, either the
                                          strobe or the data can be delayed. In this design, the read data is captured in the delayed
                                          strobe domain and recaptured in the FPGA clock domain in the ISERDES. The ISERDES
                                          OCLK input and CLKDIV input are both provided the FPGA fast clock. Therefore, the Q3 and
                                          Q4 outputs of the ISERDES are ignored. The differential strobe is placed on a clock-capable
                                          I/O pair in order to access the BUFIO clock resource. The BUFIO clocking resource routes the
                                          delayed read DQS to its associated data ISERDES clock inputs. The write data and strobe
                                          transmitted by the FPGA use the ODDR.
                                          A brief overview of the DDR2 SDRAM device features and a detailed explanation of the
                                          controller operation when interfacing to high-speed DDR2 memories are provided. The
                                          backend user interface to the controller is also explained.
DDR2 SDRAM                                DDR2 SDRAM devices are the next generation devices in the DDR SDRAM family. DDR2
Overview                                  SDRAM devices use the SSTL 1.8V I/O standard. The following section explains the features
                                          available in the DDR2 SDRAM devices and the key differences between DDR SDRAM and
                                          DDR2 SDRAM devices.
                                          DDR2 SDRAM devices use a DDR architecture to achieve high-speed operation. The memory
                                          operates using a differential clock provided by the controller. Commands are registered at every
                                          positive edge of the clock. A bidirectional data strobe (DQS) is transmitted along with the data
                                          for use in data capture at the receiver. DQS is a strobe transmitted by the DDR2 SDRAM device
                                          during Reads and by the controller during Writes. DQS is edge aligned with data for Reads and
                                          center aligned with data for Writes.
                                          Read and write accesses to the DDR2 SDRAM device are burst oriented. Accesses begin with
                                          the registration of an Active command, which is then followed by a Read or Write command.
                                          The address bits registered with the Active command are used to select the bank and row to be
                                          accessed. The address bits registered with the Read or Write command are used to select the
                                          bank and the starting column location for the burst access.
                                          The DDR2 controller reference design includes a user backend interface to generate the Write
                                          address, Write data, and Read addresses. This information is stored in three backend FIFOs
                                          for address and data synchronization between the backend and controller modules. Based on
                                          the availability of addresses in the address FIFO, the controller issues the correct commands to
                                          the memory, taking into account the timing requirements of the memory. The implementation
                                          details of the logic blocks are explained in the following sections.
© 2006–2007 Xilinx, Inc. All rights reserved. All Xilinx trademarks, registered trademarks, patents, and further disclaimers are as listed at http://www.xilinx.com/legal.htm. All other
trademarks and registered trademarks are the property of their respective owners. All specifications are subject to change without notice.
NOTICE OF DISCLAIMER: Xilinx is providing this design, code, or information "as is." By providing the design, code, or information as one possible implementation of this feature,
application, or standard, Xilinx makes no representation that this implementation is free from any claims of infringement. You are responsible for obtaining any rights you may
require for your implementation. Xilinx expressly disclaims any warranty whatsoever with respect to the adequacy of the implementation, including but not limited to any warranties
or representations that this implementation is free from claims of infringement and any implied warranties of merchantability or fitness for a particular purpose.
        Notes:
        1.   Address signal A10 is held High during Precharge All Banks and is held Low during single bank
             precharge.
                                                                                               A2 A1 A0 Burst Length
                                                                                               0    1       0            4
                                                                                               0     1 1              8
                                                                                                   Others          Reserved
                                                               A6 A5 A4 CAS Latency
                                                               0    1        0            2
                         A11 A10 A9 Write Recovery             0    1        1            3
                                                               1    0        0            4
                         0      0     1          2
                                                               1     0 1             5
                         0      1     0          3
                         0      1     1          4                 Others         Reserved
                         1      0     0          5
                         1      0     1          6
                             Others           Reserved
                                                                                                                     X858_01_042006
                            Initialization Sequence
                            The initialization sequence used in the controller state machine follows the DDR2 SDRAM
                            specifications. The voltage requirements of the memory need to be met by the interface. The
                            following is the sequence of commands issued for initialization.
                            1. After stable power and clock, a NOP or Deselect command applied for 200 μs.
                            2. CKE asserted.
                            3. Precharge All command executed after 400 ns.
                            4. EMR (2) command executed. BA0 is held Low, and BA1 is held High.
                            5. EMR (3) command executed. BA0 and BA1 are both held High.
                            6. EMR command executed to enable the memory DLL. BA1 and A0 are held Low, and BA0
                               is held High.
                            7. Mode Register Set command executed for DLL reset. To lock the DLL, 200 clock cycles are
                               required.
                            8. Precharge All command executed.
                            9. Two Auto Refresh commands executed.
                            10. Mode Register Set command executed with Low to A8 to initialize device operation.
                            11. EMR command executed to enable OCD default by setting bits E7, E8, and E9 to 1.
                            12. EMR command executed to enable OCD exit by setting bits E7, E8, and E9 to 0.
        After the initialization sequence is complete, the controller issues a dummy write followed by
        dummy reads to the DDR2 SDRAM memory for the datapath module to select the right number
        of taps in the Virtex-5 input delay block. The datapath module determines the right number of
        delay taps required and then asserts the dp_dly_slct_done signal to the controller. The
        controller then moves into the IDLE state.
        Precharge Command
        The Precharge command is used to deactivate the open row in a particular bank. The bank is
        available for a subsequent row activation a specified time (tRP) after the Precharge command is
        issued. Input A10 determines whether one or all banks are to be precharged.
        Active Command
        Before any Read or Write commands can be issued to a bank within the DDR2 SDRAM
        memory, a row in the bank must be activated using an Active command. After a row is opened,
        Read or Write commands can be issued to the row subject to the tRCD specification. DDR2
        SDRAM devices also support posted CAS additive latencies; these allow a Read or Write
        command to be issued prior to the tRCD specification by delaying the actual registration of the
        Read or Write command to the internal device using additive latency clock cycles.
        When the controller detects a conflict, it issues a Precharge command to deactivate the open
        row and then issues another Active command to the new row. A conflict occurs when an
        incoming address refers to a row in a bank other than the currently opened row.
        Read Command
        The Read command is used to initiate a burst read access to an active row. The values on BA0
        and BA1 select the bank address. The address inputs provided on A0 – Ai select the starting
        column location. After the read burst is over, the row is still available for subsequent access until
        it is precharged.
        Figure 2 shows an example of a Read command with an additive latency of zero. Hence, in this
        example, the Read latency is three, the same as the CAS latency.
                                T0           T1           T2           T3    T3n     T4    T4n     T5
                      CK
                      CK
              Command        READ           NOP          NOP          NOP           NOP          NOP
                              Bank a,
                Address        Col n
                                         RL = 3 (AL = 0, CL = 3)
                    DQS
                    DQS
                      DQ                                                 DOn
                                                                                            X858_02_042606
                            Write Command
                            The Write command is used to initiate a burst access to an active row. The values on BA0 and
                            BA1 select the bank address while the value on address inputs A0 – Ai select the starting
                            column location in the active row. DDR2 SDRAMs use a Write Latency (WL) equal to Read
                            Latency (RL) minus one clock cycle.
                                    Write Latency = Read Latency – 1 = (Additive Latency + CAS Latency) – 1
                            Figure 3 shows the case of a Write burst with a WL of 2. The time between the Write command
                            and the first rising edge of the DQS signal is determined by the WL.
                                                 T0             T1        T2    T2n   T3    T3n   T4         T5
                                        CK
                                        CK
                                 Command       Write           NOP        NOP         NOP         NOP       NOP
                                               Bank a,
                                   Address      Col b
DM
X858_03_042006
User Backend
         Read/Write
                                        User                  Memory
         Data & Addr
                                      Interface            Interface Top
           FIFOs
                                    Controller
                                 (Main Command
                                  State Machine)
                                                                             Virtex-5 FPGA
                                                                                                                  X858_04_042606
User Backend                 The backend provides address and data patterns to test read and write accesses between the
                             memory device and the memory interface (DDR2 controller and Physical layer). The backend
                             includes the following blocks: backend state machine, read data comparator, and a data
                             generator module. The data generation module generates the various address and data
                             patterns that are written to the memory. The address locations are pre-stored in a block RAM,
                             being used here as a ROM. The address values stored have been selected to test accesses to
                             different rows and banks in the DDR2 SDRAM device. The data pattern generator includes a
                             state machine that issues patterns of data. The backend state machine emulates a user
                             backend. This state machine issues the write or read enable signals to determine the specific
                             FIFO to be accessed by the data generator module.
User Interface               The backend user interface has three FIFOs: the Address FIFO, the Write Data FIFO, and the
                             Read Data FIFO. The first two FIFOs are accessed by the user backend modules, while the
                             Read Data FIFO is accessed by the datapath module used to store the captured Read data.
User-to-Controller Interface
User-to-                    Table 4 lists the signals between the user interface and the controller.
Controller                  Table 4: Signals Between User Interface and Controller
Interface                                                Port
                                    Port Name           Width           Port Description                    Notes
                                                       (in bits)
                               usr_ip_add_fifo_addr       36       Output of the Address         Monitor FIFO-full status
                                                                   FIFO in the user interface.   flag to write address into
                                                                   Mapping of these address      the address FIFO.
                                                                   bits:
                                                                   • Memory Address 31:0],
                                                                     (CS, Bank, Row,
                                                                     Column)[
                                                                   • Reserved [33:32]
                                                                   • Command Request
                                                                     [35:34]
                               usr_ip_add_fifo_empty       1       The user interface Address    FIFO16 Empty Flag.
                                                                   FIFO empty status flag
                                                                   output. The controller
                                                                   processes the address on
                                                                   the output of the FIFO
                                                                   when this signal is
                                                                   deasserted.
                               ctrl_af_rden                1       Read Enable input to          This signal is asserted for
                                                                   address FIFO in the user      one clock cycle when the
                                                                   interface.                    controller state is Write or
                                                                                                 Read.
                               ctrl_wdf_rden               1       Read Enable input to Write    The controller asserts this
                                                                   Data FIFO in the user         signal for two clock cycles
                                                                   interface.                    after the write state. This
                                                                                                 signal is asserted for four
                                                                                                 clock cycles for a burst
                                                                                                 length of 8. Sufficient data
                                                                                                 must be available in Write
                                                                                                 Data FIFO associated with
                                                                                                 a write address for the
                                                                                                 required burst length
                                                                                                 before issuing a Write
                                                                                                 command. For example,
                                                                                                 for a 64-bit data bus and a
                                                                                                 burst length of 4, the user
                                                                                                 should input two 128-bit
                                                                                                 data words in the Write
                                                                                                 Data FIFO for every write
                                                                                                 address before issuing the
                                                                                                 Write command.
Command Request
                              The memory address (Af_addr) includes the column address, row address, bank address, and
                              chip-select width for deep memory interfaces (Table 5).
Command                       Table 6 lists the Read and Write command request format.
Request                       .
                              Figure 5 shows four consecutive Writes followed by four consecutive Reads with a burst length
                              of 4. Table 7 lists the state signal values for Figure 5.
CLK
State 09 0A 09 0A 09 0A 09 0A 0B 07 08 07 08 07 08 07 08
ctrl_af_rden
ctrl_wdf_Rden
usr_ip_add_fifo_empty
                                                                                                                    X858_05_042606
Physical Layer
Physical Layer              The physical layer comprises the write datapath, the read datapath, the calibration state
                            machine for DQS and DQ calibration, calibration logic for read enable alignment, and the
                            memory initialization state machine. The write datapath generates the data and strobe signals
                            transmitted during a Write command. And the read datapath captures the read data in the read
                            strobe domain.
Write Datapath              The write datapath uses the built-in ODDR available in every Virtex-5 I/O. The ODDR transmits
                            the data (DQ) and strobe (DQS) signals. The memory specification requires DQS to be
                            transmitted center aligned with DQ. The strobe (DQS) forwarded to the memory is 180° out of
                            phase with CLK0. Therefore, the write data transmitted using ODDR must be clocked by CLK90
                            as shown in Figure 6. The timing diagram for write DQS and DQ is shown in Figure 7.
                            16
ODDR
CLK0
                                     CLK Forwarded
                                  to Memory Device
Strobe (DQS)
X858_07_041806
Figure 7: Write Strobe (DQS) and Data (DQ) Timing for a Write Latency of Four
Write Datapath
                                                   Uncertainties Uncertainties
     Uncertainty Parameters              Value                                                             Description
                                                    before DQS    after DQS
Read Datapath
Read Datapath               The read datapath comprises the read data capture and recapture stages. Both stages are
                            implemented in the built-in ISERDES available in every Virtex-5 I/O. The ISERDES has three
                            clock inputs: CLK, OCLK, and CLKDIV. The read data is captured in the CLK (DQS) domain,
                            recaptured in the OCLK (FPGA fast clock) domain, and finally transferred to the CLKDIV (also
                            clocked with FPGA fast clock) domain to provide parallel data.
                            •    CLK: The read DQS routed using BUFIO provides the CLK input of the ISERDES as
                                 shown in Figure 8.
                            •    OCLK: The OCLK input of ISERDES is connected to the CLK input of ODDR in hardware.
                                 In this design, the CLKfast_90 clock is provided to the ISERDES OCLK input and the
                                 ODDR CLK input. The clock phase used for OCLK is dictated by the phase required for
                                 write data.
                            •    CLKDIV: It is imperative for OCLK and CLKDIV clock inputs to be phase aligned for correct
                                 functionality. In this design, both OCLK and CLKDIV inputs are provided the same clock,
                                 CLKfast_90.
                                                                                     IOB    CLB
                                                                                                     User Interface
                                                                                                        FIFOs
                                  DQ                                                        Q2
                                            IDELAY                                                       Read Data
                                                                                                           Rising
                                                                                            Q1
                                                                                                         Read Data
                                                                                                          Falling
                                                      CLK     OCLK     CLKDIV
                                                                                                     FPGA Clock
                                  DQS
                                            IDELAY
                                                                                                 BUFIO
X858_08_042606
Read Datapath
         Table 9 shows the read timing analysis at 333 MHz required to determine the delay required on
         DQ bits for centering DQS in the data valid window.
         Table 9: Read Timing Analysis at 333 MHz
Read Datapath
                              Figure 9 shows the timing waveform for read data captured in the strobe domain and
                              recaptured in the FPGA clock domain in the ISERDES.
FPGA Clock
DQS at FPGA
DQ at FPGA D0 D1 D2 D3
                                           DQS Delayed by
                                            BUFIO at IDDR
DQ D0 D1 D2 D3
                                                                                  D0        D2
                                           DQ Captured by
                                             DQS Domain
                                                                                       D1        D3
D0 D2
                                         DQ Recaptured in                                    D1        D3
                                       FPGA Clock Domain
                                                                     Input to Rising FIFO         D0        D2
X858_09_042606
Read Datapath
CLK0
Command READ
                                                       DQ at Memory
                                                                                D0 D1 D2 D3
                                                              Device
                                                     DQS at Memory
                                                             Device
                                                     Delayed DQS at
                                                       IDDR CLK I/P
                                                          Delayed DQ
                                                                                       D0 D1 D2 D3
                                                          at IDDR I/P
                               ctrl_RdEn Generated by
                           Controller After CAS Latency
WrEn
X858_10_042606
Figure 10: Read-Enable Timing for CAS Latency of 5 and Burst Length of 4
                         The ctrl_RdEn signal is required to validate read data because the DDR2 SDRAM devices do
                         not provide a read valid or read-enable signal along with read data. The controller generates
                         this read-enable signal based on the CAS latency and the burst length. This read-enable signal
                         is asserted CAS latency later and is input to a set of pipeline registers. The number of register
                         stages required to align the read-enable signal to the ISERDES read data output is determined
                         during calibration. One read-enable signal is generated for each data byte. Figure 11 shows the
                         read-enable logic block diagram.
Controller Implementation
                                                             Number of Registers
                                                              Determined During
                                                                 Calibration
                                                                                           CLK0
                                                                                                       X858_11_041806
Controller                  The controller has the ability to keep four banks open at a time. The banks are opened in the
Implementation              order of the commands that are presented to the controller. In the event that four banks are
                            already opened and an access arrives to the fifth bank, the least recently used bank will be
                            closed and the new bank will be opened. All the banks are closed during auto refresh and will
                            be opened as commands are presented to the controller.
                            The controller state machine manages issuing the commands in the correct sequencing order
                            while determining the timing requirements of the memory.
                            Along with Figure 12, the following sections explain in detail the various stages of the controller
                            state machine.
                            Before the controller issues the commands to the memory:
                            1. The controller decodes the address located in the FIFO.
                                 Note: The address FIFO is in first-word-fall-through mode (FWFT). In FWFT mode, the first address
                                 written into the FIFO appears at the output of the FIFO.
                            2. The controller opens a row in a bank if that bank and row is not already open. In the case
                               of an access to a different row in an already opened bank, the controller closes the row in
                               that bank and open the new row. The controller moves to the Read/Write states after
                               opening the banks s if the banks are already open.
                            3. After arriving in the Write state, if the controller gets a Read command, the controller waits
                               for the write_to_read time before issuing the Read command. Similarly, in the Read state,
                               when the controller sees a Write command from the command logic block, the controller
                               waits for the read_to_write time before issuing the Write command. In the Read or Write
                               state, the controller also asserts the read enable to the address FIFO to get the next
                               address.
                            4. The commands are pipelined to synchronize with the Address signals before being issued
                               to the DDR2 memory.
Reference Design
rst || ~phy_init_done
Idle
                                                        cmd
                                                                                                   wr
Active
                                                   Active
                                                    Wait
             Command                                                                            Burst
             Wait Conf             conflict                             wr                      Write
                                                 Command                                wr
                  conflict
                                                   Wait
                                                                                              Write Wait
             Precharge                                 rd
                                                                             rd || conflict
                  rd                                                            rd            Write Bank
                                                Burst_Read
                                                                                                 Conf
             Precharge
                                                              rd
                Wait
                                                                    conflict
                  auto refresh
                                                 Read_Wait
               Auto
              Refresh
                                                       wr || conflict
                                     conflict
                Auto
                                                 Read Wait
            Refresh Wait
                                                   Conf
                                                                                                        X858_16_041806
Reference        The reference design for the Virtex-5 DDR2 SDRAM memory controller is integrated with the
Design           Memory Interface Generator (MIG) tool. This tool has been integrated with the
                 Xilinx CORE Generator™ software. For the latest version of the design, download the IP
                 update on the Xilinx website from the following URL:
                       http://www.xilinx.com/xlnx/xil_sw_updates_home.jsp
Reference                   Table 11 lists the resource utilization for a 64-bit interface, including the physical layer, the
Design                      controller, the user interface, and a synthesizable testbench.
Conclusion                  The DDR2 SDRAM controller along with the data capture technique using SERDES, explained
                            in this application note, provide a good margin for high-performance memory interfaces. A high
                            margin is achieved when data capture in the DQS domain and data transfer to the FPGA clock
                            domain occurs in the ISERDES.
Revision                    The following table shows the revision history for this document.
History
                                  Date       Version                                   Revision
                                 05/12/06      1.0       Initial Xilinx release.
                                 01/09/07      1.1       Updated link to reference design.