Interfacing DDR SDRAM With Stratix & Stratix GX Devices: Application Note 342 December 2005 Ver. 2.0
Interfacing DDR SDRAM With Stratix & Stratix GX Devices: Application Note 342 December 2005 Ver. 2.0
Introduction
Traditionally, systems featuring FPGAs used single data rate (SDR)
SDRAM, which transmits data on each rising edge of the clock signal. The
total amount of data an SDR memory device can send or receive is equal
to the clock speed multiplied by the bus width. To increase the data-rate
transmission, one of those parameters must increase. With dual-edge
clocking, double data rate (DDR) SDRAM can transmit data on both the
rising and falling edge of the clock signal. DDR SDRAM effectively
doubles the amount of data sent compared to SDR SDRAM without
increasing the clock speed or the bus width.
DDR SDRAM devices are widely used for a broad range of applications
such as embedded processor systems, image processing, storage,
communications, and networking. Stratix and Stratix GX devices can
interface with DDR SDRAM in component or module configurations up
to 200 MHz/400 Mbps. Tables 1 and 2 show the DDR SDRAM interface
support in Stratix and Stratix GX devices.
Table 1. DDR SDRAM Support in Stratix EP1S10 Through EP1S40 Devices & All Stratix GX Devices (Part
1 of 2)
Altera Corporation 1
AN-342-2.0 Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Table 1. DDR SDRAM Support in Stratix EP1S10 Through EP1S40 Devices & All Stratix GX Devices (Part
2 of 2)
Notes to Table 1:
(1) These maximum clock rates apply if the Stratix or Stratix GX device uses DQS phase-shift circuitry to interface with
DDR SDRAM. DQS phase-shift circuitry is only available in the top and bottom I/O banks (I/O banks 3, 4, 7, and
8).
(2) You should use the minimum drive strength setting in the Quartus II software for DDR SDRAM interfaces in Stratix
and Stratix GX devices.
(3) To achieve 200 MHz interface speed, you should use loading of 10pF or less for Class II termination.
(4) DDR SDRAM is supported on the Stratix device side banks (I/O banks 1, 2, 5, and 6) without dedicated DQS
phase-shift circuitry. The read DQS signal is ignored in this mode.
DDR Memory Type I/O Standard -5 Speed Grade -6 Speed Grade -7 Speed Grade
Notes to Table 2:
(1) These maximum clock rates apply if the Altera® Stratix or Stratix GX device uses DQS phase-shift circuitry to
interface with DDR SDRAM. DQS phase-shift circuitry is only available in the top and bottom I/O banks (I/O
banks 3, 4, 7, and 8).
(2) You should use the minimum drive strength setting in the Quartus II software for DDR SDRAM interfaces in
Stratix and Stratix GX devices.
(3) DDR SDRAM is supported on the Stratix device side banks (I/O banks 1, 2, 5, and 6) without dedicated DQS
phase-shift circuitry. The read DQS signal is ignored in this mode.
2 Altera Corporation
Preliminary
Functional Description
Functional DDR SDRAM is a 2n-prefetch architecture with two data transfers per
clock cycle. It uses a strobe, DQS, that is associated with a group of data
Description pins (DQ) for read and write operations. Both the DQS and DQ ports are
bidirectional. Address ports are shared for read and write operations.
Write and read operations are sent in bursts, and DDR SDRAM supports
burst lengths of two, four, and eight. You provide two, four, or eight
groups of data for each write transaction and receive two, four, or eight
groups of data for each read transaction. The interval between when the
read command is clocked into memory and the data is presented at the
memory pins is called the column address strobe (CAS) latency. DDR
SDRAM supports CAS latencies of 2, 2.5, and 3, depending on the
operating frequency. Both the burst length and CAS latency are set in the
DDR SDRAM mode register.
DDR SDRAM devices use the SSTL-2 class II I/O standard and can hold
between 64 Mb to 1 Gb of data, according to the JEDEC specification. Each
device is divided into four banks, and each bank has a fixed number of
rows and columns. Only one row per bank can be accessed at one time.
The ACTIVE command opens a row, and the PRECHARGE command
closes a row.
For data reads, a delay-locked loop (DLL) inside the DDR SDRAM edge-
aligns the DQ and DQS signals with respect to CK. The DLL must be
turned on for normal operation, but can be turned off to save power or for
debugging purposes. (All timing analyses in this document assume that
Altera Corporation 3
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
the DLL is on.) DDR SDRAM also has adjustable output drive strength.
Altera recommends using a minimum drive strength setting on Altera
devices.
Interface Pins Table 3 describes the DDR SDRAM interface pins and how to connect
them to Stratix and Stratix GX devices on the top and bottom I/O banks
whether or not you are using the DQS phase-shift circuitry. On the side
banks, connect DQ and DQS pins to Stratix and Stratix GX user I/O pins.
The DQ and DQS signals are both bidirectional (the same signals are used
for both writes and reads). A group of DQ pins is associated with one
DQS pin. In 8 and 16 DDR SDRAM devices, one DQS pin is associated
with 8 DQ pins (Stratix and Stratix GX 8 mode definition). Use the DQS
pins and their associated DQ pins listed in the Stratix and Stratix GX pin
tables when interfacing with DDR SDRAM from Stratix and Stratix GX
I/O banks 3, 4, 7, or 8. When interfacing DDR SDRAM from Stratix and
Stratix GX I/O banks 1, 2, 5, and 6, use any of the user I/O pins in those
banks as DQS pins. I/O banks 1, 2, 5, and 6 do not have dedicated phase-
shift circuitry and can only support up to 150-MHz DDR SDRAM
interfaces.
4 Altera Corporation
Preliminary
Interface Pins
Table 4. DQS & DQ Bus Mode Support in Stratix Devices Note (1)
Notes to Table 4:
(1) See the Using Selectable I/O Standards in Stratix & Stratix GX Devices chapter in the Stratix Device Handbook, volume 2
for VREF guidelines.
(2) These packages have six groups in I/O banks 3 and 4 and six groups in I/O banks 7 and 8.
(3) These packages have eight groups in I/O banks 3 and 4 and eight groups in I/O banks 7 and 8.
(4) This package has nine groups in I/O banks 3 and 4 and nine groups in I/O banks 7 and 8.
(5) These packages have three groups in I/O banks 3 and 4 and four groups in I/O banks 7 and 8.
Altera Corporation 5
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Table 5. DQS & DQ Bus Mode Support in Stratix GX Devices Note (1)
Notes to Table 5:
(1) See the Using Selectable I/O Standards in Stratix & Stratix GX Devices chapter in the Stratix Device Handbook, volume 2
for VREF guidelines.
(2) These packages have six groups in I/O banks 3 and 4 and six groups in I/O banks 7 and 8.
(3) These packages have eight groups in I/O banks 3 and 4 and eight groups in I/O banks 7 and 8.
The data signals (DQ) are edge-aligned with the DQS signal during a read
from the memory and center-aligned with the DQS signal during a write
to the memory. The memory controller shifts the DQS signal during a
write to center-align the DQ and DQS signals, and shifts the DQS signal
during a read so that the DQ and DQS signals are center-aligned at the
capture register. Stratix and Stratix GX devices use a phase-locked loop
(PLL) to center-align the DQS signal with respect to the DQ signals
during writes, and use dedicated DQS phase-shift circuitry to shift the
incoming DQS signal during reads. Figure 1 shows an example where the
DQS signal is center-aligned during a burst-of-two read. Figure 2 shows
an example of the relationship between the data and the data strobe
during a burst-of-two write.
6 Altera Corporation
Preliminary
Interface Pins
DQS at
FPGA Pin Preamble Postamble
DQ at
FPGA Pin
DQS at DQ
IOE registers
DQ at DQ
IOE registers
DQS at
FPGA Pin
DQ at
FPGA Pin
The memory device’s setup (tDS) and hold times (tDH) for the DQ and DM
pins during a write are relative to the edges of DQS write signals and not
the CK or CK# clock. These times are equal (tDS = tDH) and typically 0.4 ns
for a 200-MHz DDR SDRAM device.
The DQS signal is typically generated on the positive edge of the system
clock (because of the tDQSS requirement described below). The DQ and
data mask (DM) signals are clocked using a –90° shifted clock from the
system clock. The edges of DQS are centered on the DQ and DM signals
when they arrive at the DDR SDRAM.
The DQS, DQ, and DM board trace lengths should be similar to minimize
the skew in the arrival time of these signals.
Altera Corporation 7
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
The DDR SDRAM address and command inputs require the same setup
and hold times with respect to the DDR SDRAM clock. The Stratix and
Stratix GX device’s address and command signals change at the same
time as the DQS write signal because they are both generated from the
system clock. The positive edge of the DDR SDRAM clock, CK, is aligned
with DQS to satisfy tDQSS. If the command and address outputs are
generated on the clock’s positive edge, they may not meet the hold time
requirements (Figure 3). Therefore, you should use the negative edge of
the system clock for the commands and addresses to the DDR SDRAM.
You can use any of the I/O pins for the commands and addresses.
Figure 3 shows the address and command timing and the DDR SDRAM
tDQSS, tDS, and tDH timing requirements.
8 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
System Clock
CK at Stratix or Stratix GX
Device Pin
SDRAM Address/Command
tSU (SDRAM)
Input Timing to the DDR
SDRAM Device
tCO tH (SDRAM)
(FPGA)
Address/Command Pins
(Positive Edge)
Address/Command Pins
(Negative Edge)
tCO
(FPGA)
Notes to Figure 3:
(1) The address and command timing shown in Figure 3 is applicable for both read and write.
(2) If the board trace lengths for the DQS, CK, address, and command pins are the same, the signal relationships at the
Stratix and Stratix GX device pins are maintained at the DDR SDRAM pins.
Read-Side There is one DQS phase-shift circuit available on top of the device and one
on the bottom of the device. Each DQS phase-shift circuit requires an
Implementation input reference clock. The DQS phase-shift circuitry shifts the DQS signal
Using the DQS to center-align the signal with the DQ signal at the IOE register, ensuring
the data is latched at the IOE register. The DQS signal is then inverted
Phase-Shift before going to the DQ IOE clock ports, as described in the External
Memory Interfaces chapter of the Stratix Device Handbook, volume 2.
Circuitry
Figure 4 shows how the Stratix and Stratix GX devices generate the DQ,
DQS, CK, and CK# signals. The write PLL generates the system clock and
the –90° shifted clock (write clock). The write PLL’s input clock can be the
same or a different frequency as the DDR SDRAM frequency of operation.
If the frequencies are different, you must provide the input reference
clock to the DQS phase-shift circuitry from another input clock pin. For
details, see the External Memory Interfaces chapter in the Stratix Device
Handbook, volume 2. The system clock and write clock have the same
frequency as the DQS frequency. The write clock is –90° shifted from the
system clock.
Altera Corporation 9
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
LE IOE
DM
Length = l2
DDR
DQ "write" (1)
DDR Length = l2
Shifted -90˚
Write
Clock CK and CK#
input_clk System
Write PLL DDR
Clock Length = l1
Resynchronization DDR SDRAM
Clock
DDR Length = l2
Length = l2
DQS Phase-
Shift Circuitry
DQ "read" (1)
Length = l2
DDR
DDR DDR
(3) (2)
Notes to Figure 4:
(1) DQ and DQS signals are bidirectional. One DQS signal is associated with a group of DQ signals.
(2) The clock to the resynchronization register can be from the system clock, write clock, or an extra clock output from
the write PLL. Figure 4 shows the clock to the resynchronization register to be from an extra clock output from the
write PLL.
(3) The clock to this register can be either the system clock or another clock output of the write PLL. If another write
PLL clock output clocks the register feeding this register, another register is needed to transfer the data back to the
system clock domain.
10 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Figure 5. DDR SDRAM Read Data Path in Stratix & Stratix GX Devices
LE IOE
dq_oe (1)
dq_out
dq[7..0]
A
E
D Q Q D
C
latch
D B
D Q Q D Q D
resynch_clock ena
dataout[15..0]
dqs_oe (1)
dqs_out dqs
DQS Phase
dqs
Shift Circuitry
Local
Bus
Note to Figure 5:
(1) The output enable registers are not shown here, but dqs_oe and dq_oe are active low in silicon. However, the
Quartus II software implements it as active high and adds the inverter automatically during compilation.
Altera Corporation 11
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Stratix
ENABLE
DLL ALT_DDIO
Clock
PLL
Generator
12 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
DQS Postamble
The DDR SDRAM DQ and DQS pins use the SSTL-2 class II I/O standard.
If the Stratix, Stratix GX, or DDR SDRAM device do not drive the DQ and
DQS pins, the signals go to a high-impedance state. Because a pull-up
resistor terminates both DQ and DQS to VTT (1.25 V), the effective voltage
on the high-impedance line is 1.25 V. According to the JEDEC JESD 8-9
specification for SSTL-2 I/O standard, this is an indeterminate logic level,
and the input buffer can interpret this as either a logic high or logic low.
If there is any noise on the DQS line, the input buffer may interpret that
noise as actual strobe edges. Therefore, when the DQS signal goes to a
high-impedance state after a read postamble, you should disable the
clock to the input registers so that erroneous data is not latched in and all
the data from the memory is resynchronized properly.
Figure 7 shows a read operation example when the DQS postamble could
be a problem. Figure 5 on page 11 shows definitions of A, B, C, D, and E
waveforms. Waveform A shows the output of the active high IOE register.
Waveform B shows the active low register output of the Stratix and
Stratix GX IOE. The active low register output goes into the latch whose
output is illustrated in waveform C. Waveforms D and E show the output
signals after the resynchronization registers.
Altera Corporation 13
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
DQS
at the pin
DQS
at the IOE
A D0L D1L
B D0H D1H
C D0H D1H
resynch_clock
D D0H
E D0L
The first falling edge of the DQS at the IOE register occurs at 10 ns. At this
point, data D0H is clocked in by the active low register (waveform B). At
12.5 ns, data D0L is sampled in by the active high register (waveform A)
and data D0H passes through the latch (waveform C). In this example, the
positive edge of the resynch_clock signal occurs at 16.5 ns, where both
D0H and D0L are sampled by the LE’s resynchronization registers.
Similarly, data D1H is clocked in by the active low register at 15 ns, data
D1L is clocked in by the active high register, and data D1H passes
through the latch at 17.5 ns. At 20 ns, noise on the DQS line causes a valid
clock edge at the IOE registers that changes the values of waveforms A,
B, and C. The next rising edge of the resynch_clock signal does not
occur until 21.5 ns, but data D1L and D1H are not valid anymore at the
output of the latch and the active-high input register. Therefore, the
resynchronization registers do not sample D1L and D1H and may sample
the wrong data instead.
To avoid this possibility, add one register clocked by the undelayed DQS
signal in the LE closest to the associated DQ group to act as an enable for
each DQS/DQ group (Figure 8). The output of this register, dq_enable,
14 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
controls clock enable for the DQ IOE registers. The data input to the LE
register is set to GND, and the preset port of this register is connected to
the dq_enable_preset signal. The controller should generate
dq_enable_preset so that it is high when DQS is first detected low
(during read preamble) and low during the cycle prior to the last active
edge of DQS. This causes the dq_enable signal to go low with the last
active negative edge of the DQS signal. Register AI and BI are then
disabled before DQS goes into a high-impedance state. Latch CI has the
last data captured by register BI, whether or not it is in the transparent or
latched state.
LE IOE
Register AI
DQ
Q D
EN
Latch CI Register BI
Q D Q D
Preset (asynchronous) EN EN
dq_enable_preset
D Q
postamble_clk dq_enable
D Q
Register 1
dq_capture_clk
(1) DQS
Delay
(2)
t1 path
t2 path
Notes to Figure 8:
(1) Invert combout of the IOE for the DQS pin before feeding into inclock of the IOE for the DQ pin. This inversion
is automatic if you use an altdq megafunction for the DQ pins.
(2) You can have 0, 1, or 2 LE buffers. These are required at lower frequencies to ensure that the capture registers are
not disabled too early.
Altera Corporation 15
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
DQS
dq_enable_preset
dq_enable
To analyze the timing of the circuit shown in Figure 8, assume that t1 is the
delay from the DQS pad through the compensated delay to registers AI
and BI, t2 is the delay from the DQS pad through register 1 (tCO) to the
enable pin of registers AI and BI, and T is the clock period. The timing
equations are then as follows:
Because t1 is actually the 72° phase shift plus some PVT variation, t1a is the
delay that varies with PVT so that t1 is either t1a + 0.2T so that the
equations above are now as follows:
The equation above shows that t2 and t1a vary the same way with PVT, so
when performing timing analysis, the maximum timing for t2 should be
considered with the maximum timing for t1a, and vice versa for minimum
timing. The equations in Table 6 show the timing requirements for a
specific frequency of operation when using the 72° phase shifts,
respectively:
Frequency Equation
200 MHz 1 ns < t2 – t1 a < 3 ns
166 MHz 1. 2 ns < t2 – t1 a < 3.6 ns
133 MHz 1.5 ns < t2 – t1 a < 4.5 ns
16 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Figure 10 shows the read timing waveform when the Stratix and
Stratix GX DQS postamble circuitry is used.
Figure 10. Stratix & Stratix GX DQS Postamble Circuitry Read Timing Waveform
0 ns 5 ns 10 ns 15 ns 20 ns 25 ns
DQS
at the Pin
DQ
D0H D0L D1H D1L
at the Pin
DQS
at the IOE
A D0L D1L
B D0H D1H
C D0H D1H
resynch_clock
D D0H D1H
E D0L D1L
Reset
EnableN
Altera Corporation 17
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Figure 11. DDR SDRAM Controller MegaCore System Level Block Diagram
(1)
Local Control
Interface Logic
Pass Example Driver (Encrypted)
or fail
DDR SDRAM
Interface
DDR SDRAM
Input
Clock PLL Data Path
(Open
Source)
18 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Table 7. Example 200MHz Read Timing Analysis When Using DQS Circuitry in an EP1S25F1020C5
Device (Part 1 of 2) Note (1)
Maximum data delay 0.599 0.989 DQ pin to IOE register delay from
(Input) Quartus II + tP a c k a g e
µtS U (4) 0.133 0.276 Intrinsic setup time of the IOE register
µtH (4) 0.032 0.068 Intrinsic hold time of the IOE register
Board tE X T 0.020 0.020 Board trace variations on the DQ and
Specification DQS lines
Altera Corporation 19
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Table 7. Example 200MHz Read Timing Analysis When Using DQS Circuitry in an EP1S25F1020C5
Device (Part 2 of 2) Note (1)
Notes to Table 7:
(1) This example uses 72° phase shift.
(2) The memory numbers used here come from Micron MT16VDDT3264A.
(3) This example assumes that DLL is on only during initialization and refresh cycles. tD L L J I T T E R specifications for
Stratix and Stratix GX devices are available in the DC & Switching Characteristics chapter in the Stratix Device
Handbook, volume 1.
(4) These numbers are from the Quartus II software, version 5.0 SP1 using the Altera IP core DDR SDRAM Controller
version 3.2.0. Altera recommends using the latest version of the Quartus II software for your design.
(5) PLL phase shift is adjustable if you need to balance the setup and hold time margin.
20 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Timing paths are analyzed by considering the data and clock arrival times
at the destination register. In Figure 12, the setup margin is defined as the
time between “earliest clock arrival time” and “latest valid data arrival
time” at the register ports. Similarly, hold margin is defined as the time be
tween “earliest invalid data arrival time” and the “latest clock arrival
time” at the register ports. These arrival times are calculated based on
propagation delay information with respect to a common reference point
(such as a DQS edge or system clock edge).
Clock
Uncertainties
tH
Earliest Data Invalid
tSU
You can perform a similar timing analysis for your interface with another
DDR SDRAM memory by replacing the tHP, tQHS, and tDQSQ values in
Table 7 on page 19 with those from your memory data sheet, and
Minimum_Clock_Delay, Maximum_Clock_Delay and Data_Delay, for
your device from the Quartus II software.
You can extract clock and data delay from the project folder with filename
of <core name>_extraction_data.txt. Data delay is obtained from
dq_2_ddio in the text file. Clock delay is obtained from the summation
of dqsclk_2_ddio and dqspin_2_dqsclk. The largest and smallest
summation will be the maximum and minimum clock delay respectively.
To obtain Quartus II software timing data for the target device, you
should instantiate and compile the DDR SDRAM Controller MegaCore.
If you are using your own controller logic, you should instantiate the
clear-text DDR SDRAM data path instead to obtain timing delays. For the
read interface, the MegaCore function extracts and reports timing delays
Altera Corporation 21
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Figure 13 on page 23 shows the timing analysis and the round-trip delay
in Stratix and Stratix GX devices. The round-trip delay is the delay from
the FPGA clock to the DDR SDRAM and back to the FPGA (input to
register B). You can calculate whether the register outputs clocked by the
resynchronization clock need another resynchronization stage before
getting to the system clock domain. This analysis is required to reliably
transfer data from register A (in the IOE) to register B (in the LE).
1 You can also use a feedback clock and a second PLL for the
resynchronization clock.
22 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
D Q
DQ Read DQ
(I) (H)
data_out Q D Q D
(8) (9)
B A
(G)
(F) DQS Read (D)
tPD (Routing) tCQ (Capture) DQS
Logic Block (E)
tPD (DQS Trace)
tSU (Resynchronization) tPD (Capture)
tH (Resynchronization)
Altera Corporation 23
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Once sampled by the negative edge of the 72° phase-shifted DQS pulse,
DQL and DQH are available for resynchronization.
To sample the Q output of register A into register B, you need the time
relationship between register B’s clock input and the D input, which
depends on the phase relationship between DQS and clock and involves
the following steps:
3. Apply the correct clock edge for the resynchronization logic in the
memory controller.
■ Clock delays between the FPGA global clock net and the DDR
SDRAM clock input.
■ DQS strobe delays between the DDR SDRAM clock input and DQS’s
arrival at the FPGA capture registers.
■ Read data delays between the output of register A and the input of
register B.
24 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Figure 13 shows the individual delays between points (A) and (I). The
sum of all these delays is the round-trip delay. Figure 14 shows the timing
relationship of the signals for the delays between points (A) to (I) for a
CAS latency of 2.5.
Altera Corporation 25
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Figure 14. Round-Trip Delay Calculation Without a Feedback Clock Note (1)
Round-Trip Delay
Resynchronization
Phase
clk (A)
tDQSCK
DQS 72˚
Strobe
72˚ DQS
phase shift (F)
tPD (Capture)
Clock input
at Register A (G) (3)
tCQ (Capture)
Q Output
of Register A (H) (4)
Captured tPD (Routing)
DQ Data
D Input
of Register B (I)
26 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Delay (A) to (B) is the clock-to-out time to generate the clock signals to the
DDR SDRAM device.
Delay (B) to (C) is the trace delay for the clock. If there are multiple
DIMMs or devices in the system, use the one furthest away from the
FPGA for the maximum calculation; use the one closest to the FPGA for
the minimum calculation.
Delay (C) to (D) is the relationship between the clock and the DQS strobe
timing during reads. This is tDQSCK in DDR SDRAM specifications,
nominally 0, but can vary by 0.75 ns, depending on the DDR SDRAM
device-speed grade. The DQS output strobe is only guaranteed to be
within tDQSCK of the clock input, so use tDQSCK (maximum), typically
+0.75 ns, for calculating the maximum round-trip delay; use tDQSCK
(minimum), typically –0.75 ns, for calculating the minimum delay.
Delay (D) to (E) is the trace delay for DQS, which typically matches the
trace delay for the DQ signals in the same byte group. To calculate the
maximum round-trip delay, use the byte group with the longest trace
lengths; to calculate the minimum round-trip delay, use the byte group
with the shortest. If there are multiple DIMMs or devices in the system,
use the one furthest from the FPGA for the maximum calculation and the
one closest to the FPGA for the minimum. Trace lengths between different
byte groups do not have to be tightly matched, but a difference between
the longest and shortest decreases the safe resynchronization window
where data can be reliably resynchronized.
PLL jitter and clock duty cycle also affect the round-trip delay. Add each
of these delays to the maximum value and subtract from the minimum
value. PLL jitter and clock duty cycles are not shown in Figure 13, but are
included in Table 8, which shows example round-trip delay calculations.
Example Example
Numbers in
Delay Minimum Maximum Comments
Figures 13 & 14
Values (ns) Values (ns)
tP D (clock to pin) (A) to (B) 2.00 3.00 Equal to tC Q (DQS write)
tP D (clock trace) (B) to (C) 0.33 0.50 2 to 3 inches at 166 ps per inch (2)
tP D (DQS trace) (D) to (E) 0.33 0.50 2 to 3 inches at 166 ps per inch (2)
Altera Corporation 27
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Example Example
Numbers in
Delay Minimum Maximum Comments
Figures 13 & 14
Values (ns) Values (ns)
72° phase shift (E) to (F) 0.90 1.10 Include Stratix and Stratix GX DLL
jitter and phase-shift error
Notes to Table 8:
(1) These numbers are not taken from a specific system or a specific device. The clock frequency in this example is
200 MHz.
(2) To know the exact delay for your system, perform a time domain reflectometry (TDR) analysis on your system.
Resynchronization Selections
When the DQS signal arrives at the Stratix or Stratix GX device, the
dedicated phase-shift circuitry shifts the signal to capture the DQ signals.
The DQ signals are then ready to be synchronized with the system clock.
The round-trip delay numbers vary depending on the board delay and
the device’s internal delay. Complete a timing analysis to decide whether
to use the falling or rising edge of the write clock’s system clock for the
synchronization registers. After calculating the maximum and minimum
round-trip delay, determine the equivalent number of system clock cycles
at your operating frequency to find the point at which the data becomes
valid relative to clock. The example maximum delay shown in Table 8
represents 1.7 cycles at 200 MHz; the minimum represents 0.8 cycles. If
the CAS latency is included, which is equal to three in this example, the
example represents a minimum delay of 3.8 cycles and a maximum delay
of 4.7 cycles.
28 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
time = 5.4T
System clock
CK at the
FPGA
(maximum)
CK at the
memory
(maximum)
DQS at the
memory Read Command
(maximum) Latched Here
DQS at the
FPGA
(maximum)
DQS at the
IOE input ports
(maximum)
DQ at the
LE input ports
(minimum)
Minimum Round-Trip Delay + CAS Latency + 0.5 period
Safe Resynchronization Window
DQ at the
LE input ports
(maximum)
Maximum Round-Trip Delay + CAS Latency + 0.5 period
DQ after the
resynchronization
The read command is clocked into the DDR SDRAM once it receives the
rising edge of clk (at time 0) from the Stratix or Stratix GX device. You
can calculate the safe resynchronization window valid time as follows:
Altera Corporation 29
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
The size of the safe resynchronization window in the example is then 0.1
cycle, calculated by the following equation:
The example in Table 8 on page 27, as depicted in Figure 15, shows that
numcycle is equal to 10 and that the safe resynchronization window
does not fall within a system clock edge.
30 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Figure 16. Round-Trip Delay Diagram Clock Example Two Note (1)
time = 5.7T
time = 4.7T
time = 5.2T
time = 0 s
System Clock Edge to be
used for Resynchronization-
System Clock
CK at the
FPGA
CK at the
Memory
DQS at the
Read Command
Memory
Latched Here
DQS at the
FPGA
DQ after the
Resynchronization
LE µtco
You can calculate the needed phase shift for the resynchronization clock
from the following equations:
Altera Corporation 31
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
You then need to convert the results to the equivalent degree phase shifts.
If the closest clock edge to the safe resynchronization window is negative,
add or subtract 180° after the conversion to shift the clock from the
positive edge. For the example in Table 8 on page 27, the phase-shift
range is between 1.52 to 1.61 ns, based on the negative edge clock. The
median of this number is 1.565 ns, which equates to ~113° (from 200-MHz
clock). If you want to shift this clock from the positive edge of the system
clock, use either 293° (113° + 180°) or – 67° (113° – 180°).
The Altera DDR SDRAM Controller MegaCore function allows you to set
the resynchronization cycle and phase (Figure 17). The
0 resynchronization cycle starts at the first rising edge of the system clock
(clk) after the DQS signal’s first falling edge at the IOE register. Each
resynchronization cycle is one clock period. If there is no clock edge
within the safe resynchronization window, you must set the phase shift.
In the example shown in Figure 15 on page 29, you must choose the
0 resynchronization cycle with a 40° phase shift to set CAS latency to 2.0
or 3.0. Select the 0 resynchronization cycle with a 220° phase shift to set
CAS latency to 2.5.
32 Altera Corporation
Preliminary
Read-Side Implementation Using the DQS Phase-Shift Circuitry
Figure 17. Effect of Read Round-Trip Delay on the Choice of Resynchronization Phase for RTL
Example Note (1)
Resynchronization
Cycle 0 1 2
dq H L
Theoretical Q Output
H/L
of Register A at (H)
H/L
CAS Latency
= 2.0 or 3.0
Actual Data Valid at
D Input of Register B
at (I) Maximum Round-Trip Delay
H/L
Resynchronization
Phase One Resynchronization
Cycle + 40˚
dq H L
H/L
CAS Latency
= 2.5 Actual Data Valid at
D Input of
Register B at (I) Maximum Round-Trip Delay
H/L
Resynchronization
Phase One Resynchronization
Cycle + 220˚
Altera Corporation 33
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Read-Side You can interface Stratix and Stratix GX devices with DDR SDRAM
devices without using the DQS phase-shift circuitry. This section
Implementation provides an example that uses two sets of PLLs (per I/O bank for best
Using a PLL performance), a write PLL and a read PLL, and a feedback clock between
the write PLL and the read PLL (Figure 18).
You can choose any of the user I/O pins for the DQ, DQS, and DM pins.
The board trace lengths for the DQ, DQS, and DM pins should be tightly
matched.
The write PLL generates the system clock, the –90° shifted clock, and the
feedback clock, FB_CLK. The feedback clock is routed outside the FPGA
and back into the FPGA. This board trace length should be equal to the
clock trace length from the FPGA to the memory, plus the DQ trace length
from the memory to the FPGA. If the clock trace length = l1 and the DQ
trace length = l2, the FB_CLK must have a trace length of l3 = (l1 + l2) (see
Figure 18).
The read PLL uses the feedback clock as the input clock and generates the
clock needed to capture the DQ during reads. The DQS signal entering
the FPGA is ignored in this scheme. The read PLL is in normal mode, so
the PLL output at an LE register is in phase with the PLL input at the clock
pin. Because the trace length of the feedback clock is the same as the
CK/CK# and DQS trace, FB_CLK coming into the FPGA looks like the
DQS signal with a little bit of skew. The read PLL can then be shifted to
compensate for the skew and create the 90° PLL phase shift to capture the
DQ signals during reads.
34 Altera Corporation
Preliminary
Read-Side Implementation Using a PLL
Figure 18. DDR SDRAM Implementation on Side I/O Pins Notes (1), (2), (3)
LE IOE
DQ "Write" (1)
Length = l2
DDR
Shifted −90˚
Write CK or CK#
input_clk Clock
Write PLL DDR Length = l1
System
Clock
DQS (2)
DDR
Length = l2
FB_CLK
DDR Length = l3 = l1 + l2
DDR SDRAM
DQ "Read" (1)
Read PLL
Length = l2
DDR
Altera Corporation 35
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Table 9. Example Read Timing Analysis for 150-MHz DDR SDRAM Interface in EP1S25F1020C5 Without
Using Dedicated DQS Circuitry (Part 1 of 2)
Fast Slow
Parameter Specification Corner Corner Description
Model (ns) Model (ns)
Memory tH P 3.000 3.000 Half period as specified by the memory data
Specifications sheet (including memory clock duty cycle
(1) distortion)
tA C 0.700 0.700 Data-hold skew factor specified by the
memory data sheet
FPGA tP L L J I T T E R 0.133 0.133 Output jitter specification for Stratix and
Specifications Stratix GX device’s fast PLL
tP L L P S E R R 0.080 0.080 Phase shift error of the fast PLL
PLL Phase Shift (2) 1.000 1.000 Extra PLL phase shift to capture data (This is
based on 54° PLL phase shift)
Minimum Clock Delay 0.904 1.627 Minimum DQS pin to IOE register delay from
(Input) (3) Quartus II
Maximum Clock Delay 0.958 1.738 Maximum DQS pin to IOE register delay
(Input) (3) from Quartus II
Data Delay (Input) (3) 0.590 1.037 DQ pin to IOE register delay from Quartus II
Minimum Data Delay 0.540 0.987 DQ pin to IOE register delay from Quartus II
(Input) – package skew
Maximum Data Delay 0.640 1.087 DQ pin to IOE register delay from Quartus II
(Input) + package skew
µtS U (4) 0.133 0.280 Intrinsic setup time of the IOE register
µtH (4) 0.032 0.276 Intrinsic hold time of the IOE register
36 Altera Corporation
Preliminary
Read-Side Implementation Using a PLL
Table 9. Example Read Timing Analysis for 150-MHz DDR SDRAM Interface in EP1S25F1020C5 Without
Using Dedicated DQS Circuitry (Part 2 of 2)
Fast Slow
Parameter Specification Corner Corner Description
Model (ns) Model (ns)
Timing tE A R LY _ C L O C K 1.691 2.414 Earliest possible clock edge after DQS
Calculation phase-shift circuitry and uncertainties
(minimum clock delay + PLL phase shift –
t P L L J I T T E R – tP L L P S E R R )
tL AT E _ C L O C K 2.171 2.951 Latest possible clock edge after DQS phase-
shift circuitry and uncertainties (maximum
clock delay + PLL phase shift + tP L L J I T T E R +
tP L L P S E R R )
tE A R LY _ D ATA _ I N VA L I D 2.840 3.287 Time for earliest data to become invalid for
sampling at FPGA flop (tH P – tA C + minimum
data delay)
tL AT E _ D ATA _ VA L I D 1.340 1.787 Time for latest data to become valid for
sampling at FPGA flop (tA C + maximum data
delay)
Results Read setup timing 0.198 0.327 tE A R LY _ C L O C K – tL AT E _ D ATA _ VA L I D – µtS U –
margin (2) tE X T
Read hold timing 0.617 0.331 tE A R LY _ D ATA _ I N VA L I D – tL AT E _ C L O C K – µtH –
margin (2) tE X T
Total margin 0.815 0.579 Setup margin + hold margin
Notes to Table 9:
(1) The memory numbers used here come from Micron MT16VDDT3264A clocked at 150 MHz.
(2) PLL phase shift is adjustable if you need to balance the setup and hold time margin.
(3) These numbers are from the Quartus II software, version 5.0 SP1 using the Altera IP core DDR SDRAM Controller
version 3.2.0. Altera recommends using the latest version of the Quartus II software for your design.
(4) These numbers are from the Quartus II software, version 5.0 SP1 using the Altera IP core DDR SDRAM Controller
version 3.2.0. Altera recommends using the latest version of the Quartus II software for your design.
To calculate the read PLL phase shift, add the 54° shift with the delay from
the DQ pin to the IOE (tDQ2IOE) register and account for the board trace
length skew. The calculation will have some error due to the skew
between DQ and CK from the memory itself. Figure 19 illustrates the DQ
and FB_CLK signal relationship on a 150-MHz operation in an ideal
situation.
Altera Corporation 37
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
3.333 ns
1.5 ns
DQ at FPGA Pin
DQ at IOE Register
■ The tDQSCK
■ The skew between l3 and actual l1 + l2
■ The skew between tDQ2IOE used to compensate the PLL and the real
tDQ2IOE
■ The skew between the input to the PLL and the output of the PLL
38 Altera Corporation
Preliminary
Write-Side Implementation
Write-Side Whether you are using the DQS phase-shift circuitry or the PLL to capture
data during a read operation from the DDR SDRAM device, there is only
Implementation one implementation for the write operation. As shown in Figure 4 on
page 10, the write side uses a PLL to generate the clocks listed in Table 10.
Clock Description
System clock Use this clock for the memory controller and to
generate the DQS write and CK/CK# signals.
Write clock (–90° Use this clock in the data path to generate the DQ write
shifted from system signals.
clock)
Feedback clock Use this optional clock only if you are not using the
DQS phase-shift circuitry when reading from the DDR
SDRAM device.
Resynchronization Use this optional clock only if you are using the DQS
clock phase-shift circuitry and need a different clock phase
shift than available for resynchronization.
Figure 20 shows the data path for DDR SDRAM write operations.
Altera Corporation 39
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Figure 20. Stratix & Stratix GX DDR SDRAM Write Data Path
LE IOE
dqs_oe (1) D Q
VCC D Q
D Q
DQS
D Q
clk
dqs_in
dq_oe (1) D Q
datain[15..0] [15..8]
D Q D Q
DQ[7..0]
[7..0]
D Q D Q
Write Clock
dq_in
40 Altera Corporation
Preliminary
Write-Side Implementation
Table 11. Example 200MHz Write Timing Analysis for an EP1S25F1020C5 Device (Part 1 of 2)
Fast Slow
Parameter Specification Corner Corner Description
Model (ns) Model (ns)
Memory tD S 0.400 0.400 Memory Data Setup Requirement
Specifications
tD H (1) 0.400 0.400 Memory Data Hold Requirement
Altera Corporation 41
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Table 11. Example 200MHz Write Timing Analysis for an EP1S25F1020C5 Device (Part 2 of 2)
Fast Slow
Parameter Specification Corner Corner Description
Model (ns) Model (ns)
Timing tE A R LY _ C L O C K 1.172 2.154 Earliest possible clock edge seen by
Calculations memory device (minimum clock delay –
tC L K S K E W – tP L L J I T T E R )
tL AT E _ C L O C K 1.598 2.713 Latest possible clock edge seen by memory
device (maximum clock delay + tC L K S K E W +
tP L L J I T T E R )
tE A R LY _ D ATA _ I N VA L I D 2.162 3.144 Time for earliest data to become invalid for
sampling at the memory input pins (tH P -
tD C D + minimum data delay – tI O S K E W )
tL AT E _ D ATA _ VA L I D 0.340 1.435 Time for latest data to become valid for
sampling at the memory input pins
(maximum data delay + tI O S K E W )
Results Write setup timing 0.412 0.299 tE A R LY _ C L O C K – tL AT E _ D ATA _ VA L I D – tD S –
margin tE X T
Write hold timing 0.144 0.011 tE A R LY _ D ATA _ I N VA L I D – tL AT E _ C L O C K – tD H –
margin tE X T
Total margin 0.556 0.310 Setup margin + hold margin
42 Altera Corporation
Preliminary
Stratix & Stratix GX DDR Characterization Data
Stratix & The DDR SDRAM interface in Stratix and Stratix GX devices was
characterized under worst-case conditions. The Altera DDR SDRAM
Stratix GX DDR Controller MegaCore function was used to access the DDR SDRAM
Characterization module. For more information on the characterization setup, contact
Altera Applications.
Data
Board Design This section provides general guidelines for board design when using the
DDR SDRAM Controller MegaCore function and Stratix and Stratix GX
Guidelines devices. It also provides information about decoupling capacitance. The
following general guidelines apply when designing with Stratix and
Stratix GX devices and DDR SDRAM.
Altera Corporation 43
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
RT = 56 Ω
Address RS = 10 Ω
and Control 50 Ω
Signals
VTT
DIMM Pin
RT = 56 Ω
Data Strobe,
Data Mask, 50 Ω
and Data Signals RS = 10 Ω
44 Altera Corporation
Preliminary
Board Design Guidelines
>3×d
CK Trace
d
CK# Trace
Decoupling Capacitance
Traditional methods for providing decoupling involve placing capacitors
in locations that are convenient based on the routing of the board, and
applying some predetermined ratio of capacitors to driver pins.
However, the higher switching speeds of DDR make typical ratios less
useful. Perform careful planning and analysis to ensure that sufficient
decoupling is provided. The amount of capacitance on a board is usually
not the critical limiting factor in designing a decoupling system. Typically,
the amount of inductance in the capacitor leads and the vias attaching the
capacitors to the power and ground planes creates limitations. Altera
recommends using 0.1-F capacitors in an 0603-sized package to provide
sufficient capacitance without adding too much inductance. Make VTT
voltage decoupling on the motherboard close to the parallel pull-up
resistors. Connect the decoupling capacitors between VTT and ground.
The Stratix and Stratix GX memory interface board has a 0.1- F capacitor
for every other VTT pin. The Stratix and Stratix GX memory interface
board also has 0.1- and 0.01- F capacitors for every VDD and VDDQ pin.
Altera Corporation 45
Preliminary
Interfacing DDR SDRAM with Stratix & Stratix GX Devices
Conclusion Stratix and Stratix GX devices have dedicated circuitry to interface with
up to 200-MHz DDR SDRAM with comfortable and consistent margins.
The circuitry dynamically adjusts with PVT variations and can be fine-
tuned for your system requirements. Stratix and Stratix GX devices use
dedicated circuitry when reading from the memory and use the PLL
when writing to the memory. This implementation simplifies board
layout and controller design.
Copyright © 2005 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company,
the stylized Altera logo, specific device designations, and all other words and logos that are identified as
trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera
Corporation in the U.S. and other countries. All other product or service names are the property of their re-
spective holders. Altera products are protected under numerous U.S. and foreign patents and pending
101 Innovation Drive applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products
San Jose, CA 95134 to current specifications in accordance with Altera's standard warranty, but reserves the right to make chang-
(408) 544-7000 es to any products and services at any time without notice. Altera assumes no responsibility or liability
arising out of the application or use of any information, product, or service described
www.altera.com herein except as expressly agreed to in writing by Altera Corporation. Altera customers
Applications Hotline: are advised to obtain the latest version of device specifications before relying on any pub-
(800) 800-EPLD lished information and before placing orders for products or services.
Literature Services:
literature@altera.com
46 Altera Corporation
Preliminary