Fundamentals of Ultra-Low Voltage Embedded Memory Design: Eric Karl
Fundamentals of Ultra-Low Voltage Embedded Memory Design: Eric Karl
Eric Karl
Intel Fellow, Director of Embedded Memory Circuits & Technology
Technology Development, Intel Corporation
System
Design
Challenge
Memory
Bandwidth
High
Frequent
Low
Infrequent
Storage Access
• Reduced Data Movement Energy • Reduced Data Movement Energy • Reduced Data Movement Energy
• Energy Efficient Compute using
Array Structure
M3 M5
M5 M4 M1
50-150mV
Bitline Bar (BLB)
Bitline (BL)
Precharge
& Equalize
Wordline (WL) Small Multiplexor
Signal (optional)
Sense Amp Enable Sensing
(SAEN) Sense
Amplifier
E. Karl T7: Fundamentals of Ultra-Low Voltage Embedded Memory Design 12 of 66
© 2023 IEEE International Solid-State Circuits Conference
6T SRAM Read: Performance Concerns
Read Performance Failure
Read Performance Failures
WL VCC - ∆VWLUD
VWL are influenced by variation in
VCS
VCC
WL
the following parameters:
BL
BLB Bitcell Read Current
BL
VBL PU PU
BLB
VBL
VCC
N0 N1
VCC BL Sensing Bitline Capacitance
(typical) Margin
PG
∆V VCC
PG
BL
(high-sigma)
Failure Bitline Resistance
Wordline RC
PD PD
SENSE
ENABLE Sense Amplifier Mismatch
Stronger Weaker
Weak
WL VCC - ∆VWLUD
VWL
PMOS Targeting
0
VCS WL
VCC
0
BL
VBL PU PU
BL
BLB
VBL
VCC VCC BLB
N0 N1
VCC
BL
∆V
PG PG 0
Strong
PD PD 0
N1
Stability
Failure
0 Unstable
N0
Strong Weak
Stronger Weaker
NMOS Targeting
Read stability failures result in a bitcell that loses state during a read
operation due to charge injection from bitline to internal nodes
VCC VCC
Voltage transfer curves are superimposed
BL PGL n0 n1 PGR BLB to find SNM (the largest square that fits
PDL PDR
between the transfer curves)
PUL PUR
n1
n1 n0
PGL n0 PGR
PDL PDR
Write Driver
N0 N1
Wordline (WL)
Weak
WL
VWL VCC - ∆VWLUD Margin
PMOS Targeting
VCS WL
VCCàVCSMIN
BL
VBL PU PU BLB
VBL
VCC VSS BL
N0 N1
VSS VCC BLB
PG PG
Strong
PD PD
N1
Write
Failure Low Write
N0 Margin
Stronger Weaker
Strong Weak
NMOS Targeting
Write margin failures occur when NMOS passgate devices are incapable of
overwriting state held by the cross-coupled inverter pair
Driven by NMOS passgate contention with PMOS pullup device
E. Karl T7: Fundamentals of Ultra-Low Voltage Embedded Memory Design 18 of 66
© 2023 IEEE International Solid-State Circuits Conference
Static Write Voltage Margin (WVM)
WL
Voltage
n1
Write voltage
BLB
BL = VCC
PUL PUR BLB margin is the
voltage at which
PGL n0 n1 PGR
n0 internal nodes
PDL PDR flips
time
In static write voltage margin measurement, bitline (BL) is held at VCC and
bitline bar (BLB) is ramped down
Random Variation analysis is usually applied to assess functionality under
specific process skew, voltage and temperature conditions
Write Voltage Margin is the bitline voltage at which the bit flips
peripheral circuits
Time-dependent writability (i.e. write-at-speed)
TVC WABIAS Bitline resistance and capacitance impact
Peripheral circuit impact
WRDATA_b
sramvcc
n1 Simulation waveform of a write
vsswrdrv operation with TVC is shown
WL
n0
Weak
Margins
stability and write margin
PMOS Targeting
(better)
Stability Write
Margin Margin
+ =
Strong
Low
Margins
(worse)
Strong Weak
NMOS Targeting
Device sizing and targeting in SRAM is central to delivering adequate margins
Higher margin cells can operate at lower voltages without electrical failures
Unacceptable
Read Current
(cell performance)
NMOS and PMOS targeting is
constrained on the strong side by
maximum acceptable leakage current
Strong
Unacceptable
Landing zone for SRAM to meet VMIN
Leakage (margins-driven), performance and
Strong Weak Low
leakage requirements is challenging
Margins in advanced CMOS technologies
NMOS Targeting (worse)
NMOS VTH
NMOS VTH
Anneal Etch
VCS
VCCàVCSMIN
VBL PU PU VBL
VCC VSS
N0 N1
VSS VCC
PG PG
PD PD
Collapse bitcell VCC supply node to weaken PU and reduce write contention
PD PD Wordline = VSS
Unselected bitcells along the column have dynamic retention risks (lose state)
BL BL#
“0” “1”
N0 N1
Dynamic Data
Leakage
Retention Time
Paths
CN1
VSS
Time
VCS can be temporarily collapsed below Data Retention Voltage (DRV)
Timing and level are sensitive to leakage paths (transistor, defects, etc.)
E. Karl T7: Fundamentals of Ultra-Low Voltage Embedded Memory Design 28 of 66
© 2023 IEEE International Solid-State Circuits Conference
Negative Bitline (NBL) Write Assist
WL0
2b TRIM VCC
BL/BL#
VSS
WREN
NBL-WA 1X 2X 4X
3b TRIM
NOWA TVC
Narrow
VMIN (A.U.)
225mV TVC
Mid
TVC
Source: Kim, VLSI 2018
NBL Wide
1.0 1.2 1.4 1.6
Normalized Write Power at 675mV LCV = Voltage Collapse
NBL = Negative Bitline
Both TVC and NBL circuits
demonstrated to deliver 200-300mV
VMIN enhancements on multiple nodes Source: Chen, ISSCC 2014
VCS
VWL ∆VWLUD
VCC
VBL PU PU VBL
VCC VCC
N0 N1
∆V VCC
PG PG
N0
PD PD
∆V
Vmin (AU)
technologies
VCC VCC - ∆V
WLUD-RA VCCWL (V)
Source: Karl, ISSCC 2012
x7 6T-SRAM Cell
Sequential
Access
Parallel
Access
“True” 2RW dual-port SRAM provides the most flexible access characteristics
for a 2-port memory, but introduces unique margin and timing challenges
February 20th, 2020 T7: Fundamentals of Ultra-Low Voltage Embedded Memory Design 39 of 66
© 2023 IEEE International Solid-State Circuits Conference
2RW DP-SRAM: Supported Array Accesses
Different Row Different Row Same Row Same Row
Different Column Same Column Different Column Same Column
Worse for
Same-Row
Access
Worse for
Same-Row
Access
Access WL
(Read or Write)
Negative Skew
1R1W dual-port SRAM doesn’t support two concurrent write operations, but
has fewer timing and margin challenges than 2RW dual-port
Hierarchical Hierarchical
8T SRAM Array 8T SRAM Array
64b/WL
GBL Merge, SDL
Hierarchical Hierarchical
8T SRAM Array 8T SRAM Array
Local Bitline (LBL) LBL Keeper
M0 or M2, Keeps LBL at VCC
~32-128PP Length when DATA = 0
Primary-Secondary Flip-Flop
E. Karl T7: Fundamentals of Ultra-Low Voltage Embedded Memory Design 49 of 66
© 2023 IEEE International Solid-State Circuits Conference
Vmin Failures in Logic Sequentials
Even fully interruptible state elements can fail at lower voltages,
depending upon the circuit topology employed
For the primary-secondary flip-flop from our example, the typical
limiters at lower voltages include:
Write-back charge-sharing across the pass-gate
Internal min-delay failures related to internal clock slopes
Logic Supply
Decoder
Decoder
Decoder
SRAM SRAM SRAM SRAM Supply
Bitcells Bitcells Bitcells
System Vmin Test Coverage Guardband for the very real, practical
Guardband limitations in time-0 test coverage
Regulator
Tolerance Depending upon application and technology,
Package Droop
random telegraph noise can introduce bit errors
On-Die Droop
during operation
RTN Noise
EOL Vmin Aging guardband dependent upon lifetime stress
Aging
FoM
condition required for application; some JEDEC
Adjustment standards may apply
Time-0 Vmin
Figure of Merit Vmin (standard way to quote
Time-0
Vmin @ Vmin) may need adjustment based upon
Foundry temperature range, array size and distribution
FoM of functional parts desired
Varies
+Assist Ckts Unassisted Unassisted
by IP
General
Purpose General
Purpose
+ECC/Repair
Nominal
+Assist Ckts
LV Optimized
+ECC/Repair
+Assist Ckts
+ECC/Repair LV Optimized
Topologies
LV Optimized
Limit: Vth + Random Variation + Systematic Variation
Near-Threshold
Low voltage operation requires careful selection and optimization of storage elements
Assisted SRAM enables operation below nominal
2P SRAM, with independent optimization for read/write ports, can go further
Optimized Logic Sequentials can reach near-threshold regime