ECE 269
VLSI System Testing
Krish Chakrabarty
Built-In Self-Test (BIST)
ECE 269 Krish Chakrabarty 1
BIST Motivation
• Useful for field test and diagnosis (less expensive
than a local automatic test equipment)
• Software tests for field test and diagnosis:
Low hardware fault coverage
Low diagnostic resolution
Slow to operate
• Hardware BIST benefits:
Lower system test effort
Improved system maintenance and repair
Improved component repair
Better diagnosis
ECE 269 Krish Chakrabarty 2
1
Costly Test Problems
Alleviated by BIST
• Increasing chip logic-to-pin ratio – harder observability
• Increasingly dense devices and faster clocks
• Increasing test generation and application times
• Increasing size of test vectors stored in ATE
• Expensive ATE needed for 1 GHz clocking chips
• Hard testability insertion – designers unfamiliar with gate-
level logic, since they design at behavioral level
• Shortage of test engineers
• Circuit testing cannot be easily partitioned
ECE 269 Krish Chakrabarty 3
Typical Quality Requirements
• 98% single stuck-at fault coverage
• 100% interconnect fault coverage
• Reject ratio – 1 in 100,000
ECE 269 Krish Chakrabarty 4
2
Benefits and Costs of BIST
with DFT
Level Design Fabri- Manuf. Maintenance Diagnosis Service
and test cation Test test and repair interruption
Chips +/- + -
Boards +/- + - -
System +/- + - - - -
+ Cost increase
- Cost saving
+/- Cost increase may balance cost reduction
ECE 269 Krish Chakrabarty 5
Economics – BIST Costs
Chip area overhead for:
• Test controller
• Hardware pattern generator
• Hardware response compacter
• Testing of BIST hardware
Pin overhead -- At least 1 pin needed to activate BIST operation
Performance overhead – extra path delays due to BIST
Yield loss – due to increased chip area or more chips in system
because of BIST
Reliability reduction – due to increased area
Increased BIST hardware complexity – happens when BIST
hardware is made testable
ECE 269 Krish Chakrabarty 6
3
BIST Benefits
• Faults tested:
Single combinational / sequential stuck-at faults
Delay faults
Single stuck-at faults in BIST hardware
• BIST benefits
Reduced testing and maintenance cost
Lower test generation cost
Reduced storage / maintenance of test patterns
Simpler and less expensive ATE
Can test many units in parallel
Shorter test application times
Can test at functional system speed
ECE 269 Krish Chakrabarty 7
BIST Process
• Test controller – Hardware that activates self-test simultaneously on all
PCBs
• Each board controller activates parallel chip
• BIST Diagnosis effective only if very high fault coverage
ECE 269 Krish Chakrabarty 8
4
BIST Architecture
• Note: BIST cannot test wires and transistors:
From PI pins to Input MUX
From POs to output pins
ECE 269 Krish Chakrabarty 9
BILBO – Works as Both a PG
and a RC
• Built-in Logic Block Observer (BILBO) -- 4 modes:
1. Flip-flop
2. LFSR pattern generator
3. LFSR response compacter
4. Scan chain for flip-flops
ECE 269 Krish Chakrabarty 10
5
Complex BIST Architecture
• Testing epoch I:
LFSR1 generates tests for CUT1 and CUT2
BILBO2 (LFSR3) compacts CUT1 (CUT2)
• Testing epoch II:
BILBO2 generates test patterns for CUT3
LFSR3 compacts CUT3 response
ECE 269 Krish Chakrabarty 11
Bus-Based BIST Architecture
• Self-test control broadcasts patterns to each CUT over bus – parallel
pattern generation
• Awaits bus transactions showing CUT’s responses to the patterns:
serialized compaction
ECE 269 Krish Chakrabarty 12
6
Pattern Generation
• Store in ROM – too expensive
• Exhaustive
• Pseudo-exhaustive
• Pseudo-random (LFSR) – Preferred method
• Binary counters – use more hardware than LFSR
• Modified counters
• Test pattern augmentation
LFSR combined with a few patterns in ROM
Hardware diffracter – generates pattern cluster in neighborhood of
pattern stored in ROM
ECE 269 Krish Chakrabarty 13
Exhaustive Pattern Generation
• Shows that every state and transition works
• For n-input circuits, requires all 2n vectors
• Impractical for n > 20
ECE 269 Krish Chakrabarty 14
7
Pseudo-Exhaustive Method
Partition large circuit into fanin cones
Backtrace from each PO to PIs influencing it
Test fanin cones in parallel
Reduced # of tests from 28 = 256 to 25 x 2 = 64
Incomplete fault coverage
ECE 269 Krish Chakrabarty 15
Pseudo-Exhaustive Pattern
Generation
ECE 269 Krish Chakrabarty 16
8
Random Pattern Testing
Bottom:
random-pattern
resistant circuit
ECE 269 Krish Chakrabarty 17
Pseudo-Random Pattern
Generation
• Standard Linear Feedback Shift Register (LFSR)
Produces patterns algorithmically – repeatable
Has most of desirable random # properties
• Need not cover all 2n input combinations
• Long sequences needed for good fault coverage
ECE 269 Krish Chakrabarty 18
9
Matrix Equation for Standard
LFSR
X0 (t + 1) 0 1 0 … 0 0 X0 (t)
X1 (t + 1) 0 0 1 … 0 0 X1 (t)
. . . . . . .
. . . . . . .
. . . . . . .
Xn-3 (t + 1) = 0 0 0 … 1 0 Xn-3 (t)
Xn-2 (t + 1) 0 0 0 … 0 1 Xn-2 (t)
Xn-1 (t + 1) 1 h1 h2 … hn-2 hn-1 Xn-1 (t)
X (t + 1) = Ts X (t) (Ts is companion matrix)
ECE 269 Krish Chakrabarty 19
LFSR Implements a Galois
Field
Galois field (mathematical system):
Multiplication by x same as right shift of LFSR
Addition operator is XOR ( ⊕)
Ts companion matrix:
1st column 0, except nth element which is always 1 (X0 always
feeds Xn-1)
Rest of row n – feedback coefficients hi
Rest is identity matrix I – means a right shift
• Near-exhaustive (maximal length) LFSR
Cycles through 2n – 1 states (excluding all-0)
1 pattern of n 1’s, one of n-1 consecutive 0’s
ECE 269 Krish Chakrabarty 20
10
Standard n-Stage LFSR
Implementation
• Autocorrelation – any shifted sequence same as original in
2n-1 – 1 bits, differs in 2n-1 bits
• If hi = 0, that XOR gate is deleted
ECE 269 Krish Chakrabarty 21
LFSR Theory
• Cannot initialize to all 0’s – hangs
• If X is initial state, progresses through states X, Ts X,
Ts2 X, Ts3 X, …
• Matrix period:
Smallest k such that Tsk = I
k ≡ LFSR cycle length
• Described by characteristic polynomial:
f (x) = |Ts – I X |
= 1 + h1 x + h2 x2 + … + hn-1 xn-1 + xn
ECE 269 Krish Chakrabarty 22
11
Example External XOR LFSR
• Characteristic polynomial f (x) = 1 + x + x3
(read taps from right to left)
ECE 269 Krish Chakrabarty 23
External XOR LFSR
• Pattern sequence for example LFSR (earlier):
X0 1 0 0 1 0 1 1 1 0
X1 0 0 1 0 1 1 1 0 0 …
X2 0 1 0 1 1 1 0 0 1
• Always have 1 and xn terms in polynomial
• Never repeat an LFSR pattern more than 1 time –Repeats same
error vector, cancels fault effect
X0 (t + 1) 0 1 0 X0 (t)
X1 (t + 1) = 0 0 1 X1 (t)
X2 (t + 1) 1 1 0 X2 (t)
ECE 269 Krish Chakrabarty 24
12
Generic Modular LFSR
ECE 269 Krish Chakrabarty 25
Modular Internal XOR LFSR
• Described by companion matrix Tm = Ts T
• Internal XOR LFSR – XOR gates in between D flip-flops
• Equivalent to standard External XOR LFSR
With a different state assignment
Faster – usually does not matter
Same amount of hardware
• X (t + 1) = Tm x X (t)
• f (x) = | Tm – I X |
= 1 + h1 x + h2 x2 + … + hn-1 xn-1 + xn
• Right shift – equivalent to multiplying by x, and then
dividing by characteristic polynomial and storing the
remainder
ECE 269 Krish Chakrabarty 26
13
Modular LFSR Matrix
X0 (t + 1) 0 0 0 … 0 0 1 X0 (t)
X1 (t + 1) 1 0 0 … 0 0 h1 X1 (t)
X2 (t + 1) 0 1 0 … 0 0 h2 X2 (t)
. . . . . . . .
. = . . . . . . .
. . . . . . . .
Xn-3 (t + 1) 0 0 0 … 0 0 hn-3 Xn-3 (t)
Xn-2 (t + 1) 0 0 0 … 1 0 hn-2 Xn-2 (t)
Xn-1 (t + 1) 0 0 0 … 0 1 hn-1 Xn-1 (t)
ECE 269 Krish Chakrabarty 27
Example Modular LFSR
• f (x) = 1 + x2 + x7 + x8
• Read LFSR tap coefficients from left to right
ECE 269 Krish Chakrabarty 28
14
Primitive Polynomials
• Want LFSR to generate all possible 2n – 1 patterns (except
the all-0 pattern)
• Conditions for this – must have a primitive polynomial:
Monic – coefficient of xn term must be 1
• Modular LFSR – all D FF’s must right shift through
XOR’s from X0 through X1, …, through Xn-1, which
must feed back directly to X0
• Standard LFSR – all D FF’s must right shift directly
from Xn-1 through Xn-2, …, through X0, which must
feed back into Xn-1 through XORing feedback
network
ECE 269 Krish Chakrabarty 29
Primitive
Primitive Polynomials
Polynomials
(continued)
(continued)
Characteristic polynomial must divide the polynomial 1
+ xk for k = 2n – 1, but not for any smaller k value
See Appendix B of book for tables of primitive
polynomials
• If p (error) = 0.5, no difference between behavior
of primitive & non-primitive polynomial
• But p (error) is rarely = 0.5 In that case, non-
primitive polynomial LFSR takes much longer to
stabilize with random properties than primitive
polynomial LFSR
ECE 269 Krish Chakrabarty 30
15
Weighted Pseudo-Random
Pattern Generation
s-a-0
F
• If p (1) at all PIs is 0.5, pF (1) = 0.58 = 1
256
1
pF (0) = 1 – = 255
256 256
• Will need enormous # of random patterns to test a stuck-at
0 fault on F -- LFSR p (1) = 0.5
We must not use an ordinary LFSR to test this
• IBM – holds patents on weighted pseudo-random pattern
generator in ATE
ECE 269 Krish Chakrabarty 31
Weighted Pseudo-Random
Pattern Generator
• LFSR p (1) = 0.5
• Solution: Add programmable weight selection
and complement LFSR bits to get p (1)’s other
than 0.5
• Need 2-3 weight sets for a typical circuit
• Weighted pattern generator drastically shortens
pattern length for pseudo-random patterns
ECE 269 Krish Chakrabarty 32
16
Weighted Pattern Gen.
w1 w2 Inv. p (output) w1 w2 Inv. p (output)
0 0 0 ½ 1 0 0 1/8
0 0 1 ½ 1 0 1 7/8
0 1 0 ¼ 1 1 0 1/16
0 1 1 3/4 1 1 1 15/16
ECE 269 Krish Chakrabarty 33
Test Pattern Augmentation
• Secondary ROM – to get LFSR to 100% SAF coverage
Add a small ROM with missing test patterns
Add extra circuit mode to Input MUX – shift to ROM patterns
after LFSR done
Important to compact extra test patterns
• Use diffracter:
Generates cluster of patterns in neighborhood of stored ROM
pattern
• Transform LFSR patterns into new vector set
• Put LFSR and transformation hardware in full-scan chain
ECE 269 Krish Chakrabarty 34
17
Response Compaction
• Severe amounts of data in CUT response to LFSR patterns –
example:
Generate 5 million random patterns
CUT has 200 outputs
Leads to: 5 million x 200 = 1 billion bits response
• Uneconomical to store and check all of these responses on
chip
• Responses must be compacted
ECE 269 Krish Chakrabarty 35
Definitions
• Aliasing – Due to information loss, signatures of good and
some bad machines match
• Compaction – Drastically reduce # bits in original circuit
response – lose information
• Compression – Reduce # bits in original circuit response – no
information loss – fully invertible (can get back original
response)
• Signature analysis – Compact good machine response into
good machine signature. Actual signature generated during
testing, and compared with good machine signature
• Transition Count Response Compaction – Count # transitions
from 0 1 and 1 0 as a signature
ECE 269 Krish Chakrabarty 36
18
Transition Counting
ECE 269 Krish Chakrabarty 37
Transition Counting Details
Transition count:
m
C (R) = Σ (ri ⊕ ri-1) for all m primary outputs
i=1
To maximize fault coverage:
Make C (R0) – good machine transition
count – as large or as small as possible
ECE 269 Krish Chakrabarty 38
19
LFSR for Response Compaction
• Use cyclic redundancy check code (CRCC) generator
(LFSR) for response compacter
• Treat data bits from circuit POs to be compacted as a
decreasing order coefficient polynomial
• CRCC divides the PO polynomial by its characteristic
polynomial
Leaves remainder of division in LFSR
Must initialize LFSR to seed value (usually 0) before testing
• After testing – compare signature in LFSR to known good
machine signature
• Critical: Must compute good machine signature
ECE 269 Krish Chakrabarty 39
Example Modular LFSR
Response Compacter
• LFSR seed value is “00000”
ECE 269 Krish Chakrabarty 40
20
Polynomial Division
Inputs X0 X1 X2 X3 X4
Initial State 0 0 0 0 0
1 1 0 0 0 0
0 0 1 0 0 0
Logic 0 0 0 1 0 0
Simulation: 0 0 0 0 1 0
1 1 0 0 0 1
0 1 0 0 1 0
1 1 1 0 0 1
0 1 0 1 1 0
Logic simulation: Remainder = 1 + x2 + x3
0 1 0 1 0 0 0 1
0 x0 + 1 x1 + 0 x2 + 1 x3 + 0 x4 + 0 x5 + 0 x6 + 1 x7
.
ECE 269 Krish Chakrabarty 41
Symbolic Polynomial Division
x2 + 1
x7 + x3 +x
x5 + x3 + x + 1 7 5 3 2
x +x + x +x
x5 + x2 + x
x5 + x3 +x +1
remainder 3
x +x 2
+1
Remainder matches that from logic simulation
of the response compacter!
ECE 269 Krish Chakrabarty 42
21
Multiple-Input Signature
Register (MISR)
• Problem with ordinary LFSR response
compacter:
Too much hardware if one of these is put on each
primary output (PO)
• Solution: MISR – compacts all outputs into one
LFSR
Works because LFSR is linear – obeys superposition
principle
Superimpose all responses in one LFSR – final
remainder is XOR sum of remainders of polynomial
divisions of each PO by the characteristic polynomial
ECE 269 Krish Chakrabarty 43
Modular MISR Example
ECE 269 Krish Chakrabarty 44
22
Taxonomy
BIST methods
Off-line On-line
Functional Structural Concurrent Non-concurrent
ECE 269 Krish Chakrabarty 45
Test-Per-Scan BIST (Scan-
BIST)
LFSR Scan chain LFSR
Pattern Response
generator monitor
Circuit
under
test
• High testing time (one pattern every n+1 cycles)
• At-speed testing not possible
• Low performance degradation (none beyond scan)
• Fairy low overhead
ECE 269 Krish Chakrabarty 46
23
Test-Per-Clock BIST
Input scan
register (reconfigured Circuit
as TPG) under
test Output scan
register (reconfigured
as SA)
• Suitable for register-based designs
• Low testing time (one pattern every cycle)
• At-speed testing achieved
• Performance degradation (delay on functional paths)
• Overhead may be high
ECE 269 Krish Chakrabarty 47
BIST Pattern Generators
• LFSRs are good pseudorandom pattern generators
– Low cost, easy to implement
• Suitable for test-per-scan and test-per-clock
• LFSR-generated patterns exhibit high degree of
randomness (can be measured through statistical
correlation tests)
ECE 269 Krish Chakrabarty 48
24
BIST Pattern Generators
• For test-per-clock BIST, non-random patterns are often needed,
especially for random-pattern-resistant faults
• Test length with pseudorandom patterns only is excessive
• How to generate non-random patterns?
– Use weighted random patterns
– Use mapping logic
LFSR
LFSR
Mapping logic
ECE 269 Krish Chakrabarty 49
BIST Pattern Generators
• For scan-BIST, the bits in the LFSR sequence should have
low linear correlation and sequence length should be large
– Use primitive polynomials
– Use longer LFSRs
– Use bit-fixing logic
LFSR Scan chain
Bit-fixing logic
LFSR Scan chain
From LFSR
From control logic Fix-0 Test patterns
Fix-1
ECE 269 Krish Chakrabarty 50
25
BIST Architectures
Chip
• Centralized and separate
Distribution
Distribution
CUT
circuit
circuit
TPG SA BIST
– Lower overhead
CUT
– Lower fault coverage and
high testing time
BIST – Easy to control and
controller implement
ECE 269 Krish Chakrabarty 51
BIST Architectures
Chip
• Distributed and separate
TPG CUT SA BIST
– Higher overhead
TPG CUT SA – Low testing time
(parallelism)
– Higher fault coverage
Chip • Distributed and embedded
BIST
TPG SA
– Lower overhead
CUT – Difficult to control
– In-system reconfiguration
TPG SA employed
ECE 269 Krish Chakrabarty 52
26
STUMPS Architecture
• Self-testing using MISR and PRPG
• Centralized and separate BIST
• Multiple scan paths
Primary inputs
Scan path
Scan path
PRPG MISR
CUT
Scan path
Primary outputs
ECE 269 Krish Chakrabarty 53
Built-in Logic Block Observer
(BILBO)
• Reconfigure registers to act in four modes of operation:
functional (parallel load), scan (serial), LFSRs and MISRs
Z1 Z2 Z3
B1
B2
Si
0M
U D Q D Q D Q
1X
Q Q Q
Q1 Q2 Q3
ECE 269 Krish Chakrabarty 54
27
BILBO Operation Modes
B1 = B2 = 1 (normal mode)
Z1 Z2 Z3
D Q D Q D Q
Q1 Q2 Q3
B1 = B2 = 0 (shift register, scan mode)
Si D D D
Q Q Q
B1 = 1, B2 = 0 (MISR, response mode), B1 = 0, B2 = 1 (LFSR mode),
ECE 269 Krish Chakrabarty 55
28