Viterbi Algorithm
Advanced Architectures
Lecture 15
Vladimir Stojanović
6.973 Communication System Design – Spring 2006
Massachusetts Institute of Technology
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Radix 2 ACS
Gnx0 = min ( G 0nx- 1 + λ0nx-01, G 1x 1x 0
n - 1 + λn - 1 (
λ 0x0 1x0
n-1 n n λn
x0
λ 0x0 λ 0x0 x0 2-way dn
Gn0 x- 1
0 n G x0 n dn G0x
1 n n-1 ACS Gnx0
λ1x0 Gn0x- 1 Å λ n λ 1x1
0x1
Compare
n n
Select
0x1 λ 1x0
n Gnx0 x1
λn 2-way dn
Gn1x- 1 Gn1x- 1
G1nx- 1 1
0 Gnx1 ACS G nx1
λ 1nx1
Figure by MIT OpenCourseWare.
Radix-2 trellis 2-way ACS Radix-2 ACS Unit
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 2
Radix-4 trellis
n-2 n-1 n n-2 n-1 n n-2 n
0 0 0 0 0 0 0 0
1 1 1 2 1 1 2 1
2 2 2 4 4 2 4 2
3 3 3 6 5 3 6 3
4 4 4 1 2 4 1 4
5 5 5 3 3 5 3 5
6 6 6 5 6 6 5 6
7 7 7 7 7 7 7 7
Figure by MIT OpenCourseWare.
8-state Radix-2 trellis 4-state subtrellis 8-state Radix-4 trellis
Radix - 2k complexity speed measures
Ideal Complexity Area
Radix k speedup increase efficiency
2 1 1 1 1
4 2 2 2 1
8 3 3 4 0.75
16 4 4 8 0.5 Figure by MIT OpenCourseWare.
3
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Radix-4 ACS
λ00 x00 λ01 x00 λ10 x00 λ11 x00
n n n n
d nx 00
n λ00 x00 d nx 00 G 0n 0- 2x 4-way
n-2 n ACS Gnx00
Gn00- x2 Gnx 00 G 0n0-x2 λ00 x01 λ01 x01 λ10 x01 λ11 x01
n n n n
λ01 x00
n
d nx 01
Gn01- x2 Gnx 01 G 0n1-x2 Å G 0n 1- 2x 4-way
Compare
ACS Gnx01
Select
λ10
n
x00 Gnx 00
x10 λ01 x10 λ10 x10 λ11 x10
Gn10- x2 Gnx10 λ00
n n n n
G1n0-x2 Å
λ11 x00 d nx 10
Gn11- x2 Gnx11 n G1n 0- 2x 4-way
ACS Gnx10
Gn11-x2 Å
λ00 x11 λ01 x11 λ10 x11 λ11 x11
n n n n
4-way d nx 11
G1n 1- 2x ACS Gnx11
Radix-4 trellis 4-way ACS Radix-4 ACS unit
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 4
Radix-4 ACS implementation
Use ripple carry adders and
comparators Image removed due to copyright restrictions.
Take advantage of the ripple
profile to hide the compare
Delay 17% longer than 2-way
ACS due to
Increased adder fanout
4:1 mux instead of 2:1 mux
Overall, results in 1.7x
speedup compared to 2-way
ACS
Figure from from Black, P. J., and T. H. Meng. "A 140-Mb/s, 32-state, Radix-4 Viterbi Decoder."
IEEE Journal of Solid-State Circuits 27 (1992): 1877-1885. Copyright 1992 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 5
Radix-4 placement and routing
Paths given as Hamiltonian cycles (visit each node
in the graph once)
Images removed due to copyright restrictions.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 6
Modulo arithmetic for ACS
Viterbi algorithm inherently bounds the maximum
dynamic range ∆max of state metrics
∆max ≤ λmaxlog2N (N-number of states, λmax maximum
branch metric of the radix-2 trellis)
Number theory
Given two numbers a and b such that |a-b|< ∆
Comparison |a-b| can be evaluated as |a-b| mod 2∆ without
ambiguity
Hence state metrics can be updated and compared
modulo 2∆max
Choose state metric precision to implement modulo by ignoring
the state metric overflow
Required state metric precision equal to twice the maximum
dynamic range of the updated state metrics
Required number of bits is Γbits=ceil[ log2(∆max+kλmax) ] + 1
k – accounts for branch metric addition
Example values (for the 32-state radix-4 decoder)
k=2, λmax=14, ∆max=70, Γbits=8
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 7
Branch metric unit
Example (8-level soft input, R=1/2, K=6 (32 state)
λ(S1S2)=|G1-S1|+|G2-S2| (G-received sample, S-expected
sample) (λmax=14)
4 bits required for radix-2 branch metrics
5 bits for the radix-4 branch metrics
Images removed due to copyright restrictions.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 8
State-metric initialization
Need to start from right state metrics for dynamic range
bound to hold (and for modulo arithmetic to be valid)
This is b/c there are constraints on the state metric values
imposed by the trellis structure
For example state-0 and state-1 have a common ancestor state
one iteration back
This constrains the state metrics to differ at most by the λmax
Find the right initial metric through simulation (with all zero
inputs) until steady state is reached
Figure removed due to copyright restriction.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 9
Decoder block diagram
Decoder output
4
LIFO
4 Decisions
Radix-16 trace-back unit
32 x 4 Radix-16 decisions
Decision memory
32 x 4 Radix-16 decisions
Radix-4 pretrace-back unit
32 x 2 Radix-4 decisions
Radix-4 ACS array
16 x 5 Metrics
Radix-4 branch metric unit
4x4 Metrics Metrics 4x4
Radix-2 BMU Radix-2 BMU
2x3 G 1, G 2 G 1, G 2 2x3
Decoder inputs
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 10
Decision memory organization
Survivor paths always merge L=5K steps back
Decode Survivor path Write
block merge block block
D L D
Decision vector
Trace-back Write
{
{ Read
region
Write
region
Figure by MIT OpenCourseWare.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 11
Advanced algorithmic transformations
Sliding block Viterbi decoder (SBVD)
Based on two important observations (µ+1=constraint
length of the code)
1. Survivor paths merge L=5µ iterations back into the trellis
2. After K=5µ steps, state metrics independent on the initial
value of state metrics
Unknown state at time n can be decoded using only
information from the block [n-K, n+L]
Cannot store all the values in the memory
Have to obtain them “on-the-fly”
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 12
SBVD implementation
Can find shortest path by running forward or backward
At step m
Forward processing
4 survivors
Backward processing
4 shortest paths
Combined
Smallest concatenated state metric
Starting state for trace-back of the shortest path
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Continue fw and backw operation
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 13
Forward vs. Forward-Backward
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Can decode more than one state (M – states)
Fw-Bw has reduced decoding delay and skew buffer memory
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 14
Continuous stream processing
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Cut the incoming stream in overlapping chunks
Process in parallel
Outputs are non-overlapping
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 15
Systolic SBVD architecture
Since block is bounded
Can unroll and pipeline
ACS
Trace-back
M=2L
Has to be a multiple of L
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 16
Example for L=2
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 17
ACS units
Figures from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder."
IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 18
A 64-state example
Not fully parallel (8 radix-4 ACS units)
2 radix-4 butterflies in each cycle
8 cycles for 64 states radix-4 (i.e. two radix-2 steps)
Images removed due to copyright restrictions.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 19
Readings
[1] A.P. Hekstra "An alternative to metric rescaling in Viterbi
decoders," Communications, IEEE Transactions on vol. 37, no.
11, pp. 1220-1222, 1989.
[2] P.J. Black and T.H. Meng "A 140-Mb/s, 32-state, radix-4
Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 27,
no. 12, pp. 1877-1885, 1992.
[3] P.J. Black and T.Y. Meng "A 1-Gb/s, four-state, sliding block
Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 32,
no. 6, pp. 797-805, 1997.
[4] M. Anders, S. Mathew, R. Krishnamurthy and S. Borkar "A
64-state 2GHz 500Mbps 40mW Viterbi accelerator in 90nm
CMOS," VLSI Circuits, 2004. Digest of Technical Papers. 2004
Symposium on, pp. 174-175, 2004.
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
6.973 Communication System Design 20