DSP (2007)
Computation of DFT
Computation of DFT
Efficient algorithms for computing DFT Fast Fourier Transform. (a) Compute only a few points out of all N points (b) Compute all N points What are the efficiency criteria? Number of multiplications Number of additions Chip area in VLSI implementation
DFT as a Linear Transformation
Matrix representation of DFT Definition of DFT:
X (k ) = x(n) = 1 N
N 1
n =0 N 1 k =0
x(n)WNkn , X (k )WNkn ,
k = 0,1, K, N 1 n = 0,1, K, N 1
where
X ( 0) x (0) X (1) Let x (1) , , XN = xN = M M X ( N 1) x ( N 1)
and
1 1 1 W N 2 WN = 1 W N M M ( N 1) 1 W N 1 W W W
2 N 4 N
2 ( N 1) N
2 ( N 1) L WN L M ( N 1)( N 1) L WN L L 1
N 1 WN
Thus,
X N = WN x N xN = W XN 1 * = WN XN N
1 N
N - point DFT N - point IDFT
k Because the matrix (transformation) WN has a specific structure and because WN has par-
ticular values (for some k and n), we can reduce the number of arithmetic operations for computing this transform.
NCTU EE
DSP (2007)
Computation of DFT
Example
x[n ] = [0 1 2 3]
W40 W40 0 W W41 W4 = 40 W4 W42 0 3 W4 W4
1 1 W40 W40 1 1 1 2 3 2 3 W4 W4 1 W4 W4 W4 = 2 0 2 W44 W46 1 W4 W4 W4 3 2 1 W46 W49 1 W4 W4 W4 1 1 1 1 1 j 1 j = 1 1 1 1 1 j 1 j
Only additions are needed to compute this specific transform. (This is a well-known radix-4 FFT) Thus, the DFT of x[n ] is
6 2 + 2 j X 4 = W4 x 4 = 2 2 2 j
Fast Fourier Transform
-- Highly efficient algorithms for computing DFT General principle: Divide-and-conquer
k Specific properties of WN
kn kn * Complex conjugate symmetry: WN = (WN )
k+N 2 = W k Symmetry: WN N
k+N k Periodicity: WN = WN
Particular values of k and n: e.g., radix-4 FFT (no multiplications) Direct computation of DFT
X [k ] = =
N 1 n =0 N 1 n =0
x[n] WNkn ,
[Re(x[n]) Re(W ) Im(x[n]) Im(W )] + j[Re( x[n]) Im(W ) + Im( x[n])Re(W )]
kn N kn N kn N kn N
k = 0,1, K , N 1
For each k, we need N complex multiplications and N-1 complex additions. 4N real multiplications and 4N-2 real additions. We will show how to use the properties of WN to reduce computations. Radix-2 algorithms: Decimation-in-time; Decimation-in-frequency Composite N algorithms: Cooley-Tukey; Prime factor Winograd algorithm Chirp transform algorithm
k
NCTU EE
DSP (2007)
Computation of DFT
Radix-2 Decimation-in-time Algorithms
-- Assume N-point DFT and N = 2 Idea: N-point DFT N
-point DFT N
-point DFT
N -point DFT 4
N -point DFT N -point DFT 2 4
N -point DFT 4
Sequence: x[0] x[1] x[2] x[3] L Even index: x[0] x[2] Odd index: x[1] x[3]
X [k ] =
N 1
x[n ] L x[ N 1] 2
L x[ N 1]
L x[ N 2 ]
x[n]WNkn , k = 0,1,K, N 1 n =0 kn kn = x[n ]WN + x[n ]WN
n even N 1 2 r =0
14243
n =2 r
n odd
14243
n =2 r +1 N 1 2 r =0
x[2r ]WN2 rk + x[2r + 1]WN( 2 r+1) k
2 N 2 2 j N
QW = e
N 1 2
=e
2 j N /2
= WN / 2
rk k rk X [k ]= x[2r ]WN / 2 + WN x[ 2 r + 1]WN / 2 r= =04 0 4 1r 4 244 3 14 4 2444 3 N -point DFT 2 N -point DFT 2
N 1 2
k = G[ k ] + W N H [k ]
NCTU EE
DSP (2007)
Computation of DFT
Comparison: (a) Direct computation of N-point DFT (N frequency samples): ~ N 2 complex multiplications and N 2 complex adds (b) Direct computation of N -point DFT: 2 ~ N complex multiplications and N complex adds
2 2
2 2
+ additional
complex multis and
2 2
complex adds
~ (Total:) N + 2 N = N + N complex multis and adds 2 2 (c) log 2 N -stage FFT Since N = 2 , we can further break N -point DFT into two N -point DFT and
so on.
At each stage: ~
complex multis and adds
Total: ~ N log 2 N complex multis and adds (--> N log N ) 2 2
NCTU EE
DSP (2007)
Computation of DFT
Number of points, N 4 8 16 64 256 1024
Direct Computation: Complex Multis 16 64 256 4,096 65,536 1,048,576
FFT: Complex Multis 4 12 32 192 1,024 5,120
Speed Improvement Factor 4.0 5.3 8 21.3 64.0 204.8
Butterfly: Basic unit in FFT Two multiplications:
One multiplication:
NCTU EE
DSP (2007)
Computation of DFT
In-place computations Only two registers are needed for computing a butterfly unit.
r X m [ p ] = X m 1 [ p ] + W N X m 1 [ q] r X m [ q] = X m 1 [ p ] W N X m 1 [ q ]
Advantage: less storage! In order to retain the in-place computation property, the input data are accessed in the bit-reversed order. Note: The outputs are in the normal order (same as the position) Binary equivalent Bit reversed 6 110 011 2 010 010 Remark: Index 3 input data is placed at position 6. Position Sequence index 3 2
We may also place the inputs in the normal order; then the outputs are in the bit-reversed order.
NCTU EE
DSP (2007)
Computation of DFT
If we try to maintain the normal order of both inputs and outputs, then in-place computation structure is destroyed.
NCTU EE
DSP (2007)
Computation of DFT
Radix-2 Decimation-in-frequency Algorithms
Dividing the output sequence X [k ] into smaller pieces.
X (k ) =
N 1 n =0
x( n )W Nkn ,
k = 0,1,K , N 1
If k is even,
X [2 r ] = =
N 1 n =0 N 1 2 n =0
k = 2r .
r = 0,1, L ,
N 1
x[n ]WNkn ,
N 1 2 n (n + N ) 2
N
2 nr x[n ]WN x[n ]WN2nr + N n= 2 N 1 2
2 r n+ N = x[n ]W + x n + WN 2 2 n =0 n =0 2 r[ n+ N ] 2 rn 2 rn rN 2 Q WN = WN WN = WN 2 nr N
N 1 2
= =
N 1 2
n =0 N 1 2 n =0
n+ x[n ] + x n+ x[n ] + x
N 2 nr WN 2 N nr WN 2 2
Similarly, if k is odd,
X [2 r + 1] =
N 1 2 n =0
k = 2r + 1 .
N n nr WN WN 2 2
N nr WN 2 2 N n nr WN WN 2 2
x[n] x n +
N 1 2 n =0 N 1 2 n =0
X [2r ] = X [2r + 1] =
n+ x[n] + x n+ x[n] x
Let g[n ] = x[n ] + x n + 2
N h[n ] = x[n ] x n + 2
NCTU EE
DSP (2007)
Computation of DFT
We can further break X [2r ] into even and odd groups Again, we can reduce the two-multiplication butterfly into one multiplication. Hence, the computational complexity is bout N log N . The in-place computation property holds if the 2
2
outputs are in bit-reversed order (when inputs are in the normal order).
NCTU EE
DSP (2007)
Computation of DFT
FFT for Composite N
-- Cooley-Tukey Algorithm: N = N 1 N 2
Time index : n = N 2 n1 + n2 Freq. index : k = k1 + N 1k 2 0 n1 N 1 1 0 n2 N 2 1 0 k1 N 1 1 0 k 2 N 2 1
Remark: n ( n1 , n2 ) and k ( k1 , k 2 )
Goal: Decompose N-point DFT into two stages:
N 1 -point DFT N 2 -point DFT
X [k ] =
N 1 n =0
x[n]WNkn ,
0 k N 1
= X [k1 + N 1k 2 ] = =
N 2 1 N1 1 n2 =0 n1 =0 N 2 1 N1 1 n2 =0 n1 =0
x[ N 2 n1 + n2 ] WN(k + N k )( N n +n )
1 1 2 2 1 2
N kn W k n x[ N 2 n1 + n2 ] W N 1 23 N
2 1 1 k1n1 WN 1
1 2
N1N 2k2n1 k1N1n2 WN WN 123 1 4 24 3
k 2 n2 WN 2
k1n1 k1n2 k2n2 W = x[ N 2 n1 + n2 ] WN N3 WN 2 1 1 2 n2 =0 n1 =0 twiddle 1444 4 24444 3 factor N point 44444 44 1 14 44 2444444 3
N 2 1 N1 1 N 2 -point
Procedure (1) Compute
G[n2 , k1 ] =
N 1 -point DFT: (row transform)
N1 1 n1 =0
x[ N 2 n1 + n2 ] WNk n
1 1 1
(2) Multiply twiddle factors:
~ k1n2 G[n2 , k1 ] = WN G[n2 , k1 ]
(3) Compute
N 2 -point DFT: (column transform)
N 2 1 n2 = 0
X [k1 + N1k 2 ] =
G[n2 , k1 ] WNk n
2 2 2
NCTU EE
10
DSP (2007)
Computation of DFT
(Computation of N=15-point DFT by means of 3-point and 5-point DFTs.)
NCTU EE
11
DSP (2007)
Computation of DFT
Extension:
N = N 1 N 2 L N
Let ( N ) number of multiplications for N - point DFT If N = N 1 N 2
N 2 ( N1 ) 1. row transform : N1 N 2 = N 2. twiddle factors : 3. column transfrm : N ( N ) 1 2
( N ) = N 2 ( N1 ) + N1 ( N 2 ) + N ( N1 ) ( N 2 )
= N N 1 + N2
+ 1
In general,
(N ) = N
i =1
(Ni )
Ni
+ ( 1)
In fact, the term ( 1) should be ( 1)
because rearranging the butterfly structure
would make half of the branches becoming 1. Special Case: N 1 = N 2 = L = N = 2 Radix-2: N 1 = N 2 = L = N = 2 and = log 2 N
( N ) = N ( 1) 2 multiplications because ( 2) requires no multiplications.
Radix-4: N1 = N 2 = L = N = 4 and = log 4 N
( N ) = N ( 1) 2 multiplications because ( 4 ) requires no multiplications. This FFT
has fewer stages than Radix-2 ==> fewer multiplications.
W40 W40 0 W W41 W4 = 40 W4 W42 0 3 W4 W4 1 1 W40 W40 1 1 1 2 3 2 3 W4 W4 1 W4 W4 W4 = 2 0 2 = W44 W46 1 W4 W4 W4 3 2 1 W46 W49 1 W4 W4 W4 1 1 1 1 1 j 1 j 1 1 1 1 1 j 1 j
NCTU EE
12
DSP (2007)
Computation of DFT
Inverse FFT
IDFT: DFT:
x[n ] = 1 N
N 1 k =0
X [k ] WNkn
N 1 n =0
(*)
X [k ] =
x[n] W Nnk
*
Hence, take the conjugate of (*) :
x [n ] =
*
= = =
1 N 1 N 1 N 1 N
N 1 kn X [k ] W N k =0
N 1 k =0 N 1 k =0
(X [k ] W Nkn )
(X * [k ] W Nkn )
DFT X * ( k )
Take the conjugate of the above equation:
x[n ] = 1 DFT X * ( k ) N 1 = FFT X * ( k ) N
( (
[ [
]) ])
Thus, we can use the FFT algorithm to compute the inverse DFT.
NCTU EE
13