Proof of Optimality of Huffman Codes
CSC373 Spring 2009
1 Problem
P are given an alphabet A and a frequency function f : A → (0, 1) such that
You
x f (x) = 1. Find a binary tree T with |A| leaves (each leaf corresponding to
a unique symbol) that minimizes
X
ABL(T ) = f (x)depth(x)
leaves of T
Such a tree is called optimal.
2 Algorithm
HUF(A, f )
If |A| = 1 then return a single vertex.
Let w and y be the symbols with the lowest frequencies.
Let A0 = A \ {w, y} + {z}.
Let f 0 (x) = f (x) for all x ∈ A0 \ {z}, and let f 0 (z) = f (w) + f (y).
T 0 = HUF(A0 , f 0 ).
Create T from T 0 by adding w and y as children of z.
return T
3 Proof
Lemma 1 Let T be a tree for some f and A, and let y and w be two leaves. Let
T 0 be the tree obtained from T by swapping y and w. Then ABL(T 0 )−ABL(T ) =
(f (y) − f (w))(depth(w, T ) − depth(y, T )).
Proof
ABL(T 0 ) − ABL(T ) = f (y)depth(w, T ) + f (w)depth(y, T ) − f (w)depth(w, T ) − f (y)depth(y, T )
= f (y)(depth(w, T ) − depth(y, T )) + f (w)(depth(y, T ) − depth(w, T ))
= (f (y) − f (w))(depth(w, T ) − depth(y, T ))
Lemma 2 There exists an optimal tree such that the two symbols with the lowest
frequencies are siblings.
Proof Let T be an optimal tree. Let w and y be two symbols with the lowest
frequencies. If there is more than one symbol that has the lowest frequency, then
1
take two that have the biggest depth. If w and y are siblings, then we are done.
Otherwise, suppose without loss of generality, that depth(w, T ) ≥ depth(y, T ).
We have three cases:
• w has a sibling z. Let T 0 be the tree created from T by swapping z and
y, and thus making w and y siblings. By applying Lemma 1, we get that
ABL(T 0 ) ≤ ABL(T ). Since T is optimal, there cannot be another tree
with a smaller cost, and so ABL(T 0 ) = ABL(T ). Thus T 0 is also optimal.
• w is an only child. Create T 0 by removing w’s leaf and assigning w to
its old parent. T 0 is cheaper than T , contradiction the optimality of T .
Hence, this case is not possible.
• There exists a node z at a depth bigger then w. Create T 0 by swapping
w and z. By our choice of w, f (w) < f (z), so, applying Lemma 1, we
have that T 0 is cheaper than T , a contradiction. Hence, this case is not
possible.
Theorem 3 The algorithm HUF(A, f ) computes an optimal tree for frequencies
f and alphabet A.
Proof The proof is by induction on the size of the alphabet. The induction
hypothesis is that for all A with |A| = n and for all frequencies f , HUF(A, f )
computes the optimal tree.
In the base case (n = 1), the tree is only one vertex and the cost is zero,
which is the smallest possible.
For the general case, assume that the induction hypothesis holds for n − 1.
That is, T 0 is optimal for A0 and f 0 . First, let us show the following:
X
ABL(T ) = ( f (x)depth(x, T )) + f (w)depth(w, T ) + f (y)depth(y, T )
x∈A\{w,y}
X
=( f (x)depth(x, T )) + (f (w) + f (y))(depth(z, T 0 ) + 1)
x∈A\{w,y}
X
=( f (x)depth(x, T )) + f 0 (z)depth(z, T 0 ) + f (w) + f (y)
x∈A\{w,y}
X
=( f 0 (x)depth(x, T 0 )) + f (w) + f (y)
x∈A0
= ABL(T 0 ) + f (w) + f (y)
Now, assume for the sake of contradiction that T is not optimal, and let Z be
an optimal tree that has w and y as siblings (this exists by the above lemma).
Let Z 0 be the tree obtained from Z by removing w and y. We can view Z 0 as
a tree for the alphabet A0 and frequency function f 0 . We can then repeat the
calculation above and get ABL(Z) = ABL(Z 0 ) + f (w) + f (y). So, ABL(T 0 ) =
ABL(T )−f (w)−f (y) > ABL(Z)−f (w)−f (y) = ABL(Z 0 ). Since T 0 is optimal
for A0 and f 0 , this is a contradiction.