ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
4.2 CODING TECHNIQUES
Shannon Fano Coding Techniques
In the field of data compression, Shannon-Fano coding is a suboptimal
technique for constructing a prefix code based on a set of symbols and their
probabilities (estimated or measured).
In Shannon-Fano coding, the symbols are arranged in order from most probable
to least probable, and then divided into two sets whose total probabilities are as
close as possible to being equal.All symbols then have the first digits of their
codes assigned; symbols in the first set receive "0" and symbols in the second set
receive "1". As long as any sets with more than one member remain, the same
process is repeated on those sets, to determine successive digits of their codes.
When a set has been reduced to one symbol, of course, this means the symbol's
code is complete and will not form the prefix of any other symbol's code.
The algorithm works, and it produces fairly efficient variable-length encodings;
when the two smaller sets produced by a partitioning are in fact of equal
probability, the one bit of information used to distinguish them is used most
efficiently. Unfortunately, Shannon-Fano does not always produce optimal prefix
codes; the set of probabilities {0.35, 0.17, 0.17, 0.16, 0.15} is an example of one
that will be assigned
Shannon-Fano Algorithm
A Shannon-Fano tree is built according to a specification designed to define an
effective code table. The actual algorithm is simple:
1. For a given list of symbols, develop a corresponding list of probabilities or
frequency counts so that each symbol‘ known.
2. Sort the lists of symbols according to frequency, with the most frequently
occurring symbols at the left and the least common at the right.
1. Divide the list into two parts, with the total frequency counts of the left half
being as close to the total of the right as possible
2. The left half of the list is assigned the binary digit 0, and the right half is
assigned the digit 1. This means that the codes for the symbols in the first half
will all start with 0, and the codes in the second half will all start with 1.
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
3. Recursively apply the steps 3 and 4 to each of the two halves, subdividing
groups and adding bits to the codes until each symbol has become a
corresponding code leaf on the tree.
Huffmann Coding Techniques
Huffman coding is an entropy encoding algorithm used for lossless
data compression. The term refers to the use of a variable-length code table for
encoding a source symbol (such as a character in a file)
Huffman coding uses a specific method for choosing the representation for each
symbol, resulting in a prefix code (sometimes called "prefix-free codes") (that is,
the bit string representing some particular symbol is never a prefix of the bit string
representing any other symbol) that expresses the most common characters using
shorter strings of bits than are used for less common source symbols. Huffman
was able to design the most efficient compression method of this type: no other
mapping of individual source symbols to unique strings of bits will produce a
smaller average output size when the actual symbol frequencies agree with those
used to create the code.
Although Huffman coding is optimal for a symbol-by-symbol coding (i.e. a
stream of unrelated symbols) with a known input probability distribution, its
optimality can sometimes accidentally be over-stated. For example, arithmetic
coding and LZW coding often have better compression capability.
Given
A set of symbols and their weights (usually proportional to probabilities).
Find
A prefix-free binary code (a set of codewords) with minimum expected
codeword length (equivalently, a tree with minimum weighted path length).
Input.
Alphabet , which is the symbol alphabet of size n.
Set , which is the set of the (positive) symbol weights (usually proportional to
probabilities), i.e. .
Output.
Code , which is the set of (binary) codewords, where ci is the codeword for .
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Goal.
Let be the weighted path length of code C. Condition: for any code .
For any code that is biunique, meaning that the code is uniquely decodable, the
sum of the probability budgets across all symbols is always less than or equal to
one. In this example, the sum is strictly equal to one; as a result, the code is termed
a complete code. If this is not the case, you can always derive an equivalent code
by adding extra symbols (with associated null probabilities), to make the code
complete while keeping it biunique. In general, a Huffman code need not be
unique, but it is always one of the codes minimizing L(C).
MUTUAL INFORMATION
On an average we require H(X) bits of information to specify one input symbol.
However, if we are allowed to observe the output symbol produced by that input,
we require, then, only H (X|Y) bits of information to specify the input symbol.
Accordingly, we come to the conclusion, that on an average, observation of a
single output provides with [H(X) –H (X|Y)]
Notice that in spite of the variations in the source probabilities, p (xk) (may be
due to noise in the channel), certain probabilistic information regarding the state
of the input is available, once the conditional probability p (xk | yj) is computed at
the receiver end. The difference between the initial uncertainty of the source
symbol xk, i.e. log 1/p(xk) and the final uncertainty about the same source
symbol xk, after receiving yj, i.e. log1/p(xk |yj) is the information gained through
the channel. This difference we call as the mutual information between the
symbols xk and yj. Thus
This is the definition with which we started our discussion on information theory.
Accordingly I (xk) is also referred to as ‘Self Information‘.
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Eq (4.22) simply means that “the Mutual information ‟ is symmetrical with
respect to its arguments.i.e.
I (xk, yj) = I (yj, xk)
Averaging Eq. (4.21b) over all admissible characters xk and yj, we obtain the
average information gain of the receiver:
I(X, Y) = E {I (xk, yj)}
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
Or I(X, Y) = H(X) + H(Y) –H(X, Y)
Further, we conclude that, ― even though for a particular received symbol, yj,
H(X) –H(X | Yj) may be negative, when all the admissible output symbols are
covered, the average mutual information is always non- negative‖. That is to say,
we cannot loose information on an average by observing the output of a channel.
An easy method, of remembering the various relationships, is given in Fig
4.2.Althogh the diagram resembles a Venn-diagram, it is not, and the diagram is
only a tool to remember the relationships. That is all. You cannot use this diagram
for proving any result.
The entropy of X is represented by the circle on the left and that of Y by the circle
on the right. The overlap between the two circles (dark gray) is the mutual
information so that the remaining (light gray) portions
of H(X) and H(Y) represent respective equivocations. Thus we have
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY
H(X | Y) = H(X) –I(X, Y) and H (Y| X) = H(Y) –I(X, Y)
The joint entropy H(X,Y) is the sum of H(X) and H(Y) except for the fact that the
overlap is added twice so that
H(X, Y) = H(X) + H(Y) - I(X, Y)
Also observe H(X, Y) = H(X) + H (Y|X)
= H(Y) + H(X |Y)
For the JPM given by I(X, Y) = 0.760751505 bits / sym
EC8395 COMMUNICATION ENGINEERING