0% found this document useful (0 votes)

5 views6 pages

Rohini 67178593226

Uploaded by

sangeethamkm1216

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views6 pages

Rohini 67178593226

Uploaded by

sangeethamkm1216

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

4.2 CODING TECHNIQUES

Shannon Fano Coding Techniques

In the field of data compression, Shannon-Fano coding is a suboptimal

technique for constructing a prefix code based on a set of symbols and their
probabilities (estimated or measured).

In Shannon-Fano coding, the symbols are arranged in order from most probable
to least probable, and then divided into two sets whose total probabilities are as
close as possible to being equal.All symbols then have the first digits of their
codes assigned; symbols in the first set receive "0" and symbols in the second set
receive "1". As long as any sets with more than one member remain, the same
process is repeated on those sets, to determine successive digits of their codes.
When a set has been reduced to one symbol, of course, this means the symbol's
code is complete and will not form the prefix of any other symbol's code.

The algorithm works, and it produces fairly efficient variable-length encodings;

when the two smaller sets produced by a partitioning are in fact of equal
probability, the one bit of information used to distinguish them is used most
efficiently. Unfortunately, Shannon-Fano does not always produce optimal prefix
codes; the set of probabilities {0.35, 0.17, 0.17, 0.16, 0.15} is an example of one
that will be assigned

Shannon-Fano Algorithm

A Shannon-Fano tree is built according to a specification designed to define an

effective code table. The actual algorithm is simple:

1. For a given list of symbols, develop a corresponding list of probabilities or

frequency counts so that each symbol‘ known.

2. Sort the lists of symbols according to frequency, with the most frequently
occurring symbols at the left and the least common at the right.

1. Divide the list into two parts, with the total frequency counts of the left half
being as close to the total of the right as possible

2. The left half of the list is assigned the binary digit 0, and the right half is
assigned the digit 1. This means that the codes for the symbols in the first half
will all start with 0, and the codes in the second half will all start with 1.

EC8395 COMMUNICATION ENGINEERING

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

3. Recursively apply the steps 3 and 4 to each of the two halves, subdividing
groups and adding bits to the codes until each symbol has become a
corresponding code leaf on the tree.

Huffmann Coding Techniques

Huffman coding is an entropy encoding algorithm used for lossless

data compression. The term refers to the use of a variable-length code table for
encoding a source symbol (such as a character in a file)

Huffman coding uses a specific method for choosing the representation for each
symbol, resulting in a prefix code (sometimes called "prefix-free codes") (that is,
the bit string representing some particular symbol is never a prefix of the bit string
representing any other symbol) that expresses the most common characters using
shorter strings of bits than are used for less common source symbols. Huffman
was able to design the most efficient compression method of this type: no other
mapping of individual source symbols to unique strings of bits will produce a
smaller average output size when the actual symbol frequencies agree with those
used to create the code.

Although Huffman coding is optimal for a symbol-by-symbol coding (i.e. a

stream of unrelated symbols) with a known input probability distribution, its
optimality can sometimes accidentally be over-stated. For example, arithmetic
coding and LZW coding often have better compression capability.

Given

A set of symbols and their weights (usually proportional to probabilities).

Find

A prefix-free binary code (a set of codewords) with minimum expected

codeword length (equivalently, a tree with minimum weighted path length).

Input.

Alphabet , which is the symbol alphabet of size n.

Set , which is the set of the (positive) symbol weights (usually proportional to
probabilities), i.e. .

Output.

Code , which is the set of (binary) codewords, where ci is the codeword for .
EC8395 COMMUNICATION ENGINEERING
ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Goal.

Let be the weighted path length of code C. Condition: for any code .

For any code that is biunique, meaning that the code is uniquely decodable, the
sum of the probability budgets across all symbols is always less than or equal to
one. In this example, the sum is strictly equal to one; as a result, the code is termed
a complete code. If this is not the case, you can always derive an equivalent code
by adding extra symbols (with associated null probabilities), to make the code
complete while keeping it biunique. In general, a Huffman code need not be
unique, but it is always one of the codes minimizing L(C).

MUTUAL INFORMATION

On an average we require H(X) bits of information to specify one input symbol.

However, if we are allowed to observe the output symbol produced by that input,
we require, then, only H (X|Y) bits of information to specify the input symbol.
Accordingly, we come to the conclusion, that on an average, observation of a
single output provides with [H(X) –H (X|Y)]

Notice that in spite of the variations in the source probabilities, p (xk) (may be
due to noise in the channel), certain probabilistic information regarding the state
of the input is available, once the conditional probability p (xk | yj) is computed at
the receiver end. The difference between the initial uncertainty of the source
symbol xk, i.e. log 1/p(xk) and the final uncertainty about the same source
symbol xk, after receiving yj, i.e. log1/p(xk |yj) is the information gained through
the channel. This difference we call as the mutual information between the
symbols xk and yj. Thus

This is the definition with which we started our discussion on information theory.
Accordingly I (xk) is also referred to as ‘Self Information‘.

EC8395 COMMUNICATION ENGINEERING

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Eq (4.22) simply means that “the Mutual information ‟ is symmetrical with

respect to its arguments.i.e.

I (xk, yj) = I (yj, xk)

Averaging Eq. (4.21b) over all admissible characters xk and yj, we obtain the
average information gain of the receiver:

I(X, Y) = E {I (xk, yj)}

EC8395 COMMUNICATION ENGINEERING

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

Or I(X, Y) = H(X) + H(Y) –H(X, Y)

Further, we conclude that, ― even though for a particular received symbol, yj,
H(X) –H(X | Yj) may be negative, when all the admissible output symbols are
covered, the average mutual information is always non- negative‖. That is to say,
we cannot loose information on an average by observing the output of a channel.
An easy method, of remembering the various relationships, is given in Fig
4.2.Althogh the diagram resembles a Venn-diagram, it is not, and the diagram is
only a tool to remember the relationships. That is all. You cannot use this diagram
for proving any result.

The entropy of X is represented by the circle on the left and that of Y by the circle
on the right. The overlap between the two circles (dark gray) is the mutual
information so that the remaining (light gray) portions
of H(X) and H(Y) represent respective equivocations. Thus we have

EC8395 COMMUNICATION ENGINEERING

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

H(X | Y) = H(X) –I(X, Y) and H (Y| X) = H(Y) –I(X, Y)

The joint entropy H(X,Y) is the sum of H(X) and H(Y) except for the fact that the
overlap is added twice so that

H(X, Y) = H(X) + H(Y) - I(X, Y)

Also observe H(X, Y) = H(X) + H (Y|X)

= H(Y) + H(X |Y)

For the JPM given by I(X, Y) = 0.760751505 bits / sym

EC8395 COMMUNICATION ENGINEERING

PTSP VI Part 2
No ratings yet
PTSP VI Part 2
44 pages
Ut 1 PPT
No ratings yet
Ut 1 PPT
77 pages
Source & Channel Encoding Basics
No ratings yet
Source & Channel Encoding Basics
15 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
Source Coding & Theorems Guide
No ratings yet
Source Coding & Theorems Guide
29 pages
ETN3046 Chapter 6
No ratings yet
ETN3046 Chapter 6
31 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Huffman and Lempel-Ziv-Welch
No ratings yet
Huffman and Lempel-Ziv-Welch
14 pages
cp467 12 Lecture14 Compression1
No ratings yet
cp467 12 Lecture14 Compression1
146 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Mad Unit 3-Jntuworld
No ratings yet
Mad Unit 3-Jntuworld
53 pages
Chapter Five Lossless Compression
No ratings yet
Chapter Five Lossless Compression
49 pages
CH 6
No ratings yet
CH 6
21 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Lossless Compression: Lesson 1
No ratings yet
Lossless Compression: Lesson 1
10 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Info Theory & Entropy Basics
No ratings yet
Info Theory & Entropy Basics
44 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Digital Communication Unit 5
No ratings yet
Digital Communication Unit 5
105 pages
Source Coding
No ratings yet
Source Coding
29 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
4 Huffman and Shannon Fano Coding
No ratings yet
4 Huffman and Shannon Fano Coding
23 pages
Introduction to Data Compression
No ratings yet
Introduction to Data Compression
22 pages
Info Theory & Noise Analysis
No ratings yet
Info Theory & Noise Analysis
34 pages
Mobile Multimedia Coding Techniques
No ratings yet
Mobile Multimedia Coding Techniques
19 pages
Dce 1
No ratings yet
Dce 1
21 pages
ADCS Source Coding 16112022 082348pm
No ratings yet
ADCS Source Coding 16112022 082348pm
31 pages
Shannon-Fano Coding Explained
No ratings yet
Shannon-Fano Coding Explained
8 pages
Source Coding
No ratings yet
Source Coding
18 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
Information Theory & Coding Basics
No ratings yet
Information Theory & Coding Basics
45 pages
Lec 42024
No ratings yet
Lec 42024
13 pages
Markov Model Info
No ratings yet
Markov Model Info
7 pages
DC 3
No ratings yet
DC 3
20 pages
Channel Coding Theorem
No ratings yet
Channel Coding Theorem
23 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Source Coding
No ratings yet
Source Coding
35 pages
Lecture 6
No ratings yet
Lecture 6
22 pages
Source Coding Ompression
No ratings yet
Source Coding Ompression
34 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
Multimedia Data Compression Guide
No ratings yet
Multimedia Data Compression Guide
21 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
Huff Man
No ratings yet
Huff Man
8 pages
Information Theory and Coding
100% (1)
Information Theory and Coding
79 pages
Script PDF
No ratings yet
Script PDF
78 pages
Module IV
No ratings yet
Module IV
37 pages
MM Lec 9 p2
No ratings yet
MM Lec 9 p2
4 pages
3-1-Lossless Compression
No ratings yet
3-1-Lossless Compression
10 pages
Week 3
No ratings yet
Week 3
30 pages
Huffman Coding: Vida Movahedi
No ratings yet
Huffman Coding: Vida Movahedi
24 pages
Coding & Information Theory: By: Shiva Navabi January, 29 2011
No ratings yet
Coding & Information Theory: By: Shiva Navabi January, 29 2011
38 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Integer Programming PDF
No ratings yet
Integer Programming PDF
40 pages
Graph Theory Applications - (7 Digraphs)
No ratings yet
Graph Theory Applications - (7 Digraphs)
30 pages
Dynamic Programming
No ratings yet
Dynamic Programming
36 pages
Why Tree Automata?: Pierre Genevès
No ratings yet
Why Tree Automata?: Pierre Genevès
8 pages
Closest Points 1 X 1
No ratings yet
Closest Points 1 X 1
18 pages
Numerical Errors in Computing
No ratings yet
Numerical Errors in Computing
16 pages
2024 ATMAA Program Outline
No ratings yet
2024 ATMAA Program Outline
7 pages
Fuahtr
No ratings yet
Fuahtr
53 pages
Number System Conversion Guide
No ratings yet
Number System Conversion Guide
4 pages
JEE Mathematics Flashcard 1
No ratings yet
JEE Mathematics Flashcard 1
5 pages
CS210 Programming Study Guide
No ratings yet
CS210 Programming Study Guide
10 pages
A-3 Solved
No ratings yet
A-3 Solved
7 pages
Complementary Error Function Table
100% (1)
Complementary Error Function Table
1 page
Semester: 4 Semester Course Name: Discrete Mathematics Course Code: MAT 257 Section: B Topic Name: Path-Circiut Eulerian Graph
No ratings yet
Semester: 4 Semester Course Name: Discrete Mathematics Course Code: MAT 257 Section: B Topic Name: Path-Circiut Eulerian Graph
4 pages
String Matching Algorithms: Antonio Carzaniga
No ratings yet
String Matching Algorithms: Antonio Carzaniga
11 pages
Theorems On Limits: Joe Jayson Caletena
No ratings yet
Theorems On Limits: Joe Jayson Caletena
23 pages
Applications of Backtracking
No ratings yet
Applications of Backtracking
2 pages
Periodic Test 2nd V Maths
No ratings yet
Periodic Test 2nd V Maths
2 pages
Lecture 18
No ratings yet
Lecture 18
13 pages
Divisibility Rules For 2, 5 and 10
No ratings yet
Divisibility Rules For 2, 5 and 10
12 pages
Assignment No. 2-Graph Theory - Div - I
No ratings yet
Assignment No. 2-Graph Theory - Div - I
3 pages
A* Algorithm Pathfinding Steps
No ratings yet
A* Algorithm Pathfinding Steps
3 pages
Prism Defence Guide 250309 194929-1
100% (1)
Prism Defence Guide 250309 194929-1
1,116 pages
Scs Module 5 Sas
No ratings yet
Scs Module 5 Sas
51 pages
Petri Nets
No ratings yet
Petri Nets
69 pages
Unit Two
No ratings yet
Unit Two
20 pages
Advanced Algorithms Lecture
No ratings yet
Advanced Algorithms Lecture
7 pages
Real Numbers PYQ
No ratings yet
Real Numbers PYQ
14 pages
On Intuitionistic Hesitancy Fuzzy Graphs
No ratings yet
On Intuitionistic Hesitancy Fuzzy Graphs
7 pages
Podem Algorithm
No ratings yet
Podem Algorithm
4 pages

Rohini 67178593226

Uploaded by

Rohini 67178593226

Uploaded by

ROHINI COLLEGE OF ENGINEERING & TECHNOLOGY

4.2 CODING TECHNIQUES

In the field of data compression, Shannon-Fano coding is a suboptimal

The algorithm works, and it produces fairly efficient variable-length encodings;

A Shannon-Fano tree is built according to a specification designed to define an

1. For a given list of symbols, develop a corresponding list of probabilities or

EC8395 COMMUNICATION ENGINEERING

Huffmann Coding Techniques

Huffman coding is an entropy encoding algorithm used for lossless

Although Huffman coding is optimal for a symbol-by-symbol coding (i.e. a

A set of symbols and their weights (usually proportional to probabilities).

A prefix-free binary code (a set of codewords) with minimum expected

Alphabet , which is the symbol alphabet of size n.

On an average we require H(X) bits of information to specify one input symbol.

EC8395 COMMUNICATION ENGINEERING

Eq (4.22) simply means that “the Mutual information ‟ is symmetrical with

I (xk, yj) = I (yj, xk)

I(X, Y) = E {I (xk, yj)}

EC8395 COMMUNICATION ENGINEERING

Or I(X, Y) = H(X) + H(Y) –H(X, Y)

EC8395 COMMUNICATION ENGINEERING

H(X | Y) = H(X) –I(X, Y) and H (Y| X) = H(Y) –I(X, Y)

H(X, Y) = H(X) + H(Y) - I(X, Y)

Also observe H(X, Y) = H(X) + H (Y|X)

= H(Y) + H(X |Y)

For the JPM given by I(X, Y) = 0.760751505 bits / sym

EC8395 COMMUNICATION ENGINEERING

You might also like