University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043

Huffman coding is a lossless data compression algorithm that uses variable-length codewords to encode symbols based on their frequency of occurrence. It creates a binary tree where the leaves represent symbols and the paths to the leaves represent codewords. The most common symbols are assigned shorter codewords while less common symbols have longer codewords. This allows the data to be compressed more when transmitted or stored. The Huffman algorithm runs in O(n log n) time where n is the number of unique symbols. It is widely used in compression methods like JPEG, MP3, and DEFLATE due to its simplicity, speed, and lack of patent coverage compared to other techniques like arithmetic coding.

Uploaded by

Hassan Humayoun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views7 pages

University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043

Uploaded by

Hassan Humayoun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Submitted by :

Usama Dastagir

14030027011

Hassan Humayoun 14030027043

University of Management & Technology

History
In 1951, David A. Huffman and his MIT information theory classmates were given the choice of
a term paper or a final exam. The professor, Robert M. Fano, assigned a term paper on the
problem of finding the most efficient binary code. Huffman, unable to prove any codes were the
most efficient, was about to give up and start studying for the final when he hit upon the idea of
using a frequency-sorted binary tree and quickly proved this method the most efficient.
Introduction
A Huffman Tree is a special type of binary tree used for data compression. A small Huffman
Tree appears below:

Each leaf in the tree contains a symbol (in this case a Character) and an integer value. Each nonleaf (internal node) contains just references to its left and right children. The 0's and 1's that you
see are not stored anywhere, we just mentally associate 0 with any left branch and 1 with any
right branch.
Each character that appears in the tree is assigned a unique code (a sequence of 0's and 1's)
obtained by following the path from the root of the tree to the leaf containing that character.
Below are the codes for the four characters found in this sample tree:

e is coded by "0"

b is coded by "100"

c is coded by "101"

d is coded by "11"

Basic Technique
Encoding
The technique works by creating a binary tree of nodes. These can be stored in a regular array,
the size of which depends on the number of symbols, . A node can be either a leaf node or
an internal node. Initially, all nodes are leaf nodes, which contain the symbol itself,
theweight (frequency of appearance) of the symbol and optionally, a link to a parent node which
makes it easy to read the code (in reverse) starting from a leaf node. Internal nodes contain
symbol weight, links to two child nodes and the optional link to a parent node. As a common
convention, bit '0' represents following the left child and bit '1' represents following the right
child. A finished tree has up to leaf nodes and
internal nodes. A Huffman tree that
omits unused symbols produces the most optimal code lengths.
The process essentially begins with the leaf nodes containing the probabilities of the symbol they
represent, then a new node whose children are the 2 nodes with smallest probability is created,
such that the new node's probability is equal to the sum of the children's probability. With the
previous 2 nodes merged into one node (thus not considering them anymore), and with the new
node being now considered, the procedure is repeated until only one node remains, the Huffman
tree.
The simplest construction algorithm uses a priority queue where the node with lowest probability
is given highest priority:
1. Create a leaf node for each symbol and add it to the priority queue.
2. While there is more than one node in the queue:
1. Remove the two nodes of highest priority (lowest probability) from the queue
2. Create a new internal node with these two nodes as children and with probability
equal to the sum of the two nodes' probabilities.
3. Add the new node to the queue.
3. The remaining node is the root node and the tree is complete.
Since efficient priority queue data structures require O(log n) time per insertion, and a tree
with n leaves has 2n1 nodes, this algorithm operates in O(n log n) time, where n is the number
of symbols.
If the symbols are sorted by probability, there is a linear-time (O(n)) method to create a Huffman
tree using two queues, the first one containing the initial weights (along with pointers to the

associated leaves), and combined weights (along with pointers to the trees) being put in the back
of the second queue. This assures that the lowest weight is always kept at the front of one of the
two queues:
1. Start with as many leaves as there are symbols.
2. Enqueue all leaf nodes into the first queue (by probability in increasing order so that the
least likely item is in the head of the queue).
3. While there is more than one node in the queues:
1. Dequeue the two nodes with the lowest weight by examining the fronts of both
queues.
2. Create a new internal node, with the two just-removed nodes as children (either
node can be either child) and the sum of their weights as the new weight.
3. Enqueue the new node into the rear of the second queue.
4. The remaining node is the root node; the tree has now been generated
Example

Decoding
Generally speaking, the process of decompression is simply a matter of translating the stream of
prefix codes to individual byte values, usually by traversing the Huffman tree node by node as
each bit is read from the input stream (reaching a leaf node necessarily terminates the search for
that particular byte value). Before this can take place, however, the Huffman tree must be
somehow reconstructed. In the simplest case, where character frequencies are fairly predictable,
the tree can be preconstructed (and even statistically adjusted on each compression cycle) and
thus reused every time, at the expense of at least some measure of compression efficiency.
Otherwise, the information to reconstruct the tree must be sent a priori. A naive approach might
be to prepend the frequency count of each character to the compression stream. Unfortunately,
the overhead in such a case could amount to several kilobytes, so this method has little practical
use. If the data is compressed using canonical encoding, the compression model can be precisely
reconstructed with just
bits of information (where is the number of bits per symbol).
Another method is to simply prepend the Huffman tree, bit by bit, to the output stream. For
example, assuming that the value of 0 represents a parent node and 1 a leaf node, whenever the
latter is encountered the tree building routine simply reads the next 8 bits to determine the
character value of that particular leaf. The process continues recursively until the last leaf node is
reached; at that point, the Huffman tree will thus be faithfully reconstructed. The overhead using

such a method ranges from roughly 2 to 320 bytes (assuming an 8-bit alphabet). Many other
techniques are possible as well. In any case, since the compressed data can include unused
"trailing bits" the decompressor must be able to determine when to stop producing output. This
can be accomplished by either transmitting the length of the decompressed data along with the
compression model or by defining a special code symbol to signify the end of input (the latter
method can adversely affect code length optimality, however).

Time Complexity
The time complexity of the Huffman algorithm is O(nlogn). Using a heap to store the weight of
each tree, each iteration requires O(logn) time to determine the cheapest weight and insert the
new weight. There are O(n) iterations, one for each item.

Applications
Arithmetic coding can be viewed as a generalization of Huffman coding, in the sense that they
produce the same output when every symbol has a probability of the form 1/2 k; in particular it
tends to offer significantly better compression for small alphabet sizes. Huffman coding
nevertheless remains in wide use because of its simplicity, high speed, andlack of patent
coverage. Intuitively, arithmetic coding can offer better compression than Huffman coding
because its "code words" can have effectively non-integer bit lengths, whereas code words in
Huffman coding can only have an integer number of bits. Therefore, there is an inefficiency in
Huffman coding where a code word of length k only optimally matches a symbol of probability
1/2k and other probabilities are not represented as optimally; whereas the code word length in
arithmetic coding can be made to exactly match the true probability of the symbol.
Huffman coding today is often used as a "back-end" to some other compression
methods. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a
front-end model and quantization followed by Huffman coding (or variable-length prefix-free
codes with a similar structure, although perhaps not necessarily designed by using Huffman's
algorithm.
References

https://en.wikipedia.org/wiki/Huffman_coding#Formalized_description

https://www.cs.auckland.ac.nz/software/AlgAnim/huffman.html

https://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman
_tutorial.html

http://www.cs.umd.edu/class/fall2012/cmsc132h/Projects/P7/project7.html

http://www.csee.umbc.edu/courses/undergraduate/341/fall11/projects/project3/HuffmanE
xplanation.shtml

Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
Huffman Coding - Wikipedia
No ratings yet
Huffman Coding - Wikipedia
11 pages
Huffman Coding
No ratings yet
Huffman Coding
11 pages
Huffman Coding
No ratings yet
Huffman Coding
23 pages
Huffman Coding Trees
No ratings yet
Huffman Coding Trees
3 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
S 2
No ratings yet
S 2
8 pages
Unit 2 CA209
No ratings yet
Unit 2 CA209
29 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Modification of Adaptive Huffman Coding For Use in
No ratings yet
Modification of Adaptive Huffman Coding For Use in
6 pages
Lecture 09 - Greedy Algos Updates
No ratings yet
Lecture 09 - Greedy Algos Updates
64 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Huffman Coding Algorithm
No ratings yet
Huffman Coding Algorithm
4 pages
Huffman Coding: Greedy Algorithm Guide
No ratings yet
Huffman Coding: Greedy Algorithm Guide
27 pages
Huff Man
No ratings yet
Huff Man
8 pages
Unit 2
No ratings yet
Unit 2
28 pages
Huffman Trees and Codes: Greedy Technique
No ratings yet
Huffman Trees and Codes: Greedy Technique
6 pages
Huffman Coding in C++
No ratings yet
Huffman Coding in C++
10 pages
Huffman Coding Technique
No ratings yet
Huffman Coding Technique
13 pages
HuffmanCoding 2
No ratings yet
HuffmanCoding 2
16 pages
Mini Project
No ratings yet
Mini Project
26 pages
Huffman Encoding Project Report
No ratings yet
Huffman Encoding Project Report
36 pages
Huffman Coding
No ratings yet
Huffman Coding
32 pages
Huffman Coding
No ratings yet
Huffman Coding
22 pages
Static Huffman Coding Term Paper
No ratings yet
Static Huffman Coding Term Paper
23 pages
Unit III - Daa
No ratings yet
Unit III - Daa
127 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Unit Iii Greedy and Dynamic Programming
No ratings yet
Unit Iii Greedy and Dynamic Programming
120 pages
Steps of Huffman Encoding:: Calculate The Frequency of Each Character Build A Priority Queue Build A Binary Tree
No ratings yet
Steps of Huffman Encoding:: Calculate The Frequency of Each Character Build A Priority Queue Build A Binary Tree
1 page
Huffman Coding
No ratings yet
Huffman Coding
16 pages
5 Huffman Coding
No ratings yet
5 Huffman Coding
50 pages
Data Structure 10th
No ratings yet
Data Structure 10th
3 pages
Huffman Encoding with Greedy Method
No ratings yet
Huffman Encoding with Greedy Method
16 pages
Compression: Another Example of Greedy Algorithm: Huffman Codes
No ratings yet
Compression: Another Example of Greedy Algorithm: Huffman Codes
4 pages
Wa0023.
No ratings yet
Wa0023.
28 pages
Lecture 22 Compression
No ratings yet
Lecture 22 Compression
42 pages
Huffman Coding for Tech Enthusiasts
No ratings yet
Huffman Coding for Tech Enthusiasts
5 pages
Huffman's Algorithm Lecture1
No ratings yet
Huffman's Algorithm Lecture1
69 pages
Huffman Coding Assignment
50% (2)
Huffman Coding Assignment
7 pages
M1 Greedy - Huffman Codes
No ratings yet
M1 Greedy - Huffman Codes
2 pages
Huffman Trees and Codes-V1
No ratings yet
Huffman Trees and Codes-V1
15 pages
Graph Theory - Important Application of Trees Huffman Coding
No ratings yet
Graph Theory - Important Application of Trees Huffman Coding
50 pages
Optimization Problems
No ratings yet
Optimization Problems
38 pages
Huffman Coding Principles
No ratings yet
Huffman Coding Principles
31 pages
Huffman Coding
No ratings yet
Huffman Coding
6 pages
Department of Artificial Intelligence & Data Science K. K. Wagh Institute of Engineering Education and Research
No ratings yet
Department of Artificial Intelligence & Data Science K. K. Wagh Institute of Engineering Education and Research
5 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
Huffman Codes
No ratings yet
Huffman Codes
27 pages
Unit 2
No ratings yet
Unit 2
82 pages
Huffman Coding Ms 140400147 Sadia Yunas Butt
No ratings yet
Huffman Coding Ms 140400147 Sadia Yunas Butt
9 pages
Huffman Coding
No ratings yet
Huffman Coding
7 pages
Assignment No: 02 Title: Huffman Algorithm
No ratings yet
Assignment No: 02 Title: Huffman Algorithm
7 pages
Data Compression Unit-2
No ratings yet
Data Compression Unit-2
74 pages
Huffman Alg
No ratings yet
Huffman Alg
14 pages
Data Compression
No ratings yet
Data Compression
28 pages
Huffman Coding Explained
No ratings yet
Huffman Coding Explained
27 pages
Machine Learning QB R22-1
No ratings yet
Machine Learning QB R22-1
24 pages
0.4 Combinations of Functions
No ratings yet
0.4 Combinations of Functions
15 pages
Cisco ICND1 Exam Course Notes PDF
No ratings yet
Cisco ICND1 Exam Course Notes PDF
7 pages
Pa600 Os Update (Efgis)
No ratings yet
Pa600 Os Update (Efgis)
14 pages
Unix Files & Directories Guide
No ratings yet
Unix Files & Directories Guide
10 pages
Fake Jobs Code
No ratings yet
Fake Jobs Code
3 pages
Sesion 3 - Estructuras, Controles y ListView TreeView
No ratings yet
Sesion 3 - Estructuras, Controles y ListView TreeView
18 pages
SQLite
No ratings yet
SQLite
8 pages
Chap5-Sampling Rate Conversion
No ratings yet
Chap5-Sampling Rate Conversion
22 pages
Reddy J.N. - Solution Manual For Introduction To Finite Element Analysis-MGH (2005)
No ratings yet
Reddy J.N. - Solution Manual For Introduction To Finite Element Analysis-MGH (2005)
13 pages
Problem Set 0: Introduction To Algorithms
No ratings yet
Problem Set 0: Introduction To Algorithms
27 pages
MNRE
No ratings yet
MNRE
2 pages
2018, Sriyanong - A Text Preprocessing Framework For Text Mining On Big Data Infrastructure
No ratings yet
2018, Sriyanong - A Text Preprocessing Framework For Text Mining On Big Data Infrastructure
5 pages
D856ab 3
No ratings yet
D856ab 3
2 pages
Radar Overlay Guide v7
No ratings yet
Radar Overlay Guide v7
18 pages
Ahrefs Cookies
No ratings yet
Ahrefs Cookies
2 pages
Class XI Informatics Practices Exam
No ratings yet
Class XI Informatics Practices Exam
4 pages
Escape Room Rubric
100% (1)
Escape Room Rubric
1 page
DFSORT Changes SORTSOC PDF
No ratings yet
DFSORT Changes SORTSOC PDF
55 pages
CAN Bus Tool for PocketPC Dissertation
100% (1)
CAN Bus Tool for PocketPC Dissertation
56 pages
Control System: Module No. - 6 - Steady State Error and System Types
No ratings yet
Control System: Module No. - 6 - Steady State Error and System Types
6 pages
Software Engineering Fundamental - Alind Saxena
No ratings yet
Software Engineering Fundamental - Alind Saxena
473 pages
Defining Homogenous Climate Zones of Bangladesh Using Cluster Analysis
No ratings yet
Defining Homogenous Climate Zones of Bangladesh Using Cluster Analysis
11 pages
Eurocode Design Software List
No ratings yet
Eurocode Design Software List
2 pages
ON 8800 Brochure v1
No ratings yet
ON 8800 Brochure v1
2 pages
Sound On: Mastercard Debuts Sonic Brand
No ratings yet
Sound On: Mastercard Debuts Sonic Brand
3 pages
3 - Architecture Patterns Tactics PDF
No ratings yet
3 - Architecture Patterns Tactics PDF
24 pages
PRELIMINARY (CMP) - Comprehensive Master Planning Report / Checklist
No ratings yet
PRELIMINARY (CMP) - Comprehensive Master Planning Report / Checklist
7 pages
Snigdha Inerim Report 1 (1) New
No ratings yet
Snigdha Inerim Report 1 (1) New
3 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
102 pages

University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043

Uploaded by

University of Management & Technology: Submitted By: Usama Dastagir 14030027011 Hassan Humayoun 14030027043

Uploaded by

Submitted by :

Hassan Humayoun 14030027043

University of Management & Technology

You might also like