0% found this document useful (0 votes)

15 views22 pages

Multimdia Word File

Text compression is the process of reducing data for storage or transmission, utilizing techniques like lossless and lossy compression, entropy encoding, and various coding algorithms such as Huffman and Lempel-Ziv. Lossless compression retains all original data, while lossy compression sacrifices some data for smaller file sizes. The document details multiple encoding methods, including static and dynamic Huffman coding, arithmetic coding, and LZW coding, emphasizing their principles and applications in efficiently transmitting information.

Uploaded by

Adila Kammadath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views22 pages

Multimdia Word File

Uploaded by

Adila Kammadath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

UNIT -II

Text compression

It is the process of reducing the amount of data needed

for the storage or transmission of a given piece of
information.
A technique known as compression is first applied to
the source information prior to its transmission. This is
done either to reduce the volume of information to be
transmitted- text, fax and images or to reduce the
bandwidth required for its transmission-speech, audio
and video.

Compression principle
 source encoders and destination decoders.
 lossless and lossy compression.
 Entropy encoding
 Source encoding
The basic principles of text compression are set out to
achieve a reduction in file size by encoding data more
efficiently.
source encoders and destination decoders
Prior to transmitting the source information relating to
a multimedia application , a compression algorithm is
applied to it.
This implies that in order for destination to reproduce
the original source information , a matching
decompression algorithm must be applied to it.
The application of the compression algorithm is the
main function carried out by the source encoder and the
decompression algorithm is carried out by the
destination decoder.
Fig. source encoder and destination
decoder
(a) Software only (b) special
processor/hardware

Lossless and Lossy compression

Lossless compression :-In the case of lossless
compression algorithm the aim is to reduce the amount
of source information to be transmitted in such a way
that, when the compressed information is
decompressed, there is no loss of information. Lossless
compression is said to be reversible. An example of
lossless compression is for the transfer over a network
of a text file.
Lossless compression algorithms use statistic modelling
technique to reduce the repetitive information in a file.
Some of the methods may include removal of spacing
character, representing a string of repeated character
with a single character or replacing recurring characters
with smaller bit sequences.
Lossy compression:- the aim of lossy compression
algorithm is normally not to reproduce an exact copy of
the source information after decompression but rather a
version of it which is perceived by the recipient as a
true copy. example of lossy compression are for the
transfer of digitized images, audio, and video streams.
Any fine details that may be missing from the original
source signal after decompression are not detectable.
Entropy encoding
Entropy encoding is lossless and independent of the
type of information that is being compressed. It creates
and assigns a unique prefix code to each unique symbol
that occur in the input.
Run length encoding
This type of encoding is used when the source
information comprises long substrings of the same
character or binary digit. In this the source string is
transmitted as a different set of codewords which
indicates not only the particular character or bit being
transmitted but also the number of characters or bits in
the substring. Then providing the destination knows the
set of codewords being used, it simply interprets each
codeword received and outputs the appropriate number
of characters/bits
e.g. output from a scanner in a Fax Machine
000000011111111110000011
will be represented as
0,7 1,10 0,5 1,2
If we ensure the first substring always comprises 0s ,
then the string will be represented as 7,10,5,2.

Statistical encoding
A set of ASCII codewords are often used for the
transmission of strings of characters.
However, the symbols and hence the codewords in the
source information does not occur with the same
frequency.
E.g. in string of text A may occur more frequently than
P which may occur more frequently than Z and so on.
Therefore the statistical encoding uses a set of variable
length codewords with the shortest codeword used to
represent the most frequently appearing symbol.
It is necessary to ensure that a shorter codeword in the
set does not form the start of longer codeword
otherwise the decoder will interpret the string on the
wrong codeword boundaries. To avoid this a codeword
must possess the prefix property. Example of this
property is Huffman encoding algorithm.
Source encoding
Differential encoding
This type of coding is used where the amplitude of a
value/signal covers a large range but difference in
ampltitude between successive value/signals is
relatively small.
Instead of using a set of large codewords to represent
the amplitude of each value/signal, a set of smaller
codeword can be used which indicate the difference in
ampltitude between current value/signal being encoded
and immediately preceding value/signal.
Eg if the digitization of analog signal requires 12 bits
but the maximum difference in amplitude between
successive sample of the signal requires only 3 bits,
then by using only the difference values a saving of 75%
on transmission bandwidth can be obtained.
It can be either lossless or lossy.
If the number of bits is sufficient to cater for the
maximum difference value then it is lossless otherwise
it is lossy.

Transform encoding
It involves transforming the source information from
one form to another, the other form lending itself more
readily to the application of compression.
The transform is typically lossless.
This technique is used in a number of application
including images and video.
It is used to convert spatial image pixel values to
transform coefficient values.
The digitization of a continuous monochromatic image
produces a two dimensional matrix. As we go from one
position in the matrix to the next the magnitude of each
pixel value may vary. The rate of change in magnitude
as one transverse the matrix give rise to term known as
spatial frequency.
For a particular image there will be mix of different
spatial frequencies whose amplitude are determined by
the related change in the magnitude of the pixel.
If we scan the matrix in either horizontal or vertical
direction, it gives rise to the terms horizontal and
vertical frequency components of the image.
Discrete cosine transform technique is used for the
transformation of two-dimensional matrix of pixel
values into an equivalent matrix of spatial frequency
components

Fig. transform coding

(a) Pixal pattern (b) DCT transform
priniciples
DCT transform

principle Text

compression

technique
Static Huffman coding
The character string to be transmitted is first analysed
and the character types and their relative frequency is
determined.
The coding operation involves creating an unbalanced
tree with some branches shorter than the other. The
degree of imbalance is function of relative frequency of
occurrence of the characters.
The resulting tree is known as Huffman code tree. It is a
binary tree with branches assigned the value 0 or 1,
binary 0 for the left branch and binary 1 for the right
branch.
The base of the tree is known as root node.
The point at which branch divides is known as branch
node.
The termination point of a branch is known as leaf
node to which the symbols being encoded are assigned.
The codewords used for each characters are determined
by tracing the path from the root node out to each leaf
and forming a string of binary
values associated with each branch traced.
Huffman encoding example

(a) Codeword generation (b) Huffman code tree

Huffman code tree varies for different set of characters
being transmitted therefore receiver must known the
codewords relating to the data being transmitted .
This is done in two ways, one is the codeword relating
to the next data are sent before the data is transmitted
and the second is the receiver knows in advance what
codeword are being used.
A flowchart of a suitable decoding algorithm is shown in
figure. The algorithm assumes a table of codewords is
available at the receiver and this also holds the
corresponding ASCII codeword.
The received bit stream is held in the variable BIT
STREAM and the variable codeword is used to hold the
bits in codeword while it is being
constructed. Once a codeword is identified the
corresponding ASCII codeword is written into the
variable RECEIVE BUFFER. The procedure repeats until
all the bits in the received string have been processed.

Decoding algorithm

Dynamic Huffman coding

Static Huffman coding requires both the transmitter and
the receiver to know the table of codewords relating to
the data being transmitted. But with dynamic Huffman
coding the transmitter and the receiver build the
Huffman tree and hence the codeword table dynamically
as the characters are being transmitted / received.
Encoder starts by reading the first character ,since tree
is empty, it sends the first character in uncompressed
form. On reception , since the decoder tree is also
empty , it interprets the received bit string as an
uncompressed character and proceeds to assign the
character to its tree in the same way.
The encoder first checks whether the character is
already present in the tree. If it is, then the encoder
sends the current codeword for the character in normal
way ,if it is not present then the encoder sends the
current codeword for the empty leaf .
If the character is already present in the tree, then the
frequency of occurrence of the leaf node is incremented
by unity.
To ensure that both the encoder and decoder do in a
consistent way , first list the weights of a leaf and
branch in the updated tree from left to right and from
bottom to top starting at the empty leaf. The tree is left
unchanged , if they are all in weight order otherwise the
structure of tree is modified by exchanging the position
of the node with the other node in the tree.
Let us assume that the data to betransmitted starts
with the following character string : this is simple
Both transmitter and receiver start with a tree that
comprises the root node and a single empty leaf node (a
leaf node with zero frequency of occurrence assigned to
its zero(0) branch.
There is always one empty leaf node in the tree and its
position and codeword varies as the tree is begin
constructed.
Arithmetic coding
Huffman coding uses a separate codeword for each
character whereas arithmetic coding yields a single
codeword for each encoded string of charcaters.
Arithmetic coding encodes data by creating a code
string
which represent a fractional value on the number line
between 0 and 1.
The first step is to divide the numeric range from 0 to 1
into a number of different characters present in the
message to be sent and the size of the each segment by
the probability of the related character.
Eg. Consider the transmission of a message comprising
a string of characters with probabilities of
e=0.3, n=0.3,t=0.2,w=0.1, .=0.1
we assume the message to be encoded is the single
word went. .

Arithmetic coding principle

the character e has the probability of 0.3 and hence

assigned the range from 0.0 to 0.3, the character n also
has a probability of 0.3 , the range from 0.3 to 0.6 , the
character t has probability of 0.2 and hence the range
from 0.6 to 0.8 and so on.
The first character to be encoded w is in the range from
0.8 to 0.9 hence the range 0.8 to 0.9 is itself subdivided
into five further segments, the width of each segment
again determined by the probability of the five
characters.
Range of symbol =lower limit :lower limit + d *
probability of symbol Where d(difference) = upper
limit – lower limit
0.8 to 0.9 range the upper limit is 0.9 and
lower limit is 0.8 so the difference d is 0.1
Next range of symbol= 0.8 : 0.8+ 0.1(0.3)
i.e from 0.8 to 0.83
this procedure continue until the termination
character . is encoded. At this point, the segment
range of . is from 0.81602 to 0.812
Hence the codeword for the complete string is any
number within the range
0.81602≤ codeword >0.8162
The decoder can follow the same procedure as that
followed by the encoder to determine the character
string relating to each received codeword.
If the received codeword is 0.8161, then decoder can
determine from this that te first character is w (
because its range is 0.8 to 0.9)
This procedure then repeats until it decodes the
termination character.

Lempel-ziv (LZ) coding

It uses strings of characters, instead of using a single
characters as the basis of coding operation. Eg. In text
compression, a table containing all the possible
character strings that occur in the text to be transferred
held by both the encoder and decoder.
Instead of sending the word as a set of individual
(ASCII) codeword the encoder sends only the index of
where the word is stored in the table and on receipt of
each index, the decoder uses this to access the
corresponding word/string of characters from the table
and proceeds to reconstruct text into its original form.
Thus the table is used as a dictionary and the
algorithm is known as a dictionary-based
compression algorithm.
The basic requirement with the LZ algorithm is that a
copy of the dictionary is held by both the encoder and
the decoder.
It can be relatively inefficient if the text to be
transmitted comprises only a small subset of the words
stored in the dictionary.
Hence a variation of the LZ algorithm has been
developed which allow the dictionary to be built up
dynamically by the encoder and the decoder as the
compressed text is being transferred.
Most word –processing packages have a dictionary
associated with them which is used for both spell
checking and for the compression of text , they contain
in the region of 25,000 words and hence 15 bits
(32768 combination) are required to encode the index
Eg. to send the word ‘multimedia’ dictionary require
just 15 bits instead of 70 bits with 7-bit ASCII
codeword.
Shorter words will have a lower compression ratio as
compare to longer words

Lempel –ziv-walsh (LZW) coding

The principle of the Lempel –ziv-walsh(LZW) coding
algorithm is for the encoder and the decoder to build
the contents of the dictionary dynamically as the text is
being transferred.
Initially the dictionary held by both the encoder and the
decoder contains only the character set (i.e ASCII) that
has been used to create the text.
The remaining entries in the dictionary are then build
up dynamically by both the encoder and decoder and
contain the words that occur in the text.
Eg. If the character set comprises 128 characters and
the dictionary is limited to 4096 entries , then the first
128 entries would contains the single characters that
make up the character set and the remaining 3968
entries would each contain strings of two or more
characters that make up the words in the text being
transferred.
Eg. The text to be compressed starts with string:
This is simple as it is…..
Initially the dictionary held by both encoder and
decoder contains only
the individual character from the character set being
used (128 character in the ASCII character set)
Hence the first word in text is sent by the encoder using
the index of each of the four characters T, h, i, s.
When encoder reads the next character from the string
(first space character) ,it determines that this is an
alphanumeric characters therefore it transmits the
character using index as before, in addition to this it
also interprets it as terminating the first word.
Similarly the decoder on receipt of the first codeword,
read the character stored at each index and commence
to reconstruct the text. When it determines that the
fifth character is space character, it interprets this as a
word delimiter and proceeds to store the word this in its
dictionary.
The same procedure is followed by both the encoder
and the decoder transferring the other words.

Figure LZW algorithm

Fig. LZW compression algorithm (a) basic
operation
(b) dynamically extending the number of entries
in the dictionary
With static dictionary the no. of entries is fixed . eg.
The dictionary contains 25000 words require 15 bits to
encode the index but in dynamically dictionary at the
commencement of each transfer the no. of entries is set
to a relatively low but when the available space become
full, then the no. of entries is allowed to increase
incrementally.

MMC Module 3
No ratings yet
MMC Module 3
58 pages
Multimedia-Communication Module 2
No ratings yet
Multimedia-Communication Module 2
26 pages
Text & Image Compression Guide
100% (1)
Text & Image Compression Guide
54 pages
Unit 3
No ratings yet
Unit 3
51 pages
Notes Module2
No ratings yet
Notes Module2
16 pages
MC Module 3 Notes
No ratings yet
MC Module 3 Notes
48 pages
Bec613a MMC Module 3 Notes
No ratings yet
Bec613a MMC Module 3 Notes
35 pages
Bce613a-Mod 3
No ratings yet
Bce613a-Mod 3
22 pages
Data Compression Essentials
No ratings yet
Data Compression Essentials
19 pages
Multimedia Communication Module 3
No ratings yet
Multimedia Communication Module 3
68 pages
Lecture Notes - 17ec741 - Module 3a - Itext & Image Compression - Raja GV
No ratings yet
Lecture Notes - 17ec741 - Module 3a - Itext & Image Compression - Raja GV
50 pages
MMC Module Iii-1
No ratings yet
MMC Module Iii-1
73 pages
3 Chapter Text and Image Compression
No ratings yet
3 Chapter Text and Image Compression
132 pages
Unit 5 - Presentation Layer
No ratings yet
Unit 5 - Presentation Layer
8 pages
Dmslecture 7
No ratings yet
Dmslecture 7
22 pages
Data Compression
No ratings yet
Data Compression
20 pages
Data Compression Techniques: Pushpender Rana, Student
No ratings yet
Data Compression Techniques: Pushpender Rana, Student
4 pages
Bec613a MMC Mod3
No ratings yet
Bec613a MMC Mod3
50 pages
Text & Image Compression Guide
No ratings yet
Text & Image Compression Guide
51 pages
CHAPTER FOURmultimedia
No ratings yet
CHAPTER FOURmultimedia
23 pages
MT Notes
No ratings yet
MT Notes
23 pages
MMC 17ec741 Module 3 Notes
No ratings yet
MMC 17ec741 Module 3 Notes
45 pages
Module 3
No ratings yet
Module 3
23 pages
Chapter 5 Data Compression
No ratings yet
Chapter 5 Data Compression
18 pages
Chapter 5 New
No ratings yet
Chapter 5 New
19 pages
Chapter 7
No ratings yet
Chapter 7
70 pages
Unit 5 DC
No ratings yet
Unit 5 DC
33 pages
Vik
No ratings yet
Vik
23 pages
Module-3: Text and Image Compression
No ratings yet
Module-3: Text and Image Compression
66 pages
Now and Get: Best VTU Student Companion App You Can Get
No ratings yet
Now and Get: Best VTU Student Companion App You Can Get
51 pages
MMC Module 3 Material 1 Final New
No ratings yet
MMC Module 3 Material 1 Final New
50 pages
MMC 17EC741 Module 3 Notes
No ratings yet
MMC 17EC741 Module 3 Notes
23 pages
Compression
No ratings yet
Compression
21 pages
Multimedia Communication - ECE - VTU - 8th Sem - Unit 3 - Text and Image Compression, Ramisuniverse
83% (6)
Multimedia Communication - ECE - VTU - 8th Sem - Unit 3 - Text and Image Compression, Ramisuniverse
30 pages
Compression
100% (1)
Compression
38 pages
Multimedia Communication Vtu Unit 3
71% (7)
Multimedia Communication Vtu Unit 3
44 pages
Multimedia Communication Vtu Unit 3
No ratings yet
Multimedia Communication Vtu Unit 3
44 pages
MM05 1
No ratings yet
MM05 1
27 pages
Chap15 1473751047 598113
No ratings yet
Chap15 1473751047 598113
34 pages
Multimedia Class 11
No ratings yet
Multimedia Class 11
6 pages
Mod 3
No ratings yet
Mod 3
69 pages
Data Compression
No ratings yet
Data Compression
35 pages
Digital Coding for ELEC1010 Students
No ratings yet
Digital Coding for ELEC1010 Students
72 pages
Main Techniques and Performance of Each Compression
No ratings yet
Main Techniques and Performance of Each Compression
23 pages
Comparison of Lossless Data Compression Algorithms
No ratings yet
Comparison of Lossless Data Compression Algorithms
12 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Mod 2 Multimedia Communications
No ratings yet
Mod 2 Multimedia Communications
20 pages
Data Compression
No ratings yet
Data Compression
35 pages
Unit-Ii Itc
No ratings yet
Unit-Ii Itc
42 pages
MMC Chap3
100% (1)
MMC Chap3
22 pages
Chap 5 Compression
No ratings yet
Chap 5 Compression
43 pages
Multimedia Systems Chapter 7
No ratings yet
Multimedia Systems Chapter 7
21 pages
15 Data Compression: Foundations of Computer Science Cengage Learning
No ratings yet
15 Data Compression: Foundations of Computer Science Cengage Learning
33 pages
Discrete Source Coding Guide
No ratings yet
Discrete Source Coding Guide
21 pages
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
Unit3 Ece MMC 6th Sem
No ratings yet
Unit3 Ece MMC 6th Sem
96 pages
Data Compression Techniques
100% (1)
Data Compression Techniques
18 pages
Multimedia Compression Techniques
No ratings yet
Multimedia Compression Techniques
34 pages
Ayesha
No ratings yet
Ayesha
3 pages
WM55R Specification 12058896
No ratings yet
WM55R Specification 12058896
4 pages
Btech Cs 7 Sem Human Computer Interface rcs073 2021
No ratings yet
Btech Cs 7 Sem Human Computer Interface rcs073 2021
1 page
Bleak 2
No ratings yet
Bleak 2
9 pages
DynamoPrimer Print
No ratings yet
DynamoPrimer Print
569 pages
Understanding Threads and Processes
No ratings yet
Understanding Threads and Processes
10 pages
IoT 4417 Lec 11
No ratings yet
IoT 4417 Lec 11
29 pages
POS Printer Test V3.5
No ratings yet
POS Printer Test V3.5
13 pages
C++ Header Files
No ratings yet
C++ Header Files
5 pages
02 Sn2010eu13sn 0002 System Overview Ewsd PDF
100% (2)
02 Sn2010eu13sn 0002 System Overview Ewsd PDF
126 pages
Software Asset Management Policy Template For CIS Control 2
No ratings yet
Software Asset Management Policy Template For CIS Control 2
19 pages
MAS055 Ifho l1800 Sec1 Sec3
No ratings yet
MAS055 Ifho l1800 Sec1 Sec3
113 pages
Python Metaprogramming
100% (1)
Python Metaprogramming
93 pages
TMUX1109 5 V, 2.5 V, Low-Leakage-Current, 4:1, 2-Channel Precision Multiplexer
No ratings yet
TMUX1109 5 V, 2.5 V, Low-Leakage-Current, 4:1, 2-Channel Precision Multiplexer
41 pages
Object Oriented Programming Assignment 1
No ratings yet
Object Oriented Programming Assignment 1
3 pages
RICOH MP 2014/ MP 2014D/ MP 2014AD: General Scanner
No ratings yet
RICOH MP 2014/ MP 2014D/ MP 2014AD: General Scanner
4 pages
Xbox Series S 512 GB All Digital Gaming Console
No ratings yet
Xbox Series S 512 GB All Digital Gaming Console
1 page
Analog Devices Welcomes Hittite Microwave Corporation: No Content On The Attached Document Has Changed
No ratings yet
Analog Devices Welcomes Hittite Microwave Corporation: No Content On The Attached Document Has Changed
10 pages
Chapter 2 Software Development Life Cycle Models
100% (6)
Chapter 2 Software Development Life Cycle Models
48 pages
Tuned Amplifiers: Concepts & Analysis
No ratings yet
Tuned Amplifiers: Concepts & Analysis
54 pages
Isd4003 Series: Single-Chip, Multiple-Messages Voice Record/Playback Devices 4-, 5-, 6-, AND 8-MINUTE DURATION
No ratings yet
Isd4003 Series: Single-Chip, Multiple-Messages Voice Record/Playback Devices 4-, 5-, 6-, AND 8-MINUTE DURATION
37 pages
SRS of Intra Mailing System
75% (4)
SRS of Intra Mailing System
17 pages
Year 9 Python Programming Basics
No ratings yet
Year 9 Python Programming Basics
13 pages
Intel® Desktop Board DG41TY: Technical Product Specification
No ratings yet
Intel® Desktop Board DG41TY: Technical Product Specification
88 pages
Security Basics Pap, Chap, Eap Radius Examples
No ratings yet
Security Basics Pap, Chap, Eap Radius Examples
35 pages
RTUFRTUTesting
No ratings yet
RTUFRTUTesting
5 pages
Cancer Database System: Manalo, Mathieu Lois Navarette, Joniel Reformado, Juan Paolo
No ratings yet
Cancer Database System: Manalo, Mathieu Lois Navarette, Joniel Reformado, Juan Paolo
41 pages
Jump Statements in C
No ratings yet
Jump Statements in C
6 pages
Understanding Algorithms & Flowcharts
No ratings yet
Understanding Algorithms & Flowcharts
24 pages
Harmonic Filter (Detuned)
No ratings yet
Harmonic Filter (Detuned)
3 pages

Multimdia Word File

Uploaded by

Multimdia Word File

Uploaded by

UNIT -II

It is the process of reducing the amount of data needed

Lossless and Lossy compression

Fig. transform coding

(a) Codeword generation (b) Huffman code tree

Dynamic Huffman coding

Arithmetic coding principle

the character e has the probability of 0.3 and hence

Lempel-ziv (LZ) coding

Lempel –ziv-walsh (LZW) coding

Figure LZW algorithm

You might also like