IMAGE COMPRESSION
Motivation towards Compression
• Digital images require huge amounts of space for
storage and large bandwidths for transmission.
– A 640 x 480 color image requires close to 1MB of
space.
• Calculate the space required for a SD ( Standard
Definition 720 * 480) movie of 2 hours running at
30 fps…!! (Answer = 224 GB)
• Imagine how movies came in two CD’s / DVD’s of
very less size than required 224 GB size…?
Motivation Continued…
• Have you checked the size of any image when
its clicked by your camera and compared it to
the ones you share on social media…?
• Imagine the network congestion if all images
shared over Whatsapp or any social media did
not have any provision for compression.
• What would have been the cost of data
transmission then?
Data Compression
• It aims at reducing the amount of data
required to represent a given quantity of
information.
• Data is the means by which information is
conveyed.
• Information are said to contain redundant
data.
Data vs Information
• The same amount of information can be represented by
various amount of data.
• For example:-
• Your wife, Helen, will meet you at Logan Airport in Boston
at 5 minutes past 6:00 pm tomorrow night
• Your wife will meet you at Logan Airport at 5 minutes past
6:00 pm tomorrow night
• Helen will meet you at Logan at 6:00 pm tomorrow night
• All 3 statements represent the same information with
different levels of data redundancy, the first line
containing maximum redundant data.
Compression Ratio
compression
Compression Ratio:
Data Redundancy
• Relative data redundancy:
Example:
Types of Data Redundancy
(1) Coding
(2) Interpixel
(3) Psychovisual
• Compression attempts to reduce one or
more of these redundancy types.
Image Compression
• It is the art and science of reducing the
amount of data required to represent an
image.
Fig 1: Idea of Compression
Coding Redundancy
• A code is a system of symbols used to represent a
body of information or set of events.
• Each piece of information or event is assigned a
sequence of code symbols called code word.
• The number of symbols in each codeword is
called its length.
• For eg:- Each pixel in a gray scale image is
represented by 8-bits binary. ( Pixel value = 255,
code = 11111111, code length per pixel= 8)
Spatial & Temporal Redundancy
• Spatial Redundancy means the neighbouring
pixels are correlated in a 2-D image and hence
it is redundant to store all the pixel values.
• Temporal Redundancy is observed in case of
videos where the adjacent frames are similar
to each other i.e. the frames are correlated in
time.
Psychovisual Redundancy
• The human eye does not respond with equal
sensitivity to all visual information.
• It is more sensitive to the lower frequencies
than to the higher frequencies in the visual
spectrum.
• Idea: Discard data that is perceptually
insignificant to human eyes.
Types of Image Compression
• Based on the loss incurred in the compression
scheme, it is classified into the following:-
• Lossless
– Information preserving
– Low compression ratios
• Lossy
– Not information preserving
– High compression ratios
• Trade-off: image quality vs compression ratio
Fidelity Criteria
• How close is to ?
• Criteria
– Subjective: based on human observers
– Objective: mathematically defined criteria
Subjective Fidelity Criteria
Objective Fidelity Criteria
• Root mean square error (RMS)
• Mean-square signal-to-noise ratio (SNR)
Lossless Compression
Lossless Methods: Taxonomy
Types of Lossless Coding Techniques
(Entropy Coding)
1.Repetitive Sequence Encoding (Ex:- Run
Length Encoding)
2.Statistical Encoding (Ex:- Huffman,
Arithmetic, LZW)
3.Lossless Predictive Coding (Ex:- DPCM)
4.Bit plane Encoding
Huffman Coding (coding redundancy)
• A variable-length coding technique.
• Optimal code (i.e., minimizes the number
of code symbols per source symbol).
• Assumption: symbols are encoded one at a
time!
Huffman Coding (cont’d)
• Forward Pass
1. Sort probabilities per symbol
2. Combine the lowest two probabilities
3. Repeat Step2 until only two probabilities remain.
Huffman Coding (cont’d)
Backward Pass
Assign code symbols going backwards
Huffman Coding (cont’d)
• Lavg (Bits per symbol) using Huffman coding:
• Lavg assuming binary codes:
Huffman Coding/Decoding
• After the code has been created,
coding/decoding can be implemented using
a look-up table.
• Note that decoding is done unambiguously
in this case.
Arithmetic (or Range) Coding
(coding redundancy)
• No assumption on encoding source symbols
one at a time.
– Sequences of source symbols are encoded
together.
– There is no one-to-one correspondence
between source symbols and code words.
• Slower than Huffman coding but typically
achieves better compression.
Arithmetic Coding (cont’d)
• A sequence of source symbols is assigned a
single arithmetic code word which
corresponds to a sub-interval in [0,1].
• As the number of symbols in the message
increases, the interval used to represent it
becomes smaller.
• Smaller intervals require more information
units (i.e., bits) to be represented.
Arithmetic Coding (cont’d)
Encode message: a1 a2 a3 a3 a4
1) Assume message occupies [0, 1)
0 1
2) Subdivide [0, 1) based on the probability of αi
3) Update interval by processing source symbols
Example of Arithmetic Coding
Encode
a1 a2 a3 a3 a4
[0.06752, 0.0688)
or,
0.068
Example contd…
• The message a1 a2 a3 a3 a4 is encoded using
3 decimal digits or 3/5 = 0.6 decimal digits
per source symbol.
• The entropy of this message is:
-(3 x 0.2log10(0.2)+0.4log10(0.4))=0.5786 digits/symbol
Note: finite precision arithmetic might
cause problems due to truncations!
Arithmetic Decoding
1.0 0.8 0.72 0.592 0.5728
a4
0.8 0.72 0.688 0.5856 0.57152
Decode 0.572
a3
0.4 0.56 0.624 0.5728 056896
a2
a3 a3 a1 a2 a4
0.2 0.48 0.592 0.5664 0.56768
a1
0.0 0.4
0.56 0.56 0.5664
LZW Coding (Interpixel
Redundancy)
• Requires no prior knowledge of pixel
probability distribution values.
• Assigns fixed length code words to variable
length sequences.
• It is a type of Dictionary based Coding.
• Included in GIF and TIFF and PDF file formats
LZW Coding
• A codebook (or dictionary) needs to be
constructed.
• Initially, the first 256 entries of the dictionary are assigned
to the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel)
Initial Dictionary
Dictionary Location Entry
Consider a 4x4, 8 bit image
39 39 126 126 0 0
39 39 126 126 1 1
39 39 126 126 . .
39 39 126 126 255 255
256 -
511 -
LZW Coding (cont’d)
As the encoder examines image pixels,
39 39 126 126 gray level sequences (i.e., blocks) that are
not in the dictionary are assigned to a
39 39 126 126 new entry.
39 39 126 126
39 39 126 126
Dictionary Location Entry
0 0 - Is 39 in the dictionary……..Yes
1 1 - What about 39-39………….No
. .
- Then add 39-39 in entry 256
255 255
256 39- -39
511 -
Example
39 39 126 126 Concatenated Sequence: CS = CR + P
39 39 126 126
39 39 126 126 (CR) (P)
39 39 126 126
CR = empty
If CS is found:
(1) No Output
(2) CR=CS
else:
(1) Output D(CR)
(2) Add CS to D
(3) CR=P
Decoding LZW
• The dictionary which was used for encoding need not
be sent with the image.
• Can be built on the “fly” by the decoder as it reads
the received code words.
Run-length coding (RLC)
(Interpixel redundancy)
• Used to reduce the size of a repeating string of characters
(i.e., runs):
1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)
a a a b b b b b b c c (a,3) (b, 6) (c, 2)
• Encodes a run of symbols into two bytes: (symbol, count)
• Can compress any type of data but cannot achieve high
compression ratios compared to other compression
methods.
Bit-plane coding (Interpixel redundancy)
• An effective technique to reduce inter pixel
redundancy is to process each bit plane
individually.
(1) Decompose an image into a series of binary
images.
(2) Compress each binary image (e.g., using
run-length coding)
Combining Huffman Coding
with Run-length Coding
• Assuming that a message has been
encoded using Huffman coding, additional
compression can be achieved using run-
length coding.
e.g., (0,1)(1,1)(0,1)(1,0)(0,2)(1,4)(0,2)
Transform based Coding
• Transform the image into a domain where
compression can be performed more
efficiently (i.e. reduce interpixel
redundancies).
• Transform types:- DFT, DCT, K-L Transform
~ (N/n)2 subimages
Transform selection
-Consider an image f (x,y) of size N X N
-The forward discrete transform is defined as
N 1 N 1
T (u, v) f (x, y ) g ( x , y, u, v)
X 0 y O
u , v 0,1,......., N 1
- The inverse discrete transform is defined as
f (x, y) u0 v0 T (u, v)h(x, y, u, v)
N 1 N 1
x, y 0,1,............, N 1
Example : Discrete Fourier Transform
- Forward transform kernel
2 (uxvy )
j
g(x, y, u, v) e N
- Inverse transform kernel
uxvy
2 ( )
1
h(x, y, u, v) 2 e j N
N
Discrete cosine transform(DCT)
The discrete cosine transform (DCT) gets its name from the fact
that the rows of the N x N transform matrix C are obtained as a
function of cosines.
g(x, y, u, v) h(x, y, u, v)
(2x 1)u (2 y 1)v
(u)(v)cos cos
2N 2N
Where
1
(u) for u 0
N
2
for u 1, 2.........., N 1
4/9/2020 N Prof. K.P. Nayak 41
Why DCT
-Blocking artifacts are less pronounced in the DCT than
the DFT.
-The DCT is a good approximation of the Karhunen-Loeve
Transform(KLT) which is optimal in terms of energy
compaction.
-However unlike KLT , the DCT has image independent
basis functions.
-The DCT is used in JPEG compression standard.
JPEG ( Joint Photographic Experts Group )
-> It is a compression standard for still images
-> It defines three different coding systems :
1. A lossy baseline coding system based on DCT (adequate
for most compression applications )
2. An extended system for greater compression, higher
precision, or progressive reconstruction applications
3. A lossless independent coding system for reversible
compression.
Steps of JPEG Compression
JPEG – Baseline Coding System
1. Computing the DCT: the image is divided into 8 X 8 blocks
The blocks of component are then transformed into
frequency components.
2. Each block is represented by 64 frequency components,
one of which (the DC component) is the average color of
the 64 pixels in the block.
3. Applying FDCT to change the blocks from spatial domain
into frequency domain.
4. Roughly speaking , F(0,0) is the average value within the
block (called DC component).
5. Other values stores the variation within a block (AC
components).
Example Fast DCT
183 160 94 153 194 163 132 165
183 153 116 176 187 166 130 169
179 168 171 182 179 170 131 167
f=
177 177 179 177 179 165 131 167
178 178 179 176 179 164 130 171
179 180 180 179 182 164 130 171
179 179 180 182 183 170 129 173
180 179 181 179 181 170 130 69
Example : Level Shifting
55 32 -34 25 66 35 4 37
55 25 -12 48 59 38 2 41
51 40 43 54 51 42 3 39
fs=f-128 49 49 51 49 51 37 3 39
50 50 51 48 54 36 2 43
51 52 52 51 55 36 2 43
51 51 52 54 55 42 1 45
52 51 53 51 53 42 2 41
Example : Computing the DCT
312 56 -27 17 79 -60 26 -26
-38 -28 13 45 31 -1 -24 -10
-20 -18 10 33 21 -6 -16 -9
dcts=round(dct2(fs))
-11 -7 9 15 10 -11 -13 1
-6 1 6 5 -4 -7 -5 5
3 3 0 -2 -7 -4 1 2
3 5 0 -4 -8 -1 2 4
3 1 -1 -2 -3 -1 4 1
Example : The Normalization Matrix
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
qmat=
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Example : Normalization Table
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
312/16
49 64 78 87 103 121 120 101
= 20
72 92 95 98 112 100 103 99
312 56 -27 17 79 -60 26 -26 20 5 -3 1 3 -2 1 0
-38 -28 13 45 31 -1 -24 -10 -3 -2 1 2 1 0 0 0
-1 -1 1 1 1 0 0 0
-20 -18 10 33 21 -6 -16 -9
-1 0 0 1 0 0 0 0
-11 -7 9 15 10 -11 -13 1
0 0 0 0 0 0 0 0
-6 1 6 5 -4 -7 -5 5
0 0 0 0 0 0 0 0
3 3 0 -2 -7 -4 1 2
0 0 0 0 0 0 0 0
3 5 0 -4 -8 -1 2 4 0 0 0 0 0 0 0 0
4/ 9/ 2020 Nayak
Pr of. K.P. 50
3 1 -1 -2 -3 -1 4 1 Quantized value
Example : Thresholding
20 5 -3 1 3 -2 1 0
-3 -2 1 2 1 0 0 0
-1 -1 1 1 1 0 0 0
t=round(dcts. /qmat)= -1 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Zig- Zag Scanning of the Coefficients
20 5 -3 1 3 -2 1 0
-3 -2 1 2 1 0 0 0
-1 -1 1 1 1 0 0 0
t= round(dcts./qmat)= -1 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
[ 20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,0,EOB ]
Decoding the Coefficients
320 55 -30 16 72 -80 51 0
-36 -24 14 38 26 0 0 0
-14 -13 16 24 40 0 0 0
-14 0 0 29 0 0 0 0
t X qmat=
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Computing the IDCT
67 12 -9 20 69 43 -8 42
58 25 15 30 65 40 -4 47
46 41 44 40 59 38 0 49
41 52 59 43 57 42 3 42
fs_hat=round(idct2(ds_hat))
44 54 58 40 58 47 3 33
49 52 53 40 61 47 1 33
53 50 53 46 63 41 0 45
55 50 56 53 64 34 -1 57
Shifting Back the Coefficients
195 140 119 148 197 171 120 170
186 153 143 158 193 168 124 175
174 169 172 168 187 166 128 177
169 180 187 171 185 170 131 170
f_hat=fs_hat+128 172 182 186 168 186 175 131 161
177 180 181 168 189 175 129 161
181 178 181 174 191 169 128 173
183 178 184 181 192 162 127 185
183 160 94 153 194 163 132 165
183 153 116 176 187 166 130 169
179 168 171 182 179 170 131 167
f= 177 177 179 177 179 165 131 167
178 178 179 176 182 164 130 171
179 180 180 179 183 164 130 171
179 179 180 182 183 170 129 173
180 179 181 179 Pr o1f.8K1.P. Na 1
ya7k0 130 169
JPEG as Block Transform Encoding
What is Wavelet Transform ?
-Wavelets are functions defined over a finite interval and
having an average value of zero.
- The basic idea of the wavelet transform is to represent any
arbitrary function (t) as a superposition of a set of such
wavelets or basis functions.
- These basis function or baby wavelets are obtained from a
single prototype wavelet called mother wavelet , by
contractions (scaling) and translations (shifts).
- The Discrete Wavelet Transform of a finite length signal x(n)
having N components , for example , is expressed by an N x N
4/9m/20a
20trix. Prof. K.P. Nayak 57
Why Wavelet-based Compression?
• No need to divide input image into blocks and its basis functions
have variable length to avoid blocking artifacts.
• More robust under transmission and decoding errors.
• Better matched to the human visual system(HVS) characteristics
Good frequency resolution at lower frequencies, good time
resolution at higher frequencies – good for natural images.
Subband coding
-The fundamental concept behind Subband Coding (SBC) is to
split up the frequency band of a signal (image in our case) and
then to code each subband using a coder and bit rate accurately
matched to the statistics of the band.
- SBC has been used extensively first in speech coding and later
in image coding.
Wavelets as filter
Essentially a multiresolution filtering process:
− HPF (Mother wavelet)
− LPF (Father scaling)
− Apply in different resolutions
One Stage Filtering
Approximations and details:
- The low-frequency content is the most important part in
many applications, and gives the signal its identity.
This part is called “Approximations”.
- The high-frequency gives the ‘flavor’, and is called “Details”.
Approximations and Details:
- Approximations: low- frequency components of the signal
- Details: high- frequency components
Multi-level Decomposition
Iterating the decomposition process, breaks the input signal
into many lower-resolution components.
Wavelet decomposition tree is given below:-
Wavelet Image Decomposition
Wavelets Image decomposition
• Multilevel Decomposition
Every
512 X 512 Every row
column
256 X 256
128 X 128
Decomposition and Compression
Quantization
Two important observations:
1. Natural images in general have a low pass spectrum, so the
wavelet coefficients will, on average, be smaller in the higher
subbands than in the lower subbands.
2. Large wavelet coefficients are more important than smaller
wavelet coefficients.
Example
• Here is a two-stage wavelet decomposition of an image.
Notice the large number of zeros (black):
Matlab toolbox
Comparison
JPEG (DCT based) JPEG-2000 (Wavelet
4/9/2020 Prof. K.P. Nayak
based) 69
Comparison JPEG-2000(1.83KB)
Original(979 KB)
JPEG (6.21 KB)