Isp Mod 2 Complete
Isp Mod 2 Complete
Module 2
IMAGE COMRESSION
Part b
Lossless compression is a type of data compression in which the original data can
be reconstructed exactly from the compressed data. This means that there is no loss
of quality or data when the file is decompressed. Lossless compression is used for
data that needs to be stored or transmitted exactly as it is, such as medical images,
On the other hand, lossy compression is a type of data compression in which some
of the original data is lost during the compression process. Lossy compression is
used for data that can tolerate some loss of quality, such as audio and video files.
While lossy compression can achieve higher levels of compression than lossless
compression, it does so at the expense of some loss of quality in the original data.
In general, lossless compression is used for data that needs to be preserved exactly,
while lossy compression is used for data that can tolerate some loss of quality in
Run length encoding is a lossless data compression technique that encodes data by
representing a sequence of repeated values as a single value and a count of the number
Using run length encoding, this string can be compressed to the following:
"6W3B4R"
In this example, the letter "W" appears 6 times in a row, so it is represented as "6W". The
letter "B" appears 3 times in a row, so it is represented as "3B". The letter "R" appears 4
Overall, this results in a compression ratio of 1:4, as the original string has a length of 14
Note that run length encoding is most effective on data that contains long sequences of
repeated values, as it can significantly reduce the size of the data. It is less effective on
data that does not have many repeated values, as the compressed data may not be
There is a need for image compression because images are often very large files that
take up a lot of space and can be difficult to transmit over the internet or other networks.
Image compression is a way of reducing the file size of an image while maintaining as
One approach to image compression is run length encoding, which is a lossless data
values as a single value and a count of the number of times the value appears. In the
context of image compression, run length encoding can be used to compress an image
by identifying and encoding sequences of pixels that have the same color.
For example, consider an image with the following pixel values:
(255, 255, 255, 255, 255, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255)
Using run length encoding, this image can be compressed to the following:
In this example, the pixel value "255" appears 5 times in a row, so it is represented as "5,
255". The pixel value "0" appears 5 times in a row, so it is represented as "5, 0".
Run length encoding is a lossless compression technique because the original data can
be exactly reconstructed from the compressed data. This means that there is no loss of
Overall, the use of run length encoding for image compression can help to reduce the file
size of an image, making it easier to store and transmit over the internet or other
networks.
● A: 0.1
● B: 0.2
● C: 0.3
● D: 0.1
● E: 0.3
We can then use these probabilities to assign a range of values to each character. For
example:
● A: 0.0 - 0.099
● B: 0.1 - 0.299
● C: 0.3 - 0.599
● D: 0.6 - 0.699
● E: 0.7 - 0.999
To encode the string "ABBCCCDEE", we can start with the range 0.0 - 1.0 and then
continually update the range based on the character being encoded. For example:
The final range, 0.7 - 0.849, represents the encoded string "ABBCCCDEE". This range can
To decode the string, we can use the same probabilities and ranges to reconstruct the
exactly reconstructed from the compressed data. This means that there is no loss of
The average length of the code in a Huffman coding system is the average number of
bits used to represent each symbol in the original data. In a Huffman coding system, the
average length of the code is directly related to the probabilities of the symbols in the
original data.
Symbols with higher probabilities will be assigned shorter codes, while symbols with
lower probabilities will be assigned longer codes. This means that the average length of
the code will be lower for data with a more balanced distribution of symbol probabilities,
Huffman coding is a uniquely decodable code, which means that there is a unique way
to decode the compressed data to obtain the original data. This is because the Huffman
coding system assigns a unique code to each symbol in the original data, and the codes
are designed in such a way that they can be easily distinguished from one another.
Symbol Code
A 0
B 10
C 110
D 111
In this example, the codes for the symbols A, B, C, and D are all distinct from one
another, and there is no overlap between the codes. This means that it is always
possible to unambiguously decode the compressed data to obtain the original data.
Overall, the uniquely decodable nature of Huffman coding makes it an effective and
JPEG (Joint Photographic Experts Group) is a widely used image compression standard
that is designed to reduce the file size of digital images while maintaining as much
To decompress the image, the process is reversed: the compressed data is decoded
using the Huffman coding, the quantized DCT coefficients are dequantized, the DCT
coefficients are transformed back to the spatial domain using an inverse discrete cosine
transform (IDCT), and the macroblocks are combined to form the decompressed image.
Overall, the use of the DCT and quantization in the JPEG compression process allows for
a high level of data reduction while maintaining relatively good image quality, making it a
The basic idea behind arithmetic coding is to assign a range of values to each symbol in
the original data, and then to update the range based on the symbol being encoded. The
final range, which can be represented as a single fractional value, is the encoded version
of the original data. To decode the data, the process is reversed: the encoded value is
used to reconstruct the original range, and the original data is obtained by decoding the
1. Assign probabilities to each symbol in the original data based on their frequency
of occurrence.
2. Use the probabilities to assign a range of values to each symbol. The range of
values for each symbol should be a contiguous subrange of the overall range of
possible values.
3. Initialize the range to the full range of possible values (e.g., 0.0 to 1.0).
4. Encode each symbol in the original data by updating the range based on the
symbol's range of values. For example, if the symbol is "A" and its range of values
is 0.0 to 0.499, the overall range would be updated to 0.0 to 0.499.
5. The final range, which can be represented as a single fractional value, is the
encoded version of the original data.
6. To decode the data, the process is reversed: the encoded value is used to
reconstruct the original range, and the original data is obtained by decoding the
symbols within the range.
Arithmetic coding is a lossless compression technique because the original data can be
exactly reconstructed from the compressed data. This means that there is no loss of
Overall, arithmetic coding is a powerful data compression technique that can achieve
very high levels of compression, particularly for data with a skewed distribution of
symbol probabilities. However, it can be more complex to implement than other data
resources.
Run length encoding is a lossless data compression technique that encodes data by
representing a sequence of repeated values as a single value and a count of the number
Example 1:
"WWWWWWBBBBRRR"
Using run length encoding, this string can be compressed to the following:
"6W3B4R"
In this example, the letter "W" appears 6 times in a row, so it is represented as "6W". The
letter "B" appears 3 times in a row, so it is represented as "3B". The letter "R" appears 4
Overall, this results in a compression ratio of 1:4, as the original string has a length of 14
Example 2:
"AAAAABBBBCCCCDDDDEEEE"
Using run length encoding, this string can be compressed to the following:
"5A4B5C4D4E"
In this example, the letter "A" appears 5 times in a row, so it is represented as "5A". The
letter "B" appears 4 times in a row, so it is represented as "4B". The letter "C" appears 5
times in a row, so it is represented as "5C". The letter "D" appears 4 times in a row, so it is
represented as "4D". The letter "E" appears 4 times in a row, so it is represented as "4E".
Overall, this results in a compression ratio of 1:9, as the original string has a length of 20
Run length encoding is most effective on data that contains long sequences of repeated
values, as it can significantly reduce the size of the data. It is less effective on data that
does not have many repeated values, as the compressed data may not be much smaller
be exactly reconstructed from the compressed data. This means that there is no loss of
Bit plane slicing is a technique used in image processing and data compression that
An image can be thought of as a matrix of pixel values, where each pixel value is a digital
representation of the color and intensity of the pixel. The pixel values are typically
represented as a fixed number of bits, such as 8 bits (for a 256-color image) or 24 bits
Bit plane slicing involves separating the individual bits that make up each pixel value into
separate planes or layers. For example, in an 8-bit image, the first bit plane would
contain the least significant bit (LSB) of each pixel value, the second bit plane would
contain the second least significant bit (LSB-1) of each pixel value, and so on up to the
eighth bit plane, which would contain the most significant bit (MSB) of each pixel value.
1. Image compression: Bit plane slicing can be used to identify and encode the
most important bits of an image (i.e., the bits that contribute the most to the
overall visual quality of the image), while discarding or encoding less important
bits less efficiently. This can help to reduce the file size of the image while
maintaining as much visual quality as possible.
2. Image enhancement: Bit plane slicing can be used to manipulate the individual
bits of an image to improve its visual quality. For example, the LSBs of an image
can be modified to smooth out noise or to adjust the contrast of the image.
3. Image representation: Bit plane slicing can be used to represent an image in a
compact and efficient way. For example, the LSBs of an image can be used to
encode a rough approximation of the image, while the MSBs can be used to
encode the fine details of the image. This can be useful for applications such as
image transmission, where the available bandwidth is limited.
Overall, bit plane slicing is a useful technique for manipulating and analyzing the
individual bits that make up an image, and can be an effective tool in image processing
There are several techniques that can be used for image compression, including:
JPEG (Joint Photographic Experts Group) is a widely used image compression standard
that is designed to reduce the file size of digital images while maintaining as much
To decompress the image, the process is reversed: the compressed data is decoded
using the Huffman coding, the quantized DCT coefficients are dequantized, the DCT
coefficients are transformed back to the spatial domain using an inverse discrete cosine
transform (IDCT), and the macroblocks are combined to form the decompressed image.
Overall, the use of the DCT and quantization in the JPEG compression process allows for
a high level of data reduction while maintaining relatively good image quality, making it a
Huffman coding is a technique for generating variable length codes for data
compression. It works by assigning shorter codes to symbols that occur more
frequently in the data, and longer codes to symbols that occur less frequently.
"AAABBCCCCCDDDDDD"
Symbol Probability
A 0.2
B 0.2
C 0.4
D 0.2
.4
/\
/ \
.2 .2
/ /\
.2 .2 .2
A B CD
Finally, the symbols are assigned codes based on the path from the root of the tree
Symbol Code
A 0
B 10
C 110
D 111
"000110001110111111111"
Overall, Huffman coding is a simple and efficient technique for generating variable
including image and video compression, data transmission, and data storage.
"AAABBCCCCCDDDDDD"
The first step in arithmetic coding is to assign probabilities to each symbol based on
their frequency of occurrence. For this string, the probabilities would be:
Symbol Probability
A 0.2
B 0.2
C 0.4
D 0.2
Next, the range of values for each symbol is determined based on the probabilities.
For example, if the overall range of possible values is 0.0 to 1.0, the range of values
for symbol "A" would be 0.0 to 0.2, the range of values for symbol "B" would be 0.2 to
1. Initialize the range to the full range of possible values (e.g., 0.0 to 1.0).
2. Encode the first symbol in the original data by updating the range based on
the symbol's range of values. For example, if the symbol is "A" and its range of
values is 0.0 to 0.2, the overall range would be updated to 0.0 to 0.2.
3. Encode the next symbol in the original data by further updating the range
based on the symbol's range of values. For example, if the next symbol is "B"
and its range of values is 0.2 to 0.4, the overall range would be updated to 0.2
to 0.4.
4. Repeat the process for each remaining symbol in the original data
14)Why LZW coding and what is the need for image relate with an
example.
dictionary of strings (i.e., sequences of symbols) that occur in the data, and
replacing each occurrence of a string in the data with a code that represents the
The basic idea behind LZW is to take advantage of the fact that many strings in the
data are likely to be repeated, and to replace the repeated strings with a single code,
The first step in LZW is to initialize the dictionary with the individual symbols in the
data. In this example, the dictionary would be initialized with the symbols "A", "B", "C",
and "D".
Next, the encoding process begins by looking for repeated strings in the data. The
first repeated string in this example is "AA", which can be replaced with a code from
the dictionary. The dictionary is updated to include "AA" and the resulting encoded
string becomes:
"1B"
The encoding process continues by looking for the next repeated string in the data.
In this example, the next repeated string is "CC", which can also be replaced with a
code from the dictionary. The dictionary is updated to include "CC" and the resulting
"1B2D"
The encoding process continues until all strings in the data have been processed. In
"1B2D3"
Overall, LZW is an effective data compression technique that can significantly reduce
the size of the data by replacing repeated strings with compact codes. It is widely
JPEG (Joint Photographic Experts Group) is a widely used image compression standard
that is designed to reduce the file size of digital images while maintaining as much
To decompress the image, the process is reversed: the compressed data is decoded
using the Huffman coding, the quantized DCT coefficients are dequantized, the DCT
coefficients are transformed back to the spatial domain using an inverse discrete cosine
transform (IDCT), and the macroblocks are combined to form the decompressed image.
Overall, the use of the DCT and quantization in the JPEG compression process allows for
a high level of data reduction while maintaining relatively good image quality, making it a
For example, consider an image that consists of a large area of solid color. Run length
encoding could be used to represent the image by encoding the length of the solid color
region and the color value, rather than encoding the individual pixels. This would result in
2. Statistical redundancy refers to the presence of patterns in the data that can be
predicted based on the statistical properties of the data. Lossy compression
techniques, such as JPEG image compression, can be used to remove statistical
redundancy by approximating the original data with a simpler representation that
captures the most important features of the data.
For example, consider an image that contains smooth gradations of color. JPEG image
compression could be used to approximate the gradations with a set of discrete colors,
resulting in a smaller file size for the image while maintaining most of the visual quality
could be used to encode the video by representing the changes in the person's facial
features and mouth movements from frame to frame, rather than encoding each frame
as a separate image. This would result in a much smaller file size for the video.
or audio signal, into a form that can be more efficiently stored or transmitted. A source
decoder is a device or system that converts the encoded data back into its original form.
Overall, source encoders and decoders are essential components of many digital
systems that transmit or store data, such as video streaming platforms, video
conferencing systems, and digital video recording devices. They enable efficient storage
and transmission of data by reducing the amount of data required to represent the
original source.
techniques that are used to reduce the size of a data file or signal.
compression that allows the original data to be exactly reconstructed from the
compressed data. This means that there is no loss of information or quality when the
Lossy compression, on the other hand, is a type of data compression that involves some
techniques are designed to remove certain types of redundancy in the data, such as
reduction. However, this also means that the decompressed data is not an exact copy of
the original data. Examples of lossy compression techniques include JPEG image
Overall, error-free compression and lossy compression are useful in different situations
preferred for data that needs to be preserved exactly, such as financial records or
scientific data, while lossy compression is more suitable for data where some loss of
JPEG 2000 is a still image compression standard that was developed by the Joint
Photographic Experts Group (JPEG) in the early 2000s. It is an improvement over the
original JPEG standard and is designed to provide a higher level of image quality and
● Complexity: JPEG 2000 uses a more complex image compression algorithm than
the original JPEG standard, which can make it more difficult to implement and
slower to encode and decode.
● Compatibility issues: JPEG 2000 is not as widely supported as the original JPEG
standard, which can make it more difficult to use in some applications.
● Larger file sizes: In some cases, JPEG 2000 may result in larger file sizes than the
original JPEG standard, particularly for small or simple images.
Overall, JPEG 2000 is a useful image compression standard that offers improved image
quality and higher compression efficiency compared to the original JPEG standard.
However, it may not be the best choice in all situations due to its complexity and
compatibility issues.
signal from one domain (e.g., the spatial domain of an image) to another domain
(e.g., the frequency domain) and applies a compression algorithm to the transformed
signal in order to reduce its size. Transform coding systems are commonly used in
[Input image] ---> [Transform] ---> [Quantization] ---> [Entropy encoding] --->
[Compressed image]
[Compressed image] ---> [Entropy decoding] ---> [Inverse quantization] ---> [Inverse
The input image is first transformed from the spatial domain to the frequency
domain using a transform such as the discrete cosine transform (DCT) or the
reducing the precision of the transformed coefficients in order to reduce the data
size. The quantized coefficients are then entropy encoded, which involves applying a
lossless compression algorithm such as Huffman coding to the data. The resulting
To decompress the image, the process is reversed: the compressed image is entropy
transformed back to the spatial domain using an inverse transform, and the output
image is produced.
Overall, transform coding systems are an effective way to compress images and
other signals by exploiting redundancy in the data and representing the data more
efficiently.
Part c
the image with fewer bits of information. Image compression is useful because digital
images can be very large in size, making them difficult to store or transmit efficiently. By
required to store or transmit the image, which can save time and resources.
There are two main types of image compression: lossless and lossy. Lossless image
compression techniques allow the original image to be exactly reconstructed from the
compressed data, while lossy techniques involve some loss of information or quality
when the image is decompressed. Lossless techniques are generally used for images
that need to be preserved exactly, such as medical or scientific images, while lossy
techniques are more suitable for images where some loss of quality is acceptable, such
There are many different image compression techniques and standards, including JPEG,
JPEG 2000, PNG, GIF, and BMP. Each technique or standard has its own strengths and
limitations, and the best choice for a particular application will depend on the
1. Storage: Digital data, such as images, videos, and audio files, can be very large
in size, making it difficult to store large amounts of data efficiently.
Compression can significantly reduce the size of the data, making it more
practical to store.
2. Transmission: When transmitting data over the internet or other networks, the
amount of data that needs to be transferred can be very large. Compression
can reduce the amount of data that needs to be transmitted, which can help to
reduce the amount of time it takes to transfer the data and the amount of
bandwidth required.
3. Performance: Large data files can take up a lot of memory and processing
power when they are being used or accessed. Compression can help to
reduce the amount of memory and processing power required, which can
improve the overall performance of the system.
4. Cost: Storing and transmitting large amounts of data can be expensive in
terms of hardware, software, and other resources. Compression can help to
reduce the cost of storing and transmitting data by reducing the amount of
resources required.
Overall, compression is a useful tool for reducing the size of digital data and making
applications, including data storage, data transmission, and image and video
processing.
signal. There are several types of redundancy that can occur in data:
can help to reduce the size of the data and make it more efficient to store and transmit.
coded data file or signal. It can occur when a coding scheme or algorithm is used to
represent data in a more compact or efficient form, but some of the information in
algorithm that removes the unnecessary information from the data. This can help to
reduce the size of the coded data and make it more efficient to store and transmit.
For example, consider a data file that consists of a sequence of numbers. A simple
coding scheme might represent each number as a 4-bit binary value, resulting in a
file size of 4 bits per number. However, if the numbers are all between 0 and 15, it
would be more efficient to represent each number as a 4-bit binary value, resulting in
a file size of 4 bits per number. This would remove the coding redundancy in the data
theory, as it can be used to reduce the size of data and improve its efficiency.
Run length coding (RLC) is a lossless data compression technique that is used to
encode runs of data with the same value. It is based on the idea that long runs of data
with the same value are more common in many types of data, such as images, and can
be represented more efficiently by encoding the value and the length of the run rather
1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1
Each tuple in the encoded data represents a run of data with a particular value, with the
first element of the tuple representing the value and the second element representing
Run length coding is simple and efficient, and it can be used to compress many types of
data, including images, audio, and text. However, it is not as effective at compressing
data with a high level of randomness or complexity, as it relies on the presence of long
For example, if a data compression algorithm reduces the size of a data file
from 100 MB to 50 MB, the compression ratio is 100/50 = 2. This means that
the algorithm was able to reduce the size of the data by a factor of 2.
A higher compression ratio indicates that the data compression algorithm was
more effective at reducing the size of the data, while a lower compression
ratio indicates that the data was not as efficiently compressed.
Compression ratio is an important factor to consider when selecting a data
compression algorithm or system, as it can affect the efficiency of storing and
transmitting the data. However, it is not the only factor to consider, as other
factors such as the speed of the algorithm and the quality of the compressed
data may also be important.
A source encoder is a device or system that converts a data source, such as a video
or audio stream, into a compressed form that is more efficient to store or transmit.
the size of the data by removing redundancy and representing the data more
efficiently.
There are many different source encoding techniques and algorithms that can be
used, depending on the characteristics of the data and the requirements for the
Source encoding is often used in combination with other types of data compression
A channel encoder is a device or system that converts a digital data stream into a form
that is more suitable for transmission over a communication channel, such as a wireless
communication and is used to improve the reliability and efficiency of data transmission.
There are many different channel encoding techniques and algorithms that can be used,
for the transmitted data. Some examples of channel encoding techniques include error
Error-free compression is a type of data compression that allows the original data to be
exactly reconstructed from the compressed data. Error-free compression techniques are
generally used for data that needs to be preserved exactly, such as medical or scientific
There are several operations that are commonly performed by error-free compression
techniques:
Overall, error-free compression techniques are designed to preserve the integrity of the
original data and allow it to be exactly reconstructed from the compressed data. They
are used in applications where even a small error in the data could be significant, such
Variable length coding is a data compression technique that uses variable length codes
character is assigned a unique code, and the length of the code is chosen based on the
For example, consider a data file that consists of the characters "a", "b", and "c". If the
character "a" occurs much more frequently than the characters "b" and "c", it would be
more efficient to assign a shorter code to "a" and longer codes to "b" and "c". This would
result in a smaller data size overall, as the shorter code for "a" would be used more
often.
Variable length coding is used in many data compression techniques, such as Huffman
coding and LZW (Lempel-Ziv-Welch). It is an effective way to reduce the size of data by
assigning shorter codes to the most frequent symbols and longer codes to the less
frequent symbols.
Overall, variable length coding is a useful technique for reducing the size of data and
13)What is JPEG?
JPEG (Joint Photographic Experts Group) is a popular image compression standard that
is used to reduce the size of digital image files while maintaining a reasonable level of
continuous-tone images, and it is widely used on the web, in digital cameras, and in other
the original image is lost during the compression process. The amount of information
that is lost can be controlled by adjusting the compression level, with higher levels of
The JPEG standard defines a number of different image formats, including JPEG, JPEG
2000, and JPEG XR. Each format has its own strengths and limitations, and the best
choice for a particular application will depend on the requirements for the image and the
resources available.
Overall, JPEG is a widely used image compression standard that is effective at reducing
the size of digital images while maintaining a reasonable level of image quality.
1. Discrete Cosine Transform (DCT): JPEG uses the DCT to transform the pixel
values in an image from the spatial domain to the frequency domain, where the
frequencies correspond to the spatial patterns in the image. The DCT is used to
2. Quantization: After the DCT is applied, the resulting coefficients are quantized, or
divided into a set of discrete values. The quantization process reduces the
3. Entropy coding: After quantization, the remaining image data is encoded using an
represents the data in a more efficient form. The entropy coding step is used to
This can include resizing the image, adjusting the contrast, or applying
2. Discrete Cosine Transform (DCT): The DCT is applied to the pixel values
data is passed through a decoder, which applies the inverse of the DCT
Overall, these steps are used to compress and decompress digital images
In order for a Huffman code to be uniquely decodable, it must meet the following
conditions:
1. No code is the prefix of any other code. This means that no code can be the
beginning of another code. For example, if the code for symbol "a" is "01" and the
code for symbol "b" is "011", the code for "a" is not uniquely decodable, because it
2. The code for each symbol is unique. This means that each symbol must have a
distinct code, and no two symbols can share the same code.
To determine whether the given Huffman code is uniquely decodable, we would need to
know the complete set of symbols and their corresponding codes. Without this
information, it is not possible to determine whether the code satisfies the above
conditions.
examples:
1. Run length coding: This technique encodes runs of data with the same value
using a tuple of the form (value, length). For example, the data "1, 1, 1, 2, 2, 2, 3, 3,
based on their frequency. For example, the data "aaabbbccc" could be encoded
as "0, 01, 011, 010" if "a" is the most frequent symbol, "b" is the next most
sequences of data with references to earlier occurrences of the same data. For
encoded as a number between 0 and 1, with the most frequent symbol "a" being
represented by a smaller range within the interval and the less frequent symbols
Overall, these lossless compression techniques can be used to reduce the size of data
to save storage space or reduce the amount of bandwidth needed to transmit the
data. Data compression is typically achieved by removing redundancy from the data,
which is defined as any information that can be derived from other information in the
data.
There are two main types of data compression: lossless and lossy. Lossless
compression techniques preserve the original data exactly and allow it to be exactly
other hand, remove some of the information from the original data and do not allow
it to be exactly reconstructed.
Data redundancy is the presence of duplicate or unnecessary information in a data
file or stream. Data redundancy can occur in many forms, such as redundant
symbols, patterns, or structures within the data. Data redundancy can be reduced by
identifying and removing the redundant information, which can improve the
Overall, data compression and data redundancy are important concepts in data
management and are used to improve the efficiency of storing and transmitting data.
In this system, the input data is first preprocessed to improve the efficiency of the
compression process. This can include resizing the data, adjusting the contrast, or
Next, the data is transformed using a transform algorithm, such as the Discrete
Cosine Transform (DCT) or the Discrete Wavelet Transform (DWT). The transform
algorithm converts the data from the spatial domain to the frequency domain, where
The transformed data is then quantized, or divided into a set of discrete values, in
order to reduce the precision of the data and remove some of the less significant
Finally, the quantized data is encoded using an entropy coding algorithm, such as
Huffman coding or arithmetic coding, which represents the data in a more efficient
To decompress the data, the encoded data is passed through a decoder, which
Overall, this lossy compression system reduces the size of the input data by
removing redundant and less significant information and representing the data in a
more efficient form. The quality of the reconstructed data will depend on the level of
Part a
2)A source emits letters from an alphabet for this source? Find the average length of
the code and its redundancy
To determine the average length of the code and the redundancy for a source
that emits letters from an alphabet, we would need to know the probability
distribution of the letters in the alphabet. The probability distribution specifies
the probability of each letter occurring in the source.
For example, consider a source that emits letters from the alphabet {a, b, c, d,
e} with the following probability distribution:
Letter Probability
a 0.4
b 0.2
c 0.2
d 0.1
e 0.1
To find the average length of the code for this source, we can use the formula:
For example, if we use a Huffman code to encode the letters, we might assign
the following codes:
Using these codes, the average length of the code would be:
The redundancy of the code is a measure of how much the code deviates
from the theoretical minimum length, which is known as the entropy of the
source. The entropy is defined as:
Entropy = ∑(prob
3)For the image shown below compute the compression ratio that
can be achieved using Huffman coding
4)A source emits three symbols A,B,C with a
probability{0.5,0.25,0.25} respectively. Construct an arithmetic code
to encode the word ‘C A B’
To encode the word "C A B" using arithmetic coding, we can follow these steps:
1. Determine the range of the alphabet: First, we need to determine the range of the
alphabet. Since the alphabet consists of three symbols {A, B, C} with probabilities
{0.5, 0.25, 0.25}, the range of the alphabet can be represented as follows:
A: [0, 0.5)
B: [0.5, 0.75)
C: [0.75, 1.0)
2. Initialize the range: Next, we need to initialize the range to the full range of the
alphabet, which is [0, 1.0).
3. Encode the first symbol: To encode the first symbol "C", we need to narrow the
range to the range of symbol "C", which is [0.75, 1.0). The encoded value is the
midpoint of this range, which is 0.875.
4. Encode the second symbol: To encode the second symbol "A", we need to narrow
the range to the range of symbol "A", which is [0, 0.5). The encoded value is the
midpoint of this range, which is 0.25.
5. Encode the third symbol: To encode the third symbol "B", we need to narrow the
range to the range of symbol "B", which is [0.5, 0.75). The encoded value is the
midpoint of this range, which is 0.625.
6. Output the encoded value: Finally, the encoded value is the midpoint of the final
range, which is 0.625. This is the arithmetic code for the word "C A B".
Overall, the arithmetic code for the word "C A B" is 0.625. This code represents the
probability of the word occurring in the source, based on the probability of each symbol
in the alphabet.
5)Encode the word a1,a2,a3,a4 using arithmetic coding and find tag
for the given probabilities.a1=0.2, a2= 0.2,a3=0.4, a4=0.2
To encode the word "a1, a2, a3, a4" using arithmetic coding, we can follow these
steps:
1. Determine the range of the alphabet: First, we need to determine the range of
the alphabet. Since the alphabet consists of four symbols {a1, a2, a3, a4} with
probabilities {0.2, 0.2, 0.4, 0.2}, the range of the alphabet can be represented
as follows:
2. Initialize the range: Next, we need to initialize the range to the full range of the
alphabet, which is [0, 1.0).
3. Encode the first symbol: To encode the first symbol "a1", we need to narrow
the range to the range of symbol "a1", which is [0, 0.2). The encoded value is
the midpoint of this range, which is 0.1.
4. Encode the second symbol: To encode the second symbol "a2", we need to
narrow the range to the range of symbol "a2", which is [0.2, 0.4). The encoded
value is the midpoint of this range, which is 0.3.
5. Encode the third symbol: To encode the third symbol "a3", we need to narrow
the range to the range of symbol "a3", which is [0.4, 0.8). The encoded value is
the midpoint of this range, which is 0.6.
6. Encode the fourth symbol: To encode the fourth symbol "a4", we need to
narrow the range to the range of symbol "a4", which is [0.8, 1.0). The encoded
value is the midpoint of this range, which is 0.9.
7. Output the encoded value: Finally, the encoded value is the midpoint of the
final range, which is 0.9. This is the arithmetic code for the word "a1, a2, a3,
a4".
Overall, the arithmetic code for the word "a1, a2, a3, a4" is 0.9. This code represents
To perform the Huffman algorithm for the given intensity distribution of a 64 x 64 image,
1. Calculate the probability of each intensity level: First, we need to calculate the
probability of each intensity level by dividing the frequency of each level by the
total number of pixels in the image. The total number of pixels in a 64 x 64 image
is 64 * 64 = 4096. Therefore, the probabilities of each intensity level are as
follows:
2. Construct the Huffman tree: Next, we need to construct the Huffman tree by
combining the intensity levels with the lowest probabilities until there is only one
node left. The resulting tree will have a set of branches, with each branch
representing a unique intensity level. The length of each branch will be equal to
the depth of the node in the tree, with the root node at depth 0.
3. Assign codes to each intensity level: Finally, we need to assign a code to each
intensity level by traversing the tree and assigning a 0 to a left branch and a 1 to a
right branch. The code for each intensity level will be the concatenation of the
branches from the root to the node representing the intensity level.
For example, the Huffman code for intensity level r3 would be "11", since it is located on
the right branch of the root node and the right branch of the next node. The Huffman
code for intensity level r5 would be "000", since it is located on the left branch of the root
The coding efficiency of the Huffman code can be calculated by dividing the average
length of the code by the entropy of the source, which is the theoretical minimum length
For the given intensity distribution, the coding efficiency of the Huffman code can be
compared with the coding efficiency of a uniform length code, which assigns the same
number of bits to each intensity level. The coding efficiency of the uniform length code
can be calculated by dividing the number of bits per intensity level by the entropy of the
source.
Overall, the Huffman algorithm is a technique for constructing a prefix code for a set of
symbols with probabilities, with the goal of minimizing the average length of the code.
By constructing a code with shorter average length, the Huffman code can achieve a
higher coding efficiency than a uniform length code, especially if the probabilities of the
values in an image to achieve a desired effect. It is often used to adjust the contrast and
brightness of an image, as well as to enhance the detail and clarity of the image.
8)Obtain Huffman coding for the source symbols S={S0 ,S1, S2, S3,S4}
and the corresponding probabilities P= {0.4 ,0.2,0.2,0.1,0.1}
To obtain the Huffman coding for the source symbols S={S0, S1, S2, S3, S4} and the
corresponding probabilities P={0.4, 0.2, 0.2, 0.1, 0.1}, we can follow these steps:
1. Construct the Huffman tree: First, we need to construct the Huffman tree by
combining the symbols with the lowest probabilities until there is only one node
left. The resulting tree will have a set of branches, with each branch representing
a unique symbol. The length of each branch will be equal to the depth of the node
in the tree, with the root node at depth 0.
2. Assign codes to each symbol: Next, we need to assign a code to each symbol by
traversing the tree and assigning a 0 to a left branch and a 1 to a right branch.
The code for each symbol will be the concatenation of the branches from the root
to the node representing the symbol.
For example, the Huffman code for symbol S3 would be "11", since it is located on the
right branch of the root node and the right branch of the next node. The Huffman code
for symbol S4 would be "000", since it is located on the left branch of the root node and
S4 0.1 111
Overall, the Huffman coding is a technique for constructing a prefix code for a set of
symbols with probabilities, with the goal of minimizing the average length of the code.
By constructing a code with shorter average length, the Huffman code can achieve a
higher coding efficiency than a uniform length code, especially if the probabilities of the
intensity range of the image. It is often used to enhance the detail in images that are
1. Calculate the histogram: First, we need to calculate the histogram of the image,
which is a plot of the number of pixels at each intensity level. The histogram can
be used to visualize the distribution of pixel values in the image.
2. Normalize the histogram: Next, we need to normalize the histogram by dividing
the number of pixels at each intensity level by the total number of pixels in the
image. This step converts the histogram into a probability distribution, with the
probability of each intensity level representing the fraction of pixels at that
intensity level.
3. Accumulate the normalized histogram: Next, we need to accumulate the
normalized histogram by adding up the probabilities of each intensity level. This
step converts the probability distribution into a cumulative distribution function
(CDF), with the CDF representing the accumulated probability of each intensity
level.
4. Map the CDF to the full intensity range: Finally, we need to map the CDF to the full
intensity range of the image by scaling the CDF to the desired intensity range.
This step redistributes the pixel values to a uniform distribution, resulting in an
increase in contrast and a reduction in the overall intensity range of the image.
Overall, histogram equalization is a simple but effective technique for enhancing the
10)Why JPEG is better than a Raw free? Summarize the merits and
de-merits.
efficiency. It is often used for storing and transmitting images on the internet, as well
There are several reasons why JPEG is often considered to be better than a Raw
image format:
However, there are also some drawbacks to using JPEG compared to Raw:
1. Quality: JPEG images are lossy, which means that they suffer from some loss
of quality compared to Raw images. This can be especially noticeable in
images with a lot of detail or high contrast, as these details may be lost or
distorted during the compression process.
2. Dynamic range: JPEG images typically have a lower dynamic range compared
to Raw images, which means that they may not be able to capture as much
detail in the highlights and shadows. This can be a problem in high-contrast
scenes or when shooting in difficult lighting conditions.
Overall, JPEG is a popular and efficient image format that is well-suited for a wide
range of applications. However, it is not always the best choice for all situations, and
users may prefer to use Raw images in some cases in order to achieve the highest