Digital Image Processing Guide
Digital Image Processing Guide
1. Image Acquisition
This is the first fundamental step in digital image processing. Image acquisition
could be as simple as being given an image that is already in digital form.
Generally, the image acquisition stage involves pre-processing, such as scaling
etc.
2. Image Enhancement
Image enhancement is among the simplest and most appealing areas of digital
image processing. Basically, the idea behind enhancement techniques is to bring
out detail that is obscured, or simply to highlight certain features of interest in
an image. Such as, changing brightness & contrast etc.
3. Image Restoration
Image restoration is an area that also deals with improving the appearance of an
image. However, unlike enhancement, which is subjective, image restoration is
objective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation.
6. Compression
Compression deals with techniques for reducing the storage required to save an
image or the bandwidth to transmit it. Particularly in the uses of internet it is
very much necessary to compress data.
7. Morphological Processing
Morphological processing deals with tools for extracting image components that
are useful in the representation and description of shape.
8. Segmentation
1. Structure of Eye
2. Image Formation in the Eye
3. Brightness Adaptation and Discrimination
Structure of Eye:
The human eye is a slightly asymmetrical sphere with an average diameter of
the length of 20mm to 25mm. It has a volume of about 6.5cc. The eye is just
like a camera. The external object is seen as the camera take the picture of any
object. Light enters the eye through a small hole called the pupil, a black
looking aperture having the quality of contraction of eye when exposed to bright
light and is focused on the retina which is like a camera film.
The lens, iris, and cornea are nourished by clear fluid, know as anterior
chamber. The fluid flows from ciliary body to the pupil and is absorbed through
the channels in the angle of the anterior chamber. The delicate balance of
aqueous production and absorption controls pressure within the eye.
Cones in eye number between 6 to 7 million which are highly sensitive to
colors. Human visualizes the colored image in daylight due to these cones. The
cone vision is also called as photopic or bright-light vision.
Rods in the eye are much larger between 75 to 150 million and are distributed
over the retinal surface. Rods are not involved in the color vision and are
sensitive to low levels of illumination.
Image Formation in the Eye:
When the lens of the eye focus an image of the outside world onto a
light-sensitive membrane in the back of the eye, called retina the image is
formed. The lens of the eye focuses light on the photoreceptive cells of the
retina which detects the photons of light and responds by producing neural
impulses.
The distance between the lens and the retina is about 17mm and the focal length
is approximately 14mm to 17mm.
Brightness Adaptation and Discrimination:
Digital images are displayed as a discrete set of intensities. The eyes ability to
discriminate black and white at different intensity levels is an important
consideration in presenting image processing result.
The range of light intensity levels to which the human visual system can adapt
is of the order of 1010 from the scotopic threshold to the glare limit. In a
photopic vision, the range is about 106.
Pixels:
Pixel is the smallest unit of a digital graphic which can be illuminated on a
display screen and a set of such illuminated pixels form an image on screen. A
pixel is usually represented as a square or a dot on any display screen like a
mobile, TV, or computer monitor. They can be called as the building blocks of a
digital image and can be controlled to accurately show the desired picture.
The quality of picture concerning the clarity, size and colour combination is
majorly controlled by the amount and density of pixels present in the display.
Higher the resolution smaller the size of the pixel and better the clarity and vice
versa.
Each pixel has a unique geometric co-ordinate, dimensions (length and breadth),
size (eight bits or more), and has the ability to project multitude of colours.
Voxels:
Voxels are fairly complicated to understand but can be defined in the easiest of
language as a Volumetric Pixel. In 3D printing, we can define a voxel as a value
on a grid in a three-dimensional space, like a pixel with volume. Each voxel
contains certain volumetric information which helps to create a three
dimensional object with required properties.
One important aspect of every voxel is the ability of repeatability. Voxels have a
defined shape and size and can be stacked over each other to create a 3D object.
Black is the result of combining the three CMY colors at their highest
intensities.
Pixel Data
This is the section where the numerical values of the pixels are stored.
According to the data type, pixel data are stored as integers or
floating-point numbers using the minimum number of bytes required
to represent the values
DICOM
The Dicom standard was established by the American College of
Radiology and the National Electric Manufacturers Association. Today,
the Dicom standard is the backbone of every medical imaging
department. The added value of its adoption in terms of access,
exchange, and usability of diagnostic medical images is, in general,
huge. Dicom is not only a file format but also a network
communication protocol.
The innovation of Dicom as a file format has been to establish that the
pixel data cannot be separated from the description of the medical
procedure which led to the formation in the image itself. In other
words, the standard stressed the concept that an image that is
separate from its metadata becomes “meaningless” as medical image.
Metadata and pixel data are merged in a unique file, and the Dicom
header, in addition to the information about the image matrix,
contains the most complete description of the entire procedure used
to generate the image ever conceived in terms of acquisition protocol
and scanning parameters. The header also contains patient
information such as name, gender, age, weight, and height. For these
reasons, the Dicom header is modality-dependent and varies in size. In
practice, the header allows the image to be self-descriptive. In order to
easily understand the power of this approach, just think to the
software which Siemens first introduced for its MRI systems to
replicate an acquisition protocol. The software, known as “Phoenix”, is
able to extract from a Dicom image series dragged into the acquisition
window the protocol and to replicate it for a new acquisition. There
are similar tools for all the major manufacturers.
Regarding the pixel data, Dicom can only store pixel values as integers.
Dicom cannot currently save pixel data in floating-point while it
supports various data types, including floats, to store metadata.
Whenever the values stored in each voxel have to be scaled to different
units, Dicom makes use of a scale factor using two fields into the
header defining the slope and the intercept of the linear
transformation to be used to convert pixel values to real world values.
NIFTI
The Nifti format allows a double way to store the orientation of the
image volume in the space. The first, comprising a rotation plus a
translation, to be used to map voxel coordinates to the scanner frame
of reference; this “rigid body” transformation is encoded using a
“quaternion”. The second method is used to save the 12 parameters of
a more general linear transformation which defines the alignment of
the image volume to a standard or template-based coordinate system.
This spatial normalization task is common in brain functional image
analysis.
INTERFILE
Interfile is a file format that was developed for the exchange of nuclear medicine image data.
An Interfile data set consists of two files:
● Header file — Provides information about dimensions, identification and processing
history. You use the interfileinfo function to read the header information. The header
file has the .hdr file extension.
● Image file — Image data, whose data type and ordering are described by the header file.
You use interfileread to read the image data into the workspace. The image file has
the .img file extension.
● Sampling
● Quantization
The sampling rate determines the spatial resolution of the digitized image, while the
quantization level determines the number of grey levels in the digitized image. A magnitude
of the sampled image is expressed as a digital value in image processing. The transition
between continuous values of the image function and its digital equivalent is called
quantization.
The number of quantization levels should be high enough for human perception of fine
shading details in the image. The occurrence of false contours is the main problem in image
which has been quantized with insufficient brightness levels.
In this lecture we will talk about two key stages in digital image processing. Sampling and
quantization will be defined properly. Spatial and grey-level resolutions will be introduced
and examples will be provided.
To create a digital image, we need to convert the continuous sensed data into
digital form.
This process includes 2 processes:
1. Sampling: Digitizing the co-ordinate value is called sampling.
2. Quantization: Digitizing the amplitude value is called quantization.
To convert a continuous image f(x, y) into digital form, we have to sample the
function in both co-ordinates and amplitude.
Sampling
Since an analog image is continuous not just in its co-ordinates (x
axis), but also in its amplitude (y axis), so the part that deals with
the digitizing of co-ordinates is known as sampling. In digitizing
sampling is done on independent variable. In case of equation y =
sin(x), it is done on x variable.
When looking at this image, we can see there are some random
variations in the signal caused by noise. In sampling we reduce this
noise by taking samples. It is obvious that more samples we take,
the quality of the image would be more better, the noise would be
more removed and same happens vice versa. However, if you take
sampling on the x axis, the signal is not converted to digital format,
unless you take sampling of the y-axis too which is known as
quantization.
Quantization
Quantization is opposite to sampling because it is done on “y axis”
while sampling is done on “x axis”. Quantization is a process of
transforming a real valued sampled image to one taking only a finite
number of distinct values. Under quantization process the
amplitude values of the image are digitized. In simple words, when
you are quantizing an image, you are actually dividing a signal into
quanta(partitions).
Now let’s see how quantization is done. Here we assign levels to the
values generated by sampling process. In the image showed in
sampling explanation, although the samples has been taken, but
they were still spanning vertically to a continuous range of gray level
values. In the image shown below, these vertically ranging values
have been quantized into 5 different levels or partitions. Ranging
from 0 black to 4 white. This level could vary according to the type
of image you want.
Industry standards define sensitivity in terms of the ISO film speed equivalent, using SNR
thresholds (at average scene luminance) of 40:1 for "excellent" image quality and 10:1 for
"acceptable" image quality.[1]
SNR is sometimes quantified in decibels (dB) of signal power relative to noise power, though in
the imaging field the concept of "power" is sometimes taken to be the power of a voltage signal
proportional to optical power; so a 20 dB SNR may mean either 10:1 or 100:1 optical power,
depending on which definition is in use.
DCT:
The discrete cosine transform (DCT) represents an image as a sum of sinusoids
of varying magnitudes and frequencies. The DCT has the property that, for a
typical image, most of the visually significant information about the image is
concentrated in just a few coefficients of the DCT. For this reason, the DCT is
often used in image compression applications. For example, the DCT is at the
heart of the international standard lossy image compression algorithm known as
JPEG. The two-dimensional DCT of an M-by-N matrix A is defined as follows.
The values Bpq are called the DCT coefficients of A. (Note that matrix indices in
MATLAB® always start at 1 rather than 0; therefore, the MATLAB matrix
elements A(1,1) and B(1,1) correspond to the mathematical
quantities A00 and B00, respectively.)
The DCT is an invertible transform, and its inverse is given by
KL Transform:
i. The KL Transform is also known as the Hoteling transform or the Eigen Vector transform.
The KL Transform is based on the statistical properties of the image and has several
important properties that make it useful for image processing particularly for image
compression.
ii. The main purpose of image compression is to store the image in fewer bits as compared
to original image, now data from neighbouring pixels in an image are highly correlated.
iii. More image compression can be achieved by de-correlating this data. The KL transform
does the task of de-correlating the data thus facilitating higher degree of compression.
(I) Find the mean vector and covariance matrix of the given image x
(II) Find the Eigen values and then the eigen vectors of the covariance matrix
(III) Create the transformation matrix T, such that rows of T are eigen vectors
(IV) Find the KL Transform
The mean vector is found out as shown below mx=E(x) (Equation 1)
Where E{x} is the expected value
N = number of columns
The covariance of the vector population is defined as
Covariance (x)=Cx=E[(x−mx)(x−mx)′]
14. Arithmetic and Logical Operations on Images (Image Algebra)
These operations are applied on pixel-by-pixel basis. So, to add two
images together, we add the value at pixel (0 , 0) in image 1 to the value at
pixel (0 , 0) in image 2 and store the result in a new image at pixel (0 , 0). Then
we move to the next pixel and repeat the process, continuing until all pixels
have been visited.
Clearly, this can work properly only if the two images have identical
dimensions. If they do not, then combination is still possible, but a meaningful
result can be obtained only in the area of overlap. If our images have
dimensions of w1*h1, and w2*h2 and we assume that their origins are aligned,
then the new image will have dimensions w*h, where:
w = min (w1, w2)
h = min (h1, h2)
as likely as
negative perturbations by the same amount, and there will be a tendency for the
perturbations to cancel out when several noisy values are added.
Addition can also be used to combine the information of two images,
such as an image morphing, in motion pictures.
a b c
Figure (4) a) noisy image b) average of five observation c) average of ten
observation
Subtraction
Subtracting two 8-bit grayscale images can produce values between - 225
and +225. This necessitates the use of 16-bit signed integers in the output
image unless sign is unimportant, in which case we can simply take the
modulus of the result and store it using 8-bit integers:
The main application for image subtraction is in change detection (or motion
detection). If we make two observations of a scene and compute their difference
using the above equation, then changes will be indicated by pixels in the difference
image which have non-zero values. Sensor noise, slight changes in illumination and
various other factors can result in small differences which are of no significance so it
is usual to apply a threshold to the difference image. Differences below this threshold
are set to zero. Difference above the threshold can, if desired, be set to the maximum
pixel value. Subtraction can also be used in medical imaging to remove static
background information.
Figure (5) a, b ) two frames of video sequencec) their difference
Logical AND & OR operations are useful for the masking and
compositing of images. For example, if we compute the AND of a binaryimage
with some other image, then pixels for which the corresponding value in the
binary image is 1 will be preserved, but pixels for which the corresponding
binary value is 0 will be set to 0 (erased) . Thus the binary image acts as a
AND^
This operation can be used to find the similarity white regions of two
different images (it required two images).
g (x,y) = a (x,y) ^ b (x,y)
Exclusive OR
This operator can be used to find the differences between white regions of two
different images (it requires two images).
NOT
Photons incident on the CCD convert to photoelectrons within the silicon layer. These
photoelectrons comprise the signal but also carry a statistical variation of fluctuations in the
photon arrival rate at a given point. This phenomenon is known as “photon noise” and
follows Poisson statistics. Additionally, inherent CCD noise sources create electrons that are
indistinguishable from the photoelectrons. When calculating overall SNR, all noise sources
need to be taken into consideration:
Photon noise refers to the inherent natural variation of the incident photon flux.
Photoelectrons collected by a CCD exhibit a Poisson distribution and have a
square root relationship between signal and noise.
(noise=√signal)
Dark noise arises from the statistical variation of thermally generated electrons
within the silicon layers comprising the CCD. Dark current describes the rate of
generation of thermal electrons at a given CCD temperature. Dark noise, which
also follows a Poisson relationship, is the square root of the number of thermal
electrons generated within a given exposure. Cooling the CCD from room
temperature to -25°C will reduce dark current by more than 100 times.
Taken together, the SNR for a CCD camera can be calculated from the following
equation:
where:
Under low-light-level conditions, read noise exceeds photon noise and the image
data is said to be “read-noise limited”. The integration time can be increased
until photon noise exceeds both read noise and dark noise. At this point, the
image data is said to be “photon limited”.
Once you have determined acceptable values for SNR, integration time, and the
degree to which you are prepared to bin pixels, the above equation can be
solved for the minimum photon flux required. This is, therefore, the lowest light
level that can be measured for given experimental conditions and camera
specifications.
Dr. Qadri Hamarsheh
Neighbors of a Pixel
1. N4 (p) : 4-neighbors of p.
• Any pixel p(x, y) has two vertical and two horizontal neighbors, given by
(x+1,y), (x-1, y), (x, y+1), (x, y-1)
• This set of pixels are called the 4-neighbors of P, and is denoted by N4(P)
• Each of them is at a unit distance from P.
2. ND(p)
• This set of pixels, called 4-neighbors and denoted by ND (p).
• ND(p): four diagonal neighbors of p have coordinates:
(x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1)
• Each of them are at Euclidean distance of 1.414 from P.
3. N8 (p): 8-neighbors of p.
• N4(P)and ND(p) together are called 8-neighbors of p, denoted by N8(p).
• N8 = N4 U ND
• Some of the points in the N4 , ND and N8 may fall outside image when P lies on the
border of image.
1
Dr. Qadri Hamarsheh
Adjacency
• Two pixels are connected if they are neighbors and their gray levels satisfy some
specified criterion of similarity.
• For example, in a binary image two pixels are connected if they are 4-neighbors and
have same value (0/1)
• Let v: a set of intensity values used to define adjacency and connectivity.
• In a binary Image v={1}, if we are referring to adjacency of pixels with value 1.
• In a Gray scale image, the idea is the same, but v typically contains more elements,
for example v= {180, 181, 182,....,200}.
• If the possible intensity values 0 to 255, v set could be any subset of these 256 values.
Types of adjacency
1. 4-adjacency: Two pixels p and q with values from v are 4-adjacent if q is in the
set N4 (p).
2. 8-adjacency: Two pixels p and q with values from v are 8-adjacent if q is in the
set N8 (p).
3. m-adjacency (mixed): two pixels p and q with values from v are m-adjacent if:
q is in N4 (p) or
q is in ND (P) and
The set N4 (p) ∩ N4 (q) has no pixel whose values are from v (No intersection).
• Mixed adjacency is a modification of 8-adjacency ''introduced to eliminate the
ambiguities that often arise when 8- adjacency is used. (eliminate multiple path
connection)
• Pixel arrangement as shown in figure for v= {1}
Example:
Path
• A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with
coordinate (s,t) is a sequence of distinct pixels with coordinates (x0, y0), (x1, y1),
..., (xn, yn), where (x0, y0)= (x,y), (xn, yn)= (s,t)
• (xi, yi) is adjacent pixel (xi-1, yi-1) for 1≤j≤n ,
• n- The length of the path.
• If (x0, y0)= (xn, yn):the path is closed path.
• We can define 4- ,8- , or m-paths depending on the type of adjacency specified.
2
Dr. Qadri Hamarsheh
Connectivity
• Let S represent a subset of pixels in an image, Two pixels p and q are said to be
connected in S if there exists a path between them.
• Two image subsets S1 and S2 are adjacent if some pixel in S1 is adjacent to some
pixel in S2
Region
• Let R to be a subset of pixels in an image, we call a R a region of the image. If R is a
connected set.
• Region that are not adjacent are said to be disjoint.
• Example: the two regions (of Is) in figure, are adjacent only if 8-adjacany is used.
1 1 1
1 0 1 Ri
0 1 0
0 0 1
1 1 1 Rj
1 1 1
• 4-path between the two regions does not exist, (so their union in not a connected set).
• Boundary (border) image contains K disjoint regions, Rk, k=1, 2, ...., k, none of
which touches the image border.
R1 R2
R3
Rk
Dr. Qadri Hamarsheh
• Let: Ru - denote the union of all the K regions, (Ru)c- denote its complement.
complement
(Complement of a set S is the set of points that are not in s).
Ru - called foreground; (Ru)c - called background of the image.
• Boundary (border or contour) of a region R is the set of points that are adjacent to
points in the complement of R (another way: the border of a region is the set of pixels
in the region that have at least are background neighbor)
neighbor).
We must specify the connectivity being used to define adjacency
Distance Measures
• For pixels p, q and z,, with coordinates (x,y), (s,t) and (u,v),, respenctively, D is
a distance function or metric if:
D(p,q)0, D(p,q) = 0 if p=q q(s,t)
D(p,q) = D(q,p), and
D(p,z) D(p,q) + D(q,z)
p(x,y)
• The following are the different Distance measures:
1. Euclidean Distance (De)
܍, ሿ
P(x,y)
Dr. Qadri Hamarsheh
Example 1: the pixels with D4=1 are the 4-nighbors of (x, y).
4. Dm distance:
• Is defined as the shortest m-path
m between the points.
• The distance between pixels depends only on the values of pixels.
Example: consider the following arrangement of pixels
P3 P4
P1 P2
P
and assume that P, P2 have value 1 and that P1 and P3 can have a value of 0 or 1
Suppose, that we consider adjacency of pixels value 1 (v={1})
(
a) if P1 and P3 are 0:
Then Dm distance = 2
b) if P1 = 1 and P3 = 0
m-distance = 3;
c) if P1=0 ; and P3 = 1
d) if P1=P3 =1 ;
m-distance=4
distance=4 path = p p1 p2 p3 p4
Dr. Qadri Hamarsheh
Matlab Example
Matlab Code
bw = zeros(200,200); bw(50,50) = 1; bw(50,150) = 1;
bw(150,100) = 1;
D1 = bwdist(bw,'euclidean'););
D2 = bwdist(bw,'cityblock');
);
D3 = bwdist(bw,'chessboard'
'chessboard');
D4 = bwdist(bw,'quasi-euclidean'
euclidean');
figure
subplot(2,2,1), subimage(mat2gray(D1)), title(
title('Euclidean')
hold on, imcontour(D1)
subplot(2,2,2), subimage(mat2gray(D2)), title(
title('City block')
hold on, imcontour(D2)
subplot(2,2,3), subimage(mat2gray(D3)),
(mat2gray(D3)), title(
title('Chessboard'))
hold on, imcontour(D3)
subplot(2,2,4), subimage(mat2gray(D4)), title(
title('Quasi-Euclidean'
Euclidean')
hold on, imcontour(D4)
International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org
Abstract – Images are used in various fields to help monitoring The content of this paper is organized as follows: Section I
processes such as images in fingerprint evaluation, satellite gives introduction to the topic and projects fundamental
monitoring, medical diagnostics, underwater areas, etc. Image background. Section II describes the types of image
processing techniques is adopted as an optimized method to help enhancement techniques. Section III defines the operations
the processing tasks efficiently. The development of image
applied for image filtering. Section IV shows results and
processing software helps the image editing process effectively.
Image enhancement algorithms offer a wide variety of approaches discussions. Section V concludes the proposed approach and
for modifying original captured images to achieve visually its outcome.
acceptable images. In this paper, we apply frequency domain 1.1 Digital Image Processing
filters to generate an enhanced image. Simulation outputs results
in noise reduction, contrast enhancement, smoothening and Digital image processing is a part of signal processing which
sharpening of the enhanced image. uses computer algorithms to perform image processing on
Index Terms – Digital Image Processing, Fourier Transforms, digital images. It has numerous applications in different studies
High-pass Filters, Low-pass Filters, Image Enhancement. and researches of science and technology. The fundamental
steps in Digital Image processing are image acquisition, image
1. INTRODUCTION enhancement, image analysis, image reconstruction, image
Image processing is a form of signal processing in which the restoration, image compression, image segmentation, image
input is an image, such as a photograph or video frame and the recognition, and visualization of image.
output may be either an image or a set of characteristics or The main sources of noise in digital image processing come
parameters related to the image. Most image-processing under image acquisition and image transmission. Image
techniques involve treating the image as a two-dimensional Enhancement basically improves the visual quality of the
signal and applying standard signal-processing techniques. It image by providing clear images for human observer and for
deals with the improvement of pictorial information for human machine in automatic image processing techniques. Digital
interpretation and processing of image for storage, image processing has fundamental classes depending on their
transmission and representation for machine perception. operations:
Image processing can be defined as analysis of picture using A. Image enhancement
techniques that can basically identify region of interest from all
those images in bitmapped graphic format that have been Image enhancement deals with contrast enhancement, spatial
scanned or captured with digital camera. Image enhancement filtering, frequency domain filtering, edge enhancement and
techniques aims at realizing the improvement in the quality of noise reduction. This project briefly shows the theoretical and
a given image. An image can be enhanced by changing any practical approaches in frequency domain.
attribute of the image. There exist many techniques that can
B. Image analysis
enhance an image without spoiling it. Enhancement methods
can be broadly divided into two categories i.e. spatial domain It deals with the statistical details of an image. It is possible to
technique and frequency domain technique. examine the information of an image in detail. This information
helps in image restoration and enhancement. One of the
Spatial domain deals with direct manipulation of pixels of an
representations of the information is the histogram
image whereas the frequency domain filters the image by
representation. During image analysis, the main tasks include
modifying the Fourier Transform of an image. In this paper,
image segmentation, feature extraction and object
main focus is laid on enhancing an image using frequency
classification.
domain technique. The objective to show how a digital image
is being processed generate a better -quality image.
C. Image restoration The enhancement technique differs from one field to another
according to its objective. Advancement in the technology
In this class, the image is corrected using different correction
brings the development in the digital image processing
methods like inverse filtering and feature extraction in order to
techniques in both domains:
restore an image to its original form.
A. Spatial domain.
D. Image compression
The term spatial domain refers to the image plane itself, and
It deals with the compression of the size of the image so that it
approaches in this category are based on direct manipulation of
can easily be stored electronically. The compressed images are
pixel values of an image. It enhances the whole image in a
then decompressed to their original forms. The image
uniform manner. The value of the pixels with coordinates (x,
compression and decompression can either lose their size by
y) in an enhanced image ‘F’ is the result of performing some
maintaining high quality or preserves the original data size
operation on the pixels with the neighbourhood of (x, y) in the
without losing size.
input image ‘f’. This method is straightforward and are chiefly
E. Image synthesis utilized in real time applications. But it lags in producing
adequate robustness and imperceptibility requirement.
This class of digital image processing is well known now-a-
days in the film and game industry and is very advanced in 3- B. Frequency domain.
dimensional and 4-dimensional productions. In both cases the
The frequency domain processing techniques are based on
images and videos scenes are constructed using certain
modifying the Fourier transform of an image. The basic idea in
techniques of visualization.
using this technique is to enhance the image by manipulating
1.2 Image Enhancement the transform coefficient of the image, such as Discrete Fourier
Transform (DFT), Discrete Wavelet Transform (DWT), and
Image enhancement is basically improving the interpretability Discrete Cosine Transform (DCT). This methods advantages
or perception of information in images for human viewers and includes low complexity of computations, ease of viewing and
providing `better' input for other automated image processing
manipulating the frequency composition of the image and the
techniques. The principal objective of image enhancement is to
easy applicability of special transformed domain properties.
process a given image so that the result is more suitable than
the original image for a specific application. 3. IMAGE ENHANCEMENT USING FREQUENCY
DOMAIN TECHNIQUE
Image enhancement simply means, transforming an image f
into image g using a transformation function T. Let the values In frequency domain methods, the image is first transferred into
of pixels in images f and g are denoted by r and s respectively. frequency domain. All the enhancement operations are
Then the pixel values r and s are related by the expression, performed on the Fourier transform of the. Image enhancement
function in the frequency domain is denoted by the expression:
s = T(r)
g(x, y) = T[f(x, y)]
Where, T is a transformation that maps a pixel value r into a
pixel value s. where f(x, y) is the input image, g(x, y) is an enhanced image
formed by the result of performing some operation, T on the
2. IMAGE ENHANCEMENT TECHNIQUES frequency component of the transformed image.
3.1 Filtering in the Frequency Domain
The procedures required to enhance an image using frequency
domain technique are:
i. Transform the input image into the Fourier domain.
ii. Multiply the Fourier transformed image by a filter.
iii. Take the inverse Fourier transform of the image to get
the resulting enhanced image.
3.2 Basic Steps for Filtering in the Frequency Domain:
1. Given an input image f(x, y) of size M x N.
2. Compute F (u, v), the DFT of the image.
Fig. 1. Types of Enhancement Technique
3. Multiply F (u, v) by a filter function H(u, v), i.e., G(u, v) = with content in other frequency bands. The general formula for
H(u, v)F(u, v) filtering is given as:
4. Compute inverse DFT of the G(u, v). G(u, v) = F(u, v).H(u, v)
5. Obtain the real part of the result. where the H(u, v) is the transfer function, and F(u, v) is the
Fourier transform of the image function. The G(u, v) is the
Step-1 Input Image
filtered final function.
An input image may be defined as a two-dimensional function,
In all the filters, it is important to find the right filter function
f(x, y), where x and y are spatial (plane) coordinates, and the
H(u, v) as it amplifies some frequencies and suppresses certain
amplitude of f at any pair of coordinates (x, y) is called the
frequency components in an image. There are many filters that
intensity or grey level of the image at that point.
are used for blurring/smoothing, sharpening and edge detection
in an image. Based on the property of using the frequency
domain the image filters are broadly classified into two
categories:
1. Low-pass filters / Smoothing filters.
2. High-pass filters / Sharpening filters.
where, u represents the frequency and x represents time/space. A low-pass filter that attenuates (suppresses) high frequencies
The exponential in the above formula can be expanded into while passing the low frequencies which results in creating a
sines and cosines with the variables u and v determining these blurred (smoothed) image. It leaves the low frequencies of the
frequencies. Fourier transform relatively unchanged and ignores the high
frequency noise components. Three main low-pass filters are:
Step-3 Filtering of the Fourier Transformed image.
i. Ideal low-pass filter (ILPF)
A filter is a tool designed to suppress certain frequency
components of an input image and return the image in a An ideal low pass filter deals with the removal of all high
modified format. They are used to compensate for image frequency values of the Fourier transform that are at a distance
imperfections such as noise, and insufficient sharpness. By greater than a specified distance from the origin of the
filter design we can create filters that pass signals with transformed image. The filter transfer function for the Ideal
frequency components in some bands, and attenuate signals low-pass filter is given by:
D(u, v) (u M / 2) 2 (v N / 2) 2 1/ 2 Step-4 Compute Inverse Fourier Transform to get the enhanced
image.
ii. Butterworth low-pass filter (BLPF) We then need to convert data back to real image to use in any
applications. After the needed frequencies removed it is easy to
The filter transfer function for the Butterworth low-pass filter
return back to the spatial domain. Function represented by
is given by:
1 Fourier transform can be completely reconstructed by an
H (u, v)
1 D(u, v) / D0 inverse transform with no loss of information
2n
iii. Gaussian low-pass filter (GLPF) For this the Inverse Fourier Transform of the filtered image is
calculated by the following equation:
The filter transfer function for the Gaussian low-pass filter is
given by:
H(u,v) e D
2
(u,v)/ 2 D02
0 if D(u, v) D0
H (u, v)
1 if D(u, v) D0
Fig. 4. Original Input Image
ii. Butterworth High-pass Filter
The transfer function of Butterworth high-pass filter of order n
and with a specified cut-off frequency is given by:
H (u, v) 1 e D
2
( u ,v ) / 2 D02
1
H (u, v)
1 D0 / D(u , v)
2n
The results obtained in Fig. 11. are smoother than with the REFERENCES
previous to filters Even the filtering of the smaller objects and [1] Maini, Raman, and Himanshu Aggarwal. "A comprehensive review of
thin bars is cleaner with Gaussian filter. image enhancement techniques." arXiv preprint
arXiv:1003.4053 (2010).
5. CONCLUSION AND FUTURE SCOPE [2] Singh, L. Shyam Sundar, et al. "A Review on Image Enhancement
Methods on Different Domains."
In this project, we focus on existing frequency domain based [3] Upadhye, Mrs Smita Y., and Mrs Swapnali B. Karole. "Comparision of
image enhancement techniques that includes filters that are different Image Resolution Enhancement techniques using wavelet
useful in many application areas as medical diagnosis, army transform." (2016).
and industrial areas. Program is developed to compute and [4] Nguchu, Benedictor Alexander. "Critical Analysis of Image
Enhancement Techniques."
display the image after applying various low pass and high pass [5] Shaikh, Md Shahnawaz, Ankita Choudhry, and Rakhi Wadhwani.
filters on it. "Analysis of Digital Image Filters in Frequency
Domain." Analysis 140.6 (2016).
In this project frequency domain filters are implemented in [6] Singh, Palwinder. "Image Enhancement Techniques: A
MATLAB. It is found that low-pass filters smoothen the input Comprehensive."
image by removing noise and results in blurring of the image [7] Bedi, S. S., and Rati Khandelwal. "Various image enhancement
techniques-a critical review." International Journal of Advanced
and high-pass filters sharpens the inside details of an image. Research in Computer and Communication Engineering 2.3 (2013).
Ideal filters results in the ringing effect in the enhanced image. [8] Wang, David CC, Anthony H. Vagnucci, and C. C. Li. "Digital image
Using the Butterworth filters the ringing effect gets reduced enhancement: a survey." Computer Vision, Graphics, and Image
since there are no sharp frequency transitions, whereas the use Processing 24.3 (1983): 363-381.
[9] Bansal, Atul, Rochak Bajpai, and J. P. Saini. "Simulation of image
of Gaussian filters completely gives the filtered image without enhancement techniques using Matlab." Modelling & Simulation, 2007.
any ringing effect. AMS'07. First Asia International Conference on. IEEE, 2007.
[10] Sawant, H. K., and Mahentra Deore. "A comprehensive review of image
The future scope can be the development of adaptive enhancement techniques." International Journal of Computer
algorithms for effective image enhancement using Fuzzy Logic Technology and Electronics Engineering (IJCTEE) 1.2 (2010): 39-44.
and Neural Network. Many more filters can be added into [11] Rajput, Seema, and S. R. Suralkar. "Comparative study of image
functionality. The same work can be extended for further enhancement techniques." International Journal of Computer Science
and Mobile Computing-A Monthly Journal of Computer Science and
digital image processing applications such as image Information Technology 2.1 (2013): 11-21.
restoration, image data compression etc. [12] Chen, Qiang, et al. "A solution to the deficiencies of image
enhancement." Signal Processing 90.1 (2010): 44-56.
Homomorphic filters are widely used in image processing for compensating the effect
of no uniform illumination in an image. Pixel intensities in an image represent the
light reflected from the corresponding points in the objects. As per as image model,
image f(z,y) may be characterized by two components: (1) the amount of source light
incident on the scene being viewed, and (2) the amount of light reflected by the
objects in the scene. These portions of light are called the illumination and reflectance
components, and are denoted i ( x , y) and r ( x , y) respectively.
The functions i ( x , y) and r ( x , y) combine multiplicatively to give the image
function f ( x , y):
f ( x , y) = i ( x , y).r(x, y) (1)
Homomorphic filters are used in such situations where the image is subjected to the
multiplicative interference or noise as depicted in Eq. 1. We cannot easily use the
above product to operate separately on the frequency components of illumination and
reflection because the Fourier transform of f ( x , y) is not separable; that is
F[f(x,y)) not equal to F[i(x, y)].F[r(x, y)]. We can separate the two components by
taking the logarithm of the two sides ln f(x,y) = ln i(x, y) + ln r(x, y).
Taking Fourier transforms on both sides we get,
that is, F(x,y) = I(x,y) + R(x,y), where F, I and R are the Fourier transforms ln
f(x,y),ln i(x, y) , and ln r(x, y). respectively. The function F represents the Fourier
transform of the sum of two images: a low-frequency illumination image and a high-
frequency reflectance image. If we now apply a filter with a transfer function that
suppresses low- frequency components and enhances high-frequency components,
then we can suppress the illumination component and enhance the reflectance
component.
Features & Application:
Images normally consist of light reflected from objects. The basic nature
of the image f(x,y) may be characterized by two components:
(1)The amount of source light incident on the scene being viewed, &
ln[i(x,y)] + ln[r(x,y)]
If we now apply a filter with a transfer function that suppresses low frequency
components and enhances high frequency components, then we can suppress the
illumination component and enhance the reflectance component. Thus ,the Fourier
transform of the output image is obtained by multiplying the DFT of the input image
with the filter function H(u,v).
i.e., S(u,v) = H(u,v) Z(u,v)
(4) where S(u,v) is the fourier transform of the output image.
Substitute equation (3) in (4),
we geS(u,v) = H(u,v) [ Fi(u,v) + Fr(u,v) ] = H(u,v) Fi(u,v) + H(u,v) Fr(u,v) --(5) Applying
IDFT to equation (6), we get,
T-1[S(u,v)] = T-1 [ H(u,v) Fi(u,v) + H(u,v) Fr(u,v)]
= T-1[ H(u,v) Fi(u,v)] + T-1[H(u,v) Fr(u,v)]
s(x,y) = i’(x,y) + r’(x,y) (6)
where, io(x,y) = e i’(x,y) , ro(x,y) = e r’(x,y) are the illumination and reflection
components of the enhanced output image.
Image Processing Lecture 6
Note:
The size of mask must be odd (i.e. 3×3, 5×5, etc.) to ensure it has a
center. The smallest meaningful size is 3×3.
( , )= ( , ) ( + , + )
Example:
Use the following 3×3mask to perform the convolution process on the
shaded pixels in the 5×5 image below. Write the filtered image.
0 1/6 0 30 40 50 70 90
1/6 1/3 1/6 40 50 80 60 100
0 1/6 0 35 255 70 0 120
3×3 mask 30 45 80 100 130
40 50 90 125 140
5×5 image
Solution:
1 1 1 1 1
0 × 30 + × 40 + 0 × 50 + × 40 + × 50 + × 80 + 0 × 35 + × 255
6 6 3 6 6
+ 0 × 70 = 85
1 1 1 1 1
0 × 40 + × 50 + 0 × 70 + × 50 + × 80 + × 60 + 0 × 255 + × 70
6 6 3 6 6
+ 0 × 0 = 65
1 1 1 1 1
0 × 50 + × 70 + 0 × 90 + × 80 + × 60 + × 100 + 0 × 70 + × 0
6 6 3 6 6
+ 0 × 120 =
1 1 1 1 1
0 × 40 + × 50 + 0 × 80 + × 35 + × 255 + × 70 + 0 × 30 + × 45
6 6 3 6 6
+ 0 × 80 = 118
and so on …
30 40 50 70 90
40 85 65 61 100
Filtered image = 35 118 92 58 120
30 84 77 89 130
40 50 90 125 140
Spatial Filters
Spatial filters can be classified by effect into:
1. Smoothing Spatial Filters: also called lowpass filters. They include:
1.1 Averaging linear filters
1.2 Order-statistics nonlinear filters.
2. Sharpening Spatial Filters: also called highpass filters. For example,
the Laplacian linear filter.
1 1 1 1 2 1
1 1 1 1 1 2 4 2
× ×
9 16
1 1 1 1 2 1
Standard average filter Weighted average filter
Note:
Weighted average filter has different coefficients to give more
importance (weight) to some pixels at the expense of others. The idea
behind that is to reduce blurring in the smoothing process.
(a) (b)
(c) (d)
(e) (f)
Figure 6.2 Effect of averaging filter. (a) Original image. (b)-(f) Results of smoothing with
square averaging filter masks of sizes n = 3,5,9,15, and 35, respectively.
Order-statistics filters
are nonlinear spatial filters whose response is based on ordering (ranking)
the pixels contained in the neighborhood, and then replacing the value of
the center pixel with the value determined by the ranking result.
Examples include Max, Min, and Median filters.
Median filter
It replaces the value at the center by the median pixel value in the
neighborhood, (i.e. the middle element after they are sorted). Median
filters are particularly useful in removing impulse noise (also known as
salt-and-pepper noise). Salt = 255, pepper = 0 gray levels.
In a 3×3 neighborhood the median is the 5th largest value, in a 5×5
neighborhood the 13th largest value, and so on.
For example, suppose that a 3×3 neighborhood has gray levels (10,
20, 0, 20, 255, 20, 20, 25, 15). These values are sorted as
(0,10,15,20,20,20,20,25,255), which results in a median of 20 that
replaces the original pixel value 255 (salt noise).
Example:
Consider the following 5×5 image:
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Apply a 3×3 median filter on the shaded pixels, and write the filtered
image.
Solution
20 30 50 80 100 20 30 50 80 100
30 20 80 100 110 30 20 80 100 110
25 255 70 0 120 25 255 70 0 120
30 30 80 100 130 30 30 80 100 130
40 50 90 125 140 40 50 90 125 140
Sort: Sort
20, 25, 30, 30, 30, 70, 80, 80, 255 0, 20, 30, 70, 80, 80, 100, 100, 255
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Sort
0, 70, 80, 80, 100, 100, 110, 120, 130
20 30 50 80 100
30 20 80 100 110
Filtered Image = 25 30 80 100 120
30 30 80 100 130
40 50 90 125 140
(a) (b)
(c)
Figure 6.3 Effect of median filter. (a) Image corrupted by salt & pepper noise. (b) Result of
applying 3×3 standard averaging filter on (a). (c) Result of applying 3×3 median filter on (a).
= ( + 1, ) − ( , ) and = ( , + 1) − ( , )
The second order partial derivatives of the digital image f(x,y) are:
= ( + 1, ) + ( − 1, ) − 2 ( , )
= ( , + 1) + ( , − 1) − 2 ( , )
We conclude that:
• 1st derivative detects thick edges while 2nd derivative detects thin
edges.
• 2nd derivative has much stronger response at gray-level step than 1st
derivative.
Thus, we can expect a second-order derivative to enhance fine detail (thin
lines, edges, including noise) much more than a first-order derivative.
∇ = +
(a) (b)
(c)
Figure 6.5 Example of applying Laplacian filter. (a) Original image. (b) Laplacian image.
(c) Sharpened image.
1. s = T * r
Where T is transformation, r is the value of pixels, s is pixel value before and after
processing.
Let,
1. r = f(x,y)
2. s = g(x,y)
'r' and 's' are used to denote gray levels of f and g at(x,y)
1. Linear
2. Logarithmic
3. Power - law
In identity transformation, each value of the image is directly mapped to each other
values of the output image.
Logarithmic transformations
Logarithmic transformation is divided into two types:
1. Log transformation
2. Inverse log transformation
The formula for Logarithmic transformation
1. s = c log(r + 1)
Here, s and r are the pixel values for input and output image. And c is constant. In the
formula, we can see that 1 is added to each pixel value this is because if pixel intensity
is zero in the image then log(0) is infinity so, to have minimum value one is added.
When log transformation is done dark pixels are expanded as compared to higher pixel
values. In log transformation higher pixels are compresses.
In the above image (a) Fourier Spectrum and (b) result of applying Log Transformation.
Power - Law transformations
Power Law Transformation is of two types of transformation nth power transformation
and nth root transformation.
Formula:
1. s = cr ^ γ
All display devices have their own gamma correction. That is why images are displayed
at different intensity.
For example:
Image Enhancement
The main objective of Image Enhancement is to process the given image into a more
suitable form for a specific application. It makes an image more noticeable by
enhancing the features such as edges, boundaries, or contrast. While enhancement,
data does not increase, but the dynamic range is increased of the chosen features by
which it can be detected easily.
In image enhancement, the difficulty arises to quantify the criterion for enhancement
for which enhancement techniques are required to obtain satisfying results.
Histogram Processing:
Applications of Histograms
1. In digital image processing, histograms are used for simple calculations in software.
2. It is used to analyze an image. Properties of an image can be predicted by the detailed
study of the histogram.
3. The brightness of the image can be adjusted by having the details of its histogram.
4. The contrast of the image can be adjusted according to the need by having details of
the x-axis of a histogram.
5. It is used for image equalization. Gray level intensities are expanded along the x-axis to
produce a high contrast image.
6. Histograms are used in thresholding as it improves the appearance of the image.
7. If we have input and output histogram of an image, we can determine which type of
transformation is applied in the algorithm.
Implementation
Example
The following input grayscale image is to be changed to match the reference histogram.
The input image has the following histogram
Equation
A simple snake model can be denoted by a set of n points, vi for i=0,….n-1, the
internal elastic energy term EInternal and the external edge-based energy term Eexternal. The
internal energy term’s aim is to regulate the snake’s deformations, while the exterior energy term’s
function is to control the contour’s fitting onto the image. The external energy is typically a
combination of forces caused by the picture Eimage and constraint forces imposed by the user Econ.
The snake’s energy function is the total of its exterior and internal energy,
which can be written as below.
Advantage
The applications of the active snake model are expanding rapidly, particularly
in the many imaging domains. In the field of medical imaging, the snake model
is used to segment one portion of an image that has unique characteristics
when compared to other regions of the picture. Traditional snake model
applications in medical imaging include optic disc and cup segmentation to
identify glaucoma, cell image segmentation, vascular region segmentation,
and several other regions segmentation for diagnosis and study of disorders
or anomalies.
Disadvantage
Equation
In 2D, the GVF vector field FGVF minimizes the energy functional
3. Balloon Model
A snake model isn’t drawn to far-off edges. If no significant image forces
apply to the snake model, its inner side will shrink. A snake that is larger than
the minima contour will eventually shrink into it, whereas a snake that is
smaller than the minima contour will not discover the minima and will instead
continue to shrink. To address the constraints of the snake model, the balloon
model was developed, in which an inflation factor is incorporated into the
forces acting on the snake. The inflation force can overwhelm forces from
weak edges, exacerbating the problem with first guess localization.
Equation
Inflation term into the forces acting on the snake is introduced in the balloon
model.
where n(s) is the normal unitary vector of the curve at v(s) and k1 is the
magnitude of the force.
Equation
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.1 Point detection 2.2 Line detection
• A point has been detected at the location p(i,j) on
which the mask is centered if |R |>T, where T is a
• Line masks
nonnegative threshold, and R is obtained with the
− 1 − 1 − 1
following mask. 2
Horizontal line 2 2
− 1 − 1 − 1 − 1 − 1 − 1
− 1 8 − 1
− 1 −1 2
− 1 − 1 − 1 45$ line − 1 2 − 1
• The idea is that the gray level of an isolated point 2 − 1 − 1
will be quite different from the gray level of its − 1 2 − 1
neighbors. Vertical line − 1 2 − 1
− 1 2 − 1
2 − 1 − 1
- 45$ line − 1 2 − 1
− 1 − 1 2
2ULJLQDO 1RLVHDGGHG • If, at a certain point in the image, |Ri|>|Rj| for all
j ≠ i , that point is said to be more likely associated
with a line in the direction of mask i.
)LOWHUHGRS 7KUHVKROGHGRS
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.3 Edge detection
• It locates sharp changes in the intensity function.
• Edges are pixels where brightness changes abruptly.
• A change of the image function can be described by
a gradient that points in the direction of the largest
growth of the image function.
2ULJLQDO
• An edge is a property attached to an individual pixel
and is calculated from the image function behavior
in a neighborhood of the pixel.
• Magnitude of the first derivative detects the
presence of the edge.
• Sign of the second derivative determines whether
the edge pixel lies on the dark sign or light side.
+RUL]RQWDOOLQH ROLQH
9HUWLFDOOLQH ROLQH
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
(a) Gradient operator
∂f 2 ∂f 2
2
∇f ( x ' , y ' ) = +
∂x ∂y
)LJ (GJHGHWHFWLRQE\GHULYDWLYHRSHUDWRUV D OLJKW
( x ', y ' )
VWULSHRQDGDUNEDFNJURXQG E GDUNVWULSHRQD
OLJKWEDFNJURXQG
• Direction of the vector ∇f ( x' , y ' ) :
α ( x' , y ' ) = tan −1 (∂∂fy ∂f
∂x
) ( x ', y ' )
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
Sobel operator: (b) Laplacian Operator
• It provides both a differentiating and a smoothing • The Laplacian of a 2D function f(x,y) is a 2nd-order
effect, which is particularly attractive as derivatives derivative defined as
typically enhance noise. ∂2 f ∂2 f
∇ f ( x' , y ' ) = 2 + 2
2
∂x ∂y ( x' , y ' )
− 1 − 2 − 1 − 1 0 1
Gx : 0 0 0 Gy : − 2 0 2 • The Laplacian has the same properties in all
directions and is therefore invariant to rotation in
1 2 1 − 1 0 1 the image.
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.4 Combined Detection:
• The Laplacian usually plays the secondary role of Basis of line subspace :
detector for establishing whether a pixel is on the − 1 0 1 0 1 0
W5= 0 0 0 W6= − 1 − 1
dark or light side of an edge. 1 1
0
2 2
1 0 − 1 0 1 0
1 −2 1 − 2 1 − 2
1
4 − 2
1
W7= − 2 W8= 1 4 1
6 6
1 − 2 1 − 2 1 − 2
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
1 1 1
W9= 1 1 1
1
"Average" subspace : • Example:
3
4 7 1
1 1 1
What's the attribute of the center of 3 5 2 ?
• Given a 3x3 region represented by {f(i,j)|-2<i,j<2}, 2 0 0
we have
R1 = 4.5607 R2 = 2.2678
1 1
Rm = ∑ ∑ f (i, j ) wm (i, j ) R3 = -2.6213 R4 = -0.8284
i = −1 j = −1 R5 = -0.5000 R6 = 1.0000
1/ 2
Pline = ∑ Rm R7 = 0.5000 R8 = 3.0000
8 2
m =5 R9 = 8.0000
1/ 2
4
Pedge = ∑ Rm
2
m =1 Pedge = 5.7879
Paverage = R9 Pline = 3.2404
Paver = 8.0000
where Pline , Paverage and Pedge are the magnitudes of the
projections onto edge, line and average subspaces
respectively, which tell how likely it is associated
with either an edge, a line or nothing.
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.5 Edge linking and boundary detection • A point (x',y') in the neighborhood of (x,y) is linked
to the pixel at (x,y) if both the following magnitude
• The techniques of detecting intensity discontinuities
and direction criteria are satisfied.
yield pixels lying only on the boundary between
regions.
∇f ( x' , y ' ) − ∇f ( x, y ) ≤ Threshold Tm
• In practice, this set of pixels seldom characterizes a α ( x' , y ' ) − α ( x, y ) ≤ Threshold Td
boundary completely because if noise, breaks in
boundary from nonuniform illumination, and other
effects that introduce spurious intensity
discontinuities.
• Edge detection algorithms are typically followed by
linking and other boundary detection procedures
designed to assemble edge pixels into meaningful
boundaries.
D
(a) Local processing
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3. Thresholding • Special cases:
If T depends on
• Thresholding is one of the most important 1. f(x,y) only - global threshold
approaches to image segmentation. 2. Both f(x,y) & p(x,y) - local threshold
3. (x,y) - dynamic threshold
• If background and object pixels have gray levels
grouped into 2 dominant modes, they can be • Multilevel thresholding is in general less reliable as
separated with a threshold easily. it is difficult to establish effective thresholds to
isolate the regions of interest.
+LVWRJUDP
)LJ1RQDGDSWLYHWKUHVKROGLQJUHVXOW
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3.1 Adaptive thresholding
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3.2 Threshold selection based on boundary
0 if G[ f ( x, y )] < T
characteristics
s ( x, y ) = 1 if G[ f ( x, y )] ≥ T and L[ f ( x, y )] ≥ 0
• A reliable threshold must be selected to identify the − 1 if G[ f ( x, y )] ≥ T and L[ f ( x, y )] < 0
mode peaks of a given histogram.
• This capability is very important for automatic where T is a threshold.
threshold selection in situations where image
characteristics can change over a broad range of
intensity distributions.
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
4.1 Region growing by pixel aggregation
• Region growing is a procedure that groups pixels or
subregions into larger regions.
• Pixel aggregation starts with a set of "seed" points
from those grows by appending to each seed point
those neighboring pixels that have similar properties D E F
such as gray level, texture and color. )LJ 2ULJLQDOLPDJHZLWKVHHGSRLQW E HDUO\VWDJHRI
UHJLRQJURZWK F ILQDOUHJLRQ
0 0 5 6 7 a a b b b
1 1 5 8 7 a a b b b
0 1 6 7 7 a a b b b
2 0 7 6 6 a a b b b
0 1 5 6 5 a a b b b • Problems have to be resolved:
Original intensity array Result of threshold=3
1. Selection of initial seeds that properly represent
a
a
a
a
a
a
b
b
b
b
a
a
a
a
a
a
a
a
a
a
regions of interest.
a a b b b a a a a a
a a b b b a a a a a 2. Selection of suitable properties for including
a a a b ? a a a a a
points in the various regions during the growing
Result of Threshold=5.5 Result of threshold=9
process.
([DPSOHRIUHJLRQJURZLQJXVLQJNQRZQVWDUWLQJSRLQWV 3. The formulation of stopping rule.
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
4.2 Region splitting and merging • Example:
• To subdivide an image initially into a set of
arbitrary, disjointed regions and then merge and/or
split the regions in an attempt to satisfy the
conditions stated above.
• A split and merge algorithm is summarized by the (a) (b) (c) (d)
following procedure in which, at each step, we:
(1) split into 4 disjointed quadrants any regions
([DPSOHRIVSOLWDQGPHUJHDOJRULWKP
Ri where P( Ri ) =false;
(2) merge any adjacent regions Rj and Rk for
which P( Ri ∪ R j ) =true; and
(3) stop when no further merging or splitting is
possible. (a) The entire image is split into 4 quadrants.
(b) Only the top left region satisfies the predicate
so it is not changed, while the other 3 quadrants
R
are split into subquadrants.
R1 R2
)LJ3DUWLWLRQHGLPDJHDQGFRUUHVSRQGLQJTXDGWUHH
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
D E F
)LJ D 2ULJLQDOLPDJH E 5HVXOWRIVSOLWDQGPHUJH
DOJRULWKP F 5HVXOWRIWKUHVKROGLQJ E
&<+,PDJH6HJPHQWDWLRQS
Estimation of the
degradation function
⮚ Observation
⮚ Experimentation
⮚ Mathematical Modeling
Estimation by Image Observation
Fs (u, v)
Based on the information of Position invariance,
we can deduce complete degradation function
H(u,v) from the characteristics of above function
Suppose the radial plot of Hs(u,v) has the
approximate shape of the Gaussian curve , same
information is used for H(u,v) in the larger scale.
Estimation by Experimentation
=
∫
y0 (t)]
Whe g(x, y) is the blurred image
re
UNIT 3: Image Restoration
Image restoration is the operation of taking a corrupt/noisy image and estimating
the clean, original image. Corruption may come in many forms such as motion
blur, noise and camera mis-focus.[1] Image restoration is performed by reversing
the process that blurred the image and such is performed by imaging a point source
and use the point source image, which is called the Point Spread Function (PSF) to
restore the image information lost to the blurring process.
Image restoration is different from image enhancement in that the latter is designed
to emphasize features of the image that make the image more pleasing to the
observer, but not necessarily to produce realistic data from a scientific point of
view. Image enhancement techniques (like contrast stretching or de-blurring by a
nearest neighbor procedure) provided by imaging packages use no a priori model
of the process that created the image.
With image enhancement noise can effectively be removed by sacrificing some
resolution, but this is not acceptable in many applications. In a fluorescence
microscope, resolution in the z-direction is bad as it is. More advanced image
processing techniques must be applied to recover the object.
Noise models:
Gaussian Noise:
Because of its mathematical simplicity, the Gaussian noise model is often used in
practice and even in situations where they are marginally applicable at best. Here,
m is the mean and σ2 is the variance.
Gaussian noise arises in an image due to factors such as electronic circuit noise and
sensor noise due to poor illumination or high temperature.
Here mean m and variance σ2 are the following:
Spatial Filtering technique is used directly on pixels of an image. Mask is usually
considered to be added in size so that it has a specific center pixel. This mask is
moved on the image such that the center of the mask traverses all image pixels.
In this article, we are going to cover the following topics –
● To write a program in Python to implement spatial domain averaging filter and
to observe its blurring effect on the image without using inbuilt functions
● To write a program in Python to implement spatial domain median filter to
remove salt and pepper noise without using inbuilt functions
Theory
Mean filter is one of the techniques which is used to reduce noise of the images.
This is a local averaging operation and it is a one of the simplest linear filter. The
value of each pixel is replaced by the average of all the values in the local
neighborhood. Let f(i,j) is a noisy image then the smoothed image g(x,y) can be
obtained by,
A single pixel with a very uncommon value(outlier) can significantly affect the
mean value of all the pixels in its neighborhood.
(Order statistics, adaptive filters, notch filters, band pass and reject filters,optimum
notch filters,…refer notes)
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter
Restoration Filters
There are basically two classes of restoration filters.
1 Deterministic-Based
These methods ignore effects of noise and statistics of the image,
e.g., inverse filter and Least Squares (LS) filter.
2 Stochastic-Based
Statistical information of the noise and image is used to generate
the restoration filters, e.g., 2-D Wiener filter and 2-D Kalman filter.
Inverse Filter
(a) Direct Inverse Filter: Attempts to recover the original image from the
observed blurred image using an inverse system, hI (m, n), corresponding
to the blur PSF, h(m, n).
Example 1: Derive the transfer function of the LS filter and show that it
is the same as the inverse filter.
Solution: The LS solution minimizes the LS cost function,
XX
J(x̂) = |y(m, n) − ŷ(m, n)|2
m n
where Ŷ (k, l) = H(k, l)X̂(k, l). Taking derivative of J wrt X̂(k, l) gives
∂J
= 0 =⇒ −H ∗ (k, l)(Y (k, l) − H(k, l)X̂(k, l)) = 0
∂ X̂(k, l)
which gives the same inverse filter solution i.e.,
X̂LS (k, l) = Y (k, l)/H(k, l)
Quest ion: What would you change in the cost function to yield LS
solution that is the same as that of the pseudo inverse?
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter
Wiener Filter
This filter takes into account 1st and 2nd order statistics of the noise and
image to generate the restoration filter transfer function. Additionally, it
provides the best linear estimate of the image based up minimizing MSE
(i.e. MMSE). However, it assumes wide-sense stationarity of the image
field. We begin with the 1-D case.
1-D Wiener Filter:
Goal: Find g(n)’s or FIR filter coefficients so that z(n) is as close as
possible to the desired signal d(n) by minimizing MSE J(g) = E[e2 (n)].
Then
N
X
J(g) = E[(d(n) − g(i)y(n − i))2 ]
i=0
To find the optimum g(k)’s,
N
∂J(g) X
= 0 =⇒ −2E[y(n − k)(d(n) − g(i)y(n − i))] = 0, k ∈ [0, N ]
∂g(k) i=0
Thus, we get
N
X
g(i)E[y(n − i)y(n − k)] = E[d(n)y(n − k)]
i=0
or
N
X
g(i)ryy (k − i) = rdy (k)
i=0
Now, let d(n) = x(n) (during the training phase) then assuming that the
noise and signal are uncorrelated
y(n) = h(n) ∗ x(n) + η(n)
ryy (k) = y(k) ∗ y(−k)
= (h(k) ∗ x(k) + η(k))(h(−k) ∗ x(−k) + η(−k))
= h(k) ∗ h(−k) ∗ rxx (k) + rηη (k)
rdy (k) = d(k) ∗ y(−k)
= x(k) ∗ (h(−k) ∗ x(−k) + η(−k))
= h(−k) ∗ rxx (k)
Thus, we get
Syy (ejΩ ) = |H(ejΩ )|2 Sxx (ejΩ ) + Sηη (ejΩ )
Sdy (e )jΩ
= H ∗ (ejΩ )Sxx (ejΩ )
Hence leading to Wiener filter transfer function,
H ∗ (ejΩ )Sxx (ejΩ )
G(ejΩ ) =
|H(ejΩ )|2 Sxx (ejΩ ) + Sηη (ejΩ )
Important Remarks
1 Once the transfer function of the Wiener filter is designed, the
filtering can simply be done as shown in Fig. 3.
H ∗ (k)Sxx (k)
G(k) =
|H(k)|2 Sxx (k) + Sηη (k)
4 The general Wiener filter transfer function,
Example 2: Derive the transfer function of the Wiener filter for the
following observation model.
where H(k, l) = 2 − D DF T {h(m, n)}, and Sxx (k, l) and Sηη (k, l) are
power spectra of the original image x(m, n) and additive noise, η(m, n),
respectively. The process is depicted in Fig.5.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter
Important Remarks
1 The above transfer function can be written as
H ∗ (k, l)
G(k, l) =
|H(k, l)|2 + Sηη (k, l)/Sxx (k, l)
where Sηη (k, l)/Sxx (k, l) > 0 is like ε in the pseudo inverse filter,
but dependent on SNR. Thus, the Wiener filter does not have the
singularity problem of the inverse filter.
2 If Sηη >> Sxx , G(k, l) reduces to de-emphasize the noisy
observation at that frequency. Recall that we have the opposite
effects in the inverse filter, i.e. Wiener filter does not have
ill-conditioning problems.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter
Figs. 6(a)-(d) show original, blurred (motion blur size 5), blurred and
noisy (SNR=7 dB), and finally the Wiener filtered Baboon. Note: even
some of the whisker details are restored.
Wiener filter assumes WSS of the image field which leads to smearing of
the edges and subtle textural features. This can be circumvented by
recursive Kalman Filter (KF). KF requires:
1 1-D or 2-D AR/ARMA model representation of the image field and
covariance of the noise.
2 1-D or 2-D state and observation equations with states being image
pixels to estimate.
3 Recursive implementation of the filtering equations.
Remark: For WSS cases, KF becomes the same as Wiener filter.
Here, we discuss one type of recursive image restoration using Strip
Kalman Filter (Azimi-Sadjadi, IEEE Trans. CAS, June 1989).
Note that the model coefficient matrices can be identified using vector
Yule-Walker equation. To see this let’s form covariance matrix of data
from the vector AR model above. This gives
which yields
RzT (1) RzT (M )
Rz (0) ··· ··· IW
T Re
Rz (1) Rz (0) ··· ··· Rz (M − 1) −A1
0
.. ..
..
. .
. 0
..
=
.
..
.. .. ..
..
.
.
. . .
−AM 0
Rz (M ) Rz (M − 1) ··· ··· Rz (0)
Here Rz (−m) = RzT (m). Solving this vector Yule Walker equation leads
to the parameter matrices for AR model. The sample data covariance
matrices R̂z (m) in this equation are estimated using the training images
PN −1
and R̂z (m) = N1 k=0 z(k)zT (k − m).
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter
Next, we form the state equation by defining the state vector (size
W M × 1) as
x(k) = [zT (k)zT (k − 1) · · · zT (k − M + 1)]T
then state equation (in canonical form) is
x(k) = F x(k − 1) + Ge(k)
where
AT1 AT2 ATM
··· ···
IW
IW 0 ··· ··· 0 0
.. ..
..
. . .
F = , G=
..
.
.. ..
..
.. .
. . .
0 0 ··· ··· IW 0
where y(k) is the observation vector and n(k) represents the additive
noise vector with E[n(k)] = 0 and E[n(k)nT (j)] = Rn δ(k − j) with Rn
being the covariance matrix of n(k). In vector form, the observation
equation becomes
y(k) = Hx(k) + n(k)
where H = [H0 · · · HM −1 ].
Figs. 8(a) and (b) show the noisy (additive WGN) Baboon with
SN Rinput =-1.7 dB and strip KF processed Baboon with
SN Routput =2.83 dB (i.e. >4.5dB improvements). Vector AR(1) was
used with W = 64. Clearly, the results are impressive.
The function of a band pass filter is opposite to that of a band reject filter It
allows a specific frequency band of the image to be passed and blocks the
rest of frequencies. The transfer function of a band pass filter can be
obtained from a corresponding band reject filter with transfer function
Hbr(u,v) by using the equation
Fig 3.4.2: Perspectives plot of ideal, Butterworth , Gaussian and Notch Filter.
(Source: D,E. Dudgeon and RM. Mersereau, Multidimensional Digital Signal
Processing‘, Prentice Hall Professional Technical Reference, 1990.-Page-335)
These filters cannot be applied directly on an image because it may remove too
much details of an image but these are effective in isolating the effect of an
image of selected frequency bands.
Image Registration is the process of estimating an optimal transformation between two
images.
• Sometimes also known as “Spatial Normalization”.
The image registration process is an automated or manual operation that attempts to discover
matching spots between two photos and spatially align them to minimise the desired error, i.e.
a uniform proximity measurement between two images. Medical sciences, remote sensing,
and computer vision all use image registration.
It could be said that Image registration is the process of calculating spatial transforms which
align a set of images to a common observational frame of reference, often one of the images
in the set. Registration is a key step in any image analysis or understanding task where
different sources of data must be combined. During the registration process, two situations
become evident:
1. It is impossible to imagine; this is known as a matching issue, and it is also the most
time-consuming step of the algorithm’s execution.
2. There is a requirement for transformation in the three-dimensional information of one
of the photos in terms of its coordinate system and related to the image chosen as its
reference.
There are major four steps that every method of image registration has to go through for
image alignment. These could be listed as follows:
● Feature detection: A domain expert detects salient and distinctive objects (closed
boundary areas, edges, contours, line intersections, corners, etc.) in both the reference
and sensed images.
● Feature matching: It establishes the correlation between the features in the reference
and sensed images. The matching approach is based on the content of the picture or
the symbolic description of the control point-set.
● Estimating the transform model: The parameters and kind of the so-called mapping
functions are calculated, which align the detected picture with the reference image.
● Image resampling and transformation: The detected image is changed using
mapping functions.
Image registration methods are majorly classified into two types: area-based approaches and
feature-based methods. When significant features are lacking in photos and distinguishing
information is given by grey levels/colours rather than local forms and structure, area-based
approaches are preferred.
When picture intensities provide more local structural information, feature-based matching
algorithms are used. Image characteristics produced from the feature extraction technique are
used in these procedures. But these two classifications could be further classified into various
methods. Let’s have a look at those classifications.
If the template fits the image, the cross-correlation will be at its peak. Because the measure
might be influenced by local picture intensity, the cross-correlation should be adjusted.
The key disadvantages of correlation approaches are the flatness of the similarity measure
maximum (owing to the self-similarity of the pictures) and the high processing complexity.
The maximum can be successfully sharpened by pre-processing or by applying edge or vector
correlation.
Before a projection operation a (x,y,z) vertex represents a location in 3-dimensional space. After
a projection the meaning of the values have changed:
Multiplanar Reformation
MPR is an image processing technique, which extracts two-dimensional (2D) slices from a
3D volume using arbitrarily positioned orthogonal or oblique planes.8 Although it is still a 2D
method, it has the advantages of ease of use, high speed, and no information loss. The
observer can display a structure of interest in any desired plane within the data set, and
four-dimensional (4D) MPR can be performed in real time using graphics hardware.9 In
addition, to accurately visualize tubular structures such as blood vessels, curved MPR may be
employed to sample a given artery along a predefined curved anatomic plane.10 Curved MPR
is an important visualization method for patients with bypass grafts and tortuous coronary
arteries.
Volume rendering:
DVR displays the entire 3D dataset as a 2D image, without computing any intermediate
geometry representations.27–29 This algorithm can be further divided into image-space DVR,
such as software- 30,31 and GPU-based raycasting,32 and object-space DVR, such as
splatting,33,34 shell rendering,35 texture mapping (TM),36 and cell
37 38
projection. Shear-warp can be considered as a combination of these two categories. In
addition, MIP,39 minimum intensity projection (MinIP), and X-ray projection40 are also
widely used methods for displaying 3D medical images. This paper focuses its attentions on
DVR, and the datasets discussed here are assumed to be represented on cubic and uniform
rectilinear grids, such as are provided by standard 3D medical imaging modalities.
Data compression is defined as the process of encoding data using a representation that reduce the
overall size of data. The reduction is possible when the original data set contains some type of
redundancy and compression can be achieved by eliminating these redundancies. Three basic data
redundancy can be identified and reduced in digital images, the coding redundancy, the interpixel
redundancy and psycho visual redundancy. Coding redundancy is eliminating by Huffman coding.
Due to decorrelation and energy compaction properties discrete cosine transform is used. The image
compression algorithm takes a gray scale image as an input. In lossless, we have used Huffman
coding and in lossy we have used discrete cosine transform (DCT). The method provides high
compression ratio (CR) for medical image with no loss of diagnostic quality. The algorithm can also be
called as a hybrid scheme because it combine a lossy and a lossless technique. we have used two
dimensional discrete cosine transform for image compression. However due to the nature of most of
the images, maximum energy (information) lies in low frequency as opposed to the high frequency.
The low frequency component is called DC coefficient and high frequency component is called AC
coefficient. We can represent the high frequency components coarsely, or drop them altogether,
without strongly affecting the quality of the resulting image reconstruction. This leads to a lot of
compression (lossy).
The input image is first resized (in 256 rows and 256 columns) and then padded if required. In this
technique of image reduction process, the block processing is done prior to discrete cosine transform
is applied to 8x8 pixel blocks of image and then padded them if required. Hence if the input image is
256x256 pixels in size, we break it into 32 square blocks of size 8x8 and treat each block
independently.
Transform Coding: In transform coding, a block of correlated pixels is transformed into a set of less
correlated coefficients. The transform to be used for data compression should satisfy two objectives.
Firstly, it should provide energy compaction: i.e. the energy in the transform coefficients should be
concentrated to as few coefficients as possible. This is referred to as the energy compaction property
of the transform. Secondly, it should minimize the statistical correlation between the transform
coefficients. As consequence transform coding has a good capability of data compression, because
not all transform coefficients need to be transmitted in order to obtain good image quality and even
those that are transmitted need not be represented with full accuracy in order to obtain good image
quality. In addition the transform domain coefficients are generally related to the spatial frequencies
in the image and hence the compression techniques can exploit the psychovisual properties of the
HVS, by quantizing the higher frequency coefficients more coarsely, as the HVS is more sensitive to
the lower frequency coefficients [2]. The Discrete Cosine Transform: The discrete cosine transform
(DCT) represents an image as a sum of sinusoids of varying magnitudes and frequencies. In DCT for
image, most of the visually significant information about the image is concentrated in just a few
coefficients of the DCT[19]. For this reason, the DCT is often used in image compression applications.
For example, the DCT is at the heart of the international standard lossy image compression algorithm
known as JPEG. The important feature of the DCT is that it takes correlated input data and
concentrates its energy in just the first few transform coefficients. If the input data consists of
correlated quantities, then most of the n transform coefficients produced by the DCT are zeros or
small numbers, and only a few are large (normally the first ones). The early coefficients contain the
important (low-frequency) image information and the later coefficients contain the less-important
(highfrequency) image information. Compressing data with the DCT is therefore done by quantizing
the coefficients.
In medical imaging field, computer-aided detection (CADe) or computer-aided diagnosis
(CADx) is the computer-based system that helps doctors to take decisions swiftly. Med-
ical imaging deals with information in image that the medical practitioner and doctors has to
valuate and analyze abnormality in short time. Analysis of imaging in medical field is very
crucial task because imaging is basic modality to diagnose any diseases at the earliest but
acquisition of image is not to harm the human body. Imaging techniques like MRI, X-ray,
endoscopy, ultrasound, etc. if acquired with high energy will provide good quality image but
they will harm the human body; hence, images are taken in less energy and therefore, the
images will be bad in quality and low contrast. CAD systems are used to improve the quality
of the image, which helps to interpret the medical images correctly and process the images
for highlighting the conspicuous parts .
CAD is a technology which includes multiple elements like concepts of artificial intelligence
(AI), computer vision, and medical image processing. The main application of CAD system
is finding abnormality in human body. Among all these, detection of tumor is the typical
application because if it misses in basic screening, it leads to cancer.
The main goal of CAD systems is to identify abnormal signs at an earliest that a human
professional fails to find. In mammography, identification of small lumps in dense tissue,
finding architectural distortion and prediction of mass type as benign or malignant by its
shape, size, etc.
The Discrete Wavelet Transform (DWT), which is based on sub-band coding. In DWT, the signal to be
analyzed is passed through filters with different cutoff frequencies at different scales. Wavelets can
be realized by iteration of filters with rescaling. The resolution of the signal, which is the measure of
the amount of detail information in the signal, is determined by the filtering operations, and the
scale is determined by up-sampling and down-sampling. The DWT is computed by successive
low-pass and high-pass filtering of the discrete time-domain signal. Images are treated as two
dimensional signals, they change horizontally and vertically, thus 2D wavelet analysis must be used
for images .2D wavelet analysis uses the same ‘mother wavelets’ but requires an additional step at
each level of decomposition. In 2D, the images are considered to be matrices with N rows and M
columns. At every level of decomposition the horizontal data is filtered, and then the approximation
and details produced from this are filtered on columns. At every level, four sub-images are obtained;
the approximation, the vertical detail, the horizontal detail and the diagonal detail.
Thresholding Once DWT is performed, the next task is thresholding, which is neglecting certain
wavelet coefficients for level from 1 to N. There are two types of threshold: (a) Hard threshold; (b)
Soft threshold By applying hard threshold the coefficients below this threshold level are zeroed, and
the output after a hard threshold is applied and defined by this equation:
Feature Extraction uses an object-based approach to classify imagery, where an object (also
called segment) is a group of pixels with similar spectral, spatial, and/or texture attributes.
Traditional classification methods are pixel-based, meaning that spectral information in each
pixel is used to classify imagery. With high-resolution panchromatic or multispectral imagery,
an object-based method offers more flexibility in the types of features to extract.
The workflow involves the following steps:
● Dividing an image into segments
● Computing various attributes for the segments
● Creating several new classes
● Interactively assigning segments (called training samples) to each class
● Classifying the entire image with a K Nearest Neighbor (KNN), Support Vector
Machine (SVM), or Principal Components Analysis (PCA) supervised
classification method, based on your training samples.
● Exporting the classes to a shapefile or classification image.
What is AUC?
Area Under the Curve, aka the AUC, is another important evaluation metric
generally used for binary classification problems. AUC represents the degree
of separability. The higher the AUC, the better the classifier can distinguish
between the positive and negative classes.
The graph below demonstrates the area under the curve. AUC is used to
summarize the ROC Curve as it measures the entire 2D area present
underneath the ROC curve.
The value of AUC ranges from 0 to 1. If AUC=1, the model can distinguish
between the Positive and the Negative class points correctly.
Similarly, a model with 100% false predictions, i.e., predicting all Negatives as
Positives, and all Positives as Negatives, would have AUC=0.
When an AUC=0.5, it means the model possesses no class separation capacity
and cannot distinguish between the Positive and Negative class points.
The Sensitivity and Specificity of a classifier are inversely proportional to each
other. So, when Sensitivity increases, Specificity decreases, and vice versa.
Confusion matrix:
Confusion matrix is a performance measurement for machine learning Classification &
Regression problem where output can be two or more classes. performance
measurement is an essential task in Machine Learning, So when it comes to a
classification /Regression problem, we can count on Confusion matrix & AUC - ROC
Curve.
Denoising
Medical imaging modalities are susceptible to noise, which introduces random intensity fluctuations
in an image. To reduce noise, you can filter images in the spatial and frequency domains.
Resampling
Use resampling to change the pixel or voxel size of an image without changing its spatial limits in the
patient coordinate system. Resampling is useful for standardizing image resolution across a data set
that contains images from multiple scanners.
Intensity Normalization
Intensity normalization standardizes the range of image intensity values across a data set. Typically,
you perform this process in two steps. First, clip intensities to a smaller range. Second, normalize the
clipped intensity range to the range of the image data type, such as [0, 1] for double or [0, 255]
for uint8. Whereas visualizing image data using a display window changes how you view the data,
intensity normalization actually updates the image values.
Preprocessing of retinal images:
Segmentation of liver:
Segmentation of ROI: Lung Nodules
Segmentation of ROI blood vessels:
Tumour detection:
Segmentation of lesions:
Kidney ultrasound image processing:
Unit 1: Question Bank
Part A
1. Define 4 neighbours of a pixel.
N4 (p) : 4-neighbors of p.
• Any pixel p(x, y) has two vertical and two horizontal neighbors, given by (x+1,y),
(x-1, y), (x, y+1), (x, y-1)
• This set of pixels are called the 4-neighbors of P, and is denoted by N4 (P)
• Each of them is at a unit distance from P.
2. Define a path.
A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with coordinate
(s,t) is a sequence of distinct pixels with coordinates (x0 , y0 ), (x1 , y1 ), ..., (xn , yn
), where (x0 , y0 )= (x,y), (xn , yn )= (s,t).
3. Name the elements of visual perception.
Structure of the eye, image formation in the eye, brightness adaptation and
discrimination.
4. Define a pixel.
Pixel is the smallest unit of a digital graphic which can be illuminated on a display
screen and a set of such illuminated pixels form an image on screen. A pixel is usually
represented as a square or a dot on any display screen like a mobile, TV, or computer
monitor.
5. Define a voxel.
In 3D printing, we can define a voxel as a value on a grid in a three-dimensional
space, like a pixel with volume. Each voxel contains certain volumetric information
which helps to create a three dimensional object with required properties.
6. What is image acquisition?
The image is captured by a camera and digitized (if the camera output is not
digitized automatically) using an analogue-to-digital converter for further processing in a
computer.
7. State the significance of wavelets in image processing
Wavelets are the building blocks for representing images in various degrees of
resolution. Images subdivision successively into smaller regions for data compression and
for pyramidal representation.
8. Define hue.
The Hue component describes the color itself in the form of an angle between [0,360]
degrees. 0 degree mean red, 120 means green 240 means blue. 60 degrees is yellow,
300 degrees is magenta.
9. Define Intensity.
An image is defined as a two-dimensional function f(x, y) the amplitude of f at any
pair of coordinates (x, y) is called the intensity or gray level of the image at that point.
10. What is the meaning of brightness and contrast in image processing?
Brightness increases the overall lightness of the image—for example, making dark colors
lighter and light colors whiter—while contrast adjusts the difference between the darkest
and lightest colors.
Part B
1. Elaborate on the components of image processing system.
2. Elucidate on the elements of visual perception.
3. Write a detailed note on image quality with respect to signal to noise ratio.
4. Give an account of arithmetic and logical operations done on pixels.
5. Explain in detail about the relationship between pixels.
6. Provide an account of the purpose and use of DFT and DCT on images.
Part C
1. Give an account of discrete sampling and quantization.
2. Provide a brief account of KLT on images.
Unit 2: Question Bank
Part A
1. What is the application of homomorphic filtering?
a.Homomorphic filter is used for image enhancement.
b.It simultaneously normalizes the brightness across an image and increases contrast.
c.It is also used to remove multiplicative noise.
2. What is spatial filtering?
Spatial filtering is a process by which we can alter properties of an optical image by
selectively removing certain spatial frequencies that make up an object, for example, filtering
video data received from satellite and space probes, or removal of raster from a television
picture or scanned image.
3.Define a histogram.
An image histogram is a gray-scale value distribution showing the frequency of
occurrence of each gray-level value. For an image size of 1024 × 1024 × 8 bits, the
abscissa ranges from 0 to 255; the total number of pixels is equal to 1024 × 1024.
4. Differentiate between smoothing and sharpening filters.
While linear smoothing is based on the weighted summation or integral operation on the
neighborhood, the sharpening is based on the derivative (gradient) or finite difference.
5. Define image enhancement.
Image enhancement is the process of digitally manipulating a stored image using
software. The tools used for image enhancement include many different kinds of software
such as filters, image editors and other tools for changing various properties of an entire
image or parts of an image.
6. What are hybrid filters?
The hybrid filter takes advantage of the image decomposition and reconstruction
processes of the MMWT where reconstruction of specific subimages is used to selectively
enhance the masses and separate the background structures.
7. What is frequency domain filtering?
Filtering in the frequency domain consists of modifying the Fourier transform of an
image and then computing the inverse transform to obtain the processed result.
8. What is the significance of Fourier Transform in image processing?
In Image processing, the Fourier Transform tells you what is happening in the image in
terms of the frequencies of those sinusoidal. For example, eliminating high frequencies
blurs the image. Eliminating low frequencies gives you edges.
9. Define power law transformation.
Power -law transformation enables us having both logarithmic and exponential
transformation in using simply using the value of the power. If the power is less than one,
then it behaves like logarithmic transformation. If the power is more than one, it behaves
like exponential transformation.
10. Define piecewise linear transformation.
Piece-wise Linear Transformation is type of gray level transformation that is used for
image enhancement. It is a spatial domain method. It is used for manipulation of an image
so that the result is more suitable than the original for a specific application.
Part B
1. Give an in detailed account of smoothing and sharpening filters.
2. Elucidate on homomorphic filtering.
3. Elucidate on medical image enhancement using hybrid filters.
4. Elaborate on frequency domain filtering.
5. With an example, explain power law transformation.
6. Explain in detail piecewise linear transformation.
Part C
1. With an example elaborate on histogram equalization technique.
2. With an example, explain in detail histogram matching technique.
Unit 3: Question Bank
Part A
1. Define region of interest.
A region of interest (ROI) is a portion of an image that you want to filter or operate on
in some way. You can represent an ROI as a binary mask image. In the mask image,
pixels that belong to the ROI are set to 1 and pixels outside the ROI are set to 0 .
2. What are active contours?
Active contour is a segmentation method that uses energy forces and constraints to
separate the pixels of interest from a picture for further processing and analysis.
3. What are the three methods of estimating degradation function?
● Observation
● Experimentation
● Mathematical modelling
4. What is image restoration?
Image restoration is the process of recovering an image from a degraded
version—usually a blurred and noisy image. Image restoration is a fundamental
problem in image processing, and it also provides a testbed for more general inverse
problems.
5. Differentiate between image enhancement and restoration.
enhancement aims to improve the visual appearance of an image without changing its
content, while restoration aims to recover the original content of an image that has
been degraded or damaged.
6. How is periodic noise removed?
Periodic noise can be reduced significantly via frequency domain filtering. On this
page we use a notch reject filter with an appropriate radius to completely enclose the
noise spikes in the Fourier domain. The notch filter rejects frequencies in predefined
neighborhoods around a center frequency.
7. Define image segmentation.
Image segmentation is a commonly used technique in digital image processing and
analysis to partition an image into multiple parts or regions, often based on the
characteristics of the pixels in the image.
8. What is the purpose of noise models in image processing?
Noise is always presents in digital images during image acquisition, coding,
transmission, and processing steps. Noise is very difficult to remove it from the digital
images without the prior knowledge of noise model. That is why, review of noise
models are essential in the study of image denoising techniques.
9. Define inverse filtering.
Inverse filtering is a technique used in signal processing and image processing to
recover an original signal or image from a degraded or distorted version of it. It's
based on the idea of reversing the effects of a known filter or degradation process.
10. Define weiner filtering.
The Wiener filter can be used to filter out the noise from the corrupted signal to
provide an estimate of the underlying signal of interest.
Part B
1. Explain in detail noise models and image restoration.
2. How is periodic noise removed?
3. Elaborate on invariant degradation.
4. Explain in detail inverse filtering.
5. Explain in detail weiner filtering.
Part C
1. Explain edge linking and boundary detection.
2. Explain in detail various active contour models.
UNIT 4: Question bank
Part A
1. Define image registration.
Image registration is the process of transforming different sets of data into a single
unified coordinate system, and can be thought of as aligning images so that
comparable characteristics can be related easily. It involves mapping points from one
image to corresponding points in another image.
Principal axes for a body. For the cylinder the principal axes are represented by the
frame O2 positioned at the mass centre of the body. In this case each of the principal planes is
a plane of symmetry for the body.
Part B
1. Expalin in detail rigid body transformation of an image.
2. Explain in detail principal axes registration.
3. Explain in detail volume visualization of images.
Part C
1. Explain in detail feature based visualization of images.
2. Explai orthogonal and perspective projection in medicine.
UNIT 5: Question bank
Part A
1. What is the need of image compression?
The objective of image compression is to reduce irrelevance and redundancy of the
image data to be able to store or transmit data in an efficient form. It is concerned
with minimizing the number of bits required to represent an image. Image
compression may be lossy or lossless.
2. Differentiate between lossy and lossless compression.
Lossy reduces file size by permanently removing some of the original data. Lossless
reduces file size by removing unnecessary metadata.
3. What is wavelet transform based image compression?
Wavelets allow one to compress the image using less storage space with more details of
the image.
4. What are the basic preprocessing steps in image processing?
Read image.
Resize image.
Remove noise(Denoise)
Segmentation.
Morphology(smoothing edges)
Part B
1. Briefly explain the concept of mammogram in image processing.
2. How is image preprocessing done?
3. Describe the salient features of ROI of blood vessels, tumour.
Part C
1. Explain in detail about computer aided diagnosis system.
2. Expalin in detail about feature extraction in medical images.