0% found this document useful (0 votes)
20 views193 pages

Digital Image Processing Guide

MEDICAL IMAGE PROCESSING REGULATION 2021

Uploaded by

suhagaja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views193 pages

Digital Image Processing Guide

MEDICAL IMAGE PROCESSING REGULATION 2021

Uploaded by

suhagaja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 193

Overview of Image Processing system:

1. Image Acquisition

This is the first fundamental step in digital image processing. Image acquisition
could be as simple as being given an image that is already in digital form.
Generally, the image acquisition stage involves pre-processing, such as scaling
etc.

2. Image Enhancement

Image enhancement is among the simplest and most appealing areas of digital
image processing. Basically, the idea behind enhancement techniques is to bring
out detail that is obscured, or simply to highlight certain features of interest in
an image. Such as, changing brightness & contrast etc.

3. Image Restoration

Image restoration is an area that also deals with improving the appearance of an
image. However, unlike enhancement, which is subjective, image restoration is
objective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation.

4. Color Image Processing


Color image processing is an area that has been gaining its importance because
of the significant increase in the use of digital images over the Internet. This
may include color modeling and processing in a digital domain etc.

5. Wavelets and Multi-Resolution Processing

Wavelets are the foundation for representing images in various degrees of


resolution. Images subdivision successively into smaller regions for data
compression and for pyramidal representation.

6. Compression

Compression deals with techniques for reducing the storage required to save an
image or the bandwidth to transmit it. Particularly in the uses of internet it is
very much necessary to compress data.

7. Morphological Processing

Morphological processing deals with tools for extracting image components that
are useful in the representation and description of shape.

8. Segmentation

Segmentation procedures partition an image into its constituent parts or objects.


In general, autonomous segmentation is one of the most difficult tasks in digital
image processing. A rugged segmentation procedure brings the process a long
way toward successful solution of imaging problems that require objects to be
identified individually.

9. Representation and Description

Representation and description almost always follow the output of a


segmentation stage, which usually is raw pixel data, constituting either the
boundary of a region or all the points in the region itself. Choosing a
representation is only part of the solution for transforming raw data into a form
suitable for subsequent computer processing. Description deals with extracting
attributes that result in some quantitative information of interest or are basic for
differentiating one class of objects from another.

10. Object recognition


Recognition is the process that assigns a label, such as, “vehicle” to an object
based on its descriptors.

11. Knowledge Base

Knowledge may be as simple as detailing regions of an image where the


information of interest is known to be located, thus limiting the search that has
to be conducted in seeking that information. The knowledge base also can be
quite complex, such as an interrelated list of all major possible defects in a
materials inspection problem or an image database containing high-resolution
satellite images of a region in connection with change-detection applications.

Elements of Visual Perception:


The field of digital image processing is built on the foundation of mathematical
and probabilistic formulation, but human intuition and analysis play the main
role to make the selection between various techniques, and the choice or
selection is basically made on subjective, visual judgements.
In human visual perception, the eyes act as the sensor or camera, neurons act as
the connecting cable and the brain acts as the processor.
The basic elements of visual perceptions are:

1. Structure of Eye
2. Image Formation in the Eye
3. Brightness Adaptation and Discrimination
Structure of Eye:
The human eye is a slightly asymmetrical sphere with an average diameter of
the length of 20mm to 25mm. It has a volume of about 6.5cc. The eye is just
like a camera. The external object is seen as the camera take the picture of any
object. Light enters the eye through a small hole called the pupil, a black
looking aperture having the quality of contraction of eye when exposed to bright
light and is focused on the retina which is like a camera film.
The lens, iris, and cornea are nourished by clear fluid, know as anterior
chamber. The fluid flows from ciliary body to the pupil and is absorbed through
the channels in the angle of the anterior chamber. The delicate balance of
aqueous production and absorption controls pressure within the eye.
Cones in eye number between 6 to 7 million which are highly sensitive to
colors. Human visualizes the colored image in daylight due to these cones. The
cone vision is also called as photopic or bright-light vision.
Rods in the eye are much larger between 75 to 150 million and are distributed
over the retinal surface. Rods are not involved in the color vision and are
sensitive to low levels of illumination.
Image Formation in the Eye:
When the lens of the eye focus an image of the outside world onto a
light-sensitive membrane in the back of the eye, called retina the image is
formed. The lens of the eye focuses light on the photoreceptive cells of the
retina which detects the photons of light and responds by producing neural
impulses.

The distance between the lens and the retina is about 17mm and the focal length
is approximately 14mm to 17mm.
Brightness Adaptation and Discrimination:
Digital images are displayed as a discrete set of intensities. The eyes ability to
discriminate black and white at different intensity levels is an important
consideration in presenting image processing result.
The range of light intensity levels to which the human visual system can adapt
is of the order of 1010 from the scotopic threshold to the glare limit. In a
photopic vision, the range is about 106.

Pixels:
Pixel is the smallest unit of a digital graphic which can be illuminated on a
display screen and a set of such illuminated pixels form an image on screen. A
pixel is usually represented as a square or a dot on any display screen like a
mobile, TV, or computer monitor. They can be called as the building blocks of a
digital image and can be controlled to accurately show the desired picture.

The quality of picture concerning the clarity, size and colour combination is
majorly controlled by the amount and density of pixels present in the display.
Higher the resolution smaller the size of the pixel and better the clarity and vice
versa.

Each pixel has a unique geometric co-ordinate, dimensions (length and breadth),
size (eight bits or more), and has the ability to project multitude of colours.

Voxels:
Voxels are fairly complicated to understand but can be defined in the easiest of
language as a Volumetric Pixel. In 3D printing, we can define a voxel as a value
on a grid in a three-dimensional space, like a pixel with volume. Each voxel
contains certain volumetric information which helps to create a three
dimensional object with required properties.

Voxel is the smallest distinguishable element of any 3D printed object and


represents a certain grid value. However, unlike a pixel, voxel does not have a
specific position in three-dimensional space. They are not bound by absolute
coordinates but are defined by the relative position of the surrounding voxels.
We can equate a voxel to bricks, where the position of a brick is defined by the
relative position of the neighbouring bricks.

One important aspect of every voxel is the ability of repeatability. Voxels have a
defined shape and size and can be stacked over each other to create a 3D object.

Understanding color models


You need a precise method to define colors. Color models provide various
methods to define colors, each model defining colors through the use of specific
color components. There is a range of color models to choose from when
creating graphics.

CMYK color model


The CMYK color model, which is used in printing, uses the components cyan
(C), magenta (M), yellow (Y), and black (K) to define color. Values for these
components range from 0 to 100 and represent percentages.
In subtractive color models, such as CMYK, color (that is, ink) is added to a
surface, such as white paper. The color then “subtracts” brightness from the
surface. When the value of each color component (C,M,Y) is 100, the resulting
color is black. When the value of each component is 0, no color is added to the
surface, so the surface itself is revealed —in this case, the white paper. Black
(K) is included in the color model for printing purposes because black ink is
more neutral and darker than blending equal amounts of C, M, and Y. Black ink
produces sharper results, especially for printed text. In addition, black ink is
usually less expensive than using colored ink.

Black is the result of combining the three CMY colors at their highest
intensities.

RGB color model


The RGB color model uses the components red (R), green (G), and blue (B) to
define the amounts of red, green, and blue light in a given color. In a 24-bit
image, each component is expressed as a number from 0 to 255. In an image
with a higher bit rate, such as a 48-bit image, the value range is greater. The
combination of these components defines a single color.
In additive color models, such as RGB, color is produced from transmitted light.
RGB is therefore used on monitors, where red, blue, and green lights are
blended in various ways to reproduce a wide range of colors. When red, blue,
and green lights are combined at their maximum intensities, the eye perceives
the resulting color as white. In theory, the colors are still red, green and blue, but
the pixels on a monitor are too close together for the eye to differentiate the
three colors. When the value of each component is 0, signifies there is an
absence of light, the eye perceives the color as black.
White is the result of combining the three RGB colors at their maximum
intensities.
RGB is the most commonly used color model, because it allows a broad range
of colors to be stored and displayed.

HSB color model


The HSB color model uses hue (H), saturation (S), and brightness (B) as
components for defining color. HSB is also known as HSV (with the
components hue, saturation, and value). Hue describes the pigment of a color
and is expressed in degrees to represent the location on the standard color
wheel. For example, red is 0 degrees, yellow is 60 degrees, green is 120
degrees, cyan is 180 degrees, blue is 240 degrees, and magenta is 300 degrees.
Saturation describes the vividness or dullness of a color. Values of saturation
range from 0 to 100 and represent percentages (the higher the value, the more
vivid the color). Brightness describes the amount of white in the color. Like
saturation values, brightness values range from 0 to 100 and represent
percentages (the higher the value, the brighter the color).
HSB color model

Grayscale color model


The grayscale color model defines color by using only one component,
lightness, which is measured in values ranging from 0 to 255. Each grayscale
color has equal values of the red, green, and blue components of the RGB color
model. Changing a color photo to grayscale creates a black-and-white photo.

Pixel Data

This is the section where the numerical values of the pixels are stored.
According to the data type, pixel data are stored as integers or
floating-point numbers using the minimum number of bytes required
to represent the values

DICOM
The Dicom standard was established by the American College of
Radiology and the National Electric Manufacturers Association. Today,
the Dicom standard is the backbone of every medical imaging
department. The added value of its adoption in terms of access,
exchange, and usability of diagnostic medical images is, in general,
huge. Dicom is not only a file format but also a network
communication protocol.

The innovation of Dicom as a file format has been to establish that the
pixel data cannot be separated from the description of the medical
procedure which led to the formation in the image itself. In other
words, the standard stressed the concept that an image that is
separate from its metadata becomes “meaningless” as medical image.
Metadata and pixel data are merged in a unique file, and the Dicom
header, in addition to the information about the image matrix,
contains the most complete description of the entire procedure used
to generate the image ever conceived in terms of acquisition protocol
and scanning parameters. The header also contains patient
information such as name, gender, age, weight, and height. For these
reasons, the Dicom header is modality-dependent and varies in size. In
practice, the header allows the image to be self-descriptive. In order to
easily understand the power of this approach, just think to the
software which Siemens first introduced for its MRI systems to
replicate an acquisition protocol. The software, known as “Phoenix”, is
able to extract from a Dicom image series dragged into the acquisition
window the protocol and to replicate it for a new acquisition. There
are similar tools for all the major manufacturers.

Regarding the pixel data, Dicom can only store pixel values as integers.
Dicom cannot currently save pixel data in floating-point while it
supports various data types, including floats, to store metadata.
Whenever the values stored in each voxel have to be scaled to different
units, Dicom makes use of a scale factor using two fields into the
header defining the slope and the intercept of the linear
transformation to be used to convert pixel values to real world values.

Dicom supports compressed image data through a mechanism that


allow a non-Dicom-formatted document to be encapsulated in a Dicom
file.
Analyze 7.5

Analyze 7.5 was created at the end of 1980s as format employed by


the commercial software Analyze developed at the Mayo Clinic in
Rochester, MN, USA. For more than a decade, the format was the
standard “de facto” for the medical imaging post-processing. The big
insight of the Analyze format was that it has been designed for
multidimensional data (volume). Indeed, it is possible to store in one
file 3D or 4D data (the fourth dimension being typically the temporal
information). An Analyze 7.5 volume consists of two binary files: an
image file with extension “.img” that contains the voxel raw data and a
header file with extension “.hdr” that contains the metadata, such as
number of pixels in the x, y, and z directions, voxel size, and data type.
The header has a fixed size of 348 bytes and is described as a structure
in the C programming language. The reading and the editing of the
header require a software utility. The format is today considered “old”
but it is still widely used and supported by many processing software
packages, viewers, and conversion utilities. A new version of the
format (AnalyzeAVW) used in the latest versions of the Analyze
software is not discussed here since it is not widespread.

As summarized in Table 1, Analyze 7.5 does not support certain basic


data types including the unsigned 16 bits, and this can be sometimes a
limitation forcing users to use a scale factor or to switch to a pixel
depth of 32-bit. Moreover, the format does not store enough
information to unambiguously establish the image orientation.

NIFTI

Nifti is a file format created at the beginning of 2000s by a committee


based at the National Institutes of Health with the intent to create a
format for neuroimaging maintaining the advantages of the Analyze
format, but solving the weaknesses. The Nifti can in fact be thought as
a revised Analyze format. The format fills some of the unused/little
used fields present in the Analyze 7.5 header to store new information
like image orientation with the intent to avoid the left–right ambiguity
in brain study. Moreover, Nifti include support for data type not
contemplated in the Analyze format like the unsigned 16-bit. Although
the format also allows the storage of the header and pixel data in
separate files, images are typically saved as a single “.nii” file in which
the header and the pixel data are merged. The header has a size of 348
bytes in the case of “.hdr” and “.img” data storage, and a size of 352
bytes in the case of a single “.nii” file for the presence of four additional
bytes at the end, essentially to make the size a multiple of 16, and also
to provide a way to store additional-metadata, in which case these 4
bytes are nonzero. A practical implementation of an extended Nifti
format for the processing of diffusion-weighted magnetic resonance
data is described in .

The Nifti format allows a double way to store the orientation of the
image volume in the space. The first, comprising a rotation plus a
translation, to be used to map voxel coordinates to the scanner frame
of reference; this “rigid body” transformation is encoded using a
“quaternion”. The second method is used to save the 12 parameters of
a more general linear transformation which defines the alignment of
the image volume to a standard or template-based coordinate system.
This spatial normalization task is common in brain functional image
analysis.

INTERFILE

Interfile is a file format that was developed for the exchange of nuclear medicine image data.
An Interfile data set consists of two files:
● Header file — Provides information about dimensions, identification and processing
history. You use the interfileinfo function to read the header information. The header
file has the .hdr file extension.
● Image file — Image data, whose data type and ordering are described by the header file.
You use interfileread to read the image data into the workspace. The image file has
the .img file extension.

Sampling and quantization


In order to become suitable for digital processing, an image function f(x,y) must be digitized
both spatially and in amplitude. Typically, a frame grabber or digitizer is used to sample and
quantize the analogue video signal. Hence in order to create an image which is digital, we
need to covert continuous data into digital form. There are two steps in which it is done:

● Sampling
● Quantization

The sampling rate determines the spatial resolution of the digitized image, while the
quantization level determines the number of grey levels in the digitized image. A magnitude
of the sampled image is expressed as a digital value in image processing. The transition
between continuous values of the image function and its digital equivalent is called
quantization.

The number of quantization levels should be high enough for human perception of fine
shading details in the image. The occurrence of false contours is the main problem in image
which has been quantized with insufficient brightness levels.

In this lecture we will talk about two key stages in digital image processing. Sampling and
quantization will be defined properly. Spatial and grey-level resolutions will be introduced
and examples will be provided.

To create a digital image, we need to convert the continuous sensed data into
digital form.
This process includes 2 processes:
1. Sampling: Digitizing the co-ordinate value is called sampling.
2. Quantization: Digitizing the amplitude value is called quantization.
To convert a continuous image f(x, y) into digital form, we have to sample the
function in both co-ordinates and amplitude.

Sampling
Since an analog image is continuous not just in its co-ordinates (x
axis), but also in its amplitude (y axis), so the part that deals with
the digitizing of co-ordinates is known as sampling. In digitizing
sampling is done on independent variable. In case of equation y =
sin(x), it is done on x variable.

When looking at this image, we can see there are some random
variations in the signal caused by noise. In sampling we reduce this
noise by taking samples. It is obvious that more samples we take,
the quality of the image would be more better, the noise would be
more removed and same happens vice versa. However, if you take
sampling on the x axis, the signal is not converted to digital format,
unless you take sampling of the y-axis too which is known as
quantization.

Sampling has a relationship with image pixels. The total number of


pixels in an image can be calculated as Pixels = total no of rows *
total no of columns. For example, let’s say we have total of 36 pixels,
that means we have a square image of 6X 6. As we know in
sampling, that more samples eventually result in more pixels. So it
means that of our continuous signal, we have taken 36 samples on x
axis. That refers to 36 pixels of this image. Also the number sample
is directly equal to the number of sensors on CCD array.

Here is an example for image sampling and how it can be


represented using a graph.

Quantization
Quantization is opposite to sampling because it is done on “y axis”
while sampling is done on “x axis”. Quantization is a process of
transforming a real valued sampled image to one taking only a finite
number of distinct values. Under quantization process the
amplitude values of the image are digitized. In simple words, when
you are quantizing an image, you are actually dividing a signal into
quanta(partitions).
Now let’s see how quantization is done. Here we assign levels to the
values generated by sampling process. In the image showed in
sampling explanation, although the samples has been taken, but
they were still spanning vertically to a continuous range of gray level
values. In the image shown below, these vertically ranging values
have been quantized into 5 different levels or partitions. Ranging
from 0 black to 4 white. This level could vary according to the type
of image you want.

There is a relationship between Quantization with gray level


resolution. The above quantized image represents 5 different levels
of gray and that means the image formed from this signal, would
only have 5 different colors. It would be a black and white image
more or less with some colors of gray.

When we want to improve the quality of image, we can increase the


levels assign to the sampled image. If we increase this level to 256, it
means we have a gray scale image. Whatever the level which we
assign is called as the gray level. Most digital IP devices uses
quantization into k equal intervals. If b-bits per pixel are used,

The number of quantization levels should be high enough for


human perception of fine shading details in the image. The
occurrence of false contours is the main problem in image which
has been quantized with insufficient brightness levels. Here is an
example for image quantization process.

Signal-to-noise ratio (SNR)


It is used in imaging to characterize image quality. The sensitivity of a (digital or film) imaging
system is typically described in the terms of the signal level that yields a threshold level of SNR.

Industry standards define sensitivity in terms of the ISO film speed equivalent, using SNR
thresholds (at average scene luminance) of 40:1 for "excellent" image quality and 10:1 for
"acceptable" image quality.[1]

SNR is sometimes quantified in decibels (dB) of signal power relative to noise power, though in
the imaging field the concept of "power" is sometimes taken to be the power of a voltage signal
proportional to optical power; so a 20 dB SNR may mean either 10:1 or 100:1 optical power,
depending on which definition is in use.

DCT:
The discrete cosine transform (DCT) represents an image as a sum of sinusoids
of varying magnitudes and frequencies. The DCT has the property that, for a
typical image, most of the visually significant information about the image is
concentrated in just a few coefficients of the DCT. For this reason, the DCT is
often used in image compression applications. For example, the DCT is at the
heart of the international standard lossy image compression algorithm known as
JPEG. The two-dimensional DCT of an M-by-N matrix A is defined as follows.
The values Bpq are called the DCT coefficients of A. (Note that matrix indices in
MATLAB® always start at 1 rather than 0; therefore, the MATLAB matrix
elements A(1,1) and B(1,1) correspond to the mathematical
quantities A00 and B00, respectively.)
The DCT is an invertible transform, and its inverse is given by

KL Transform:
i. The KL Transform is also known as the Hoteling transform or the Eigen Vector transform.
The KL Transform is based on the statistical properties of the image and has several
important properties that make it useful for image processing particularly for image
compression.
ii. The main purpose of image compression is to store the image in fewer bits as compared
to original image, now data from neighbouring pixels in an image are highly correlated.
iii. More image compression can be achieved by de-correlating this data. The KL transform
does the task of de-correlating the data thus facilitating higher degree of compression.
(I) Find the mean vector and covariance matrix of the given image x
(II) Find the Eigen values and then the eigen vectors of the covariance matrix
(III) Create the transformation matrix T, such that rows of T are eigen vectors
(IV) Find the KL Transform
The mean vector is found out as shown below mx=E(x) (Equation 1)
Where E{x} is the expected value
N = number of columns
The covariance of the vector population is defined as
Covariance (x)=Cx=E[(x−mx)(x−mx)′]
14. Arithmetic and Logical Operations on Images (Image Algebra)
These operations are applied on pixel-by-pixel basis. So, to add two
images together, we add the value at pixel (0 , 0) in image 1 to the value at
pixel (0 , 0) in image 2 and store the result in a new image at pixel (0 , 0). Then
we move to the next pixel and repeat the process, continuing until all pixels
have been visited.
Clearly, this can work properly only if the two images have identical
dimensions. If they do not, then combination is still possible, but a meaningful
result can be obtained only in the area of overlap. If our images have
dimensions of w1*h1, and w2*h2 and we assume that their origins are aligned,
then the new image will have dimensions w*h, where:
w = min (w1, w2)
h = min (h1, h2)

Addition and Averaging


If we add two 8-bit gray scale images, then pixels in the resulting image
can have values in the range 0-510. We should therefore choose a 16-bit
representation for the output image or divide every pixel value by two. If we do
the later, then we are computing an average of the two images.
The main application of image averaging is noise removal. Every image
acquired by a real sensor is afflicted to some degree of random noise. However,
the level of noise is represented in the image can be reduced, provided that the
scene is static and unchanging, by the averaging of multiple observations of
that scene. This works because the noisy distribution can be regarded as
approximately symmetrical with a mean of zero. As a result,

as likely as
negative perturbations by the same amount, and there will be a tendency for the
perturbations to cancel out when several noisy values are added.
Addition can also be used to combine the information of two images,
such as an image morphing, in motion pictures.

a b c
Figure (4) a) noisy image b) average of five observation c) average of ten
observation

Subtraction
Subtracting two 8-bit grayscale images can produce values between - 225
and +225. This necessitates the use of 16-bit signed integers in the output

image unless sign is unimportant, in which case we can simply take the
modulus of the result and store it using 8-bit integers:

g(x,y) = |f1 (x,y) f2 (x,y)|

The main application for image subtraction is in change detection (or motion
detection). If we make two observations of a scene and compute their difference
using the above equation, then changes will be indicated by pixels in the difference
image which have non-zero values. Sensor noise, slight changes in illumination and
various other factors can result in small differences which are of no significance so it
is usual to apply a threshold to the difference image. Differences below this threshold
are set to zero. Difference above the threshold can, if desired, be set to the maximum
pixel value. Subtraction can also be used in medical imaging to remove static
background information.
Figure (5) a, b ) two frames of video sequencec) their difference

Multiplication and Division


Multiplication and division can be used to adjust brightness of an image.
Multiplication of pixel values by a number greater than one will brighten the
image, and division by a factor greater than one will darken the image.
Brightness adjustment is often used as a preprocessing step in image
enhancement.
One of the principle uses of image multiplication (or division) is to correct
grey-level shading resulting from non uniformities in illumination orin the
sensor used to acquire the image.
(a) (b) (c)
Figure a) original image b) image multiplied by 2 c) image divided by 2

15. Logical Operation:


Logical operations apply only to binary images, whereas arithmetic
operations apply to multi-valued pixels. Logical operations are basic tools in
binary image processing, where they are used for tasks such as masking,
feature detection, and shape analysis. Logical operations on entire image are

performed pixel by pixel. Because the AND operation of two binary


variables is 1 only when both variables are 1, the result at any location in a
resulting AND image is 1 only if the corresponding pixels in the two input
images are 1. As logical operation involve only one pixel location at a time,
they can be done in place, as in the case of arithmetic operations. The XOR
(exclusive OR) operation yields a 1 when one or other pixel (but not both) is 1,
and it yields a 0 otherwise. The operation is unlike the OR operation, which is
1, when one or the other pixel is 1, or both pixels are 1.

Logical AND & OR operations are useful for the masking and
compositing of images. For example, if we compute the AND of a binaryimage
with some other image, then pixels for which the corresponding value in the
binary image is 1 will be preserved, but pixels for which the corresponding

binary value is 0 will be set to 0 (erased) . Thus the binary image acts as a

mask removes information from certain parts of the image.


On the other hand, if we compute the OR of a binary image with some
other image , the pixels for which the corresponding value in the binary image
is 0 will be preserved, but pixels for which the corresponding binary value is 1,
will be set to 1 (cleared).
So, masking is a simple method to extract a region of interest from an
image.

Figure: image masking

In addition to masking, logical operation can be used in feature detection.


Logical operation can be used to compare between two images, as shown
below:

AND^
This operation can be used to find the similarity white regions of two
different images (it required two images).
g (x,y) = a (x,y) ^ b (x,y)

Exclusive OR
This operator can be used to find the differences between white regions of two
different images (it requires two images).

NOT

NOT operation can be performed on gray-


only one image, and the result of this operation is the negative of the original
image.
g (x,y) = 255- f (x,y)

Figure a) input image a(x,y); b) input image b(x,y) ; c) a(x,y) ^ b(x,y) ;


d) a(x,y) ^ ~ b(x,y)
Signal-to-noise ratio (SNR) describes the quality of a measurement. In CCD imaging, SNR
refers to the relative magnitude of the signal compared to the uncertainty in that signal on a
per-pixel basis. Specifically, it is the ratio of the measured signal to the overall measured
noise (frame-to-frame) at that pixel. High SNR is particularly important in applications
requiring precise light measurement.

Photons incident on the CCD convert to photoelectrons within the silicon layer. These
photoelectrons comprise the signal but also carry a statistical variation of fluctuations in the
photon arrival rate at a given point. This phenomenon is known as “photon noise” and
follows Poisson statistics. Additionally, inherent CCD noise sources create electrons that are
indistinguishable from the photoelectrons. When calculating overall SNR, all noise sources
need to be taken into consideration:

Photon noise refers to the inherent natural variation of the incident photon flux.
Photoelectrons collected by a CCD exhibit a Poisson distribution and have a
square root relationship between signal and noise.

(noise=√signal)

Read noise refers to the uncertainty introduced during the process of


quantifying the electronic signal on the CCD. The major component of readout
noise arises from the on-chip preamplifier.

Dark noise arises from the statistical variation of thermally generated electrons
within the silicon layers comprising the CCD. Dark current describes the rate of
generation of thermal electrons at a given CCD temperature. Dark noise, which
also follows a Poisson relationship, is the square root of the number of thermal
electrons generated within a given exposure. Cooling the CCD from room
temperature to -25°C will reduce dark current by more than 100 times.

Taken together, the SNR for a CCD camera can be calculated from the following
equation:

where:

I = Photon flux (photons/pixel/second)


QE = Quantum efficiency
t = Integration time (seconds)
Nd = Dark current (electrons/pixel/sec)
Nr = Read noise (electrons)

Under low-light-level conditions, read noise exceeds photon noise and the image
data is said to be “read-noise limited”. The integration time can be increased
until photon noise exceeds both read noise and dark noise. At this point, the
image data is said to be “photon limited”.

An alternative means of raising the SNR is to use a technique known as binning.


Binning is the process of combining charge from adjacent pixels in a CCD during
readout into a single “superpixel”. Binning neighboring pixels on the CCD array
may allow you to reach a photon-limited signal more quickly, albeit at the
expense of spatial resolution.

Once you have determined acceptable values for SNR, integration time, and the
degree to which you are prepared to bin pixels, the above equation can be
solved for the minimum photon flux required. This is, therefore, the lowest light
level that can be measured for given experimental conditions and camera
specifications.
Dr. Qadri Hamarsheh

Basic Relationships between Pixels


Outline of the Lecture
 Neighbourhood
 Adjacency
 Connectivity
 Paths
 Regions and boundaries
 Distance Measures
 Matlab Example

Neighbors of a Pixel
1. N4 (p) : 4-neighbors of p.
• Any pixel p(x, y) has two vertical and two horizontal neighbors, given by
(x+1,y), (x-1, y), (x, y+1), (x, y-1)
• This set of pixels are called the 4-neighbors of P, and is denoted by N4(P)
• Each of them is at a unit distance from P.
2. ND(p)
• This set of pixels, called 4-neighbors and denoted by ND (p).
• ND(p): four diagonal neighbors of p have coordinates:
(x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1)
• Each of them are at Euclidean distance of 1.414 from P.
3. N8 (p): 8-neighbors of p.
• N4(P)and ND(p) together are called 8-neighbors of p, denoted by N8(p).
• N8 = N4 U ND
• Some of the points in the N4 , ND and N8 may fall outside image when P lies on the
border of image.

F(x-1, y-1) F(x-1, y) F(x-1, y+1)

F(x, y-1) F(x,y) F(x, y+1)

F(x+1, y-1) F(x+1, y) F(x+1, y+1)


N8 (p)

1
Dr. Qadri Hamarsheh
Adjacency
• Two pixels are connected if they are neighbors and their gray levels satisfy some
specified criterion of similarity.
• For example, in a binary image two pixels are connected if they are 4-neighbors and
have same value (0/1)
• Let v: a set of intensity values used to define adjacency and connectivity.
• In a binary Image v={1}, if we are referring to adjacency of pixels with value 1.
• In a Gray scale image, the idea is the same, but v typically contains more elements,
for example v= {180, 181, 182,....,200}.
• If the possible intensity values 0 to 255, v set could be any subset of these 256 values.
Types of adjacency
1. 4-adjacency: Two pixels p and q with values from v are 4-adjacent if q is in the
set N4 (p).
2. 8-adjacency: Two pixels p and q with values from v are 8-adjacent if q is in the
set N8 (p).
3. m-adjacency (mixed): two pixels p and q with values from v are m-adjacent if:
q is in N4 (p) or
q is in ND (P) and
The set N4 (p) ∩ N4 (q) has no pixel whose values are from v (No intersection).
• Mixed adjacency is a modification of 8-adjacency ''introduced to eliminate the
ambiguities that often arise when 8- adjacency is used. (eliminate multiple path
connection)
• Pixel arrangement as shown in figure for v= {1}
Example:

Path
• A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with
coordinate (s,t) is a sequence of distinct pixels with coordinates (x0, y0), (x1, y1),
..., (xn, yn), where (x0, y0)= (x,y), (xn, yn)= (s,t)
• (xi, yi) is adjacent pixel (xi-1, yi-1) for 1≤j≤n ,
• n- The length of the path.
• If (x0, y0)= (xn, yn):the path is closed path.
• We can define 4- ,8- , or m-paths depending on the type of adjacency specified.

2
Dr. Qadri Hamarsheh
Connectivity
• Let S represent a subset of pixels in an image, Two pixels p and q are said to be
connected in S if there exists a path between them.
• Two image subsets S1 and S2 are adjacent if some pixel in S1 is adjacent to some
pixel in S2

Region
• Let R to be a subset of pixels in an image, we call a R a region of the image. If R is a
connected set.
• Region that are not adjacent are said to be disjoint.
• Example: the two regions (of Is) in figure, are adjacent only if 8-adjacany is used.

1 1 1
1 0 1 Ri
0 1 0
0 0 1
1 1 1 Rj
1 1 1
• 4-path between the two regions does not exist, (so their union in not a connected set).
• Boundary (border) image contains K disjoint regions, Rk, k=1, 2, ...., k, none of
which touches the image border.

R1 R2

R3

Rk
Dr. Qadri Hamarsheh
• Let: Ru - denote the union of all the K regions, (Ru)c- denote its complement.
complement
(Complement of a set S is the set of points that are not in s).
Ru - called foreground; (Ru)c - called background of the image.
• Boundary (border or contour) of a region R is the set of points that are adjacent to
points in the complement of R (another way: the border of a region is the set of pixels
in the region that have at least are background neighbor)
neighbor).
We must specify the connectivity being used to define adjacency

Distance Measures
• For pixels p, q and z,, with coordinates (x,y), (s,t) and (u,v),, respenctively, D is
a distance function or metric if:
D(p,q)൒0, D(p,q) = 0 if p=q q(s,t)
D(p,q) = D(q,p), and
D(p,z) ൑D(p,q) + D(q,z)

p(x,y)
• The following are the different Distance measures:
1. Euclidean Distance (De)
  
‫ ܍‬,  ૛  ૛ ሿ

• The points contained in a disk of radius r centred at (x,y).


2. D4 distance (city-block
block distance)
૝ ,   | | | |
• Pixels having a D4 distance from (x,y) less than or equal to some value r form a
Diamond centred (x,y),.,. q(s,t)

P(x,y)
Dr. Qadri Hamarsheh
Example 1: the pixels with D4=1 are the 4-nighbors of (x, y).

3. D8 distance (chess board distance)


ૡ , 
  | |, | |
• square – centred at (x, y)
• D8 = 1 are 8-neighbors of (x,y)
Example: D8 distance ൑2

4. Dm distance:
• Is defined as the shortest m-path
m between the points.
• The distance between pixels depends only on the values of pixels.
Example: consider the following arrangement of pixels
P3 P4
P1 P2
P
and assume that P, P2 have value 1 and that P1 and P3 can have a value of 0 or 1
Suppose, that we consider adjacency of pixels value 1 (v={1})
(
a) if P1 and P3 are 0:
Then Dm distance = 2
b) if P1 = 1 and P3 = 0
m-distance = 3;
c) if P1=0 ; and P3 = 1
d) if P1=P3 =1 ;
m-distance=4
distance=4 path = p p1 p2 p3 p4
Dr. Qadri Hamarsheh
Matlab Example

Matlab Code
bw = zeros(200,200); bw(50,50) = 1; bw(50,150) = 1;
bw(150,100) = 1;
D1 = bwdist(bw,'euclidean'););
D2 = bwdist(bw,'cityblock');
);
D3 = bwdist(bw,'chessboard'
'chessboard');
D4 = bwdist(bw,'quasi-euclidean'
euclidean');
figure
subplot(2,2,1), subimage(mat2gray(D1)), title(
title('Euclidean')
hold on, imcontour(D1)
subplot(2,2,2), subimage(mat2gray(D2)), title(
title('City block')
hold on, imcontour(D2)
subplot(2,2,3), subimage(mat2gray(D3)),
(mat2gray(D3)), title(
title('Chessboard'))
hold on, imcontour(D3)
subplot(2,2,4), subimage(mat2gray(D4)), title(
title('Quasi-Euclidean'
Euclidean')
hold on, imcontour(D4)
International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

Image Smoothening and Sharpening using Frequency


Domain Filtering Technique
Swati Dewangan
M.Tech. Scholar, Computer Networks, Bhilai Institute of Technology, Durg, India.

Anup Kumar Sharma


M.Tech. Scholar, Computer Networks, Bhilai Institute of Technology, Durg, India.

Abstract – Images are used in various fields to help monitoring The content of this paper is organized as follows: Section I
processes such as images in fingerprint evaluation, satellite gives introduction to the topic and projects fundamental
monitoring, medical diagnostics, underwater areas, etc. Image background. Section II describes the types of image
processing techniques is adopted as an optimized method to help enhancement techniques. Section III defines the operations
the processing tasks efficiently. The development of image
applied for image filtering. Section IV shows results and
processing software helps the image editing process effectively.
Image enhancement algorithms offer a wide variety of approaches discussions. Section V concludes the proposed approach and
for modifying original captured images to achieve visually its outcome.
acceptable images. In this paper, we apply frequency domain 1.1 Digital Image Processing
filters to generate an enhanced image. Simulation outputs results
in noise reduction, contrast enhancement, smoothening and Digital image processing is a part of signal processing which
sharpening of the enhanced image. uses computer algorithms to perform image processing on
Index Terms – Digital Image Processing, Fourier Transforms, digital images. It has numerous applications in different studies
High-pass Filters, Low-pass Filters, Image Enhancement. and researches of science and technology. The fundamental
steps in Digital Image processing are image acquisition, image
1. INTRODUCTION enhancement, image analysis, image reconstruction, image
Image processing is a form of signal processing in which the restoration, image compression, image segmentation, image
input is an image, such as a photograph or video frame and the recognition, and visualization of image.
output may be either an image or a set of characteristics or The main sources of noise in digital image processing come
parameters related to the image. Most image-processing under image acquisition and image transmission. Image
techniques involve treating the image as a two-dimensional Enhancement basically improves the visual quality of the
signal and applying standard signal-processing techniques. It image by providing clear images for human observer and for
deals with the improvement of pictorial information for human machine in automatic image processing techniques. Digital
interpretation and processing of image for storage, image processing has fundamental classes depending on their
transmission and representation for machine perception. operations:
Image processing can be defined as analysis of picture using A. Image enhancement
techniques that can basically identify region of interest from all
those images in bitmapped graphic format that have been Image enhancement deals with contrast enhancement, spatial
scanned or captured with digital camera. Image enhancement filtering, frequency domain filtering, edge enhancement and
techniques aims at realizing the improvement in the quality of noise reduction. This project briefly shows the theoretical and
a given image. An image can be enhanced by changing any practical approaches in frequency domain.
attribute of the image. There exist many techniques that can
B. Image analysis
enhance an image without spoiling it. Enhancement methods
can be broadly divided into two categories i.e. spatial domain It deals with the statistical details of an image. It is possible to
technique and frequency domain technique. examine the information of an image in detail. This information
helps in image restoration and enhancement. One of the
Spatial domain deals with direct manipulation of pixels of an
representations of the information is the histogram
image whereas the frequency domain filters the image by
representation. During image analysis, the main tasks include
modifying the Fourier Transform of an image. In this paper,
image segmentation, feature extraction and object
main focus is laid on enhancing an image using frequency
classification.
domain technique. The objective to show how a digital image
is being processed generate a better -quality image.

ISSN: 2454-6410 ©EverScience Publications 169


International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

C. Image restoration The enhancement technique differs from one field to another
according to its objective. Advancement in the technology
In this class, the image is corrected using different correction
brings the development in the digital image processing
methods like inverse filtering and feature extraction in order to
techniques in both domains:
restore an image to its original form.
A. Spatial domain.
D. Image compression
The term spatial domain refers to the image plane itself, and
It deals with the compression of the size of the image so that it
approaches in this category are based on direct manipulation of
can easily be stored electronically. The compressed images are
pixel values of an image. It enhances the whole image in a
then decompressed to their original forms. The image
uniform manner. The value of the pixels with coordinates (x,
compression and decompression can either lose their size by
y) in an enhanced image ‘F’ is the result of performing some
maintaining high quality or preserves the original data size
operation on the pixels with the neighbourhood of (x, y) in the
without losing size.
input image ‘f’. This method is straightforward and are chiefly
E. Image synthesis utilized in real time applications. But it lags in producing
adequate robustness and imperceptibility requirement.
This class of digital image processing is well known now-a-
days in the film and game industry and is very advanced in 3- B. Frequency domain.
dimensional and 4-dimensional productions. In both cases the
The frequency domain processing techniques are based on
images and videos scenes are constructed using certain
modifying the Fourier transform of an image. The basic idea in
techniques of visualization.
using this technique is to enhance the image by manipulating
1.2 Image Enhancement the transform coefficient of the image, such as Discrete Fourier
Transform (DFT), Discrete Wavelet Transform (DWT), and
Image enhancement is basically improving the interpretability Discrete Cosine Transform (DCT). This methods advantages
or perception of information in images for human viewers and includes low complexity of computations, ease of viewing and
providing `better' input for other automated image processing
manipulating the frequency composition of the image and the
techniques. The principal objective of image enhancement is to
easy applicability of special transformed domain properties.
process a given image so that the result is more suitable than
the original image for a specific application. 3. IMAGE ENHANCEMENT USING FREQUENCY
DOMAIN TECHNIQUE
Image enhancement simply means, transforming an image f
into image g using a transformation function T. Let the values In frequency domain methods, the image is first transferred into
of pixels in images f and g are denoted by r and s respectively. frequency domain. All the enhancement operations are
Then the pixel values r and s are related by the expression, performed on the Fourier transform of the. Image enhancement
function in the frequency domain is denoted by the expression:
s = T(r)
g(x, y) = T[f(x, y)]
Where, T is a transformation that maps a pixel value r into a
pixel value s. where f(x, y) is the input image, g(x, y) is an enhanced image
formed by the result of performing some operation, T on the
2. IMAGE ENHANCEMENT TECHNIQUES frequency component of the transformed image.
3.1 Filtering in the Frequency Domain
The procedures required to enhance an image using frequency
domain technique are:
i. Transform the input image into the Fourier domain.
ii. Multiply the Fourier transformed image by a filter.
iii. Take the inverse Fourier transform of the image to get
the resulting enhanced image.
3.2 Basic Steps for Filtering in the Frequency Domain:
1. Given an input image f(x, y) of size M x N.
2. Compute F (u, v), the DFT of the image.
Fig. 1. Types of Enhancement Technique

ISSN: 2454-6410 ©EverScience Publications 170


International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

3. Multiply F (u, v) by a filter function H(u, v), i.e., G(u, v) = with content in other frequency bands. The general formula for
H(u, v)F(u, v) filtering is given as:
4. Compute inverse DFT of the G(u, v). G(u, v) = F(u, v).H(u, v)
5. Obtain the real part of the result. where the H(u, v) is the transfer function, and F(u, v) is the
Fourier transform of the image function. The G(u, v) is the
Step-1 Input Image
filtered final function.
An input image may be defined as a two-dimensional function,
In all the filters, it is important to find the right filter function
f(x, y), where x and y are spatial (plane) coordinates, and the
H(u, v) as it amplifies some frequencies and suppresses certain
amplitude of f at any pair of coordinates (x, y) is called the
frequency components in an image. There are many filters that
intensity or grey level of the image at that point.
are used for blurring/smoothing, sharpening and edge detection
in an image. Based on the property of using the frequency
domain the image filters are broadly classified into two
categories:
1. Low-pass filters / Smoothing filters.
2. High-pass filters / Sharpening filters.

Fig. 2. Frequency Domain Filtering Operations


Step-2 Compute Fourier Transform of the input image.
The image f(x, y) of size M x N will be represented in the
frequency domain F(u, v) using Discrete Fourier Transform
(DFT). The concept behind the Fourier transform is that any
waveform that can be constructed using a sum of sine and
cosine waves of different frequencies. The Discrete Fourier
Transform (DFT) of an image takes a discrete signal and
transforms it into its discrete frequency domain representation.
The Fourier transform F(u), of a single variable continuous
function f(x), is defined by:
Fig. 3. Types of Frequency Domain Filters
A. Image Smoothing (Low-pass Frequency Domain Filters)

where, u represents the frequency and x represents time/space. A low-pass filter that attenuates (suppresses) high frequencies
The exponential in the above formula can be expanded into while passing the low frequencies which results in creating a
sines and cosines with the variables u and v determining these blurred (smoothed) image. It leaves the low frequencies of the
frequencies. Fourier transform relatively unchanged and ignores the high
frequency noise components. Three main low-pass filters are:
Step-3 Filtering of the Fourier Transformed image.
i. Ideal low-pass filter (ILPF)
A filter is a tool designed to suppress certain frequency
components of an input image and return the image in a An ideal low pass filter deals with the removal of all high
modified format. They are used to compensate for image frequency values of the Fourier transform that are at a distance
imperfections such as noise, and insufficient sharpness. By greater than a specified distance from the origin of the
filter design we can create filters that pass signals with transformed image. The filter transfer function for the Ideal
frequency components in some bands, and attenuate signals low-pass filter is given by:

ISSN: 2454-6410 ©EverScience Publications 171


International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

In the above formulas, D0 is cut-off frequency, a specified


1 if D(u, v)  D0
H (u, v)   nonnegative number. D(u, v) is the distance from point (u, v) to
0 if D(u, v)  D0 the center of the filter.


D(u, v)  (u  M / 2) 2  (v  N / 2) 2 1/ 2 Step-4 Compute Inverse Fourier Transform to get the enhanced
image.
ii. Butterworth low-pass filter (BLPF) We then need to convert data back to real image to use in any
applications. After the needed frequencies removed it is easy to
The filter transfer function for the Butterworth low-pass filter
return back to the spatial domain. Function represented by
is given by:
1 Fourier transform can be completely reconstructed by an
H (u, v) 
1  D(u, v) / D0  inverse transform with no loss of information
2n

iii. Gaussian low-pass filter (GLPF) For this the Inverse Fourier Transform of the filtered image is
calculated by the following equation:
The filter transfer function for the Gaussian low-pass filter is
given by:
H(u,v)  e  D
2
(u,v)/ 2 D02

B. Image Sharpening (High-pass Frequency Domain Filters) 4. RESULTS AND DISCUSSIONS


Sharpening of an image in the frequency domain can be
achieved by high pass filtering process which attenuates
(suppress) low frequency components without disturbing high
frequency information in the Fourier transform of the image.
The high-pass filter Hhp is often represented by its relationship
to the low-pass filter (Hlp) as:
Hhp (u, v) =1- Hlp (u, v)
i. Ideal High-Pass Filter (IHPF)
The ideal high pass filter simply cuts off all the low frequencies
lower than the specified cut-off frequency. The filter transfer
function is given as:

0 if D(u, v)  D0
H (u, v)  
1 if D(u, v)  D0
Fig. 4. Original Input Image
ii. Butterworth High-pass Filter
The transfer function of Butterworth high-pass filter of order n
and with a specified cut-off frequency is given by:

H (u, v)  1  e  D
2
( u ,v ) / 2 D02

iii. Gaussian High Pass Filters


The transfer function of the Gaussian high-pass filter with cut-
off frequency locus at a distance 0 D from the origin given by:

1
H (u, v) 
1  D0 / D(u , v)
2n

Fig. 5. Fourier Transform of an Image

ISSN: 2454-6410 ©EverScience Publications 172


International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

Hence Gaussian low-pass filter is used more instead of the


ILPF/BLPF.

Fig. 6. Ideal Low-Pass Filtered Image


In Fig. 6., filtering center component is responsible for
blurring. The circular components are responsible for the
ringing effects. The severe ringing effect in the blurred images Fig. 9. Ideal High-Pass Filtered Image
is a characteristic of ideal filters.
The severe ringing effect in Fig. 9. is a characteristic of ideal
filters. It is due to the discontinuity in the filter transfer
function. Ringing effect in this filter is so severe that it
produces distorted and thickened object boundaries.

Fig. 7. Butterworth Low-Pass Filtered Image


In Fig. 7., the BLPF with less number of orders does not have
any ringing effect. As the order increases BLPF results in
increasing ringing effects. Less ringing effect is due to the Fig. 10. Butterworth High-Pass Filtered Image
filter’s smooth transition between low and high frequencies.
Boundaries in Fig. 10. are much less distorted compared to
IHPF. This is more appropriate for image sharpening than the
ideal HPF, since this not introduce ringing.

Fig. 8. Gaussian Low-Pass Filtered Image


In Fig. 8., there is no ringing effect of the GLPF. Ringing Fig. 11. Gaussian High-Pass Filtered Image
artifacts are not acceptable in fields like medical imaging.

ISSN: 2454-6410 ©EverScience Publications 173


International Journal of Emerging Technologies in Engineering Research (IJETER)
Volume 5, Issue 4, April (2017) www.ijeter.everscience.org

The results obtained in Fig. 11. are smoother than with the REFERENCES
previous to filters Even the filtering of the smaller objects and [1] Maini, Raman, and Himanshu Aggarwal. "A comprehensive review of
thin bars is cleaner with Gaussian filter. image enhancement techniques." arXiv preprint
arXiv:1003.4053 (2010).
5. CONCLUSION AND FUTURE SCOPE [2] Singh, L. Shyam Sundar, et al. "A Review on Image Enhancement
Methods on Different Domains."
In this project, we focus on existing frequency domain based [3] Upadhye, Mrs Smita Y., and Mrs Swapnali B. Karole. "Comparision of
image enhancement techniques that includes filters that are different Image Resolution Enhancement techniques using wavelet
useful in many application areas as medical diagnosis, army transform." (2016).
and industrial areas. Program is developed to compute and [4] Nguchu, Benedictor Alexander. "Critical Analysis of Image
Enhancement Techniques."
display the image after applying various low pass and high pass [5] Shaikh, Md Shahnawaz, Ankita Choudhry, and Rakhi Wadhwani.
filters on it. "Analysis of Digital Image Filters in Frequency
Domain." Analysis 140.6 (2016).
In this project frequency domain filters are implemented in [6] Singh, Palwinder. "Image Enhancement Techniques: A
MATLAB. It is found that low-pass filters smoothen the input Comprehensive."
image by removing noise and results in blurring of the image [7] Bedi, S. S., and Rati Khandelwal. "Various image enhancement
techniques-a critical review." International Journal of Advanced
and high-pass filters sharpens the inside details of an image. Research in Computer and Communication Engineering 2.3 (2013).
Ideal filters results in the ringing effect in the enhanced image. [8] Wang, David CC, Anthony H. Vagnucci, and C. C. Li. "Digital image
Using the Butterworth filters the ringing effect gets reduced enhancement: a survey." Computer Vision, Graphics, and Image
since there are no sharp frequency transitions, whereas the use Processing 24.3 (1983): 363-381.
[9] Bansal, Atul, Rochak Bajpai, and J. P. Saini. "Simulation of image
of Gaussian filters completely gives the filtered image without enhancement techniques using Matlab." Modelling & Simulation, 2007.
any ringing effect. AMS'07. First Asia International Conference on. IEEE, 2007.
[10] Sawant, H. K., and Mahentra Deore. "A comprehensive review of image
The future scope can be the development of adaptive enhancement techniques." International Journal of Computer
algorithms for effective image enhancement using Fuzzy Logic Technology and Electronics Engineering (IJCTEE) 1.2 (2010): 39-44.
and Neural Network. Many more filters can be added into [11] Rajput, Seema, and S. R. Suralkar. "Comparative study of image
functionality. The same work can be extended for further enhancement techniques." International Journal of Computer Science
and Mobile Computing-A Monthly Journal of Computer Science and
digital image processing applications such as image Information Technology 2.1 (2013): 11-21.
restoration, image data compression etc. [12] Chen, Qiang, et al. "A solution to the deficiencies of image
enhancement." Signal Processing 90.1 (2010): 44-56.

ISSN: 2454-6410 ©EverScience Publications 174


Homomorphic Filtering

Homomorphic filters are widely used in image processing for compensating the effect
of no uniform illumination in an image. Pixel intensities in an image represent the
light reflected from the corresponding points in the objects. As per as image model,
image f(z,y) may be characterized by two components: (1) the amount of source light
incident on the scene being viewed, and (2) the amount of light reflected by the
objects in the scene. These portions of light are called the illumination and reflectance
components, and are denoted i ( x , y) and r ( x , y) respectively.
The functions i ( x , y) and r ( x , y) combine multiplicatively to give the image
function f ( x , y):
f ( x , y) = i ( x , y).r(x, y) (1)

where 0 < i ( x , y ) < a and 0 < r( x , y ) < 1.

Homomorphic filters are used in such situations where the image is subjected to the
multiplicative interference or noise as depicted in Eq. 1. We cannot easily use the
above product to operate separately on the frequency components of illumination and
reflection because the Fourier transform of f ( x , y) is not separable; that is
F[f(x,y)) not equal to F[i(x, y)].F[r(x, y)]. We can separate the two components by
taking the logarithm of the two sides ln f(x,y) = ln i(x, y) + ln r(x, y).
Taking Fourier transforms on both sides we get,

F[ln f(x,y)} = F[ln i(x, y)} + F[ln r(x, y)].

that is, F(x,y) = I(x,y) + R(x,y), where F, I and R are the Fourier transforms ln
f(x,y),ln i(x, y) , and ln r(x, y). respectively. The function F represents the Fourier
transform of the sum of two images: a low-frequency illumination image and a high-
frequency reflectance image. If we now apply a filter with a transfer function that
suppresses low- frequency components and enhances high-frequency components,
then we can suppress the illumination component and enhance the reflectance
component.
Features & Application:

1. Homomorphic filter is used for image enhancement.


2. It simultaneously normalizes the brightness across an image and increases
contrast.
3. It is also used to remove multiplicative noise.

Images normally consist of light reflected from objects. The basic nature
of the image f(x,y) may be characterized by two components:
(1)The amount of source light incident on the scene being viewed, &

(2)The amount of light reflected by the objects in the scen


These portions of light are called the illumination and reflectance
components, and are denoted i(x,y) and r(x,y) respectively. The
functions i and r combine multiplicatively to give the image function F:
f(x,y) = i(x,y)r(x,y),

where 0 < i(x,y) < 1-----indicates perfect black body indicates


perfect absorption , and 0 < r(x,y) < 1 -----indicates perfect white body
indicates perfect
reflection.
Since i and r combine multiplicatively, they can be added by taking log
of the image intensity, so that they can be separated in the frequency
domain.
Illumination variations can be thought as a multiplicative noise and can
be reduced by filtering in the log domain.To make the illuminations of
an image more even, the HF components are increased and the LF
Fig2.4.1: Summary of steps in homomorphic filtering.
(Source: Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing‘,

Pearson, Third Edition, 2010- Page- 292)

Components are filtered, because the HF components are assumed to represent


the reflectance in the scene whereas the LF components are assumed to
represent the illumination in the scene.i.e., High pass filter is used to suppress
LF’s and amplify HF’s in the log intensity domain. illumination component
tends to vary slowly across the image and the reflectance tends to vary rapidly.
Therefore, by applying a frequency domain filter the intensity variation across
the image can be reduced while highlighting detail.
Analysis:
WKT , f(x,y) = i(x,y)r(x,y) (1)

Taking natural log on both sides of the above equation,

we get, ln[f(x,y)] = ln[i(x,y)] + ln[r(x,y)]

Let z = ln[f(x,y)] = ln[i(x,y)] + ln[r(x,y)]

Taking DFT on both sides of the above

equation, we get, z(x,y) = ln[f(x,y)] =

ln[i(x,y)] + ln[r(x,y)]

DFT[z(x,y)] = DFT{ln[f(x,y)]} = DFT{ln[i(x,y)] + ln[r(x,y)]}

= DFT{ln[i(x,y)]} + DFT{ln[r(x,y)]} (2)


Since DFT[f(x,y)] = F(u,v), equation (2) becomes,
Z(u,v) = Fi(u,v) + Fr(u,v) (3)
The function Z represents the Fourier transform of the sum of
two images: a low frequency illumination image and a high
frequency reflectance image.
Figure : Transfer function for homomorphic filtering.

Fig2.4.2:Radial cross section of a circularly symmetric homomorphic filter function.


The vertical axis is at the center of the frequency rectangle and D is the distance from the
center.
(Source: Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing‘, Pearson,
Third Edition,2010.- Page-292)

If we now apply a filter with a transfer function that suppresses low frequency
components and enhances high frequency components, then we can suppress the
illumination component and enhance the reflectance component. Thus ,the Fourier
transform of the output image is obtained by multiplying the DFT of the input image
with the filter function H(u,v).
i.e., S(u,v) = H(u,v) Z(u,v)
(4) where S(u,v) is the fourier transform of the output image.
Substitute equation (3) in (4),
we geS(u,v) = H(u,v) [ Fi(u,v) + Fr(u,v) ] = H(u,v) Fi(u,v) + H(u,v) Fr(u,v) --(5) Applying
IDFT to equation (6), we get,
T-1[S(u,v)] = T-1 [ H(u,v) Fi(u,v) + H(u,v) Fr(u,v)]
= T-1[ H(u,v) Fi(u,v)] + T-1[H(u,v) Fr(u,v)]
s(x,y) = i’(x,y) + r’(x,y) (6)

The Enhanced image is obtained by taking exponential of the IDFT s(x,y),

i.e., g(x,y) = e s(x,y) = e i’(x,y) e r’(x,y) = io(x,y) ro(x,y)

where, io(x,y) = e i’(x,y) , ro(x,y) = e r’(x,y) are the illumination and reflection
components of the enhanced output image.
Image Processing Lecture 6

Filtering in the spatial domain (Spatial Filtering)


refers to image operators that change the gray value at any pixel (x,y)
depending on the pixel values in a square neighborhood centered at (x,y)
using a fixed integer matrix of the same size. The integer matrix is called
a filter, mask, kernel or a window.
The mechanism of spatial filtering, shown below, consists simply
of moving the filter mask from pixel to pixel in an image. At each pixel
(x,y), the response of the filter at that pixel is calculated using a
predefined relationship (linear or nonlinear).

Figure 6.1 Spatial filtering

Note:
The size of mask must be odd (i.e. 3×3, 5×5, etc.) to ensure it has a
center. The smallest meaningful size is 3×3.

©Asst. Lec. Wasseem Nahy Ibrahem Page 1


Image Processing Lecture 6

Linear Spatial Filtering (Convolution)


The process consists of moving the filter mask from pixel to pixel in an
image. At each pixel (x,y), the response is given by a sum of products of
the filter coefficients and the corresponding image pixels in the area
spanned by the filter mask.
For the 3×3 mask shown in the previous figure, the result (or response),
R, of linear filtering is:

= (−1, −1) ( − 1, − 1) + (−1,0) ( − 1, ) + ⋯


+ (0,0) ( , ) + ⋯ + (1,0) ( + 1, ) + (1,1) ( + 1, + 1)

In general, linear filtering of an image f of size M× N with a filter mask of


size m× n is given by the expression:

( , )= ( , ) ( + , + )

where a = (m - 1)/2 and b = (n - l)/2. To generate a complete filtered


image this equation must be applied for x = 0,1, 2,..., M-1 and y = 0,1,
2,..., N-1.

Nonlinear Spatial Filtering


The operation also consists of moving the filter mask from pixel to pixel
in an image. The filtering operation is based conditionally on the values
of the pixels in the neighborhood, and they do not explicitly use
coefficients in the sum-of-products manner.
For example, noise reduction can be achieved effectively with a
nonlinear filter whose basic function is to compute the median gray-level
value in the neighborhood in which the filter is located. Computation of
the median is a nonlinear operation.

©Asst. Lec. Wasseem Nahy Ibrahem Page 2


Image Processing Lecture 6

Example:
Use the following 3×3mask to perform the convolution process on the
shaded pixels in the 5×5 image below. Write the filtered image.

0 1/6 0 30 40 50 70 90
1/6 1/3 1/6 40 50 80 60 100
0 1/6 0 35 255 70 0 120
3×3 mask 30 45 80 100 130
40 50 90 125 140
5×5 image

Solution:
1 1 1 1 1
0 × 30 + × 40 + 0 × 50 + × 40 + × 50 + × 80 + 0 × 35 + × 255
6 6 3 6 6
+ 0 × 70 = 85
1 1 1 1 1
0 × 40 + × 50 + 0 × 70 + × 50 + × 80 + × 60 + 0 × 255 + × 70
6 6 3 6 6
+ 0 × 0 = 65
1 1 1 1 1
0 × 50 + × 70 + 0 × 90 + × 80 + × 60 + × 100 + 0 × 70 + × 0
6 6 3 6 6
+ 0 × 120 =
1 1 1 1 1
0 × 40 + × 50 + 0 × 80 + × 35 + × 255 + × 70 + 0 × 30 + × 45
6 6 3 6 6
+ 0 × 80 = 118
and so on …

30 40 50 70 90
40 85 65 61 100
Filtered image = 35 118 92 58 120
30 84 77 89 130
40 50 90 125 140

©Asst. Lec. Wasseem Nahy Ibrahem Page 3


Image Processing Lecture 6

Spatial Filters
Spatial filters can be classified by effect into:
1. Smoothing Spatial Filters: also called lowpass filters. They include:
1.1 Averaging linear filters
1.2 Order-statistics nonlinear filters.
2. Sharpening Spatial Filters: also called highpass filters. For example,
the Laplacian linear filter.

Smoothing Spatial Filters


are used for blurring and for noise reduction. Blurring is used in
preprocessing steps to:
§ remove small details from an image prior to (large) object
extraction
§ bridge small gaps in lines or curves.
Noise reduction can be accomplished by blurring with a linear filter and
also by nonlinear filtering.

Averaging linear filters


The response of averaging filter is simply the average of the pixels
contained in the neighborhood of the filter mask.
The output of averaging filters is a smoothed image with reduced "sharp"
transitions in gray levels.
Noise and edges consist of sharp transitions in gray levels. Thus
smoothing filters are used for noise reduction; however, they have the
undesirable side effect that they blur edges.

©Asst. Lec. Wasseem Nahy Ibrahem Page 4


Image Processing Lecture 6

The figure below shows two 3×3 averaging filters.

1 1 1 1 2 1
1 1 1 1 1 2 4 2
× ×
9 16
1 1 1 1 2 1
Standard average filter Weighted average filter
Note:
Weighted average filter has different coefficients to give more
importance (weight) to some pixels at the expense of others. The idea
behind that is to reduce blurring in the smoothing process.

Averaging linear filtering of an image f of size M× N with a filter mask of


size m× n is given by the expression:

To generate a complete filtered image this equation must be applied for


x = 0,1, 2,..., M-1 and y = 0,1, 2,..., N-1.

Figure below shows an example of applying the standard averaging filter.

©Asst. Lec. Wasseem Nahy Ibrahem Page 5


Image Processing Lecture 6

(a) (b)

(c) (d)

(e) (f)
Figure 6.2 Effect of averaging filter. (a) Original image. (b)-(f) Results of smoothing with
square averaging filter masks of sizes n = 3,5,9,15, and 35, respectively.

©Asst. Lec. Wasseem Nahy Ibrahem Page 6


Image Processing Lecture 6

As shown in the figure, the effects of averaging linear filter are:


1. Blurring which is increased whenever the mask size increases.
2. Blending (removing) small objects with the background. The size
of the mask establishes the relative size of the blended objects.
3. Black border because of padding the borders of the original image.
4. Reduced image quality.

Order-statistics filters
are nonlinear spatial filters whose response is based on ordering (ranking)
the pixels contained in the neighborhood, and then replacing the value of
the center pixel with the value determined by the ranking result.
Examples include Max, Min, and Median filters.

Median filter
It replaces the value at the center by the median pixel value in the
neighborhood, (i.e. the middle element after they are sorted). Median
filters are particularly useful in removing impulse noise (also known as
salt-and-pepper noise). Salt = 255, pepper = 0 gray levels.
In a 3×3 neighborhood the median is the 5th largest value, in a 5×5
neighborhood the 13th largest value, and so on.
For example, suppose that a 3×3 neighborhood has gray levels (10,
20, 0, 20, 255, 20, 20, 25, 15). These values are sorted as
(0,10,15,20,20,20,20,25,255), which results in a median of 20 that
replaces the original pixel value 255 (salt noise).

©Asst. Lec. Wasseem Nahy Ibrahem Page 7


Image Processing Lecture 6

Example:
Consider the following 5×5 image:
20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Apply a 3×3 median filter on the shaded pixels, and write the filtered
image.

Solution
20 30 50 80 100 20 30 50 80 100
30 20 80 100 110 30 20 80 100 110
25 255 70 0 120 25 255 70 0 120
30 30 80 100 130 30 30 80 100 130
40 50 90 125 140 40 50 90 125 140
Sort: Sort
20, 25, 30, 30, 30, 70, 80, 80, 255 0, 20, 30, 70, 80, 80, 100, 100, 255

20 30 50 80 100
30 20 80 100 110
25 255 70 0 120
30 30 80 100 130
40 50 90 125 140
Sort
0, 70, 80, 80, 100, 100, 110, 120, 130

20 30 50 80 100
30 20 80 100 110
Filtered Image = 25 30 80 100 120
30 30 80 100 130
40 50 90 125 140

©Asst. Lec. Wasseem Nahy Ibrahem Page 8


Image Processing Lecture 6

Figure below shows an example of applying the median filter on an


image corrupted with salt-and-pepper noise.

(a) (b)

(c)
Figure 6.3 Effect of median filter. (a) Image corrupted by salt & pepper noise. (b) Result of
applying 3×3 standard averaging filter on (a). (c) Result of applying 3×3 median filter on (a).

As shown in the figure, the effects of median filter are:


1. Noise reduction
2. Less blurring than averaging linear filter

©Asst. Lec. Wasseem Nahy Ibrahem Page 9


Image Processing Lecture 6

Sharpening Spatial Filters


Sharpening aims to highlight fine details (e.g. edges) in an image, or
enhance detail that has been blurred through errors or imperfect capturing
devices.
Image blurring can be achieved using averaging filters, and hence
sharpening can be achieved by operators that invert averaging operators.
In mathematics, averaging is equivalent to the concept of integration, and
differentiation inverts integration. Thus, sharpening spatial filters can be
represented by partial derivatives.

Partial derivatives of digital functions


The first order partial derivatives of the digital image f(x,y) are:

= ( + 1, ) − ( , ) and = ( , + 1) − ( , )

The first derivative must be:


1) zero along flat segments (i.e. constant gray values).
2) non-zero at the outset of gray level step or ramp (edges or
noise)
3) non-zero along segments of continuing changes (i.e. ramps).

The second order partial derivatives of the digital image f(x,y) are:

= ( + 1, ) + ( − 1, ) − 2 ( , )

= ( , + 1) + ( , − 1) − 2 ( , )

The second derivative must be:


1) zero along flat segments.
2) nonzero at the outset and end of a gray-level step or ramp;

©Asst. Lec. Wasseem Nahy Ibrahem Page 10


Image Processing Lecture 6

3) zero along ramps

Consider the example below:

Figure 6.4 Example of partial derivatives

We conclude that:
• 1st derivative detects thick edges while 2nd derivative detects thin
edges.
• 2nd derivative has much stronger response at gray-level step than 1st
derivative.
Thus, we can expect a second-order derivative to enhance fine detail (thin
lines, edges, including noise) much more than a first-order derivative.

©Asst. Lec. Wasseem Nahy Ibrahem Page 11


Image Processing Lecture 6

The Laplacian Filter


The Laplacian operator of an image f(x,y) is:

∇ = +

This equation can be implemented using the 3×3 mask:


−1 −1 −1
−1 8 −1
−1 −1 −1
Since the Laplacian filter is a linear spatial filter, we can apply it using
the same mechanism of the convolution process. This will produce a
laplacian image that has grayish edge lines and other discontinuities, all
superimposed on a dark, featureless background.
Background features can be "recovered" while still preserving the
sharpening effect of the Laplacian operation simply by adding the
original and Laplacian images.
( , )= ( , )+∇ ( , )
The figure below shows an example of using Laplacian filter to sharpen
an image.

©Asst. Lec. Wasseem Nahy Ibrahem Page 12


Image Processing Lecture 6

(a) (b)

(c)

Figure 6.5 Example of applying Laplacian filter. (a) Original image. (b) Laplacian image.
(c) Sharpened image.

©Asst. Lec. Wasseem Nahy Ibrahem Page 13


Gray level transformation:

All Image Processing Techniques focused on gray level transformation as it operates


directly on pixels. The gray level image involves 256 levels of gray and in a histogram,
horizontal axis spans from 0 to 255, and the vertical axis depends on the number of
pixels in the image.

The simplest formula for image enhancement technique is:

1. s = T * r

Where T is transformation, r is the value of pixels, s is pixel value before and after
processing.

Let,

1. r = f(x,y)
2. s = g(x,y)

'r' and 's' are used to denote gray levels of f and g at(x,y)

There are three types of transformation:

1. Linear
2. Logarithmic
3. Power - law

The overall graph is shown below:


Linear Transformation
The linear transformation includes identity transformation and negative transformation.

In identity transformation, each value of the image is directly mapped to each other
values of the output image.

Negative transformation is the opposite of identity transformation. Here, each value of


the input image is subtracted from L-1 and then it is mapped onto the output image

Logarithmic transformations
Logarithmic transformation is divided into two types:

1. Log transformation
2. Inverse log transformation
The formula for Logarithmic transformation

1. s = c log(r + 1)

Here, s and r are the pixel values for input and output image. And c is constant. In the
formula, we can see that 1 is added to each pixel value this is because if pixel intensity
is zero in the image then log(0) is infinity so, to have minimum value one is added.

When log transformation is done dark pixels are expanded as compared to higher pixel
values. In log transformation higher pixels are compresses.

In the above image (a) Fourier Spectrum and (b) result of applying Log Transformation.
Power - Law transformations
Power Law Transformation is of two types of transformation nth power transformation
and nth root transformation.

Formula:

1. s = cr ^ γ

Here, γ is gamma, by which this transformation is known as gamma transformation.

All display devices have their own gamma correction. That is why images are displayed
at different intensity.

These transformations are used for enhancing images.

For example:

Gamma of CRT is between 1.8 to 2.5

Image Enhancement
The main objective of Image Enhancement is to process the given image into a more
suitable form for a specific application. It makes an image more noticeable by
enhancing the features such as edges, boundaries, or contrast. While enhancement,
data does not increase, but the dynamic range is increased of the chosen features by
which it can be detected easily.

In image enhancement, the difficulty arises to quantify the criterion for enhancement
for which enhancement techniques are required to obtain satisfying results.

Histogram Processing:

In digital image processing, the histogram is used for graphical representation of a


digital image. A graph is a plot by the number of pixels for each tonal value. Nowadays,
image histogram is present in digital cameras. Photographers use them to see the
distribution of tones captured.
In a graph, the horizontal axis of the graph is used to represent tonal variations whereas
the vertical axis is used to represent the number of pixels in that particular pixel. Black
and dark areas are represented in the left side of the horizontal axis, medium grey color
is represented in the middle, and the vertical axis represents the size of the area.

Applications of Histograms
1. In digital image processing, histograms are used for simple calculations in software.
2. It is used to analyze an image. Properties of an image can be predicted by the detailed
study of the histogram.
3. The brightness of the image can be adjusted by having the details of its histogram.
4. The contrast of the image can be adjusted according to the need by having details of
the x-axis of a histogram.
5. It is used for image equalization. Gray level intensities are expanded along the x-axis to
produce a high contrast image.
6. Histograms are used in thresholding as it improves the appearance of the image.
7. If we have input and output histogram of an image, we can determine which type of
transformation is applied in the algorithm.

Histogram Processing Techniques


Histogram Sliding
In Histogram sliding, the complete histogram is shifted towards rightwards or
leftwards. When a histogram is shifted towards the right or left, clear changes are seen
in the brightness of the image. The brightness of the image is defined by the intensity of
light which is emitted by a particular light source.
Histogram Stretching
In histogram stretching, contrast of an image is increased. The contrast of an image is
defined between the maximum and minimum value of pixel intensity.
If we want to increase the contrast of an image, histogram of that image will be fully
stretched and covered the dynamic range of the histogram.
From histogram of an image, we can check that the image has low or high contrast.
Histogram Equalization
Histogram equalization is used for equalizing all the pixel values of an image.
Transformation is done in such a way that uniform flattened histogram is produced.
Histogram equalization increases the dynamic range of pixel values and makes an equal
count of pixels at each level which produces a flat histogram with high contrast image.
While stretching histogram, the shape of histogram remains the same whereas in
Histogram equalization, the shape of histogram changes and it generates only one
image.

In image processing, histogram matching or histogram specification is the transformation of


an image so that its histogram matches a specified histogram. The well-known histogram
equalization method is a special case in which the specified histogram is uniformly distributed.

It is possible to use histogram matching to balance detector responses as a relative detector


calibration technique. It can be used to normalize two images, when the images were acquired at
the same local illumination (such as shadows) over the same location, but by different sensors,
atmospheric conditions or global illumination.

Implementation
Example
The following input grayscale image is to be changed to match the reference histogram.
The input image has the following histogram

Histogram of input image


It will be matched to this reference histogram to emphasize the lower gray levels.

Desired reference histogram


After matching, the output image has the following histogram
What are Active Contours?
Active contour is a segmentation method that uses energy forces and
constraints to separate the pixels of interest from a picture for further
processing and analysis.

Active contour is defined as an active model for the segmentation process.


Contours are the boundaries that define the region of interest in an image. A
contour is a collection of points that have been interpolated. The
interpolation procedure might be linear, splines, or polynomial, depending on
how the curve in the image is described.

Why Active Contours is needed?


The primary use of active contours in image processing is to define smooth
shapes in images and to construct closed contours for regions. It is mainly
used to identify uneven shapes in images.

Active contours are used in a variety of medical image segmentation


applications. Various forms of active contour models are employed in a
variety of medical applications, particularly for the separation of desired
regions from a variety of medical images. A slice of a brain CT scan, for
example, is examined for segmentation using active contour models.

How does Active Contour work?


Active contours are the technique of obtaining deformable models or
structures in an image with constraints and forces for segmentation. Contour
models define the object borders or other picture features to generate a
parametric curve or contour.

The curvature of the models is determined using several contour techniques


that employ external and internal forces. The energy function is always
related to the image’s curve. External energy is described as the sum of
forces caused by the picture that is specifically used to control the location of
the contour onto the image, and internal energy, which is used to govern
deformable changes.
The contour segmentation constraints for a certain image are determined by
the needs. The desired shape is obtained by defining the energy function. A
collection of points that locate a contour is used to describe contour
deformation. This shape corresponds to the desired image contour, which
was defined by minimizing the energy function.

Active Contour Segmentation Models


1. Snake Model
The snake model is a technique that has the ability to solve a broad range of
segmentation problems. The model’s primary function is to identify and
outline the target object for segmentation. It requires some prior knowledge
of the target object’s shape, especially for complicated things. Active snake
models, often known as snakes, are generally configured by the use of spline
focused on minimizing energy, followed by various forces governing the
image.

Equation

A simple snake model can be denoted by a set of n points, vi for i=0,….n-1, the
internal elastic energy term EInternal and the external edge-based energy term Eexternal. The
internal energy term’s aim is to regulate the snake’s deformations, while the exterior energy term’s
function is to control the contour’s fitting onto the image. The external energy is typically a
combination of forces caused by the picture Eimage and constraint forces imposed by the user Econ.

The snake’s energy function is the total of its exterior and internal energy,
which can be written as below.

Advantage

The applications of the active snake model are expanding rapidly, particularly
in the many imaging domains. In the field of medical imaging, the snake model
is used to segment one portion of an image that has unique characteristics
when compared to other regions of the picture. Traditional snake model
applications in medical imaging include optic disc and cup segmentation to
identify glaucoma, cell image segmentation, vascular region segmentation,
and several other regions segmentation for diagnosis and study of disorders
or anomalies.

Disadvantage

The conventional active snake model approach has various inefficiencies,


such as noise sensitivity and erroneous contour detection in high-complexity
objects, which are addressed in advanced contour methods.

2. Gradient Vector Flow Model


The gradient vector flow model is a more developed and well-defined version
of the snake or active contour models. The traditional snake model has two
limitations: inadequate contour convergence for concave borders and when
the snake curve flow is commenced at a great distance from the minimum. As
an extension, the gradient vector flow model uses the gradient vector flow
field as an energy constraint to determine the contour flow.

Equation

In 2D, the GVF vector field FGVF minimizes the energy functional

3. Balloon Model
A snake model isn’t drawn to far-off edges. If no significant image forces
apply to the snake model, its inner side will shrink. A snake that is larger than
the minima contour will eventually shrink into it, whereas a snake that is
smaller than the minima contour will not discover the minima and will instead
continue to shrink. To address the constraints of the snake model, the balloon
model was developed, in which an inflation factor is incorporated into the
forces acting on the snake. The inflation force can overwhelm forces from
weak edges, exacerbating the problem with first guess localization.

Equation

Inflation term into the forces acting on the snake is introduced in the balloon
model.
where n(s) is the normal unitary vector of the curve at v(s) and k1 is the
magnitude of the force.

4. Geometric or geodesic active contour models


Geometric active contour (GAC) is a form of contour model that adjusts the
smooth curve established in the Euclidean plan by moving the curve’s points
perpendicular. The points move at a rate proportionate to the curvature of
the image’s region. The geometric flow of the curve and the recognition of
items in the image are used to characterize contours. Geometric flow
encompasses both internal and external geometric measures in the region of
interest. In the process of detecting items in an image, a geometric
replacement for snakes is utilized. These contour models rely heavily on the
level set functions that specify the image’s unique regions for segmentation.

Equation

For example, the gradient descent curve evolution equation of GAC is

where g(I) is a halting function, c is a Lagrange multiplier, K is the curvature,


and vector N is the unit inward normal. This particular form of curve
evolution equation is only dependent on the velocity in the normal direction.
Image segmentation 1. Introduction
• The objective is to subdivide an image into its
constituent parts or objects for subsequent
• Introduction processing such as recognition.
• Detection of Discontinuities • It is one of the most important steps leading to the
• Point detection analysis of processed image data.
• Line detection
• Edge detection
• Combined detection Complete v.s. patrial segmentation
• Edge linking and boundary detection • In complete segmentation,
• Thresholding ‰ Disjoint regions segmented are uniquely
• Adaptive thresholding corresponding with objects in the input image.
• Threshold selection based on boundary ‰ Cooperation with higher processing levels which

characteristics use specific knowledge of the problem domain is


necessary.
• Region-oriented segmentation
• Region growing by pixel aggregation • In partial segmentation,
• Region splitting and merging ‰ Regions segmented do not correspond directly

with image objects.

• Totally correct and complete segmentation of


complex scenes usually can't be achieved.
• A reasonable aim is to use partial segmentation as
an input to higher level processing.
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
Applications: 2. Detection of Discontinuities:
• Simple segmentation problems:
• There are 3 basic types of discontinuities: points,
1. Contrasted objects on a uniform background lines and edges.
2. Simple assembly tasks, blood cells, printed
characters, etc. • The detection is based on convoluting the image
with a spatial mask.
How to achieve segmentation?
 w−1, −1 w−1, 0 w−1,1 
• Image is divided into separate regions that are • A general 3x3 mask  w0, −1 w0, 0 w0,1 

homogeneous with respect to a chosen property
 w1, −1 w1, 0 w1,1 
such as color, brightness, texture, etc.
• Segmentation algorithms generally are based on 2
basic properties of gray level values: • The response of the mask at any point (x,y) in the
1 1
1. Discontinuity - isolated points, lines and edges image is R x , y = ∑ ∑ p( x − i, y − j ) w(i, j )
i = −1 j = −1
of image.
2. Similarity - thresholding, region growing, region
splitting and merging.
• Segmentation methods:
1. Global approaches such as thresholding
2. Edge-based segmentation
3. Region-based segmentation

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.1 Point detection 2.2 Line detection
• A point has been detected at the location p(i,j) on
which the mask is centered if |R |>T, where T is a
• Line masks
nonnegative threshold, and R is obtained with the
− 1 − 1 − 1
following mask. 2
Horizontal line 2 2
 
− 1 − 1 − 1 − 1 − 1 − 1
− 1 8 − 1
  − 1 −1 2 
− 1 − 1 − 1 45$ line − 1 2 − 1
 
• The idea is that the gray level of an isolated point  2 − 1 − 1
will be quite different from the gray level of its − 1 2 − 1
neighbors. Vertical line − 1 2 − 1
 
− 1 2 − 1
2 − 1 − 1
- 45$ line − 1 2 − 1
 
− 1 − 1 2 

2ULJLQDO 1RLVHDGGHG • If, at a certain point in the image, |Ri|>|Rj| for all
j ≠ i , that point is said to be more likely associated
with a line in the direction of mask i.

)LOWHUHGRS 7KUHVKROGHGRS

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.3 Edge detection
• It locates sharp changes in the intensity function.
• Edges are pixels where brightness changes abruptly.
• A change of the image function can be described by
a gradient that points in the direction of the largest
growth of the image function.
2ULJLQDO
• An edge is a property attached to an individual pixel
and is calculated from the image function behavior
in a neighborhood of the pixel.
• Magnitude of the first derivative detects the
presence of the edge.
• Sign of the second derivative determines whether
the edge pixel lies on the dark sign or light side.
+RUL]RQWDOOLQH ROLQH

9HUWLFDOOLQH ROLQH

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
(a) Gradient operator

• For a function f(x,y), the gradient of f at coordinates


(x',y') is defined as the vector
 ∂f 
 
∇f ( x ' , y ' ) =  ∂∂x 
f
 
 ∂y  ( x' , y ' )

• Magnitude of vector ∇f ( x' , y ' ) :


1

  ∂f  2  ∂f  2 
2

∇f ( x ' , y ' ) =    +   
  ∂x   ∂y  
)LJ (GJHGHWHFWLRQE\GHULYDWLYHRSHUDWRUV D OLJKW
  ( x ', y ' )
VWULSHRQDGDUNEDFNJURXQG E GDUNVWULSHRQD
OLJKWEDFNJURXQG
• Direction of the vector ∇f ( x' , y ' ) :
α ( x' , y ' ) = tan −1 (∂∂fy ∂f
∂x
) ( x ', y ' )

• Its magnitude can be approximated in the digital


domain in a number of ways, which result in a
number of operators such as Roberts, Prewitt and
Sobel operators for computing its value.

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
Sobel operator: (b) Laplacian Operator

• It provides both a differentiating and a smoothing • The Laplacian of a 2D function f(x,y) is a 2nd-order
effect, which is particularly attractive as derivatives derivative defined as
typically enhance noise. ∂2 f ∂2 f
∇ f ( x' , y ' ) = 2 + 2
2

∂x ∂y ( x' , y ' )
− 1 − 2 − 1  − 1 0 1
Gx :  0 0 0  Gy : − 2 0 2 • The Laplacian has the same properties in all
    directions and is therefore invariant to rotation in
 1 2 1   − 1 0 1 the image.

• It can also be implemented in digital form in various


ways.

• For a 3x3 region, the mask is given as


 0 −1 0 
− 1 4 − 1
 
 0 − 1 0 
2ULJLQDO 3URFHVVHGLPDJH

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.4 Combined Detection:

• Detection of combinations of points, lines and edges


can be achieved by using sets of orthogonal masks.

• A set of 9 3X3 masks were proposed by Frei and


Chen (1977)
2ULJLQDO 3URFHVVHGLPDJH
Basis of edge subspace :
1 2 1 1 0 −1 
1   1 
• It is seldom used in practice for edge detection for W1= 0 0 0 W = 2 0 − 2
2 2  2 2 
2
the following reasons: − 1 − 2 − 1  1 0 − 1 
1. As a 2nd-order derivative, it is unacceptably
 0 −1 2  2 −1 0 
sensitive to noise. 1   1  
W3= 1 0 − 1 W = −1 0 1 
2. It produces double edges and is unable to detect 2 2  2 2
4

edge direction. − 2 1 0   0 1 − 2 

• The Laplacian usually plays the secondary role of Basis of line subspace :
detector for establishing whether a pixel is on the − 1 0 1 0 1 0
W5=  0 0 0 W6= − 1 − 1
dark or light side of an edge. 1 1
0
2  2 
 1 0 − 1  0 1 0 
1 −2 1  − 2 1 − 2
1
4 − 2
1
W7= − 2 W8= 1 4 1
6  6 
 1 − 2 1  − 2 1 − 2

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
1 1 1
W9= 1 1 1
1
"Average" subspace : • Example:
3 
4 7 1 
1 1 1
What's the attribute of the center of 3 5 2 ?
 
• Given a 3x3 region represented by {f(i,j)|-2<i,j<2}, 2 0 0
we have
R1 = 4.5607 R2 = 2.2678
1 1
Rm = ∑ ∑ f (i, j ) wm (i, j ) R3 = -2.6213 R4 = -0.8284
i = −1 j = −1 R5 = -0.5000 R6 = 1.0000
1/ 2
Pline =  ∑ Rm  R7 = 0.5000 R8 = 3.0000
8 2

 m =5  R9 = 8.0000
1/ 2
 4
Pedge =  ∑ Rm 
2

 m =1  Pedge = 5.7879
Paverage = R9 Pline = 3.2404
Paver = 8.0000
where Pline , Paverage and Pedge are the magnitudes of the
projections onto edge, line and average subspaces
respectively, which tell how likely it is associated
with either an edge, a line or nothing.

Conclusion: It's likely not to be an edge or a line.

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
2.5 Edge linking and boundary detection • A point (x',y') in the neighborhood of (x,y) is linked
to the pixel at (x,y) if both the following magnitude
• The techniques of detecting intensity discontinuities
and direction criteria are satisfied.
yield pixels lying only on the boundary between
regions.
∇f ( x' , y ' ) − ∇f ( x, y ) ≤ Threshold Tm
• In practice, this set of pixels seldom characterizes a α ( x' , y ' ) − α ( x, y ) ≤ Threshold Td
boundary completely because if noise, breaks in
boundary from nonuniform illumination, and other
effects that introduce spurious intensity
discontinuities.
• Edge detection algorithms are typically followed by
linking and other boundary detection procedures
designed to assemble edge pixels into meaningful
boundaries.
D
(a) Local processing

• Two principal properties used for establishing


similarity of edge pixels in this kind of analysis are:
1. The strength of the response of the gradient
operator used to produce the edge pixel.
2. The direction of the gradient. E F
)LJ D 2ULJLQDOLPDJH E GHWHFWLRQUHVXOWZLWKRXWORFDO
• In a small neighborhood, e.g. 3x3, 5x5, all points SURFHVVLQJ F GHWFWLRQUHVXOWZLWKORFDOSURFHVVLQJ
with common properties are linked: 7P [PD[ _∇ I_ DQG7G SL

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3. Thresholding • Special cases:
If T depends on
• Thresholding is one of the most important 1. f(x,y) only - global threshold
approaches to image segmentation. 2. Both f(x,y) & p(x,y) - local threshold
3. (x,y) - dynamic threshold
• If background and object pixels have gray levels
grouped into 2 dominant modes, they can be • Multilevel thresholding is in general less reliable as
separated with a threshold easily. it is difficult to establish effective thresholds to
isolate the regions of interest.

• Thresholding may be viewed as an operation that


involves tests against a function T of the form 2ULJLQDO 7KUHVKROGUHVXOW 7 
T=T[x,y,p(x,y),f(x,y)], where f(x,y) is the gray level
of point (x,y), and p(x,y) denotes some local
property of this point such as the average gray level
of a neighborhood centered on (x,y).

+LVWRJUDP
)LJ1RQDGDSWLYHWKUHVKROGLQJUHVXOW
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3.1 Adaptive thresholding

• The threshold value varies over the image as a


function of local image characteristics.
• Image f is divided into subimages.
• A threshold is determined independently in each
subimage.
• If a threshold can't be determined in a subimage, it
can be interpolated with thresholds obtained in
neighboring subimages.
• Each subimage is then processed with respect to its
local threshold. )LJ+LVWRJUDPRIWKHVXELPDJHV

)LJ $GDSWLYHWKUHVKROGLQJUHVXOW 78/ 785 


7'/ 7'5 

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
3.2 Threshold selection based on boundary
0 if G[ f ( x, y )] < T
characteristics 
s ( x, y ) =  1 if G[ f ( x, y )] ≥ T and L[ f ( x, y )] ≥ 0
• A reliable threshold must be selected to identify the − 1 if G[ f ( x, y )] ≥ T and L[ f ( x, y )] < 0

mode peaks of a given histogram.
• This capability is very important for automatic where T is a threshold.
threshold selection in situations where image
characteristics can change over a broad range of
intensity distributions.

• We can consider only those pixels that lie on or near


the boundary between objects and the background
such that the associated histogram is well-shaped to
provide a good chance for us to select a good
D
threshold.

• The gradient can indicate if a pixel is on an edge or


not.
• The Laplacian can tell if a given pixel lies on the
dark or light (background or object) side of an edge.
E F
)LJ D 2ULJLQDO E SURFHVVHGUHVXOWZLWKRXWXVLQJ
ERXQGDU\FKDUDFWHULVWLFDQG F SURFHVVHGUHVXOW
• The gradient and laplacian can produce a 3-level ZLWKXVLQJERXQGDU\FKDUDFWHULVWLF
image
&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
4. Region-oriented segmentation ‰ The regions must be disjoint.
• In previous methods, we partition an image into ‰ It deals with the properties that must be satisfied
regions by finding boundaries between regions by the pixels in a segmented region - for example
based on intensity discontinuities. P(Ri)=true if all pixels in Ri have the same
intensity.
• Here, segmentation is accomplished via thresholds
‰ Regions Ri and Rj are different in the sense of
based on the distribution of pixel properties, such as
predicate P.
intensity or color.
• Basic formulation:
Let R represents the entire image which is
partitioned into subregions R1, R2,...Rn such that
n
‰  Ri = R
i =1

‰ Ri is a connected region, i=1,2...n


‰ Ri ∩ R j = {} for all i ≠ j ,
‰ P( Ri ) =true for i=1,2..n
‰ P( Ri ∪ R j ) =false for i ≠ j
where P(Ri) is a logical predicate over the points in
set Ri.

• Physical meaning of the formulation:


‰ The segmentation must be complete, i.e. every
point must be in a region.
‰ Points in a region must be connected.

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
4.1 Region growing by pixel aggregation
• Region growing is a procedure that groups pixels or
subregions into larger regions.
• Pixel aggregation starts with a set of "seed" points
from those grows by appending to each seed point
those neighboring pixels that have similar properties D E F
such as gray level, texture and color. )LJ 2ULJLQDOLPDJHZLWKVHHGSRLQW E HDUO\VWDJHRI
UHJLRQJURZWK F ILQDOUHJLRQ
0 0 5 6 7 a a b b b
1 1 5 8 7 a a b b b
0 1 6 7 7 a a b b b
2 0 7 6 6 a a b b b
0 1 5 6 5 a a b b b • Problems have to be resolved:
Original intensity array Result of threshold=3
1. Selection of initial seeds that properly represent
a
a
a
a
a
a
b
b
b
b
a
a
a
a
a
a
a
a
a
a
regions of interest.
a a b b b a a a a a
a a b b b a a a a a 2. Selection of suitable properties for including
a a a b ? a a a a a
points in the various regions during the growing
Result of Threshold=5.5 Result of threshold=9
process.
([DPSOHRIUHJLRQJURZLQJXVLQJNQRZQVWDUWLQJSRLQWV 3. The formulation of stopping rule.

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
4.2 Region splitting and merging • Example:
• To subdivide an image initially into a set of
arbitrary, disjointed regions and then merge and/or
split the regions in an attempt to satisfy the
conditions stated above.
• A split and merge algorithm is summarized by the (a) (b) (c) (d)
following procedure in which, at each step, we:
(1) split into 4 disjointed quadrants any regions
([DPSOHRIVSOLWDQGPHUJHDOJRULWKP
Ri where P( Ri ) =false;
(2) merge any adjacent regions Rj and Rk for
which P( Ri ∪ R j ) =true; and
(3) stop when no further merging or splitting is
possible. (a) The entire image is split into 4 quadrants.
(b) Only the top left region satisfies the predicate
so it is not changed, while the other 3 quadrants
R
are split into subquadrants.
R1 R2

R1 R2 R3 R4 (c) At this point several regions can be merged,


R 41 R 42
R3 with the exception of the 2 subquadrants that
R 43 R 44
R 41 R 42 R 43 R 44 include the lower part of the object; these do not
satisfy the predicate and must be split further.
(a) (b)

)LJ3DUWLWLRQHGLPDJHDQGFRUUHVSRQGLQJTXDGWUHH

&<+,PDJH6HJPHQWDWLRQS &<+,PDJH6HJPHQWDWLRQS
D E F
)LJ D 2ULJLQDOLPDJH E 5HVXOWRIVSOLWDQGPHUJH
DOJRULWKP F 5HVXOWRIWKUHVKROGLQJ E 

• Image segmentation is a preliminary step in most


automatic pictorial pattern-recognition and scene-
analysis problems.
• The choice of one segmentation technique over
another is dicated mostly by the peculiar
characteristics of the problem being considered.

&<+,PDJH6HJPHQWDWLRQS
Estimation of the
degradation function

There are three ways it can be estimated

⮚ Observation

⮚ Experimentation

⮚ Mathematical Modeling
Estimation by Image Observation

⮚ Let the image is degraded by an


unknown degradation function H.

⮚Assume that H is linear and position invariant.

⮚Gather information from the image.

⮚ Look at the small rectangular section of a


blurred image, which contain sample
structure.
⮚ To reduce the noise effect , look at the
strong signal content area (area of high
contrast).

⮚ Now process the sub-image to arrive at a


result that it a unblurred as possible.
(Using sharpening filter or by processing small areas
in hand)

⮚Let the observed sub-image be denoted by


gs (x, y)

and the processed ˆ


(x, y)
sub-image
Assuming negligible noise effect , we have
Gs (u, v)
H (u, v) =

Fs (u, v)
Based on the information of Position invariance,
we can deduce complete degradation function
H(u,v) from the characteristics of above function
Suppose the radial plot of Hs(u,v) has the
approximate shape of the Gaussian curve , same
information is used for H(u,v) in the larger scale.
Estimation by Experimentation

⮚ If the equipment , with which we obtained the


degraded image is available, accurate estimation
of the degradation function can be obtained.

⮚With various system settings of the equipments


, obtain a degraded as closely as possible to
the given degraded image.
⮚ Using the system settings (Response) get an
impulse response by imaging an impulse (small
dot of light.

⮚ An impulse is simulated by bright dot of light as


bright as possible to reduce the effect of noise to
negligible value.
Then Fourier transform of an impulse is a
constant
G(u
H (u, v) = ,
v
)
A

Where A is constant describing strength of the


impulse.
Estimation by modeling

The mathematical degradation model proposed


by Hufnagel and Stanley [1964] is based on the
physical characteristics of the atmospheric
turbulence.
H (u, v)
−k (u2 +v2
=e
)56
Where k is the constant depends on the nature of
the turbulence.
Image
Image with with
negligible severe
turbulence turbulen
ce
k=0.0025

Image with Image


mild with low
turbulence turbulenc
k=0.001 e
k=0.0002
5
Another way of mathematical modeling is starting
from the basic principle . Let us consider a case
where image is blurred due to the uniform linear
motion between the image and the sensor during
the image acquisition.
Let an image f(x,y) undergoes a planar motion
and that x0(t) and y0(t) are time varying
components of motion in x and y direction
respectively.

The total exposure at any point of recording


medium is obtained by integrating the
instantaneous exposure over time interval during
which the imaging system shutter is open.
f [x − x0 (t), y −
g(x, y)

=

y0 (t)]
Whe g(x, y) is the blurred image
re
UNIT 3: Image Restoration
Image restoration is the operation of taking a corrupt/noisy image and estimating
the clean, original image. Corruption may come in many forms such as motion
blur, noise and camera mis-focus.[1] Image restoration is performed by reversing
the process that blurred the image and such is performed by imaging a point source
and use the point source image, which is called the Point Spread Function (PSF) to
restore the image information lost to the blurring process.
Image restoration is different from image enhancement in that the latter is designed
to emphasize features of the image that make the image more pleasing to the
observer, but not necessarily to produce realistic data from a scientific point of
view. Image enhancement techniques (like contrast stretching or de-blurring by a
nearest neighbor procedure) provided by imaging packages use no a priori model
of the process that created the image.
With image enhancement noise can effectively be removed by sacrificing some
resolution, but this is not acceptable in many applications. In a fluorescence
microscope, resolution in the z-direction is bad as it is. More advanced image
processing techniques must be applied to recover the object.

Noise models:
Gaussian Noise:
Because of its mathematical simplicity, the Gaussian noise model is often used in
practice and even in situations where they are marginally applicable at best. Here,
m is the mean and σ2 is the variance.
Gaussian noise arises in an image due to factors such as electronic circuit noise and
sensor noise due to poor illumination or high temperature.
Here mean m and variance σ2 are the following:
Spatial Filtering technique is used directly on pixels of an image. Mask is usually
considered to be added in size so that it has a specific center pixel. This mask is
moved on the image such that the center of the mask traverses all image pixels.
In this article, we are going to cover the following topics –
● To write a program in Python to implement spatial domain averaging filter and
to observe its blurring effect on the image without using inbuilt functions
● To write a program in Python to implement spatial domain median filter to
remove salt and pepper noise without using inbuilt functions

Theory

● Neighborhood processing in spatial domain: Here, to modify one pixel, we


consider values of the immediate neighboring pixels also. For this purpose,
3X3, 5X5, or 7X7 neighborhood mask can be considered. An example of a 3X3
mask is shown below.
f(x-1, y-1) f(x-1, y) f(x-1, y+1)
f(x, y-1) f(x, y) f(x, y+1)
f(x+1, y-1) f(x+1, y) f(x+1, y+1)
● Low Pass filtering: It is also known as the smoothing filter. It removes the
high-frequency content from the image. It is also used to blur an image. A low
pass averaging filter mask is as shown.
1/9 1/9 1/9
1/9 1/9 1/9
1/9 1/9 1/9
● +High Pass Filtering: It eliminates low-frequency regions while retaining or
enhancing the high-frequency components. A high pass filtering mask is as
shown.
-1/9 -1/9 -1/9
-1/9 8/9 -1/9
-1/9 -1/9 -1/9
● Median Filtering: It is also known as nonlinear filtering. It is used to eliminate
salt and pepper noise. Here the pixel value is replaced by the median value of
the neighboring pixel.
● Mean Filter:

Mean filter is one of the techniques which is used to reduce noise of the images.

This is a local averaging operation and it is a one of the simplest linear filter. The
value of each pixel is replaced by the average of all the values in the local
neighborhood. Let f(i,j) is a noisy image then the smoothed image g(x,y) can be
obtained by,

Mean filtering is a simple, intuitive and easy to implement method


of smoothing images. It is often used to reduce noise in images. Also it is reducing
the amount of intensity variation between one pixel and the next.
But there are some problems of this mean filter,

A single pixel with a very uncommon value(outlier) can significantly affect the
mean value of all the pixels in its neighborhood.

(Order statistics, adaptive filters, notch filters, band pass and reject filters,optimum
notch filters,…refer notes)
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Digital Image Processing


Lectures 23 & 24

M.R. Azimi, Professor

Department of Electrical and Computer Engineering


Colorado State University

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Restoration Filters
There are basically two classes of restoration filters.
1 Deterministic-Based
These methods ignore effects of noise and statistics of the image,
e.g., inverse filter and Least Squares (LS) filter.
2 Stochastic-Based
Statistical information of the noise and image is used to generate
the restoration filters, e.g., 2-D Wiener filter and 2-D Kalman filter.
Inverse Filter
(a) Direct Inverse Filter: Attempts to recover the original image from the
observed blurred image using an inverse system, hI (m, n), corresponding
to the blur PSF, h(m, n).

Figure 1: Inverse Filtering.

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

If we assume no noise case, we have


y(m, n) = h(m, n) ∗ ∗x(m, n)
Y (k, l) = H(k, l)X(k, l)
The inverse filter produces
x̂(m, n) = y(m, n) ∗ ∗hI (m, n)
X̂(k, l) = Y (k, l)H I (k, l)
Then,
x̂(m, n) = x(m, n) ∗ ∗h(m, n) ∗ ∗hI (m, n)
or
X̂(k, l) = X(k, l)H(k, l)H I (k, l)
1
Clearly, x̂(m, n) = x(m, n) or X̂(k, l) = X(k, l) iff H I (k, l) = H(k,l) or
h(m, n) ∗ ∗hI (m, n) = δ(m, n).
Thus
X̂(k, l) = Y (k, l)/H(k, l)
Now, if there is a slight noise (e.g., quantization noise) in the image,
Y (k, l) = H(k, l)X(k, l) + N (k, l)
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

The inverse filter gives


N (k, l)
X̂(k, l) = X(k, l) +
H(k, l)
At those frequencies where H(k, l) ' 0, N (k,l)
H(k,l) becomes very large i.e.
the noise is amplified.
(b) Pseudo Inverse Filter
To overcome the problems with the direct inverse filter, modify the
transfer function of the inverse filter as
H ∗ (k, l)
H I (k, l) =
|H(k, l)|2 + ε
where ε is a small positive quantity. For ε = 0, we have
1
H I (k, l) = H(k,l) . Alternatively, we can use
 1
+ H(k,l) |H(k, l)| ≥ ε
H (k, l) =
0 |H(k, l)| < ε
While the first form of the pseudo inverse filter corresponds to a special
case of Wiener filter (discussed next) it does not take into account the
statistics of the noise and image.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Example 1: Derive the transfer function of the LS filter and show that it
is the same as the inverse filter.
Solution: The LS solution minimizes the LS cost function,
XX
J(x̂) = |y(m, n) − ŷ(m, n)|2
m n

where ŷ(m, n) = h(m, n) ∗ ∗x̂(m, n) is an estimate of the observed


image based upon the LS estimate x̂(m, n). Using energy preservation,
1 XX
J(x̂) = J(X̂) = 2 |Y (k, l) − Ŷ (k, l)|2
N
k l

where Ŷ (k, l) = H(k, l)X̂(k, l). Taking derivative of J wrt X̂(k, l) gives
∂J
= 0 =⇒ −H ∗ (k, l)(Y (k, l) − H(k, l)X̂(k, l)) = 0
∂ X̂(k, l)
which gives the same inverse filter solution i.e.,
X̂LS (k, l) = Y (k, l)/H(k, l)
Quest ion: What would you change in the cost function to yield LS
solution that is the same as that of the pseudo inverse?
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Wiener Filter
This filter takes into account 1st and 2nd order statistics of the noise and
image to generate the restoration filter transfer function. Additionally, it
provides the best linear estimate of the image based up minimizing MSE
(i.e. MMSE). However, it assumes wide-sense stationarity of the image
field. We begin with the 1-D case.
1-D Wiener Filter:
Goal: Find g(n)’s or FIR filter coefficients so that z(n) is as close as
possible to the desired signal d(n) by minimizing MSE J(g) = E[e2 (n)].

Figure 2: Wiener Filtering Process.

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

The MSE is given by


J(g(n)) = E[e2 (n)]
with e(n) = d(n) − z(n)
N
X
z(n) = g(n) ∗ y(n) = g(i)y(n − i)
i=0

Then
N
X
J(g) = E[(d(n) − g(i)y(n − i))2 ]
i=0
To find the optimum g(k)’s,
N
∂J(g) X
= 0 =⇒ −2E[y(n − k)(d(n) − g(i)y(n − i))] = 0, k ∈ [0, N ]
∂g(k) i=0

Thus, we get
N
X
g(i)E[y(n − i)y(n − k)] = E[d(n)y(n − k)]
i=0

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

or
N
X
g(i)ryy (k − i) = rdy (k)
i=0

g(k) ∗ ryy (k) = rdy (k)



where rdy (k) = E[d(n)y(n − k)] is the cross-correlation between d(n)

and y(n) and ryy (k − i) = E[y(n − i)y(n − k)] is the auto-correlation of
y(n).
Taking the DTFT gives the general expression for the Wiener filter
transfer function
Sdy (ejΩ )
G(ejΩ ) =
Syy (ejΩ )
where

Sdy (ejΩ ) = DTFT{rdy (k)} Cross − power spectrum

Syy (ejΩ ) = DTFT{ryy (k)} Power spectrum

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Now, let d(n) = x(n) (during the training phase) then assuming that the
noise and signal are uncorrelated
y(n) = h(n) ∗ x(n) + η(n)
ryy (k) = y(k) ∗ y(−k)
= (h(k) ∗ x(k) + η(k))(h(−k) ∗ x(−k) + η(−k))
= h(k) ∗ h(−k) ∗ rxx (k) + rηη (k)
rdy (k) = d(k) ∗ y(−k)
= x(k) ∗ (h(−k) ∗ x(−k) + η(−k))
= h(−k) ∗ rxx (k)
Thus, we get
Syy (ejΩ ) = |H(ejΩ )|2 Sxx (ejΩ ) + Sηη (ejΩ )
Sdy (e )jΩ
= H ∗ (ejΩ )Sxx (ejΩ )
Hence leading to Wiener filter transfer function,
H ∗ (ejΩ )Sxx (ejΩ )
G(ejΩ ) =
|H(ejΩ )|2 Sxx (ejΩ ) + Sηη (ejΩ )

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Important Remarks
1 Once the transfer function of the Wiener filter is designed, the
filtering can simply be done as shown in Fig. 3.

Figure 3: Wiener Filter Transfer Function.

2 If h(n) = δ(n) → H(ejΩ ) = 1 i.e. signal plus noise


y(n) = x(n) + η(n), the transfer function simplifies to
Sxx (ejΩ )
G(ejΩ ) =
Sxx (ejΩ ) + Sηη (ejΩ )
In this case, the power spectrum of the noise and signal can be
measured from the power spectrum of the observed signal as shown
in Fig. 4 (note that for zero mean white Gaussian noise
Sηη (ejΩ ) = ση 2 ).
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Figure 4: Estimating Noise Power.


3 A more useful form of the Wiener filter for frequency domain
implementation is to work with the DFT instead of DTFT, in which
case the transfer function becomes,

H ∗ (k)Sxx (k)
G(k) =
|H(k)|2 Sxx (k) + Sηη (k)
4 The general Wiener filter transfer function,

Sdy (ejΩ ) Sdy (k)


G(ejΩ ) = or G(k) =
Syy (ejΩ ) Syy (k)
can be applied to other types of observation models e.g.,
multiplicative noise case (see next example).
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Example 2: Derive the transfer function of the Wiener filter for the
following observation model.

y(n) = γ(n)[h(n) ∗ x(n)] + η(n)


where γ(n) and η(n) are uncorrelated multiplicative and additive random
noise with mean and correlations µγ and µη , and rγγ (k) and rηη (k),
respectively.
Solution: We use the result in Remark 4 to obtain G(ejΩ ). Since γ(n),
η(n) and x(n) are mutually independent, using the convolution in
frequency domain, the auto-correlation and power spectrum of the
observed signal becomes

ryy (k) = y(k) ∗ y(−k)


= rγγ (k)[h(k) ∗ h(−k) ∗ rxx (k)] + rηη (k)
and Syy (ejΩ ) = Sγγ (ejΩ ) ∗ ∗[|H(ejΩ )|2 Sxx (ejΩ )] + Sηη (ejΩ )

On the other hand, cross-correlation and cross-power spectrum for


d(k) = x(k) become
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

rdy (k) = d(k) ∗ y(−k)


= µγ rxx (k) ∗ h(−k)
Sdy (e ) = µγ Sxx (ejΩ )H ∗ (ejΩ )
jΩ

Thus, the transfer function becomes

µγ H ∗ (ejΩ )Sxx (ejΩ )


G(ejΩ ) =
Sγγ (ejΩ ) ∗ ∗[|H(ejΩ )|2 Sxx (ejΩ )] + Sηη (ejΩ )

2-D Wiener Filter:


2-D extension is straightforward leading to 2-D Wiener transfer function
H ∗ (k, l)Sxx (k, l)
G(k, l) =
|H(k, l)|2 Sxx (k, l) + Sηη (k, l)

where H(k, l) = 2 − D DF T {h(m, n)}, and Sxx (k, l) and Sηη (k, l) are
power spectra of the original image x(m, n) and additive noise, η(m, n),
respectively. The process is depicted in Fig.5.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Figure 5: Image formation system and 2-D Wiener filtering.

Important Remarks
1 The above transfer function can be written as
H ∗ (k, l)
G(k, l) =
|H(k, l)|2 + Sηη (k, l)/Sxx (k, l)
where Sηη (k, l)/Sxx (k, l) > 0 is like ε in the pseudo inverse filter,
but dependent on SNR. Thus, the Wiener filter does not have the
singularity problem of the inverse filter.
2 If Sηη >> Sxx , G(k, l) reduces to de-emphasize the noisy
observation at that frequency. Recall that we have the opposite
effects in the inverse filter, i.e. Wiener filter does not have
ill-conditioning problems.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

3 Assume no blur case, h(m, n) = δ(m, n) → H(k, l) = 1, then


Sxx (k, l) SNR(k, l)
G(k, l) = =
Sxx (k, l) + Sηη (k, l) SNR(k, l) + 1
where
∆ Sxx (k, l)
SNR(k, l) =
Sηη (k, l)
If SNR >> 1 → G(k, l) ≈ 1, i.e. all frequency components are in
the passband, while SNR << 1 → G(k, l) ≈ SNR(k, l), i.e. all such
frequencies are attenuated in proportion to their SN R values. This
shows the adaptive nature of the Wiener filter.
4 Since auto-correlation is real and even, Sxx and Sηη are also real
and non-negative. Thus, the phase of G(k, l) is
ΦG = ΦH ∗ = −ΦH
i.e. the phase of the inverse filter.
5 The main drawback of Wiener filter is the assumption of WSS for
both image and noise. This is not obviously true for images. As
result, the filtered images exhibit some edge smearing artifacts.
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Figs. 6(a)-(d) show original, blurred (motion blur size 5), blurred and
noisy (SNR=7 dB), and finally the Wiener filtered Baboon. Note: even
some of the whisker details are restored.

Original Image of Baboon Baboon with Blurring Applied

Baboon with Blurring and Noise Wiener Filter Result


Figure 6: Blurred and Noisy Baboon and Wiener Filter Results.

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Recursive Restoration & Kalman Filtering

Wiener filter assumes WSS of the image field which leads to smearing of
the edges and subtle textural features. This can be circumvented by
recursive Kalman Filter (KF). KF requires:
1 1-D or 2-D AR/ARMA model representation of the image field and
covariance of the noise.
2 1-D or 2-D state and observation equations with states being image
pixels to estimate.
3 Recursive implementation of the filtering equations.
Remark: For WSS cases, KF becomes the same as Wiener filter.
Here, we discuss one type of recursive image restoration using Strip
Kalman Filter (Azimi-Sadjadi, IEEE Trans. CAS, June 1989).

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Strip Kalman Filter


Consider an image which is vector scanned horizontally in strips of width
W . Image is assumed to be represented by an M th order vector AR
process with causal ROS as shown. Then,

z(k) = AT1 z(k − 1) + · · · + ATM z(k − M ) + e(k)


where z(k) is a vector of pixels of size W , e(k) is the driving noise vector
and Ai s are model coefficient matrices of size W × W . We assume that
E[e(k)] = 0 and E[e(k)eT (j)] = Re δ(k − j) with Re being the
covariance matrix of e(k).

Figure 7: Strip Kalman Filtering.

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Note that the model coefficient matrices can be identified using vector
Yule-Walker equation. To see this let’s form covariance matrix of data
from the vector AR model above. This gives

Rz (m) = E[z(k)zT (k−m)] = Rz (m−1)A1 +· · ·+Rz (m−M )AM +Re δ(m)

which yields
RzT (1) RzT (M )
 
Rz (0) ··· ··· IW
  
T Re
 Rz (1) Rz (0) ··· ··· Rz (M − 1)  −A1  
0
.. ..
  
..
    
. .
 
 
.   0 

 ..

 = 

 .   
  ..


.. .. ..
 ..  
.

.
  
 . . .  
−AM 0
Rz (M ) Rz (M − 1) ··· ··· Rz (0)

Here Rz (−m) = RzT (m). Solving this vector Yule Walker equation leads
to the parameter matrices for AR model. The sample data covariance
matrices R̂z (m) in this equation are estimated using the training images
PN −1
and R̂z (m) = N1 k=0 z(k)zT (k − m).
M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Next, we form the state equation by defining the state vector (size
W M × 1) as
x(k) = [zT (k)zT (k − 1) · · · zT (k − M + 1)]T
then state equation (in canonical form) is
x(k) = F x(k − 1) + Ge(k)
where
AT1 AT2 ATM
 
··· ··· 
IW

 IW 0 ··· ··· 0   0 
.. ..
 
 .. 
   
 . .   . 
F = , G=
 
.. 
 .  




.. ..
  .. 
 ..   . 
 . . . 
0 0 ··· ··· IW 0

Now, consider vector observation equation in which blur is modeled as an


MA (or FIR) process with ROS limited to that of AR model, i.e.

y(k) = H0 z(k) + H1 z(k − 1) + · · · + HM −1 z(k − M + 1) + n(k)


M.R. Azimi Digital Image Processing
Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

where y(k) is the observation vector and n(k) represents the additive
noise vector with E[n(k)] = 0 and E[n(k)nT (j)] = Rn δ(k − j) with Rn
being the covariance matrix of n(k). In vector form, the observation
equation becomes
y(k) = Hx(k) + n(k)
where H = [H0 · · · HM −1 ].

The aim of Kalman filter is to generate estimates of the state vector


(image) recursively based upon all available observations i.e.
x̂(k|k) = E[x(k)|Z(k)] where Z(k) = {z(1), · · · , z(k)} is a set of all
observations up to time k. Then, the Kalman filter equations in order are:
1 Initial Estimate: x̂(k|k − 1) = F x̂(k − 1|k − 1).
2 A priori Error Covariance Matrix:
P (k|k − 1) = F P (k − 1|k − 1)F T + GRe GT where
P (k|k − 1) = E[(x(k) − x̂(k|k − 1))(x(k) − x̂(k|k − 1))T ].
3 Kalman Gain: K(k) = P (k|k − 1)H T [HP (k|k − 1)H T + Rn ]−1 .

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

4 Update Estimate: x̂(k|k) = x̂(k|k − 1) + K(k)[y(k) − Hx̂(k|k − 1)].


5 A posteriori Error Covariance Matrix:
P (k|k) = [I − K(k)H]P (k|k − 1) where
P (k|k) = E[(x(k) − x̂(k|k))(x(k) − x̂(k|k))T ].
Important Remarks:
1 Initial conditions are x̂(0|0) = µx [1, · · · , 1]T and P (0|0) = Px (0)
which is a measure of confidence in µx i.e. larger Px (0) for lesser
confidence.
2 KF is driven by the innovation process ξ(k) = y(k) − Hx̂(k|k − 1).
3 Eqs. in steps 2, 3, and 5 can be iterated off-line until convergence is
reached for K(k) (stationary case) and then K(k) is used on-line in
the state updating process.
4 Note that Kalman gain matrix decreases when either effects of Re
or Px (0) decrease or when Rn increases. When Re ≈ 0 then K(k)
asymptotically approaches to zero and hence KF becomes
independent of observations (data saturation may occur).

M.R. Azimi Digital Image Processing


Image Restoration Restoration Filters Inverse Filters Wiener Filter Kalman Filter

Figs. 8(a) and (b) show the noisy (additive WGN) Baboon with
SN Rinput =-1.7 dB and strip KF processed Baboon with
SN Routput =2.83 dB (i.e. >4.5dB improvements). Vector AR(1) was
used with W = 64. Clearly, the results are impressive.

Noisy Baboon KF Processed Baboon.


Figure 8: Noisy Baboon and Strip Kalman Filter Results.

M.R. Azimi Digital Image Processing


Linear position invariant degradation models
Types of Morphological Operations
Morphology is a broad set of image processing operations that process images based on shapes.
Morphological operations apply a structuring element to an input image, creating an output image of
the same size. In a morphological operation, the value of each pixel in the output image is based on a
comparison of the corresponding pixel in the input image with its neighbors.

Morphological Dilation and Erosion


The most basic morphological operations are dilation and erosion. Dilation adds pixels to the
boundaries of objects in an image, while erosion removes pixels on object boundaries. The
number of pixels added or removed from the objects in an image depends on the size and
shape of the structuring element used to process the image. In the morphological dilation and
erosion operations, the state of any given pixel in the output image is determined by applying
a rule to the corresponding pixel and its neighbors in the input image. The rule used to
process the pixels defines the operation as a dilation or an erosion. This table lists the rules
for both dilation and erosion.
Rules for Dilation and Erosion
Periodic Noise Reduction By Frequency Domain Filtering:
These types of filters are used for this purpose-
Band Reject Filters:
It removes a band of frequencies about the origin of the Fourier transformer.
Ideal Band reject Filter:
An ideal band reject filter is given by the expression
Band pass Filter:

(Fig3.4.1: Perspectives plot of ideal, Butterworth , Gaussian and Notch


Filter
Source: D,E. Dudgeon and RM. Mersereau, ultidimensional Digital

Signal Processing‘, Prentice Hall Professional Technical Reference,


1990Page- 312)

The function of a band pass filter is opposite to that of a band reject filter It
allows a specific frequency band of the image to be passed and blocks the
rest of frequencies. The transfer function of a band pass filter can be
obtained from a corresponding band reject filter with transfer function
Hbr(u,v) by using the equation

An ideal band reject filter is given by the expression

D(u,v)- the distance from the origin of the centered frequency


rectangle. W- the width of the band
Do- the radial center of the frequency rectangle.
Butterworth Band pass Filter:

EC8093- DIGITAL IMAGE PROCESSING


Notch Filters:
A notch filter rejects (or passes) frequencies in predefined neighborhoods
about a center frequency.
Due to the symmetry of the Fourier transform notch filters must appear in
symmetric pairs about the origin.
The transfer function of an ideal notch reject filter of radius D0 with centers
a (u0, v0) and by symmetry at (-u0, v0) is

Ideal, Butterworth, Gaussian notch filters

Fig 3.4.2: Perspectives plot of ideal, Butterworth , Gaussian and Notch Filter.
(Source: D,E. Dudgeon and RM. Mersereau, Multidimensional Digital Signal
Processing‘, Prentice Hall Professional Technical Reference, 1990.-Page-335)

These filters cannot be applied directly on an image because it may remove too
much details of an image but these are effective in isolating the effect of an
image of selected frequency bands.
Image Registration is the process of estimating an optimal transformation between two
images.
• Sometimes also known as “Spatial Normalization”.
The image registration process is an automated or manual operation that attempts to discover
matching spots between two photos and spatially align them to minimise the desired error, i.e.
a uniform proximity measurement between two images. Medical sciences, remote sensing,
and computer vision all use image registration.

It could be said that Image registration is the process of calculating spatial transforms which
align a set of images to a common observational frame of reference, often one of the images
in the set. Registration is a key step in any image analysis or understanding task where
different sources of data must be combined. During the registration process, two situations
become evident:

1. It is impossible to imagine; this is known as a matching issue, and it is also the most
time-consuming step of the algorithm’s execution.
2. There is a requirement for transformation in the three-dimensional information of one
of the photos in terms of its coordinate system and related to the image chosen as its
reference.

There are major four steps that every method of image registration has to go through for
image alignment. These could be listed as follows:

● Feature detection: A domain expert detects salient and distinctive objects (closed
boundary areas, edges, contours, line intersections, corners, etc.) in both the reference
and sensed images.
● Feature matching: It establishes the correlation between the features in the reference
and sensed images. The matching approach is based on the content of the picture or
the symbolic description of the control point-set.
● Estimating the transform model: The parameters and kind of the so-called mapping
functions are calculated, which align the detected picture with the reference image.
● Image resampling and transformation: The detected image is changed using
mapping functions.
Image registration methods are majorly classified into two types: area-based approaches and
feature-based methods. When significant features are lacking in photos and distinguishing
information is given by grey levels/colours rather than local forms and structure, area-based
approaches are preferred.

When picture intensities provide more local structural information, feature-based matching
algorithms are used. Image characteristics produced from the feature extraction technique are
used in these procedures. But these two classifications could be further classified into various
methods. Let’s have a look at those classifications.

Pixel Based Method

For registration, a cross-correlation statistical methodology is employed in this procedure. It


is frequently used for template matching or pattern recognition, which involves finding the
location and orientation of a template or pattern in an image. Cross-correlation is a measure
of similarity or a match metric.

For example, the two-dimensional normalised cross-correlation function assesses the


similarity for each translation for template (an image for reference) and image, where the
template is tiny in comparison to the image.

If the template fits the image, the cross-correlation will be at its peak. Because the measure
might be influenced by local picture intensity, the cross-correlation should be adjusted.

The key disadvantages of correlation approaches are the flatness of the similarity measure
maximum (owing to the self-similarity of the pictures) and the high processing complexity.
The maximum can be successfully sharpened by pre-processing or by applying edge or vector
correlation.

Rigid body transformation:


Principal Axes registration:
z
Visualization: Orthogonal and perspective projection in medicine
When you render a 3-dimensional computer graphics scene, you create a 2-dimensional picture
of the 3D scene. The picture is a projection of the models in the scene onto a 2-dimensional
“screen”. Therefore it is logical to call this operation in the graphics pipeline a projection.
There are two standard projections used in computer graphics. An orthographic projection
maintains parallel lines but provides no sense of depth. A perspective projection provides for a
sense of depth, but parallel lines are skewed toward vanishing points. Orthographic projections
are used in the engineering fields when an accurate representation of a model is
desired. Perspective projections are used when a “real life” view of a scene is desired.
Perspective projections simulate how the human eye sees the real world
For orthographic projection.,notice that:

● Parallel lines stay parallel.


● There is no perception of depth.

For a perspective projection notice that:

● Parallel lines of the model are not parallel in the rendering.


● You can perceive depth.

● The major task of a projection transformation is to project a 3D scene onto a 2D screen.


● A projection transformation also prepares for these follow-on tasks:
o Clipping - the removal of elements that are not in the camera’s line of sight.
o Viewport mapping - convert a camera’s viewing window into the pixels of an
image.
o Hidden surface removal - determining which objects are in front of other objects.

Before a projection operation a (x,y,z) vertex represents a location in 3-dimensional space. After
a projection the meaning of the values have changed:

● The (x,y) values of a vertex represent its location on a 2-dimensional screen.


● The z value represents the distance of the vertex from the virtual camera and is used
for hidden surface removal.
● All 3 values, x, y, and z, are used for clipping, which removes points, lines, or triangles
that are not visible to the scene’s virtual camera.

Shaded Surface Rendering:


A.Image Acquisition
Principally volume data have to be acquired for surface rendered models. In radiological
imaging CT or MR imaging are suitable for 3D postprocessing. Three-dimensional
ultrasound is a recently introduced modality which is capable of creating 3D models.
Nevertheless, for surface rendering CT and MR imaging are the two main modalities. The
quality and accuracy of 3D models is always dependent on the resolution of the original 2D
acquisitions. To obtain optimal 3D models the dimensions of the voxels should be isotropic,
which means that the voxel data cube is of similar lengths in all three room axes. In CT
imaging there is a high resolution of image data within the axial slice. In spiral CT an entire
volume is acquired and the axial slices are computed with a certain slice thickness. Volume
acquisition is a prerequisite for 3D postprocessing.
Especially in the thorax and abdomen organ position is breath-dependent. The acquisition has
to be performed within one continuous breath-hold to avoid different positions of the scanned
organ. In spiral CT there is always a trade-off between volume coverage and resolution along
the body axis (Z-axis). The recently introduced multislice scanners can at least improve the
Z-axis resolution by covering the same volume with thinner slices.
With a fixed length which covers the organs of interest for 3D imaging the slice thickness
should always be as thin as possible to achieve near-isotropic voxels. For MR imaging
so-called 3D sequences are achieving near-isotropic voxel resolution. Resolution in all three
room axes should be as good as possible to avoid "stair step artifacts". A good contrast of a
surface on the desired object is extremely helpful for automated segmentation. Thus, for
example, bones in CT imaging are very easy to segment for surface rendering.
B.Image Enhancement and Filtering
After an image stack has been acquired it may be preprocessed to improve image quality
prior to 3D reconstruction. The preprocessing typically involves application of image filters
(mathematical algorithms implemented in software) to the entire data set to remove noise and
artifacts, smooth or sharpen the images, or to correct for problems with contrast or brightness
. Median and Gaussian filters have the general effect of smoothing images. These are used to
eliminate noise and background artifacts and to smooth sharp edges, but also tend to remove
some of the detail in small objects. Sharpening filters can be used to accentuate details in the
image stack, but also have the effect of highlighting noise and other small artifacts.
The application of sharpening filters is most useful when the image stack consists of fine
structural components or when edge enhancement is desired. It is important to realize that the
application of filters to the data set can ultimately affect quantitative measurements of 3D
reconstructions produced from it. Therefore in some instances filters are only used for display
purposes, and quantitative measurements are made on the unprocessed data.
C.Segmentation
To create a shaded surface 3D model the surface of the desired object has to be defined on
every 2D sectional image. The term segmentation refers to the process of extracting the
desired object of interest from the background in an image or data volume. There exist
multiple techniques that are used to do this. A simple but time-consuming approach is to
outline the desired structure on every image slice manually. But there are semiautomatic
methods ranging from the simple, such as thresholding and masking, to the complex such as
boundary detection, region growing and clustering algorithms. Moreover there are extremely
subtle and computationally challenging automated approaches such as the adaptive template
moderated (ATM) approach in combination with spatially varying statistical classification .
Thresholding is a commonly used segmentation method suitable for high-contrast structures.
Thresholding involves limiting the intensity values within an individual image or the entire
image stack to a certain bounded range. It may be decided that all pixels below a certain
value do not contribute significantly to the object of interest and hence can be eliminated.
This can be done by scanning the image one pixel at a time, and keeping that pixel if it is
above the selected intensity value or excluding it if it is below that value. In a similar manner,
thresholding can also be used to eliminate non-consecutive ranges of intensities.
Volume Visualization in medical image:

Multiplanar Reformation

MPR is an image processing technique, which extracts two-dimensional (2D) slices from a
3D volume using arbitrarily positioned orthogonal or oblique planes.8 Although it is still a 2D
method, it has the advantages of ease of use, high speed, and no information loss. The
observer can display a structure of interest in any desired plane within the data set, and
four-dimensional (4D) MPR can be performed in real time using graphics hardware.9 In
addition, to accurately visualize tubular structures such as blood vessels, curved MPR may be
employed to sample a given artery along a predefined curved anatomic plane.10 Curved MPR
is an important visualization method for patients with bypass grafts and tortuous coronary
arteries.

Volume rendering:

DVR displays the entire 3D dataset as a 2D image, without computing any intermediate
geometry representations.27–29 This algorithm can be further divided into image-space DVR,
such as software- 30,31 and GPU-based raycasting,32 and object-space DVR, such as
splatting,33,34 shell rendering,35 texture mapping (TM),36 and cell
37 38
projection. Shear-warp can be considered as a combination of these two categories. In
addition, MIP,39 minimum intensity projection (MinIP), and X-ray projection40 are also
widely used methods for displaying 3D medical images. This paper focuses its attentions on
DVR, and the datasets discussed here are assumed to be represented on cubic and uniform
rectilinear grids, such as are provided by standard 3D medical imaging modalities.
Data compression is defined as the process of encoding data using a representation that reduce the
overall size of data. The reduction is possible when the original data set contains some type of
redundancy and compression can be achieved by eliminating these redundancies. Three basic data
redundancy can be identified and reduced in digital images, the coding redundancy, the interpixel
redundancy and psycho visual redundancy. Coding redundancy is eliminating by Huffman coding.
Due to decorrelation and energy compaction properties discrete cosine transform is used. The image
compression algorithm takes a gray scale image as an input. In lossless, we have used Huffman
coding and in lossy we have used discrete cosine transform (DCT). The method provides high
compression ratio (CR) for medical image with no loss of diagnostic quality. The algorithm can also be
called as a hybrid scheme because it combine a lossy and a lossless technique. we have used two
dimensional discrete cosine transform for image compression. However due to the nature of most of
the images, maximum energy (information) lies in low frequency as opposed to the high frequency.
The low frequency component is called DC coefficient and high frequency component is called AC
coefficient. We can represent the high frequency components coarsely, or drop them altogether,
without strongly affecting the quality of the resulting image reconstruction. This leads to a lot of
compression (lossy).

The input image is first resized (in 256 rows and 256 columns) and then padded if required. In this
technique of image reduction process, the block processing is done prior to discrete cosine transform
is applied to 8x8 pixel blocks of image and then padded them if required. Hence if the input image is
256x256 pixels in size, we break it into 32 square blocks of size 8x8 and treat each block
independently.

Discrete Cosine Transform:

Transform Coding: In transform coding, a block of correlated pixels is transformed into a set of less
correlated coefficients. The transform to be used for data compression should satisfy two objectives.
Firstly, it should provide energy compaction: i.e. the energy in the transform coefficients should be
concentrated to as few coefficients as possible. This is referred to as the energy compaction property
of the transform. Secondly, it should minimize the statistical correlation between the transform
coefficients. As consequence transform coding has a good capability of data compression, because
not all transform coefficients need to be transmitted in order to obtain good image quality and even
those that are transmitted need not be represented with full accuracy in order to obtain good image
quality. In addition the transform domain coefficients are generally related to the spatial frequencies
in the image and hence the compression techniques can exploit the psychovisual properties of the
HVS, by quantizing the higher frequency coefficients more coarsely, as the HVS is more sensitive to
the lower frequency coefficients [2]. The Discrete Cosine Transform: The discrete cosine transform
(DCT) represents an image as a sum of sinusoids of varying magnitudes and frequencies. In DCT for
image, most of the visually significant information about the image is concentrated in just a few
coefficients of the DCT[19]. For this reason, the DCT is often used in image compression applications.
For example, the DCT is at the heart of the international standard lossy image compression algorithm
known as JPEG. The important feature of the DCT is that it takes correlated input data and
concentrates its energy in just the first few transform coefficients. If the input data consists of
correlated quantities, then most of the n transform coefficients produced by the DCT are zeros or
small numbers, and only a few are large (normally the first ones). The early coefficients contain the
important (low-frequency) image information and the later coefficients contain the less-important
(highfrequency) image information. Compressing data with the DCT is therefore done by quantizing
the coefficients.
In medical imaging field, computer-aided detection (CADe) or computer-aided diagnosis
(CADx) is the computer-based system that helps doctors to take decisions swiftly. Med-
ical imaging deals with information in image that the medical practitioner and doctors has to
valuate and analyze abnormality in short time. Analysis of imaging in medical field is very
crucial task because imaging is basic modality to diagnose any diseases at the earliest but
acquisition of image is not to harm the human body. Imaging techniques like MRI, X-ray,
endoscopy, ultrasound, etc. if acquired with high energy will provide good quality image but
they will harm the human body; hence, images are taken in less energy and therefore, the
images will be bad in quality and low contrast. CAD systems are used to improve the quality
of the image, which helps to interpret the medical images correctly and process the images
for highlighting the conspicuous parts .
CAD is a technology which includes multiple elements like concepts of artificial intelligence
(AI), computer vision, and medical image processing. The main application of CAD system
is finding abnormality in human body. Among all these, detection of tumor is the typical
application because if it misses in basic screening, it leads to cancer.

Objectives of CAD system:

The main goal of CAD systems is to identify abnormal signs at an earliest that a human
professional fails to find. In mammography, identification of small lumps in dense tissue,
finding architectural distortion and prediction of mass type as benign or malignant by its
shape, size, etc.

Significance of the CAD system:


CADe usually restricted to marking the visible parts or structures in image, whereas CADx
helps to evaluate the structures identified in CADe. Both together the CAD models are more
significant in identifying the abnormality at an earliest. For example, it highlights
microcalcification clusters, marginal structure of mass, and highly dense structure of tissue in
mammography. This helps the radiologist to draw the conclusion. Though the CAD has been
used for over 40 years, still it does not reach the expected outcomes. We agree that CAD
cannot substitute the doctor but definitely it makes radiologists as better decision makers. It
plays a supporting and final interpretative role in medical diagnosis.

Applications of CAD system


CAD is used in the diagnosis of breast cancer, lung cancer, colon cancer, prostate cancer,
bone metastases, coronary artery disease, congenital heart defect, pathological brain
detection, Alzheimer’s disease, and diabetic retinopathy.
Discrete wavelet transform (DWT) has gained widespread acceptance in signal processing and image
compression. Because of their inherent multi-resolution nature, wavelet-coding schemes are
especially suitable for applications where scalability and tolerable degradation are important. The
performance of discrete wavelet transforms based coding depends on the wavelet decomposition
level and threshold value. The approximation sub-signal shows the general trend of pixel value and
three detail sub-signals shows the vertical, horizontal and diagonal details or change in the image. If
these details are very small then they can be set to zero without significantly changing the image.
The value below which details are considered small enough to set to zero is known as the threshold.
The greater the number of zeros, the greater the compression achieved. The amount of information
retained by an image after compression and decompression is known as the PSNR. If the PSNR is
100% then the compression is known as lossless as the image can be reconstructed exactly.

The Discrete Wavelet Transform (DWT), which is based on sub-band coding. In DWT, the signal to be
analyzed is passed through filters with different cutoff frequencies at different scales. Wavelets can
be realized by iteration of filters with rescaling. The resolution of the signal, which is the measure of
the amount of detail information in the signal, is determined by the filtering operations, and the
scale is determined by up-sampling and down-sampling. The DWT is computed by successive
low-pass and high-pass filtering of the discrete time-domain signal. Images are treated as two
dimensional signals, they change horizontally and vertically, thus 2D wavelet analysis must be used
for images .2D wavelet analysis uses the same ‘mother wavelets’ but requires an additional step at
each level of decomposition. In 2D, the images are considered to be matrices with N rows and M
columns. At every level of decomposition the horizontal data is filtered, and then the approximation
and details produced from this are filtered on columns. At every level, four sub-images are obtained;
the approximation, the vertical detail, the horizontal detail and the diagonal detail.

Thresholding Once DWT is performed, the next task is thresholding, which is neglecting certain
wavelet coefficients for level from 1 to N. There are two types of threshold: (a) Hard threshold; (b)
Soft threshold By applying hard threshold the coefficients below this threshold level are zeroed, and
the output after a hard threshold is applied and defined by this equation:
Feature Extraction uses an object-based approach to classify imagery, where an object (also
called segment) is a group of pixels with similar spectral, spatial, and/or texture attributes.
Traditional classification methods are pixel-based, meaning that spectral information in each
pixel is used to classify imagery. With high-resolution panchromatic or multispectral imagery,
an object-based method offers more flexibility in the types of features to extract.
The workflow involves the following steps:
● Dividing an image into segments
● Computing various attributes for the segments
● Creating several new classes
● Interactively assigning segments (called training samples) to each class
● Classifying the entire image with a K Nearest Neighbor (KNN), Support Vector
Machine (SVM), or Principal Components Analysis (PCA) supervised
classification method, based on your training samples.
● Exporting the classes to a shapefile or classification image.

● Gray level co-occurrence matrix


A GLCM uses the texture classification concept. The texture classification concept is
classified using the homogeneity value. The homogeneity value is calculated for every
pixel to present inside the image. After calculating the homogeneity values, a matrix
of values is created. If there is a change in the homogeneity value of the particular
pixel, then the GLCM value is calculated. In the brain, the X-ray tumor part is
different from the rest of the gray mass. The gray mass has a different texture in
comparison to the tumor texture. At that point, GLCM is the best approach. If there is
a sharp change in the matrix value, there is the highest chance of getting the tumor.
GLCM is the best approach for classifying the pixel by pixel values. Gray level
co-occurrence value is mostly used in X-ray analysis. X-ray is a black and white film,
and it contains different shades of the gray.
What is ROC?
The Receiver Operating Characteristics Curve, aka the ROC Curve, is an
evaluation metric used for binary classification problems. ROC is a probability
curve that depicts the TPR (rate of true positives) on the y-axis against the
FPR (rate of false positives) on the x-axis. It essentially distinguishes the
‘signal’ from the ‘noise’, thereby highlighting the Sensitivity of the classifier
model.
The graph below demonstrates theoretically how the ROC curve would be
plotted for a classifier model. A perfect classifier will have a ROC where the
graph would hit a true positive rate of 100% with zero false positives. The grey
diagonal line represents a classifier with no predictive power – one that
guesses randomly.

What is AUC?
Area Under the Curve, aka the AUC, is another important evaluation metric
generally used for binary classification problems. AUC represents the degree
of separability. The higher the AUC, the better the classifier can distinguish
between the positive and negative classes.
The graph below demonstrates the area under the curve. AUC is used to
summarize the ROC Curve as it measures the entire 2D area present
underneath the ROC curve.
The value of AUC ranges from 0 to 1. If AUC=1, the model can distinguish
between the Positive and the Negative class points correctly.
Similarly, a model with 100% false predictions, i.e., predicting all Negatives as
Positives, and all Positives as Negatives, would have AUC=0.
When an AUC=0.5, it means the model possesses no class separation capacity
and cannot distinguish between the Positive and Negative class points.
The Sensitivity and Specificity of a classifier are inversely proportional to each
other. So, when Sensitivity increases, Specificity decreases, and vice versa.
Confusion matrix:
Confusion matrix is a performance measurement for machine learning Classification &
Regression problem where output can be two or more classes. performance
measurement is an essential task in Machine Learning, So when it comes to a
classification /Regression problem, we can count on Confusion matrix & AUC - ROC
Curve.

● True Positive: Actual Positive and Predicted as Positive


● True Negative: Actual Negative and Predicted as Negative
● False Positive(Type I Error): Actual Negative but predicted as Positive
● False Negative(Type II Error): Actual Positive but predicted as Negative.
● AUC_ROC curves are the most important evaluation metrics to check model's
Performance. AUC means ' Area Under The Curve '& ROC means ' Receiver
Operating Characteristics' curve.
● Higher the AUC, the better the model is. If the AUC value is nearer to 1, then the
model is high in accuracy.
Medical Image Preprocessing
Image preprocessing prepares data for a target workflow. The main goals of medical image
preprocessing are to reduce image acquisition artifacts and to standardize images across a
data set. Your exact preprocessing requirements depend on the modality and procedure used
to acquire data, as well as your target workflow. Some common preprocessing steps include
background removal, denoising, resampling, registration, and intensity normalization.
Background Removal
Background removal involves segmenting the region of interest from the image background. By
limiting the image to the region of interest, you can improve the efficiency and accuracy of your target
workflow. One example of background removal is skull stripping, which removes the skull and other
background regions from MRI images of the brain. Background removal typically consists of applying
a mask of the region of interest that you create using morphological operations or other segmentation
techniques.
To perform background removal, multiply the mask image and the original image. For example,
consider a grayscale image, im, and a mask image, mask, that is the same size as im and has a value
of 1 for every element in the region of interest and 0 for each element of the background. This code
returns a new image, imROI, in which the elements in the region of interest have the same values as
in im, and the background elements all have values of 0.

Denoising
Medical imaging modalities are susceptible to noise, which introduces random intensity fluctuations
in an image. To reduce noise, you can filter images in the spatial and frequency domains.

Resampling
Use resampling to change the pixel or voxel size of an image without changing its spatial limits in the
patient coordinate system. Resampling is useful for standardizing image resolution across a data set
that contains images from multiple scanners.

Intensity Normalization
Intensity normalization standardizes the range of image intensity values across a data set. Typically,
you perform this process in two steps. First, clip intensities to a smaller range. Second, normalize the
clipped intensity range to the range of the image data type, such as [0, 1] for double or [0, 255]
for uint8. Whereas visualizing image data using a display window changes how you view the data,
intensity normalization actually updates the image values.
Preprocessing of retinal images:

Segmentation of liver:
Segmentation of ROI: Lung Nodules
Segmentation of ROI blood vessels:

Tumour detection:
Segmentation of lesions:
Kidney ultrasound image processing:
Unit 1: Question Bank
Part A
1. Define 4 neighbours of a pixel.
N4 (p) : 4-neighbors of p.
• Any pixel p(x, y) has two vertical and two horizontal neighbors, given by (x+1,y),
(x-1, y), (x, y+1), (x, y-1)
• This set of pixels are called the 4-neighbors of P, and is denoted by N4 (P)
• Each of them is at a unit distance from P.
2. Define a path.
A digital path (or curve) from pixel p with coordinate (x,y) to pixel q with coordinate
(s,t) is a sequence of distinct pixels with coordinates (x0 , y0 ), (x1 , y1 ), ..., (xn , yn
), where (x0 , y0 )= (x,y), (xn , yn )= (s,t).
3. Name the elements of visual perception.
Structure of the eye, image formation in the eye, brightness adaptation and
discrimination.
4. Define a pixel.
Pixel is the smallest unit of a digital graphic which can be illuminated on a display
screen and a set of such illuminated pixels form an image on screen. A pixel is usually
represented as a square or a dot on any display screen like a mobile, TV, or computer
monitor.
5. Define a voxel.
In 3D printing, we can define a voxel as a value on a grid in a three-dimensional
space, like a pixel with volume. Each voxel contains certain volumetric information
which helps to create a three dimensional object with required properties.
6. What is image acquisition?
The image is captured by a camera and digitized (if the camera output is not
digitized automatically) using an analogue-to-digital converter for further processing in a
computer.
7. State the significance of wavelets in image processing
Wavelets are the building blocks for representing images in various degrees of
resolution. Images subdivision successively into smaller regions for data compression and
for pyramidal representation.
8. Define hue.

The Hue component describes the color itself in the form of an angle between [0,360]
degrees. 0 degree mean red, 120 means green 240 means blue. 60 degrees is yellow,
300 degrees is magenta.

9. Define Intensity.
An image is defined as a two-dimensional function f(x, y) the amplitude of f at any
pair of coordinates (x, y) is called the intensity or gray level of the image at that point.
10. What is the meaning of brightness and contrast in image processing?
Brightness increases the overall lightness of the image—for example, making dark colors
lighter and light colors whiter—while contrast adjusts the difference between the darkest
and lightest colors.

Part B
1. Elaborate on the components of image processing system.
2. Elucidate on the elements of visual perception.
3. Write a detailed note on image quality with respect to signal to noise ratio.
4. Give an account of arithmetic and logical operations done on pixels.
5. Explain in detail about the relationship between pixels.
6. Provide an account of the purpose and use of DFT and DCT on images.

Part C
1. Give an account of discrete sampling and quantization.
2. Provide a brief account of KLT on images.
Unit 2: Question Bank
Part A
1. What is the application of homomorphic filtering?
a.Homomorphic filter is used for image enhancement.
b.It simultaneously normalizes the brightness across an image and increases contrast.
c.It is also used to remove multiplicative noise.
2. What is spatial filtering?
Spatial filtering is a process by which we can alter properties of an optical image by
selectively removing certain spatial frequencies that make up an object, for example, filtering
video data received from satellite and space probes, or removal of raster from a television
picture or scanned image.
3.Define a histogram.
An image histogram is a gray-scale value distribution showing the frequency of
occurrence of each gray-level value. For an image size of 1024 × 1024 × 8 bits, the
abscissa ranges from 0 to 255; the total number of pixels is equal to 1024 × 1024.
4. Differentiate between smoothing and sharpening filters.
While linear smoothing is based on the weighted summation or integral operation on the
neighborhood, the sharpening is based on the derivative (gradient) or finite difference.
5. Define image enhancement.
Image enhancement is the process of digitally manipulating a stored image using
software. The tools used for image enhancement include many different kinds of software
such as filters, image editors and other tools for changing various properties of an entire
image or parts of an image.
6. What are hybrid filters?
The hybrid filter takes advantage of the image decomposition and reconstruction
processes of the MMWT where reconstruction of specific subimages is used to selectively
enhance the masses and separate the background structures.
7. What is frequency domain filtering?
Filtering in the frequency domain consists of modifying the Fourier transform of an
image and then computing the inverse transform to obtain the processed result.
8. What is the significance of Fourier Transform in image processing?
In Image processing, the Fourier Transform tells you what is happening in the image in
terms of the frequencies of those sinusoidal. For example, eliminating high frequencies
blurs the image. Eliminating low frequencies gives you edges.
9. Define power law transformation.
Power -law transformation enables us having both logarithmic and exponential
transformation in using simply using the value of the power. If the power is less than one,
then it behaves like logarithmic transformation. If the power is more than one, it behaves
like exponential transformation.
10. Define piecewise linear transformation.
Piece-wise Linear Transformation is type of gray level transformation that is used for
image enhancement. It is a spatial domain method. It is used for manipulation of an image
so that the result is more suitable than the original for a specific application.
Part B
1. Give an in detailed account of smoothing and sharpening filters.
2. Elucidate on homomorphic filtering.
3. Elucidate on medical image enhancement using hybrid filters.
4. Elaborate on frequency domain filtering.
5. With an example, explain power law transformation.
6. Explain in detail piecewise linear transformation.
Part C
1. With an example elaborate on histogram equalization technique.
2. With an example, explain in detail histogram matching technique.
Unit 3: Question Bank
Part A
1. Define region of interest.
A region of interest (ROI) is a portion of an image that you want to filter or operate on
in some way. You can represent an ROI as a binary mask image. In the mask image,
pixels that belong to the ROI are set to 1 and pixels outside the ROI are set to 0 .
2. What are active contours?
Active contour is a segmentation method that uses energy forces and constraints to
separate the pixels of interest from a picture for further processing and analysis.
3. What are the three methods of estimating degradation function?
● Observation
● Experimentation
● Mathematical modelling
4. What is image restoration?
Image restoration is the process of recovering an image from a degraded
version—usually a blurred and noisy image. Image restoration is a fundamental
problem in image processing, and it also provides a testbed for more general inverse
problems.
5. Differentiate between image enhancement and restoration.
enhancement aims to improve the visual appearance of an image without changing its
content, while restoration aims to recover the original content of an image that has
been degraded or damaged.
6. How is periodic noise removed?
Periodic noise can be reduced significantly via frequency domain filtering. On this
page we use a notch reject filter with an appropriate radius to completely enclose the
noise spikes in the Fourier domain. The notch filter rejects frequencies in predefined
neighborhoods around a center frequency.
7. Define image segmentation.
Image segmentation is a commonly used technique in digital image processing and
analysis to partition an image into multiple parts or regions, often based on the
characteristics of the pixels in the image.
8. What is the purpose of noise models in image processing?
Noise is always presents in digital images during image acquisition, coding,
transmission, and processing steps. Noise is very difficult to remove it from the digital
images without the prior knowledge of noise model. That is why, review of noise
models are essential in the study of image denoising techniques.
9. Define inverse filtering.
Inverse filtering is a technique used in signal processing and image processing to
recover an original signal or image from a degraded or distorted version of it. It's
based on the idea of reversing the effects of a known filter or degradation process.
10. Define weiner filtering.
The Wiener filter can be used to filter out the noise from the corrupted signal to
provide an estimate of the underlying signal of interest.
Part B
1. Explain in detail noise models and image restoration.
2. How is periodic noise removed?
3. Elaborate on invariant degradation.
4. Explain in detail inverse filtering.
5. Explain in detail weiner filtering.

Part C
1. Explain edge linking and boundary detection.
2. Explain in detail various active contour models.
UNIT 4: Question bank
Part A
1. Define image registration.
Image registration is the process of transforming different sets of data into a single
unified coordinate system, and can be thought of as aligning images so that
comparable characteristics can be related easily. It involves mapping points from one
image to corresponding points in another image.

2. What are the two types of image registration?


Image registration or image alignment algorithms can be classified
into intensity-based and feature-based.

3. How do you represent principal axis?

Principal axes for a body. For the cylinder the principal axes are represented by the
frame O2 positioned at the mass centre of the body. In this case each of the principal planes is
a plane of symmetry for the body.

4. Expalain the significance of surface based rendering.

Surface rendering represents a visualization technique which is well established for


three-dimensional imaging of sectional image data. This chapter describes image acquisition
and data preparation for surface shaded rendering.

5. How is volume visualization of the image is done?


The task is to display volumetric data as a meaningful two-dimensional image which reveals
insights to the user. In contrast to conventional computer graphics where one has to deal with
surfaces, volume visualization takes structured or unstructured 3D data which is the ren-
dered into two-dimensional image.

6. Explain the significance of volume based rendering.


In scientific visualization and computer graphics, volume rendering is a set of techniques
used to display a 2D projection of a 3D discretely sampled data set, typically a 3D scalar
field. A typical 3D data set is a group of 2D slice images acquired by a CT, MRI,
or MicroCT scanner. Usually these are acquired in a regular pattern (e.g., one slice for each
millimeter of depth) and usually have a regular number of image pixels in a regular pattern.

7. What is a rigid body in image processing?


a rigid body transform is a mapping from this set to another subset of the Euclidean
space, such that the Euclidean distances between points are preserved. Any such
mapping can be represented as a composition of one translation and one rotation.
8. Define orthographic projection.
Orthographic projection (also orthogonal projection and analemma)[a] is a means of
representing three-dimensional objects in two dimensions. Orthographic projection is a
form of parallel projection in which all the projection lines are orthogonal to
the projection plane,[2] resulting in every plane of the scene appearing in affine
transformation on the viewing surface.
9. Define Euclidean distance.
The Euclidean distance is the straight-line distance between two pixels.

10. Define City Block.


The city block distance metric measures the path between the pixels based on a
4-connected neighborhood. Pixels whose edges touch are 1 unit apart; pixels
diagonally touching are 2 units apart.

Part B
1. Expalin in detail rigid body transformation of an image.
2. Explain in detail principal axes registration.
3. Explain in detail volume visualization of images.

Part C
1. Explain in detail feature based visualization of images.
2. Explai orthogonal and perspective projection in medicine.
UNIT 5: Question bank
Part A
1. What is the need of image compression?
The objective of image compression is to reduce irrelevance and redundancy of the
image data to be able to store or transmit data in an efficient form. It is concerned
with minimizing the number of bits required to represent an image. Image
compression may be lossy or lossless.
2. Differentiate between lossy and lossless compression.
Lossy reduces file size by permanently removing some of the original data. Lossless
reduces file size by removing unnecessary metadata.
3. What is wavelet transform based image compression?
Wavelets allow one to compress the image using less storage space with more details of
the image.
4. What are the basic preprocessing steps in image processing?
Read image.
Resize image.
Remove noise(Denoise)
Segmentation.
Morphology(smoothing edges)

5. What is the significance of segmentation of region of interest?


The Region of Interest (ROI) in image segmentation is a specific part of an image
selected for further analysis. This region typically contains the objects or features of
interest. By focusing only on the ROI, the efficiency of the image processing task can be
significantly increased.
6. State the significance of preprocessing of retinal iamges.
Automatic segmentation and analysis of retinal images can be used to detect
pathological risk or damage, and to assist in diagnosis.
7. State the significance of preprocessing of mammographic images.
Mammography is the basic screening test for breast cancer. It consist many artefacts,
which negatively influences in detection of the breast cancer. Therefore, removing
artefacts and enhancing the image quality is a required process in Computer Aided
Diagnosis (CAD) system.
8. Give the uses of segmentation of region of interest in blood vessels.
Blood vessel segmentation is a topic of high interest in medical image analysis since the
analysis of vessels is crucial for diagnosis, treatment planning and execution, and
evaluation of clinical outcomes in different fields,
including laryngology, neurosurgery and ophthalmology. Automatic or semi-automatic
vessel segmentation can support clinicians in performing these tasks.
9. Give the significance of ultrasound image analysis of liver and kidney.
The method is highly useful in detecting stones in kidney and lver, detecting fatty
liver, enlargement of kidney,etc.
10. Why is feature extraction essential in image processing?
Feature extraction plays an important role in image processing. This technique is
used to detect features in digital images such as edges, shapes, or motion. Once these
are identified, the data can be processed to perform various tasks related to analyzing
an image.

Part B
1. Briefly explain the concept of mammogram in image processing.
2. How is image preprocessing done?
3. Describe the salient features of ROI of blood vessels, tumour.

Part C
1. Explain in detail about computer aided diagnosis system.
2. Expalin in detail about feature extraction in medical images.

You might also like