0% found this document useful (0 votes)
23 views55 pages

Mip Unit 1

The document provides an overview of medical image processing, detailing the fundamentals of image representation, processing techniques, and the human visual system. It covers key concepts such as pixels, image acquisition, and various types of image processing, including analog and digital methods. Additionally, it discusses the elements of an image processing system, relationships between pixels, and the significance of visual perception in selecting image processing techniques.

Uploaded by

Thiyagu Rajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views55 pages

Mip Unit 1

The document provides an overview of medical image processing, detailing the fundamentals of image representation, processing techniques, and the human visual system. It covers key concepts such as pixels, image acquisition, and various types of image processing, including analog and digital methods. Additionally, it discusses the elements of an image processing system, relationships between pixels, and the significance of visual perception in selecting image processing techniques.

Uploaded by

Thiyagu Rajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 55

UNIT I

FUNDAMENTALS OF MEDICAL IMAGE PROCESSING AND TRANSFORMS

Overview of Image Processing system and human Visual system- Image representation – pixel
and voxels, Gray scale and color models- Medical image file formats- DICOM, ANALYZE 7.5, NIFTI
and INTERFILE- Discrete sampling model and Quantization- Relationship between the pixels,
Arithmetic and logical operations- Image quality and Signal to Noise ratio- Image Transforms- 2D DFT,
DCT, KLT.

INTRODUCTION:
An image contains descriptive information about the object it represents. An image is defined as
a two-dimensional function, f(x, y) that carries some information, where x and y are known as
spatial or plane coordinates. The amplitude of ‘f’ at any pair of coordinates (x, y) is called the
intensity or gray level of the image at that point.
PIXELS:
Pixels are small individual elements of a digital image. These are also known as image
elements or pixels or picture elements. Each and every pixel has a particular location and
brightness or intensity value. A finite number of pixels form a digital image.
IMAGE PROCESSING:
Image processing is defined as the process of analyzing and manipulating images using a
computer.
ANALOG IMAGE PROCESSING:
Any image processing task which is conducted on two - dimensional analog signals by analog
means is known as analog image processing.
DIGITAL IMAGE PROCESSING:
Using computer algorithms to perform image processing on digital images is referred as digital
image processing i.e. processing digital images by means of a digital computer.
MEDICAL IMAGE PROCESSING:
MIP mainly includes medical image segmentation and registration, structural analysis, 3D
reconstruction and motion analysis and other research direction are based on accurate image
segmentation, so the cornerstone to medical image processing is medical image segmentation.
Advantages of DIP:
It allows a wide range of algorithms to be applied to the input data.
It avoids noise and signal distortion problems.
Fundamental Steps
The fundamental steps in digital image processing are

 Image acquisition
 Image enhancement
 Image restoration
 Image compression
 Image segmentation
 Image representation and description.
Applications - Digital image processing is mainly applied in the following fields,
Gamma - Ray Imaging
 X-ray Imaging
 Imaging in the Ultra-Violet (UV) Band
 Imaging in the Visible and Infrared (IR) Band
 Imaging in the Microwave Band
 Imaging in the Radio Band
 Ultrasound imaging

1. OVERVIEW OF IMAGE PROCESSING SYSTEM or COMPONENTS or ELEMENTS OF


IMAGE PROCESSING SYSTEM:

Block Diagram

Elements of Digital Image Processing System

a) Image sensors:

Image sensing or image acquisition is used to acquire i.e. to get digital images. It
requires two elements, which are, A physical device that is sensitive to the
energy radiated by the object to be imaged. Example: A digital video camera. A
digitizer to convert the output of the physical sensing device into digital form.

b) Specialized Image Processing Hardware:


This hardware consists of the digitizer and some hardware to perform other basic
operations. Example: Arithmetic Logic Unit (ALU) which perform
arithmetic and logical operations on entire images in parallel. This type of
hardware is also known as front-end subsystem. The main feature of this
hardware is its high speed. Therefore, fast functions which cannot be performed
by the main computer can be handled by this unit.

c) Computer:
The computer used in an image processing system is a general - purpose
computer. It can range from a personal computer (PC) to a supercomputer. In
some dedicated applications, specially designed computers are used to achieve
the required performance.
d) Image processing software
The software for image processing has specialized modules which perform specific
tasks. Some software packages have the facility for the user to write code using
the specialized modules.
e) Mass storage
Since images require large amount of storage space, mass storage capability is
very important in image processing applications. Example: A 1024 x 1024 size
image with each pixel represented in 8 - bits, requires 1 megabyte of storage
space, without compression.
Measurement: Storage is measured in the following units:
 Bytes = 8 bits
 K bytes (Kilobytes) = One thousand bytes
 M bytes (Megabytes) = One million bytes
 G bytes (Gigabytes) = One billion bytes
 T bytes (Terabytes) = One trillion bytes
f) Image displays:
Commonly used displays are color TV monitors. These monitors are driven by the
output of “image and graphics display cards” which are a part of the computer
system. If stereo display are needed in some applications, a headgear containing
two small displays is used.

g) Hardcopy devices
Hardcopy devices are used for recording images. These devices include,
 Laser printers
 Film cameras
 Heat - sensitive devices
 Inkjet units
 Digital units like optical and CD – ROM disks etc.
Even though the highest resolution is provided by camera film, the written material preferred
is paper.

h) Network:
Networking is a function used in all computer systems today. Since image
processing applications need large amount of data, the main consideration here is the
bandwidth. Also, communications with remote sites are done through the Internet, which uses
optical fiber and other broadband technologies.

2. ELEMENTS OF VISUAL PERCEPTION:

Vision is the most advanced human sense. So, images play the most important role in
human perception. Human visual perception is very important because the selection of image
processing techniques is based only on visual judgements.
Structure of the Human Eye

Human Eye – Cross Section

The human eye is nearly in the shape of a sphere. Its average diameter is approximately
20mm. The eye, called the optic globe is enclosed by three membranes known as,
(1) The Cornea and Sclera outer cover
(2) The Choroid and

(3) The Retina


(1) The Cornea and Sclera outer cover
The cornea is a tough, transparent tissue that covers the anterior i.e. front surface of the
eye. The sclera is an opaque (i.e. not transparent) membrane that is continuous with the
cornea and encloses the remaining portion of the eye.

(2) The Choroid


The choroid is located directly below the sclera. It has a network of blood vessels which
are the major nutrition source to the eye. Slight injury to the choroid can lead to severe eye
damage as it causes restriction of blood flow. The outer cover of the choroid is heavily
pigmented i.e. colored.
This reduces the amount of light entering the eye from outside and backscatter within
the optical globe. The choroid is divided into two at its anterior extreme as,
The Ciliary Body and
The Iris Diaphragm
Iris Diaphragm:
- It contracts and expands to control the amount of light enters the eye.
- The central opening of the iris is known as the pupil, whose diameter varies from
2 to 8 mm.
- The front of the iris contains the visible pigment of the eye and the back has a black
pigment.
Lens:
- The lens is made up of many layers of fibrous cells.
- It is suspended i.e. hang up by the fibers attached to the ciliary body.
- It contains 60 % to 70% water, 6% fat and more protein
Cataracts:
- The lens is colored by a slightly yellow pigmentation. This coloring increases with age,
which leads to the clouding of lens.
- Excessive clouding of lens happens in extreme cases which is known as “cataracts”.
- This leads to poor color discrimination (i.e. differentiation) and loss of clear vision.

(3) The Retina


The retina is a the innermost membrane of the eye.
It covers the inside of the wall’s entire posterior i.e. back portion.
Fovea:
The central portion of the retina is called the fovea.
It is a circular indentation with a diameter of 1.5mm.
Light receptors:
When the eye is properly focused, light from an object outside the eye is imaged
on the retina. Light receptors provide this “pattern vision” to the eye. These receptors are
distributed over the surface of the retina. There are two classes of discrete light receptors,
known as
Cones:
In each eye 6 to 7 million cones are present. They are highly sensitive to color and
are located in the fovea. Each cone is connected with its own nerve end. Therefore,
humans can resolve fine details with the use of cones. Cone vision is called photopic or
bright - light vision.
Rods
The number of rods in each eye ranges from 75 to 159 million. They are
sensitive to low levels of illumination i.e. lightings and are not involved in color vision.
Many number of rods are connected to a common, single nerve. Thus, the amount of
detail recognizable is less. Therefore, rods provide only a general, overall picture of the
field of view.
3. IMAGE REPRESENTATION:
Representation:

Representation is the process of characterizing the quantity represented by each pixel. This is
done after the segmentation of an image. Because, the output of image segmentation is a group
of pixels. Therefore, to change this raw pixel data into a form suitable for further computer
processing, representation is used.

Types:

Boundary (or) External Representation


 It represents a region in terms of its external characteristics i.e. its boundary.
 It is selected when importance is given to shape characteristics.
Example: Corners, inflections.
Regional (or) Internal Representation
 It represents a region in terms of its internal characteristics.
 It is chosen when the main focus is on regional properties.
Example: texture.
In some cases, both types of representation may be used.

Description
Not only representation makes the data useful to a computer. The next task, that is
description of the region, based on the chosen representation is also needed to complete the
process.

Objective:
The objective of description process is to ‘capture’ the needed difference between
objects or classes of objects while maintaining as much independance as possible to changes
in factors such as location, size and orientation.
Features:
Some of the features used to describe the boundary of a region are:
 Length
 The orientation of the straight line joining the extreme points
 The number of concavities i.e. curves in the boundary etc.
Whatever be the type of representation, the features selected for description should be
Insensitive to the following variations.

 Size
 Translation and
 Rotation
4. RELATIONSHIPS BETWEEN PIXELS
The relationships between the pixels in an image should be known clearly to
understand the image processing techniques. Those relationships for an image f(x, y) are
explained below:

Neighbors of a Pixel

A pixel, p can have three types of neighbors known as,

(1) 4 – Neighbors, N4(p)


(2) Diagonal Neighbors, ND(p)

(3) 8 – Neighbors, N8(p)

(1) 4 – neighbors, N4(p):


The 4-neighbors of a pixel ‘p’ at coordinates (x, y) includes two horizontal and two vertical
neighbors. The coordinates of these neighbors are given by,

(x – 1, (x –1, (x – 1,
y – 1) y) y + 1)
(x 1, y),(x 1, y),(x, y 1),(x, y 1)
(x, (x,
(x, y)
y – 1) y + 1)

(x + 1, (x +1 (x + 1,
y – 1) , y) y + 1)

Fig. 1.24 A Region of an Image Centered at (x,y)

Here, each pixel is at unit distance from (x, y) as shown in fig. 1.24. If (x, y) is on the border
of the image, some of the neighbors of p lie outside the digital image.

(2) Diagonal Neighbors, ND(p):

The coordinates of the four diagonal neighbors of ‘p’ are given by


(x 1, y 1),(x 1, y 1),(x 1, y 1),(x 1, y 1)

Here also, some of the neighbors lie outside the image if (x, y) is on the border of the
image.

(3) 8 – Neighbors, N8(p):


The diagonal neighbors together with the 4-neighbors are called the 8-neighbors of the pixel ‘p’.
It is denoted by N8(p).

Adjacency
Let, {V} be the set of gray-level values from 0 to 255 which are used to define
adjacency. There are three types of adjacency.

(1) 4 – Adjacency

(2) 8 – Adjacency

(3) m – Adjacency (or) Mixed Adjacency

4 – Adjacency:

Two pixels p and q with values from {V} in an image are 4 – adjacent if q is in the set N4(p).

8 – Adjacency:

Two pixels p and q with values from {V} in an image are 8 – adjacent if q is in the set N8(p).
Here, all the three pixels are 8 – adjacent to the center pixel. But, it is called multiple 8-
adjacency, since there is an ambiguity (i.e. confusion) This ambiguity is removed by using m –
adjacency.

m-adjacency (mixed adjacency):

Two pixels p and q of the same value (or specified similarity) are m-adjacent if either
i. q and p are 4-adjacent, or
ii. p and q are diagonally adjacent and do not have any common 4-adjacent neighbors. They
cannot be both (i) and (ii).

Path
A path is also known as digital path or curve. A path from pixel, p with coordinates (x, y) to
pixel q with coordinates (s, t) is defined as the sequence of different pixels with coordinates.

(x0, y0), (x1, y1), ……, (xn,

yn)

Path Length:
Path length is the number of pixels present in a path. It is given by the value of ‘n’
here.

Closed Path:

In a path, if (x0, y0) = (xn, yn) i.e. the first and last pixel are the same, it is known as a
‘closed path’.

Types:

According to the adjacency present, paths can be classified as:

(1) 4 – path

(2) 8 – path and

(3) m – path

Connectivity
Connectivity between pixels is a fundamental concept of digital image processing.

Connected Component:
For any pixel p in S, the set of pixels that are connected to P is called a “connected
component” of S.

Connected Set:
If the pixel p has only one connected component, then set S is called a ‘connected set’.
Region
Let ‘R’ represent a subset of pixels in an image.

If R is a connected set, it is called a ‘region’ of the image.

Boundary
Boundary is also known as border or contour (Contour  outline) The boundary of a region
R is defined as the set of pixels in the region, that has one or more neighbors which are not in
the same region R. If R is an entire image, its boundary is defined as the set of pixels in the
first and last rows and columns of the image. The boundary of a finite region forms a closed
path. Therefore, it is a “global”

Edge
Edges are formed by the pixels with derivative value that is higher than a preset threshold.
Thus, edges are considered as gray-level or intensity discontinuities. Therefore, it is a
“local” concept.
Distance Measures
Various distance measures are used to determine the distance between different pixels.
Conditions:
Consider three pixels, p, q and z, p has coordinates (x, y), q has coordinates (s, t) and z has
coordinates (v, w). For these three pixels D is a distance function or metric if

(a) D(p,q)  0, [D(p,q) = 0 if p = q]

(b) D(p,q) = D(q, p) and

(c) D(p,z)  D(p, q) + D(q, z)


Types:

Some of the important distance measures are,

(1) Euclidean distance

(2) City - Block (or) D4 distance

(3) Chessboard (or) D8 distance

(4) Dm distance

(1) Euclidean Distance, De (p,q):

The Euclidean distance between two pixels p and q is defined as

(2) City - Block (or) D4 distance

The city-block distance between two pixels p and q is defined as,

D4 (p, q) = |x – s| + |y – t|

(3) Chessboard (or) D8 distance


The chessboard distance between two pixels p and q is defined as,
D8 p, q max  x  s , y  t 
(4) Dm distance
Dm distance between two points is defined as the shortest m-path between the points. It considers
m-adjacency.
Here, the distance between two pixels depends on,
 The values of the pixels along the path and
 The values of their neighbors
But, D4 and D8 distances depend only on the coordinates of the points.

IMAGE SAMPLING AND QUANTIZATION:

To create a digital image, we need to convert the continuous sensed data into digital from. This involves two
processes – sampling and quantization. An image may be continuous with respect to the x and y coordinates and
also in amplitude. To convert it into digital form we have to sample the function in both coordinates and in
amplitudes.

Digitalizing the coordinate values is called sampling


Digitalizing the amplitude values is called quantization
There is a continuous the image along the line segment AB.To simple this function, we take equally spaced
samples along line AB. The location of each samples is given by a vertical tick back (mark) in the bottom part. The
samples are shown as block squares superimposed on function the set of these discrete locations gives the sampled
function.

Sampling and Quantization

In order to form a digital, the gray level values must also be converted (quantized) into discrete quantities.
So we divide the gray level scale into eight discrete levels ranging from block to white. The vertical tick mark
assign the specific value assigned to each of the eight level values.
The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each
sample. The assignment it made depending on the vertical proximity of a simple to a vertical tick mark.
Starting at the top of the image and covering out this procedure line by line produces a two dimensional digital
image.
if a signal is sampled at more than twice its highest frequency component, then it can be reconstructed exactly
from its samples.
But, if it is sampled at less than that frequency (called under sampling), then aliasing will result.
This causes frequencies to appear in the sampled signal that were not in the original signal.
Note that subsampling of a digital image will cause under sampling if the subsampling rate is less than twice
the maximum frequency in the digital image.
Aliasing can be prevented if a signal is filtered to eliminate high frequencies so that its highest frequency
component will be less than twice the sampling rate.
Gating function: exists for all space (or time) and has value zero everywhere except for a finite range of
space/time. Often used for theoretical analysis of signals. But, a gating signal is mathematically defined and
contains unbounded frequencies.
A signal which is periodic, x(t) = x(t+T) for all t and where T is the period, has a finite maximum frequency
component. So it is a bandlimited signal.
Sampling at a higher sampling rate (usually twice or more) than necessary to prevent aliasing is called
oversampling.

VOXELS AND PIXELS:

A pixel is represented as a square or a dot on any display screen like a mobile, TV, or computer
monitor Pixel is the smallest unit of a digital graphic which can be illuminated on a display
screen and a set of such illuminated pixels form an image on screen.
A pixel is usually represented as a square or a dot on any display screen like a
mobile, TV, or computer monitor. They can be called as the building blocks of a digital
image and can be controlled to accurately show the desired picture.
The quality of picture concerning the clarity, size and colour combination is majorly
controlled by the amount and density of pixels present in the display. Higher the resolution
smaller the size of the pixel and better the clarity and vice versa. Each pixel has a
unique geometric co-ordinate, dimensions (length and breadth), size (eight bits or more),
and has the ability to project multitude of colours.

VOXELS:
Voxels are fairly complicated to understand but can be defined in the easiest of
language as a Volumetric Pixel. In 3D printing, we can define a voxel as a value on a grid in
a three-dimensional space, like a pixel with volume.
Each voxel contains certain volumetric information which helps to create a three
dimensional object with required properties. Voxel is the smallest distinguishable element
of any 3D printed object and represents a certain grid value. However, unlike a pixel, voxel
does not have a specific position in three-dimensional space.
They are not bound by absolute coordinates but are defined by the relative position
of the surrounding voxels. We can equate a voxel to bricks, where the position of a brick is
defined by the relative position of the neighbouring bricks. One important aspect of every
voxel is the ability of repeatability. Voxels have a defined shape and size and can be
stacked over each other to create a 3D object.
A voxel need not be a cube, in fact it can be found in many forms like a spherical,
triangle, square, rectangle, diamond, etc. as long as they follow the principal rule of
repeatability

Difference between Pixel and Voxel

Parameter Pixel Voxel

Dimensions Two-dimensional Three-dimensional

Position Absolute position is known Relative position is known

Volume Pixel does not have volume Voxel has a volume and hence also called as ‘Volumetric Pixel’.

Gray - Level Resolution:


 Gray - level resolution is defined as the smallest discernible change in gray level.
 Measuring the discernible changes in gray level depends on human perception. Therefore, it
is a ‘subjective’ process.
 Gray level resolution of an L-level digital image of size M x N = L levels.
Aliasing Effect
Aliasing is an unwanted effect which is always present in a sampled image.
Band-Limited Functions:
Finite functions are represented in terms of sines and cosines of various frequencies. The sine/cosine
component with the highest frequency determines the ‘highest frequency content’ of the function.
If the highest frequency is finite and the function has unlimited duration, it is known as
a ‘band-limited function’.
Sampling Rate:
Sampling rate is defined as the number of samples taken per unit distance.

COLOR MODELS

A color model is a specification of a coordinate system and a subspace within that system where each
color is represented by a single point. The color model can also be called as color space or color system.
It is used to specify the colors easily in a standard and accepted way.

Classification

A number of color models are in use today. Based on the use, they can be classified as
below.

(i) Hardware - Oriented Color Models

RGB (Red, Green, Blue) model for color monitors and color video cameras.

CMY (Cyan, Magneta,Yellow) model and CMYK (Cyan, Magneta,Yellow, Black) model
for color printing.

HSI (Hue, Saturation, Intensity) model.


(ii) Application - Oriented Color Models

These models are used in applications where color manipulation is a goal.

One each example is the creation of color graphics for animation.

The RGB (Red, Green, Blue) Color Model


In RGB color model, all the colors appear in their primary spectral components of red, green and
blue.Therefore, images represented in RGB model consist of three component images, one for each
primary color. When these are fed into an RGB monitor, the three images combine on the phosphor
screen to produce a composite color image. Thus an RGB color image can be viewed as three
monochrome intensity images.

(1)RGB Color Cube

The RGB model is based on a Cartesian Coordinate System. The RGB color cube is a color subspace
which is shown in fig. 1.29.
Fig. 1.29 The RGB Color Cube

Applications:

The applications of RGB color model include

Color monitors
Color video cameras etc.
Advantages

The RGB model has the following advantage.

Creating colors in this model is an easy process and therefore it is an ideal tool for ‘Image color
generation’. Changing to other models such as CMY is straight forward. It is suitable for hardware
implementation.

The RGB system is based on the strong perception of human vision to red, green and blue primaries.

Disadvantages

The drawbacks of RGB color model are

It is not acceptable that a color image is formed by combining three primary images.

This model is not suitable for describing colors in a way which is practical for human
interpretation.

For example, the color of a flower is not represented by the percentage of each of the primaries present
in that flower. Therefore, it is less suitable for color description.

The HSI (Hue, Saturation, Intensity) Color model


In the HSI color model, an image is described by its Hue, Saturation and Intensity which is similar
to the way of human interpretation.

Blac
k
Hue  a color attribute that describes a pure color.

Saturation  a measure of the degree to which a pure color is diluted by white light.

Intensity  a measureable and interpretable descriptor of monochromatic images, which is


also called the ‘gray level’. Hue, and Saturation are known to be the color - carrying information in an
image. The HSI model separates the intensity component from the color - carrying information.

MEDICAL IMAGE FORMAT:

Representation of medical images requires two major components namely, metadata and image data
and shown in Figure 1. Metadata provides the structure and information about the image. Metadata can be
automatically added to an image by the capturing device.
Metadata stored at the beginning of an image file or separate file as a header which containing technical,
descriptive and administrative metadata. Technical metadata includes image dimension, depth of pixel and spatial
resolution, camera settings, and dots per inch. Descriptive metadata contains image creator, keywords related to the
image, captions, titles and comments. Administrative metadata is added manually with the information such as
licensing rights, restrictions on reuse, and contact information for the owner.

Image data is the raw intensity values to represent the pixels. Each pixel has its own memory space
to save the intensity in term of bits called depth per pixel. In, binary image each pixel is store in a single bit either
zero or one. X-ray, CT, MRI are produced grey scale images with 8 bits or 16 bits. The number of gray levels
between black to white depends on number bits used to represent the pixel. PET and SPECT scanner has given
colour images with 24 bits per pixel in a respective red-green-blue palette. Size of the image data is calculated from
Eqn. (1) using metadata, number of rows, number of columns and bits per pixel or voxel.
Fig. 1. Medical image file components formats

DICOM :

DICOM (Digital Imaging and Communications in Medicine) is a standard protocol for the
management and transmission of medical images and related data and is used in many healthcare
facilities. DICOM was originally developed by the National Electrical Manufacturers Association
(NEMA) and the American College of Radiology (ACR). It is a registered trademark of NEMA and is
governed by the DICOM Standards Committee, a collaboration of users across all medical imaging
specialties with an interest in the standardization of medical imaging information .

What is DICOM used for

DICOM is the international standard to communicate and manage medical images and data. Its mission
is to ensure the interoperability of systems used to produce, store, share, display, send, query, process, retrieve
and print medical images, as well as to manage related workflows. Vendors who manufacture imaging equipment
-- e.g., MRIs -- imaging information systems -- e.g., PACS -- and related equipment often observe DICOM
standards, according to NEMA.
These standards can apply to any field of medicine where medical imaging technology is predominately
used, such as radiology, cardiology, oncology, obstetrics and dentistry. Medical imaging is typically a
noninvasive process of creating a visual representation of the interior of a patient, which is otherwise hidden
beneath the skin, muscle, and surrounding organ systems, for diagnostic purposes.
The term non-invasive in this context means instruments are not introduced into the patient's body -- in
most cases -- when performing a scan. Medical images are used for clinical analysis, diagnosis and treatment as
part of a patient's care plan. The information collected can be used to identify any anatomical and physiological
abnormalities, chart the progress of treatment, and provide clinicians with a database of normal patient scans for
later reference.
Imaging information systems, in compliance with DICOM, have largely eliminated the need for film-
based images and the physical storage of these items. Instead, these days, medical images, as well as related non-
image data, can be securely stored digitally, whether on premises or in the cloud.

Importance of DICOM:

With the introduction of advanced imaging technologies, such as CT scans, and the growing use of
computing in clinical work, ACR and NEMA saw a need for a standard method to transfer images and associated
information between different vendor devices, according to the International Organization for Standardization.
These devices produce a variety of digital image formats.
Today, DICOM is used worldwide to store, exchange and transmit medical images, enabling the
integration of medical imaging devices from multiple manufacturers. Patient data and related images are
exchanged and stored in a standardized format. Without a standards-based approach, it would be difficult to share
data between different imaging devices because they would need to interpret multiple image formats.
With DICOM, physicians have easier access to images and reports, allowing them to make a diagnosis,
potentially from anywhere in the world. In turn, patients obtain more efficient care.
Not all medical images follow a DICOM format, which has led to the development of cross-document
sharing, or XDS. An extension known as XDS-I is specific to imaging and allows the storage of multiple image
formats. Many medical imaging system vendors offer features that interpret DICOM and non-DICOM formats.

The DICOM standard:


The DICOM Standard is an ever-evolving outline of digital imaging management standards. DICOM
provides a multipart document that details the history, scope, goals and structure of the standard. This information
is available online in various file formats and is revised and republished regularly. As of this publishing, there are
21 separate parts covering everything from DICOM's overview, definitions, data structures and encoding, media
storage, media formats, security, and other important aspects of DICOM's standardization and protocols.

Analyze 7.5 :
Analyze 7.5 is a file format for storing MRI data developed by the Biomedical Imaging Resource (BIR)
at Mayo foundation in late 1980. Analyze 7.5 contains two files to represent the image data and metadata. The
files are available in the extension of “.img” and “.hdr”.
The image database is the system of files that the ANALYZE package uses to organize and access
image data on the disk. Facilities are provided for converting data from a number of sources for use with the
package. A description of the database format is provided to aid developers in porting images from other sources
for use with the ANALYZE system. An ANALYZE image database consists of at least two files:
• an image file
• a header file

The files have the same name being distinguished by the extensions .img for the image file and .hdr
for the header file. Thus, for the image database heart, there are the UNIX files heart.img and heart.hdr. The
ANALYZETM programs all refer to this pair of files as a single entity named heart.

Image File:

The format of the image file is very simple containing usually uncompressed pixel data for the images
in one of several possible pixel formats:
The .img file contains the information in the form of voxel raw data. The .hdr file contains information
about the img file, such as a number of voxels in each dimension and voxel size. Size of Analyze 7.5 header has
348 bytes.

Header File:

The header file is represented here as a `C' structure which describes the dimensions and history of the
pixel data. The header structure consists of three substructures:

• header_key describes the header

• image_dimension describes image sizes

• data_history optional

The header file was constructed in C structure contains three substructures such as header key (40
bytes), image dimension (108 bytes), and data history (200 bytes).
Header key and image dimension are essential structure and data history is an optional one. Analyze
header is a flexible one and that could be adapt with new user-defined data types.
Header key contains several elements namely sizeof_header, extends, and regular. Element
sizeof_header indicates the size of the header in byte representation. Element regular mention all images in the
volume are the same size. Image dimension has several elements such as X,Y, Z dimension of data, spatial units
of measure for a voxel, datatype, pixel dimension in millimetre.

NIFTI :

NIFTI overcomes the drawbacks of Analyze 7.5 developed at the early of 2000 by National Institute of
Health. NIFITI file format is more compatibility with the Analyze 7.5, data stored in NIFTI format also uses a
pair of files, “.img” and “.hdr”. A process with a pair of file is inconvenient and also error-prone. To address
these issues, NIFITI allows for storing the header and data as a single file with “.nii” extension. During transmit
the NIFITI file through a network, the deflate algorithm helps to compress and decompress the file.
In NIFTI file format, the first three dimensions store the three spatial data, x, y, z and fourth
dimension reserve for time point t. The header of NIFTI file format that the size would be 348 bytes in case
of “.hdr/.img” and a size of 352 bytes in the case of “.nii”. NIFTI header parameters are very similar like
ANALYZE 7.5 which given in Table 3. Information of header structure can be read through niftiinfo()
function using MATLAB. NIFTI header info of Brain Imaging of Normal Subjects (BRAINS) dataset is
given in Table 4.
NIFTI was created for handling Neuro-imaging but can be used for other fields as well. NIFTI has
several features such as raw data saved in the 3D image which contains two affine coordinates to relate voxel
index to spatial index. NIFTI has an advantage as storing two files per 3D scans instead of handling multiple
Analyze files. NIFTI file can allow storing some addition parameters such as key acquisition parameters, and
experimental design.

ARITHMETIC AND LOGICAL OPERATIONS ON IMAGES:


ARITHMETIC CODING:
Arithmetic coding is one of the variable - length coding method which is used to reduce the
coding redundancies present in an image.

Concept:
Unlike other variable-length odes, arithmetic coding generates non-block codes. It does not have
one - to - one correspondence between source and code symbols in which a code word is present for each
source symbol. Instead, a set or sequence of source symbols is assigned a single arithmetic code word.

Features:
 An interval of real numbers between 0 and 1 is defined by the code words for each
sequence of source symbols.

 When the number of symbols in the message increases, two changes can happen:

(i) The interval for representing the message becomes smaller according to the
probability of each symbol.

(ii) The number of bits for representing the interval becomes larger.

 This coding method satisfies the noiseless coding theorem.

These operations are applied on pixel-by-pixel basis. So, to add two images together, we add the value at
pixel (0 , 0) in image 1 to the value at pixel (0 , 0) in image 2 and store the result in a new image at pixel (0 , 0).
Then we move to the next pixel and repeat the process, continuing until all pixels have been visited. Clearly, this
can work properly only if the two images have identical dimensions. If they do not, then combination is still
possible, but a meaningful result can be obtained only in the area of overlap. If our images have dimensions of
w1*h1, and w2*h2 and we assume that their origins are aligned, then the new image will have dimensions w*h,
where:
w = min (w1, w2)
h = min (h1, h2)
Addition and Averaging:
If we add two 8-bit gray scale images, then pixels in the resulting image can have values in the range 0-510.
We should therefore choose a 16-bit representation for the output image or divide every pixel value by two. If we
do the later, then we are computing an average of the two images. The main application of image averaging is
noise removal. Every image acquired by a real sensor is afflicted to some degree of random noise. However, the
level of noise is represented in the image can be reduced, provided that the scene is static and unchanging, by the
averaging of multiple observations of that scene. This works because the noisy distribution can be regarded as
approximately symmetrical with a mean of zero. As a result,
as likely as negative perturbations by the same amount, and there will be a tendency for the perturbations to
cancel out when several noisy values are added. Addition can also be used to combine the information of two
images, such as an image morphing, in motion pictures.
Algorithm 1: image addition
read input-image1 into in-array1;
read input-image2 into in- array2;
for i = 1 to no-of-rows do
for j=1 to no-of-columns do begin
out-array (i,j) = in-array1(i,j) + in-array2(i,j);
if ( out-array (i,j) > 255 ) then out-array (i,j) = 255;
end
write out-array to out-image;
Subtraction:
Subtracting two 8-bit grayscale images can produce values between – 225 and +225. This necessitates the use of
16-bit signed integers in the output image unless sign is unimportant, in which case we can simply take the
modulus of the result and store it using 8-bit integers:
g(x,y) = |f1 (x,y) f2 (x,y)|
The main application for image subtraction is in change detection (or motion detection). If we make two
observations of a scene and compute their difference using the above equation, then changes will be indicated by
pixels in the difference image which have non-zero values. Sensor noise, slight changes in illumination and
various other factors can result in small differences which are of no significance so it
is usual to apply a threshold to the difference image. Differences below this threshold are set to zero. Difference
above the threshold can, if desired, be set to the maximum pixel value. Subtraction can also be used in medical
imaging to remove static background information.
Multiplication and Division:

Multiplication and division can be used to adjust brightness of an image. Multiplication of pixel values by
a number greater than one will brighten the image, and division by a factor greater than one will darken the image.
Brightness adjustment is often used as a preprocessing step in image enhancement. One of the principle uses of
image multiplication (or division) is to correct grey-level shading resulting from non uniformities in illumination
orin the sensor used to acquire the image.

LOGICAL OPERATION:

Logical operations apply only to binary images, whereas arithmetic operations apply to multi-valued
pixels. Logical operations are basic tools in binary image processing, where they are used for tasks such as
masking, feature detection, and shape analysis. Logical operations on entire image are performed pixel by pixel.
Because the AND operation of two binary variables is 1 only when both variables are 1, the result at any location
in a resulting AND image is 1 only if the corresponding pixels in the two input images are 1. As logical operation
involve only one pixel location at a time, they can be done in place, as in the case of arithmetic operations. The
XOR (exclusive OR) operation yields a 1 when one or other pixel (but not both) is 1, and it yields a 0 otherwise.

The operation is unlike the OR operation, which is 1, when one or the other pixel is 1, or both pixels are 1.
Logical AND & OR operations are useful for the masking and compositing of images. For example, if we
compute the AND of a binaryimage with some other image, then pixels for which the corresponding value in the
binary image is 1 will be preserved, but pixels for which the corresponding binary value is 0 will be set to 0
(erased) . Thus the binary image acts as a mask that removes information from certain parts of the image. On the
other hand, if we compute the OR of a binary image with some other image , the pixels for which the
corresponding value in the binary image is 0 will be preserved, but pixels for which the corresponding binary
value is 1, will be set to 1 (cleared).

IMAGE QUALITY AND SIGNAL TO NOISE RATIO:

Image quality can refer to the level of accuracy with which different imaging systems capture, process,
store, compress, transmit and display the signals that form an image. Another definition refers to image quality
as "the weighted combination of all of the visually significant attributes of an image". The difference between
the two definitions is that one focuses on the characteristics of signal processing in different imaging systems
and the latter on the perceptual assessments that make an image pleasant for human viewers. Image quality
should not be mistaken with image fidelity. Image fidelity refers to the ability of a process to render a given
copy in a perceptually similar way to the original (without distortion or information loss), i.e., through
a digitization or conversion process from analog media to digital image.

IMAGE QUALITY ATTRIBUTES:


Sharpness determines the amount of detail an image can convey. System sharpness is affected by the lens
(design and manufacturing quality, focal length, aperture, and distance from the image center) and sensor (pixel
count and anti-aliasing filter). In the field, sharpness is affected by camera shake (a good tripod can be helpful),
focus accuracy, and atmospheric disturbances (thermal effects and aerosols). Lost sharpness can be restored by
sharpening, but sharpening has limits. Oversharpening, can degrade image quality by causing "halos" to appear
near contrast boundaries. Images from many compact digital cameras are sometimes over sharpened to
compensate for lower image quality.

Noise is a random variation of image density, visible as grain in film and pixel level variations in digital images.
It arises from the effects of basic physics— the photon nature of light and the thermal energy of heat— inside
image sensors. Typical noise reduction (NR) software reduces the visibility of noise by smoothing the image,
excluding areas near contrast boundaries. This technique works well, but it can obscure fine, low contrast detail.

Dynamic range (or exposure range) is the range of light levels a camera can capture, usually measured in f-
stops, EV (exposure value), or zones (all factors of two in exposure). It is closely related to noise: high noise
implies low dynamic range.

Tone reproduction is the relationship between scene luminance and the reproduced image brightness.

Contrast, also known as gamma, is the slope of the tone reproduction curve in a log-log space. High contrast
usually involves loss of dynamic range — loss of detail, or clipping, in highlights or shadows.

Color accuracy is an important but ambiguous image quality factor. Many viewers prefer enhanced color
saturation; the most accurate color isn't necessarily the most pleasing. Nevertheless, it is important to measure a
camera's color response: its color shifts, saturation, and the effectiveness of its white balance algorithms.

Distortion is an aberration that causes straight lines to curve. It can be troublesome for architectural
photography and metrology (photographic applications involving measurement). Distortion tends to be
noticeable in low cost cameras, including cell phones, and low cost DSLR lenses. It is usually very easy to see
in wide angle photos. It can be now be corrected in software.

Vignetting, or light falloff, darkens images near the corners. It can be significant with wide angle lenses.

Exposure accuracy can be an issue with fully automatic cameras and with video cameras where there is little or
no opportunity for post-exposure tonal adjustment. Some even have exposure memory: exposure may change
after very bright or dark objects appear in a scene.

Lateral chromatic aberration (LCA), also called "color fringing", including purple fringing, is a lens
aberration that causes colors to focus at different distances from the image center. It is most visible near corners
of images. LCA is worst with asymmetrical lenses, including ultrawides, true telephotos and zooms. It is
strongly affected by demosaicing.

Lens flare, including "veiling glare" is stray light in lenses and optical systems caused by reflections between
lens elements and the inside barrel of the lens. It can cause image fogging (loss of shadow detail and color) as
well as "ghost" images that can occur in the presence of bright light sources in or near the field of view.

Color moiré is artificial color banding that can appear in images with repetitive patterns of high spatial
frequencies, like fabrics or picket fences. It is affected by lens sharpness, the anti-aliasing (lowpass) filter
(which softens the image), and demosaicing software. It tends to be worst with the sharpest lenses.

Artifacts – software (especially operations performed during RAW conversion) can cause significant visual
artifacts, including data compression and transmission losses (e.g. Low quality JPEG), over sharpening "halos"
and loss of fine, low-contrast detail.

Signal-to-noise ratio (SNR) is used in imaging to characterize image quality. The sensitivity of
a (digital or film) imaging system is typically described in the terms of the signal level that yields a threshold
level of SNR. Industry standards define sensitivity in terms of the ISO film speed equivalent, using SNR
thresholds (at average scene luminance) of 40:1 for "excellent" image quality and 10:1 for "acceptable" image
quality.

SNR is sometimes quantified in decibels (dB) of signal power relative to noise power, though in the
imaging field the concept of "power" is sometimes taken to be the power of a voltage signal proportional to
optical power; so a 20 dB SNR may mean either 10:1 or 100:1 optical power, depending on which definition is
in use.

DEFINITION OF SNR:

Traditionally, SNR is defined to be the ratio of the average signal value to the standard deviation of the signal,

when the signal is an optical intensity, or as the square of this value if the signal and noise are viewed as
amplitudes (field quantities).

TRANSFORMATION:

Transformation is a function. A function that maps one set to another set after performing some operations.

Digital Image Processing system


We have already seen in the introductory tutorials that in digital image processing, we will develop a system
that whose input would be an image and output would be an image too. And the system would perform some
processing on the input image and gives its output as an processed image. It is shown below.

Now function applied inside this digital system that process an image and convert it into output can be called as
transformation function. As it shows transformation or relation, that how an image1 is converted to image2.

Image transformation:

Consider this equation

G(x,y) = T{ f(x,y) }

In this equation,

F(x,y) = input image on which transformation function has to be applied.

G(x,y) = the output image or processed image.

T is the transformation function.

This relation between input image and the processed output image can also be represented as.

s = T (r)

where r is actually the pixel value or gray level intensity of f(x,y) at any point. And s is the pixel value or
gray level intensity of g(x,y) at any point. The basic gray level transformation has been discussed in our
tutorial of basic gray level transformations. Now we are going to discuss some of the very basic
transformation functions.

Examples

Consider this transformation function.


Lets take the point r to be 256, and the point p to be 127. Consider this image to be a one bpp image. That
means we have only two levels of intensities that are 0 and 1. So in this case the transformation shown by the
graph can be explained as.All the pixel intensity values that are below 127 (point p) are 0, means black. And all
the pixel intensity values that are greater then 127, are 1, that means white. But at the exact point of 127, there is
a sudden change in transmission, so we cannot tell that at that exact point, the value would be 0 or 1.

Mathematically this transformation function can be denoted as:

Consider another transformation like this

Now if you will look at this particular graph, you will see a straight transition line between input image and
output image. It shows that for each pixel or intensity value of input image, there is a same intensity value of
output image. That means the output image is exact replica of the input image. It can be mathematically
represented as:

g(x,y) = f(x,y)

INTERPRET THE BASICS OF IMAGE MODELS:

All the processing methods of digital images can be broadly divided into two categories. They are,

(1) Methods whose input and output are images


(2) Methods whose input are images, but outputs are attributes i.e. features extracted from those
images.
Among the above shown modules,

(1), (2), (3), (4), (5) (6)  methods whose outputs are images

(7), (8) ,(9) ,(10)  methods whose outputs are image attributes

Knowledge Base

 This indicates the knowledge about a problem domain.

 The knowledge base may either be simple such as the details of image regions or it may be
complex such as an image database containing high-resolution satellite images for change-
detection applications.
 It guides the operation of each processing module in Fig. 1.1.

 It also controls the interaction between processing modules.

(1) Image Acquisition

- Image acquisition is the process of capturing or generating digital images using


‘imaging sensors’.

- It can be considered as a simple process in which an image in digital form is given.

- Usually, this stage involves ‘preprocessing’ such as scaling.

(2) Image Enhancement

- Image enhancement is the process of manipulating an image so that the result is more
suitable than the original image for specific application.

- There are a variety of enhancement techniques that use so many different image
processing approaches.

- These enhancement methods are ‘subjective’ and hence problem oriented.

(3) Image Restoration

- Image restoration is also the process of improving the appearance of an image.


- Restoration techniques are ‘objective’ i.e they are based on mathematical or probabilistic
models of image degradation.

(4) Color Image Processing

- Color is one of the very important features that has been extracted from an image.

- Color image processing techniques process an image considering its color as one of the
important attribute in addition to other attributes.

(5) Wavelets And Multiresolution Processing


- Wavelets are used to represent images in various degrees of resolution.
- These wavelets are mainly used for
 Image data compression and
 Pyramidal representation – this is the process of subdividing images
successively into smaller regions.
(6) Image Compression
- Image compression is the process of reducing the storage required to save an image or it is
the process of reducing the bandwidth required to transmit an image.

- Some compression standards are also defined.

(7) Morphological Processing


- Morphological processing deals with the tools for extracting image components.
- These components will be used in the representation and description of shape.
(8) Image Segmentation
- Image segmentation is the process of partitioning or dividing an image into its constituent
parts or objects.
- There are a number of algorithms available for segmentation procedures.
- Segmentation is usually done to recognize the objects from an image. If the segmentation is
more accurate, there will be successful recognition.
- The output of this process is raw pixel data.
(9) Image Representation And Description
- Image representation is a process that is used to convert the output of segmentation process
into a form suitable for computer processing.
- Two types of representations are
 Boundary Representation – focuses on external shape characteristics such as
corners and inflections.
 Regional Representation – focuses on internal properties such as texture or skeletal
shape.
- Image description is the process of extracting the attributes from an image that are used to
give some quantitative information of interest.
- Description can also be called as the process of ‘Feature selection’. (10)Image

Recognition

- Image recognition is a process that assigns a label or name to an object identified


from an image, based on its descriptors.
Output

 The output of processing the image can be obtained from any stage of the modules shown in fig.
1.1.

 The modules that are required for an application is totally dependent on the problem to be solved.
UNIT II ENHANCEMENT TECHNIQUES 9+3

Gray level transformation- Log transformation, Power law transformation, Piecewise linear transformation.
Histogram processing- Histogram equalization, Histogram Matching. Spatial domain Filtering-Smoothing filters,
sharpening filters. Frequency domain filtering- Smoothing filters, Sharpening filters- Homomorphic filtering -
Medical image enhancement using Hybrid filters- Performance measures for enhancement techniques.
Experiment with various filtering techniques for noise reduction and enhancement in medical images using
Matlab.

Enhancement by Point Processing

The principal objective of enhancement is to process an image so that the result is more suitable than
the original image for a specific application. Image enhancement approaches fall into two board categories
 Spatial domain methods
 Frequency domain methods

The term spatial domain refers to the image plane itself and approaches in this categories are basedon direct
manipulation of pixel in an image.
g(x,y)=T[f(x,y)]

Spatial domain process are denoted by the expression f(x,y)- input image T- operator on f, defined over
some neighborhood of f(x,y) g(x,y)-processed image. The neighborhood of a point (x,y) can be explain by
using as square or rectangular sub image area centered at (x,y).

The center of sub image is moved from pixel to pixel starting at the top left corner. The operator T is
applied to each location (x,y) to find the output g at that location . The process utilizes only the pixel in the
area of the image spanned by the neighborhood.It is the simplest form of the transformations when the
neighborhood is of size IxI. In this case g depends only on the value of f at (x,y) and T becomes a gray
level transformation function of the forms

S=T(r)
r- Denotes the gray level of f(x,y)
s- Denotes the gray level of g(x,y) at any point (x,y)

Because enhancement at any point in an image deepens only on the gray level at that point,technique
in this category are referred to as point processing. There are basically three kinds of functions in gray
level transformation –
Point Processing:
Contract stretching -It produces an image of higher contrast than the original one. The operation is
performed by darkening the levels below m and brightening the levels above m in the original image.

In this technique the value of r below m are compressed by the transformation function into a narrow
range of s towards black .The opposite effect takes place for the values of r above m.
Thresholding function: It is a limiting case where T(r) produces a two levels binaryimage.
The values below m are transformed as black and above m are transformed as white.

Basic Gray Level Transformation:


These are the simplest image enhancement techniques.
Image Negative: The negative of in image with gray level in the range [0, l-1] is obtained by
using the negative transformation. The expression of the transformation is
s = L -1- r

Reverting the intensity levels of an image in this manner produces the equivalent of a Photo graphic
negative. This type of processing is practically suited for enhancing white or gray details embedded in dark
regions of an image especially when the black areas are dominant in size.
Log transformations:
The general form of the log transformation is

s = c log(1 + r)
Where c- constant
R≥o

This transformation maps a narrow range of gray level values in the input image into a wider range of output
gray levels. The opposite is true for higher values of input levels. We would use this transformations to
expand the values of dark pixels in an image while compressing the higherlevel values. The opposite is true
for inverse log transformation. The log transformation function has an important characteristic that it
compresses the dynamic range of images with largevariations in pixel values.
Eg- Fourier spectrum

Power Law Transformation:

Power law transformations has the basic form

Power law curves with fractional values of y map a narrow range of dark input values into a wider
range of output values, with the opposite being true for higher values of input gray levels.We may
get various curves by varying values of γ.
A variety of devices used for image capture, printing and display respond according to a powerlaw. The
process used to correct this power law response phenomenon is called gamma correction. For eg-CRT devices
have intensity to voltage response that is a power function.
Gamma correction is important if displaying an image accurately on a computer screen is of concern.
Images that are not corrected properly can look either bleached out or too dark. Colorphenomenon also uses this
concept of gamma correction. It is becoming more popular due to useof images over the internet. It is important
in general purpose contract manipulation. To make an image black we use γ >1 and γ <1 for white image.

Piece wise linear transformation functions


The principal advantage of piecewise linear functions is that these functions can bearbitrarily
complex. But their specification requires considerably more user input.

 Contrast Stretching
It is the simplest piecewise linear transformation function. We may have various low contrast images
and that might result due to various reasons such as lack of illumination, problemin imaging sensor or wrong
setting of lens aperture during image acquisition. The idea behind contrast stretching is to increase the
dynamic range of gray levels in the image being processed.
The location of points (r1, s1) and (r2, s2) control the shape of the curve.
a) If r1=r2 and s1=s2, the transformation is a linear function that deduces no change in
gray levels.
b) If r1=s1, s1=0 , and s2=L-1, then the transformation become a thresholding
function that creates a binary image
c) Intermediate values of (r1, s1) and (r2, s2) produce various degrees of spread in the
grayvalue of the output image thus effecting its contract. Generally r1≤ r2 and s1 ≤ s2 so that
the function is single valued and monotonically increasing.

 Gray Level Slicing


Highlighting a specific range of gray levels in an image is often desirable For example when enhancing
features such as masses of water in satellite image and enhancingflaws in x- ray images.
There are two ways of doing this-
(1) One method is to display a high value for all gray level in the range. Of interest and
a lowvalue for all other gray level.

(2) Second method is to brighten the desired ranges of gray levels but preserve
the backgroundand gray level tonalities in the image.

 Bit Plane Slicing


Sometimes it is important to highlight the contribution made to the total image appearanceby
specific bits. Suppose that each pixel is represented by 8 bits. Imagine that an image is composed of
eight 1-bit planes ranging from bit plane 0 for the least significant bit to bit plane 7 for the most
significant bit. In terms of 8-bit bytes, plane 0 contains all the lowest order bits in theimage and plane 7
contains all the high order bits. High order bits contain the majority of visually significant data and
contribute to more subtle details in the image. Separating a digital image into its bits planes is useful
for analysing the relative importance played by each bit of the image. It helps in determining the
adequacy of the number of bits used to quantize each pixel. It is also useful for imagecompression.
Histogram Processing:
The histogram of a digital image with gray levels in the range [0, L-1] is a discrete function of
theForm H(rk)=nk
where rk is the kth gray level and nk is the number of pixels in the image having the level rk..
A normalized histogram is given by the equation

p (rk) = nk/n for k=0,1,2,…..,L-1

P(rk) gives the estimate of the probability of occurrence of gray level rk. The sum of all
components of a normalized histogram is equal to 1.The histogram plots are simple plots of

H (rk) = nk versus rk.

In the dark image the components of the histogram are concentrated on the low (dark) side of thegray scale. In
case of bright image the histogram components are biased towards the high side of the gray scale. The histogram
of a low contrast image will be narrow and will be centred towardsthe middle of the gray scale.
The components of the histogram in the high contrast image cover a broad range of the grayscale. The net effect
of this will be an image that shows a great deal of gray levels details andhas high dynamic range.

Histogram Equalization
Histogram equalization is a common technique for enhancing the appearance of images.
Supposewe have an image which is predominantly dark. Then its histogram would be skewed towards
thelower end of the grey scale and all the image detail are compressed into the dark end of the
histogram. If we could ‘stretch out’ the grey levels at the dark end to produce a more uniformly
distributed histogram then the image would become much clearer.
Let there be a continuous function with r being gray levels of the image to be enhanced.
The range of r is [0, 1] with r=0 repressing black and r=1 representing white. The
transformation function is of the form

S=T(r) where 0<r<1

It produces a level s for every pixel value r in the original image.The transformation function is
assumed to fulfil two conditions
 T(r) is single valued and monotonically increasing in the interval 0<T(r) <1 for 0<r, 1
 The transformation function should be single valued so that the inverse
transformations should exist. Monotonically increasing condition preserves the
increasing order from black to white in theoutput image.
 The second conditions guarantee that the output gray levels will be in the same rangeas
the input levels. The gray levels of the image may be viewed as random variables in
the interval[0.1] .The most fundamental descriptor of a random variable is its
probability density function (PDF) Pr(r) and Ps(s) denote the probability density
functions of random variables r and s respectively. Basic results from an elementary
probability theory states that if Pr(r) and Tr are known and T-1(s) satisfies conditions
(a),

For discrete values we deal with probability and summations instead of probability densityfunctions
and integrals.
The transformation function is

Function that seeks to produce an output image that has a uniform histogram. It is a good approach
when automatic enhancement is needed. Thus, an output image is obtained by mapping each pixel with
level rk in the input image into acorresponding pixel with level sk. Equalization automatically
determines a transformation.
Enhancement Using Arithmetic/Logic Operations
 These operations are performed on a pixel by basis between two or more images
excluding not operation which is performed on a single image. It depends on the
hardware and/or software that the actual mechanism of implementation should be
sequential, parallel or simultaneous.

 Logic operations are also generally operated on a pixel by pixel basis. Only AND, OR
and NOT logical operators are functionally complete. Because all other operators can
be implemented by using these operators. While applying the operations on gray scale
images, pixel values are processed as strings of binary numbers. The NOT logic
operation performsthe same function as the negative transformation.

 Image Masking is also referred to as region of Interest (Ro1) processing. This is done
to highlight a particular area and to differentiate it from the rest of the image. Out of
the four arithmetic operations, subtraction and addition are the most useful for image
enhancement.

Image Subtraction
The difference between two images f(x,y) and h(x,y) is expressed as

g(x,y)=f(x,y)-h(x,y)
It is obtained by computing the difference between all pairs of corresponding pixels fromf
and h.
 The key usefulness of subtraction is the enhancement of difference between images.
This concept is used in another gray scale transformation for enhancement known as
bit plane slicing. The higher order bit planes of an image carry a significant amount of
visually relevant detail while the lower planes contribute to fine details.
 It we subtract the four least significant bit planes from the image. The result will be
nearly identical but there will be a slight drop in the overall contrast due to less
variability in the gray level values of image.
 The use of image subtraction is seen in medical imaging area named as mask mode
radiography. The mask h (x,y) is an X-ray image of a region of a patient’s body this
image is captured by using as intensified TV camera located opposite to the x-ray
machine then aconsistent medium is injected into the patient’s blood storm and then a
series of image aretaken of the region same as h(x,y).The mask is then subtracted from
the series of incoming image. This subtraction will give the area which will be the
difference between f(x,y) andh(x,y) this difference will be given as enhanced detail in
the output image.
 This produces a move shoving now the contrast medium propagates through various
arteries of the area being viewed. Most of the image in use today is 8- bit image sothe
values of the image lie in the range 0 to 255.The value in the difference image can lie
from -255 to 255. For these reasons we have to do some sort of scaling to display the
results
 There are two methods to scale an image
(i) Add 255 to every pixel and then divide at by 2.This gives the surety that
pixel values will be in the range 0 to 255 but it is not guaranteed whether it will cover
the entire8 – bit range or not. It is a simple method and fast to implement but will not
utilize the entire gray scale range to display the results.
(ii) Another approach is
(a) Obtain the value of minimum difference
(b) Add the negative of minimum value to the pixels in the
differenceimage (this will give a modified image whose minimum value
will be 0)
(c) Perform scaling on the difference image by multiplying each pixel
bythe quantity 255/max.
 This approach is complicated and difficult to implement. Image subtraction is
used insegmentation application also.

Image Averaging
Consider a noisy image g(x,y) formed by the addition of noise η(x,y) to the original image
f(x,y)

where f (x, y): an original image


(x, y) : the addition of noise

One simple way to reduce this granular noise is to take several identical pictures and average
them,thus smoothing out the randomness.
Assuming that at every point of coordinate (x,y) the noise is uncorrelated and has zero
average value. The objective of image averaging is to reduce the noise content by adding a set of noise
images, {gi(x,y)}.If in image formed by image averaging K different noisy images

E{g(x,y)} = f(x,y) means that g(x,y) approaches f(x,y) as the number of noisy image used in the
averaging processes increases. Image averaging is important in various applications such as in thefield
of astronomy where the images are low light levels.

Basic of Spatial Filtering


Spatial filtering is an example of neighborhood operations; in this the operations are doneon the
values of the image pixels in the neighborhood and the corresponding value of a sub imagethat has the
same dimensions as of the neighborhood. This sub image is called a filter, mask, kernel, template or
window; the values in the filter sub image are referred to as coefficients rather than pixel. Spatial
filtering operations are performed directly on the pixel values (amplitude/gray scale)of the image. The
process consists of moving the filter mask from point to point in the image. At each point (x,y) the
response is calculated using a predefined relationship.

For linear spatial filtering the response is given by a sum of products of the filter coefficient and the
corresponding image pixels in the area spanned by the filter mask.

The results R of liner filtering with the filter mask at point (x,y) in the image is
The sum of products of the mask coefficient with the corresponding pixel directly under the mask. The
coefficient w (0,0) coincides with image value f(x,y) indicating that mask it centered at (x,y).when the
computation of sum of products takes place.
For a mask of size MxN we assume m=2a+1 and n=2b+1, where a and b are non negative integers. It
shows that all the masks are of add size.
In the general liner filtering of an image of size f of size M*N with a filter mask of size m*m is given by
the expression

Where a= (m-1)/2 and b= (n-1)/2


To generate a complete filtered image this equation must be applied for x=0, 1, 2, M-1 and y=0, 1, 2 ,
N-1. Thus the mask processes all the pixels in the image. The process of linear filtering is similar to
frequency domain concept called convolution. For this reason, linear spatial filtering often is referred to as
convolving a mask with an image. Filter mask are sometimes called convolution mask.

R= W1, Z1, +W2, Z2 +…. + Wmn Zmn


Where w’s are mask coefficients and z’s are the values of the image gray levels corresponding to
those coefficients. mn is the total number of coefficients in the mask.
An important point in implementing neighborhood operations for spatial filtering is the issue of what
happens when the center of the filter approaches the border of the image.
i) To limit the excursion of the center of the mask to be at distance of less than (n-1) /2
pixels form the border. The resulting filtered image will be smaller than the original but all the
pixels will be processed with the full mask.
ii) Filter all pixels only with the section of the mask that is fully contained in the image.It
will create bands of pixels near the border that will be processed with a partial mask.
iii) Padding the image by adding rows and columns of o’s & or padding by replicating
rows and columns. The padding is removed at the end of the process.

Smoothing Spatial Filters


These filters are used for blurring and noise reduction blurring is used in pre-processing steps
such as removal of small details from an image prior to object extraction and bridging of small gaps in
lines or curves.
Smoothing Linear Filters
The output of a smoothing liner spatial filter is simply the average of the pixel contained in the
neighborhood of the filter mask. These filters are also called averaging filters or low pass filters. The
operation is performed by replacing the value of every pixel in the image by the averageof the gray
levels in the neighborhood defined by the filter mask. This process reduces sharp transitions in gray
levels in the image.
A major application of smoothing is noise reduction but because edge are also provided using
sharp transitions so smoothing filters have the undesirable side effect that they blur edges
. It also removes an effect named as false contouring which is caused by using insufficient number of
gray levels in the image. Irrelevant details can also be removed by these kinds of filters, irrelevant means
which are not of our interest. A spatial averaging filter in which all coefficients are equal is sometimes
referred to as a “box filter”.
A weighted average filter is the one in which pixel are multiplied by different
coefficients.

The general implementation for filtering an MXN image with a weighted averaging filter of size
mxn is given by

Order Statistics Filter


These are nonlinear spatial filter whose response is based on ordering of the pixels contained in
the image area compressed by the filter and the replacing the value of the center pixel with value
determined by the ranking result.
The best example of this category is median filter. In this filter the values of the center pixel is
replaced by median of gray levels in the neighborhood of that pixel. Median filters are quite
popular because, for certain types of random noise, they provide excellent noise- reduction
capabilities, with considerably less blurring than linear
smoothing filters.
These filters are particularly effective in the case of impulse or salt and pepper noise. It iscalled
so because of its appearance as white and black dots superimposed on an image. The median £
of a set of values is such that half the values in the set less than or equal to
£and half are greater than or equal to this. In order to perform median filtering at a point in an
image, we first sort the values of the pixel in the question and its neighbors, determine
theirmedian and assign this value to that pixel.
We introduce some additional order-statistics filters. Order-statistics filters are spatial filters
whose response is based on ordering (ranking) the pixels contained in the image area
encompassed by the filter. The response of the filter at any point is determined by the
rankingresult

Median filter
The best-known order-statistics filter is the median filter, which, as its name implies,
replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel:

The original value of the pixel is included in the computation of the median. Median filters are quite
popular because, for certain types of random noise, they provide excellent noise-reduction capabilities,
with considerably less blurring than linear smoothing filters of similar size. Median filters are
particularly effective in the presence of both bipolar and unipolar impulse noise. In fact,the median filter
yields excellent results for images corrupted by this type of noise.

Max and min filters


Although the median filter is by far the order-statistics filter most used in image processing. It is by no
means the only one. The median represents the 50th percentile of a ranked set of numbers, but the
reader will recall from basic statistics that ranking lends itself to many other possibilities. For example,
using the 100th percentile results in the so-called max filter given by:

This filter is useful for finding the brightest points in an image. Also, because pepper noise has very low
values, it is reduced by this filter as a result of the max selection process in the sub image area S. The 0th
percentile filter is the Min filter.
 The principal objective of sharpening is to highlight fine details in an image or to
enhance details that have been blurred either in error or as a natural effect of particular
method for image acquisition.
 The applications of image sharpening range from electronic printing and medical
imagingto industrial inspection and autonomous guidance in military systems.
 As smoothing can be achieved by integration, sharpening can be achieved by spatial
differentiation. The strength of response of derivative operator is proportional to the
degree of discontinuity of the image at that point at which the operator is applied. Thus
image differentiation enhances edges and other discontinuities and deemphasizes the
areas with slow varying grey levels.
 It is a common practice to approximate the magnitude of the gradient by using absolute
values instead of square and square roots.
A basic definition of a first order derivative of a one dimensional function f(x) is the difference.

The second-order derivative of a one-dimensional function f(x) is

Development of the Laplacian method


The two dimensional Laplacian operator for continuous functions:

Laplacian highlights gray-level discontinuities in an image and deemphasize the regions of slow
varying gray levels. This makes the background a black image. The background texture can be recovered
by adding the original and Laplacian images.
• To sharpen an image, the Laplacian of the image is subtracted from the original image.

If the center coefficient of the Laplacian mask is negative. If


the center coefficient of the Laplacian mask is positive.

2.7.1 Unsharp Masking and High Boost Filtering


Unsharp masking means subtracting a blurred version of an image form the image itself. Where
f(x,y) denotes the sharpened image obtained by unsharp masking and f(x,y) is ablurred version of (x,y)

A slight further generalization of unsharp masking is called high boost filtering. A highboost filtered
image is defined at any point (x,y) as
Basis of Filtering in Frequency Domain
Basic steps of filtering in frequency Domain
) x+y
i) Multiply the input image by (-1 to centre the transform
ii) Compute F(u,v), Fourier Transform of the image
iii) Multiply f(u,v) by a filter function H(u,v)
iv) Compute the inverse DFT of Result of (iii)
v) Obtain the real part of result of (iv)
x+y
vi) Multiply the result in (v) by (-1)

H (u,v) called a filter because it suppresses certain frequencies from the image while leaving others
unchanged.

SMOOTHING FREQUENCY DOMAIN FILTERS LOW PASS FILTERING:


Edges and other sharp transition of the gray levels of an image contribute significantly
to the high frequency contents of its Fourier transformation. Hence smoothing is achieved in the
frequency domain by attenuation a specified range of high frequency components in the transformof a
given image.

Basic model of filtering in the frequency domain is

G(u, v) = H(u,v) F(u,v)

where F(u,v): the Fourier transform of the image to be smoothed


H(u,v): a filter transfer function
Objective is to find out a filter function H (u,v) that yields G (u,v) by attenuating the
highfrequency component of F (u,v)
There are three types of low pass filters
1. Ideal
2. Butterworth
3. Gaussian

Ideal Low Pass Filter


It is the simplest of all the three filters. It cuts of all high frequency component of the Fourier
transform that are at a distance greater that a specified distance D0 form the origin of the transform. It
is called a two – dimensional ideal low pass filter (ILPF) and has the transfer

function
Where D (u,v) : the distance from point (u,v) to the center of the frequency rectangle

Butterworth Low Pass Filter


It has a parameter called the filter order. For high values of filter order it approaches the form
of the ideal filter whereas for low filter order values it reach Gaussian filter. It may be viewed as a
transition between two extremes. The transfer function of a Butterworth low pass filter (BLPF)of order
n with cut off frequency at distance Do from the origin is defined as
Most appropriate value of n is 2.It does not have sharp discontinuity unlike ILPF that establishes a clear
cut-off between passed and filtered frequencies. Defining a cut-off frequency is a main concern in these filters.
This filter gives a smooth transition in blurring as a function of increasing cut-off frequency. A Butterworth filter
of order 1 has no ringing. Ringing increases as a function of filter order. (Higher order leads to negative values)

Gaussian Low Pass Filter


The transfer function of a Gaussian low pass filter is

D(u,v)- the distance of point (u,v) from the center of the transform σ = D0- specified cut off
frequency
The filter has an important characteristic that the inverse of it is also Gaussain.
Sharpening Frequency
Domain High pass
filtering:
Image sharpening can be achieved by a high pass filtering process, which attenuatesthe low
frequency components without disturbing high-frequency information. These are radially symmetric
and completely specified by a cross section.

If we have the transfer function of a low pass filter the corresponding high pass filtercan be
obtained using the equation

Ideal High Pass Filter


This filter is opposite of the Ideal Low Pass filter and has the transfer function of the form

Butterworth High Pass Filter


The transfer function of Butterworth High Pass filter of order n is given by the equation

Gaussian High Pass Filter


The transfer function of a Gaussain High Pass Filter is given by the equation
Discrete Fourier Transform and the Frequency Domain
Any function that periodically reports itself can be expressed as a sum of sines and cosines of
different frequencies each multiplied by a different coefficient, this sum is called Fourier series. Even
the functions which are non-periodic but whose area under the curve if finite can also be represented in
such form; this is now called Fourier transform.
A function represented in either of these forms and can be completely reconstructed via an inverse
process with no loss of information.

1-D Fourier Transformation and its Inverse


If there is a single variable, continuous function f(x), then Fourier transformation F (u) may be
given as

And the reverse process to recover f(x) from F(u) is

Equation (a) and (b) comprise of Fourier transformation pair.


Fourier transformation of a discrete function of one variable f(x), x=0, 1, 2, m-1 is given by

to obtain f(x) from F(u)

f (x) =

Now each of the m terms of F (u) is called a frequency component of transformation

The above two equation (e) and (f) comprise of a discrete Fourier transformation pair. According to
Euler’s formula
e jx = cos x + j sin xSubstituting these value
to equation (e)
F (u) =Σf(x) [cos 2πux/N+jSin 2πux/N] for u=0, 1, 2,……, N-1

The Fourier transformation separates a function into various components, based on frequency
components. These components are complex quantities.

F(u) in polar coordinates

Fourier Transformation and its Inverse


The Fourier Transform of a two dimensional continuous function f(x,y) (an
image)of size M * N is given by

F {f(x, y)}
Inverse Fourier transformation is given by equation

Where (u,v) are frequency variables.


Preprocessing is done to shift the origin of F (u,v) to frequency coordinate (m/2,n/2) which is
thecenter of the M*N area occupied by the 2D-FT. It is known as frequency rectangle.
It extends form u =0 to M-1 and v=0 to N-1. For this, we multiply the input image by (-1)x+y prior to
compute the transformation
Ƒ {f(x,y) (-1)x+y }= F(u-M/2, v-N/2)

Ƒ (.) denotes the Fourier transformation of the argumentValue of transformation at (u,v)=(0,0) is


F (0, 0) =1/MN ΣΣ f(x,y)
Discrete Fourier Transform and its Properties
In the two-variable case the discrete Fourier transform pair is

When images are sampled in a squared array, i.e. M=N, we can write

You might also like