DIP Module 1
DIP Module 1
Module - 1
DIGITAL IMAGE FUNDAMENTALS: What is Digital Image Processing?.
Origins of Digital Image Processing, Fundamental Steps in Digital Image
Processing, Components of an Image processing system, Elements of Visual
Perception. Image Sensing and Acquisition, Image Sampling and Quantization, Some
Basic Relationships between Pixels, Module - 2
IMAGE TRANSFORMS: Introductions, Two-Dimensional Orthogonal &
Unitary Transforms, Properties of Unitary Transforms, Two Dimensional
Discrete Fourier Transform. Discrete Cosine Transform, Haar Transform
Module-3
SPATIAL DOMAIN: Some Basic Intensity Transformation Functions,
Histogram Processing, Fundamentals of Spatial Filtering, Smoothing Spatial
Filters, Sharpening Spatial Filters
Module-4
FREQUENCY DOMAIN: Basics Of Spatial Filtering in the Frequency Domain
Filters, Image Smoothing and Image Sharpening Frequency Domain Filter
Color Fundamentals. Color Models, Pseudo Color Image Processing.
Module-5
RESTORATION: Model Of Image Degradation/Restoration Process, Noise
Models, Restoration In The Presence of Noise, Only-Spatial Filtering Periodic
Noise Reduction by Frequency Domain Filtering, Linear Position-Invariant
Degradations, Inverse Filtering, Minimum Mean Square Error (Weiner) Filtering
TEXT BOOK:
1. “Digital Image Processing”, Rafael C.Gonzal and Richard E.
Woods, Pearson Education, 2001, 2nd edition.
REFERENCE BOOKS:
1. “Fundamentals of Digital Image Processing”, Anil K. Jain, Pearson Edun, 2001.
2. “Digital Image Processing and Analysis”, B. Chanda and D. Dutta
Majumdar, PHI, 2003
Module-1
Introduction
An image may be defined as a two-dimensional function, f(x, y), where x and y are
spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is
called the intensity or gray level of the image at that point. When x, y, and the amplitude
values of f are all finite, discrete quantities, we call the image a digital image. The field
of digital image processing refers to processing digital images by means of a digital
computer. Note that a digital image is composed of a finite number of elements, each of
which has a particular location and value. These elements are referred to as picture
elements, image elements, pels, and pixels. Pixel is the term most widely used to denote
the elements of a digital image.
It is helpful to divide the material covered in the following chapters into the two broad
categories defined in Section 1.1: methods whose input and output are images, and
methods whose inputs may be images, but whose outputs are attributes extracted from
those images..The diagram does not imply that every process is applied to an image.
Rather, the intention is to convey an idea of all the methodologies that can be applied to
images for different purposes and possibly with different objectives.
Image acquisition is the first process acquisition could be as simple as being given an
image that is already in digital form. Generally, the image acquisition stage involves
preprocessing, such as scaling.
Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that
is obscured, or simply to highlight certain features of interest in an image. A familiar
example of enhancement is when we increase the contrast of an image because “it looks
Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the
sense that restoration techniques tend to be based on mathematical or probabilistic
models of image degradation. Enhancement, on the other hand, is based on human
subjective preferences regarding what constitutes a “good” enhancement result.
Color image processing is an area that has been gaining in importance because of the
significant increase in the use of digital images over the Internet. fundamental concepts in
color models and basic color processing in a digital domain. Color is used also in later
chapters as the basis for extracting features of interest in an image.
Wavelets are the foundation for representing images in various degrees of resolution. In
particular, this material is used in this book for image data compression and for
pyramidal representation, in which images are subdivided successively into smaller
regions.
Compression, as the name implies, deals with techniques for reducing the storage
required to save an image, or the bandwidth required to transmit it. Although storage
technology has improved significantly over the past decade, the same cannot be said for
transmission capacity. This is true particularly in uses of the Internet, which are
characterized by significant pictorial content. Image compression is familiar (perhaps
inadvertently) to most users of computers in the form of image file extensions, such as
the jpg file extension used in the JPEG(Joint Photographic Experts Group) image
compression standard.
Morphological processing deals with tools for extracting image components that are
useful in the representation and description of shape. The material in this chapter begins a
transition from processes that output images to processes that output image attributes,
Segmentation procedures partition an image into its constituent parts or objects. In
general, autonomous segmentation is one of the most difficult tasks in digital image
processing. A rugged segmentation procedure brings the process a long way toward
successful solution of imaging problems that require objects to be identified individually.
On the other hand, weak or erratic segmentation algorithms almost always guarantee
eventual failure. In general, the more accurate the segmentation, the more likely
recognition is to succeed.
Representation and description almost always follow the output of a segmentation stage,
which usually is raw pixel data, constituting either the boundary of a region (i.e., the set
of pixels separating one image region from another) or all the points in the region itself.
In either case, converting the data to a form suitable for computer processing is
necessary. The first decision that must be made is whether the data should be represented
as a boundary or as a complete region. Boundary representation is appropriate when the
focus is on external shape characteristics, such as corners and inflections. Regional
representation is appropriate when the focus is on internal properties, such as texture or
skeletal shape. In some applications, these representations complement each other.
Choosing a representation is only part of the solution for transforming raw data into a
form suitable for subsequent computer processing. A method must also be specified for
describing the data so that features of interest are highlighted. Description, also called
feature selection, deals with extracting attributes that result in some quantitative
information of interest or are basic for differentiating one class of objects from another.
Recognition is the process that assigns a label (e.g., “vehicle”) to an object based on its
descriptors. As detailed in Section 1.1, we conclude our coverage of digital image
processing with the development of methods for recognition of individual objects. So far
we have said nothing about the need for prior knowledge or about the interaction between
the knowledge base and Knowledge about a problem domain is coded into an image
processing system in the form of a knowledge database. This knowledge may be as
simple as detailing regions of an image where the information of interest is known to be
located, thus limiting the search that has to be conducted in seeking that information. The
knowledge base also can be quite complex, such as an interrelated list of all major
possible defects in a materials inspection problem or an image database containing high-
resolution satellite images of a region in connection with change-detection applications.
In addition to guiding the operation of each processing module, the knowledge base also
controls the interaction between modules. This distinction is made in Fig. 1.23 by the use
of double headed arrows between the processing modules and the knowledge base, as
opposed to single-headed arrows linking the processing modules. Although we do not
discuss image display explicitly at this point, it is important to keep in mind that viewing
the results of image processing can take place at the output of any stage.
Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward
miniaturizing and blending of general-purpose small computers with specialized image
processing hardware.
The function of each component is discussed in the following paragraphs, starting with
image sensing. With reference to sensing, two elements are required to acquire digital
images. The first is a physical device that is sensitive to the energy radiated by the object
we wish to image. The second, called a digitizer, is a device for converting the output of
the physical sensing device into digital form. For instance, in a digital video camera, the
sensors produce an electrical output proportional to light intensity. The digitizer converts
these outputs to digital data.
Specialized image processing hardware usually consists of the digitizer just mentioned,
plus hardware that performs other primitive operations, such as an arithmetic logic unit
(ALU), which performs arithmetic and logical operations in parallel on entire images.
One example of how an ALU is used is in averaging images as quickly as they are
digitized, for the purpose of noise reduction. This type of hardware sometimes is called a
front-end subsystem, and its most
distinguishing characteristic is speed. In other words, this unit performs functions that
require fast data throughputs (e.g., digitizing and averaging video images at 30 frames_s)
that the typical main computer cannot handle.
The computer in an image processing system is a general-purpose computer and can
range from a PC to a supercomputer. In dedicated applications, sometimes specially
2024-252024-25 Dr.B B S Kumar, Associate Professor, DBIT 2024-25 Page 5
Digital Image Processing BEC613C
designed computers are used to achieve a required level of performance, but our interest
here is on general-purpose image processing systems. In these systems, almost any well-
equipped PC-type machine is suitable for offline image processing tasks.
Software for image processing consists of specialized modules that perform specific
tasks. A well-designed package also includes the capability for the user to write code that,
as a minimum, utilizes the specialized modules. More sophisticated software packages
allow the integration of those modules and general- purpose software commands from at
least one computer language.
Image displays in use today are mainly color (preferably flat screen) TV monitors.
Monitors are driven by the outputs of image and graphics display cards that are an
integral part of the computer system. Seldom are there requirements for image display
applications that cannot be met by display cards available commercially as part of the
computer system. In some cases, it is necessary to have stereo displays, and these are
implemented in the form of headgear containing two small displays embedded in goggles
worn by the user.
Hardcopy devices for recording images include laser printers, film cameras, heat-
sensitive devices, inkjet units, and digital units, such as optical and CD-ROM disks. Film
provides the highest possible resolution, but paper is the obvious medium of choice for
written material. For presentations, images are displayed on film transparencies or in a
digital medium if image projection equipment is used. The latter approach is gaining
acceptance as the standard for image presentations.
Networking is almost a default function in any computer system in use today. Because of
the large amount of data inherent in image processing applications, the key consideration
in image transmission is bandwidth. In dedicated networks, this typically is not a
problem, but communications with remote sites via the Internet are not always as
efficient. Fortunately, this situation is improving quickly as a result of optical fiber and
other broadband technologies.
Recommended Questions
1. What is digital image processing? Explain the fundamental steps in digital image
processing.
2. Briefly explain the components of an image processing system.
3. How is image formed in an eye? Explain with examples the perceived brightness is
not a simple function of intensity.
4. Explain the importance of brightness adaption and discrimination in image
processing.
5. Define spatial and gray level resolution. Briefly discuss the effects resulting from a
reduction in number of pixels and gray levels.
6. What are the elements of visual perception?
• The following figure shows the anatomy of the human eye in cross section
-3 membrane Cornea & Sclera outer layer, Choroid and Retina –average diameter 20mm
Cornea-tough and transparent issue
Sclera- Opaque membrane
Choroid- Iris and Ciliary body
• There are two types of receptors in the retina
– The rods are long slender receptors
– The cones are generally shorter and thicker in structure- 7-8 million helps at dim light night/bright light,
sensitive to color.
• The rods and cones are not distributed evenly around the retina.
• Rods and cones operate differently
– Rods are more sensitive to light than cones. 75 to 150 million
– At low levels of illumination the rods provide a visual response called scotopic vision(to see dim light at
night)
– Cones respond to higher levels of illumination; their response is called photopic vision
• Rods are more sensitive to light than the cones.
-Pupil varies diameter 2to 8mm, Iris-control of light, front of iris visible pigment and black pigment
-Lens contains 60 to 70% water, 6% fat, and protein, UV & IR observed by lens, over excessive
damages the lens
• Over a wide range of intensities, it is found that the ratio dI/I, called the
Weber fraction, is nearly constant at a value of about 0.02.
• The response of the cones and rods to light is nonlinear. In fact many image
processing systems assume that the eye's response is logarithmic instead of
linear with respect to intensity.
• To test the hypothesis that the response of the cones and rods are
logarithmic, we examine the following two cases:
2024-252024-25 Dr.B B S Kumar, Associate Professor, DBIT 2024-25 Page 12
Digital Image Processing BEC613C
• Another way to see this is the following, note that the differential of the
logarithm of intensity is d(log(I)) = dI/I. Figure 2.3-1 shows the plot of dI/I for
the intensity response of the human visual system.
• Since this plot is nearly constant in the middle frequencies, we again conclude
that the intensity response of cones and rods can be modeled as a logarithmic
response.
The types of images in which we are interested are generated by the combination of an
“illumination” source and the reflection or absorption of energy from that source by the
elements of the “scene” being imaged. We enclose illumination and scene in quotes to
emphasize the fact that they are considerably more general than the familiar situation in
which a visible light source illuminates a common everyday 3-D (three-dimensional)
scene. For example, the illumination may originate from a source of electromagnetic
2024-252024-25 Dr.B B S Kumar, Associate Professor, DBIT 2024-25 Page 13
Digital Image Processing BEC613C
energy such as radar, infrared, or X-ray energy. But, as noted earlier, it could originate
from less traditional sources, such as ultrasound or even a computer-generated
illumination pattern. Similarly, the scene elements could be familiar objects, but they
can just as easily be molecules, buried rock formations, or a human brain. We could
even image a source, such as acquiring images of the sun.
The output voltage waveform is the response of the sensor(s), and a digital quantity is
obtained from each sensor by digitizing its response. In this section, we look at the
principal modalities for image sensing and generation.
The components of a single sensor. Perhaps the most familiar sensor of this type is the
photodiode, which is constructed of silicon materials and whose output voltage
waveform is proportional to light. The use of a filter in front of a sensor improves
selectivity. For example, a green (pass) filter in front of a light sensor favors light in the
green band of the color spectrum. As a consequence, the sensor output will be stronger
for green light than for other components in the visible spectrum. In order to generate a
2-D image using a single sensor, there has to be relative displacements in both the x- and
y-directions between the sensor and the area to be imaged. Figure 2.13 shows an
arrangement used in high-precision scanning, where a film negative is mounted onto a
drum whose mechanical rotation provides displacement in one dimension. The single
sensor is mounted on a lead screw that provides motion in the perpendicular direction.
Since mechanical motion can be controlled with high precision, this method is an
inexpensive (but slow) way to obtain high-resolution images. Other similar mechanical
arrangements use a flat bed, with the sensor moving in two linear directions. These types
of mechanical digitizers sometimes are referred to as microdensitometers.
Sensor strips mounted in a ring configuration are used in medical and industrial imaging
to obtain cross-sectional (“slice”) images of 3-D objects\
The individual sensors arranged in the form of a 2-D array. Numerous electromagnetic
and some ultrasonic sensing devices frequently are arranged in an array format. This is
also the predominant arrangement found in digital cameras. A typical sensor for these
cameras is a CCD array, which can be manufactured with a broad range of sensing
properties and can be packaged in rugged arrays of elements or more. CCD sensors are
used widely in digital cameras and other light sensing instruments. The response of each
sensor is proportional to the integral of the light energy projected onto the surface of the
sensor, a property that is used in astronomical and other applications requiring low noise
images. Noise reduction is achieved by letting the sensor integrate the input light signal
over minutes or even hours. The two dimensional, its key advantage is that a complete
image can be obtained by focusing the energy pattern onto the surface of the array.
Motion obviously is not necessary, as is the case with the sensor arrangements
This figure shows the energy from an illumination source being reflected from a scene
element, but, as mentioned at the beginning of this section, the energy also could be
transmitted through the scene elements. The first function performed by the imaging
system is to collect the incoming energy and focus it onto an image plane. If the
illumination is light, the front end of the imaging system is a lens, which projects the
viewed scene onto the lens focal plane. The sensor array, which is coincident with the
focal plane, produces outputs proportional to the integral of the light received at each
sensor. Digital and analog circuitry sweep these outputs and convert them to a video
signal, which is then digitized by another section of the imaging system.
To create a digital image, we need to convert the continuous sensed data into digital
form. This involves two processes: sampling and quantization. A continuous image, f(x,
y), that we want to convert to digital form. An image may be continuous with respect to
the x- and y-coordinates, and also in amplitude. To convert it to digital form, we have to
sample the function in both coordinates and in amplitude. Digitizing the coordinate
values is called sampling. Digitizing the amplitude values is called quantization.
The one-dimensional function shown in Fig. 2.16(b) is a plot of amplitude (gray level)
values of the continuous image along the line segment AB. The random variations are
due to image noise. To sample this function, we take equally spaced samples along line
AB, The location of each sample is given by a vertical tick mark in the bottom part of
the figure. The samples are shown as small white squares superimposed on the function.
The set of these discrete locations gives the sampled function. However, the values of
the samples still span (vertically) a continuous range of gray-level values. In order to
form a digital function, the gray-level values also must be converted (quantized) into
discrete quantities. The right side gray-level scale divided into eight discrete levels,
ranging from black to white. The vertical tick marks indicate the specific value assigned
to each of the eight gray levels. The continuous gray levels are quantized simply by
assigning one of the eight discrete gray levels to each sample. The assignment is made
depending on the vertical proximity of a sample to a vertical tick mark. The digital
samples resulting from both sampling and quantization.
Neighbors of a Pixel
A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose
coordinates are given by
(x+1, y), (x-1, y), (x, y+1), (x, y-1)
This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each pixel is a unit
distance from (x, y), and some of the neighbors of p lie outside the digital image if (x, y)
is on the border of the image.
The four diagonal neighbors of p have
coordinates (x+1, y+1), (x+1, y-1), (x-1, y+1), (x-
1, y-1)
(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if q
is in the set
N4(p).
(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if q
is in the set
N8(p).
(c) m-adjacency (mixed adjacency).Two pixels p and q with
values from V are m- adjacent if
(i) q is in N4(p), or
(ii) q is in ND(p) and the set has no pixels whose values are from V.