EC9580: COMPUTER VISION
FEATURES & FILTERS
Ms. Sujanthika M
Lecturer
Department of Computer Engineering
sujanthika@eng.jfn.ac.lk
REFERENCES
1. Reinhard Klette, Concise Computer Vision: An Introduction Into
Theory and Algorithms, 1st edition, Springer.
2. Richard Szeliski, Computer Vision: Algorithms and Applications,
2nd edition, Springer.
EC9580 Ms. Sujanthika M 2
CHAPTER OVERVIEW
▪ Lecture – 03 hours
▪ Lab – 03 hours
▪ Assignment – No Assignments
EC9580 Ms. Sujanthika M 3
FEATURES & FILTERS
▪ Scale-invariant feature (SIFT)
▪ Histogram of oriented gradients (HOG)
▪ 2D- Discrete cosine transform (2D-DCT)
▪ Gabor Filters
▪ Linear filters
▪ Texture Analysis
EC9580 Ms. Sujanthika M 4
FEATURES
▪ Detectable elements in images such as edges, textures, and objects that are
essential for image recognition and analysis are considered as features
▪ Types of features:
o Low-level features: Basic attributes like edges, corners, and textures, which can be
extracted without any prior knowledge of the image’s content
o High-level features: More abstract and semantic representations, often related to object
recognition, where the feature set provides higher-level information about objects
EC9580 Ms. Sujanthika M 5
WHAT ARE GOOD FEATURES?
▪ Repeatability: The ability to detect the same feature under different conditions
▪ Distinctiveness: Features should be unique and identifiable, allowing the
differentiation between various parts of an image or between different images.
▪ Efficiency: Features should be quick and efficient to detect, ensuring real-time
performance in applications like video processing.
▪ Invariance: Good features should be invariant to changes in scale, rotation,
illumination, and sometimes even minor affine transformations.
EC9580 Ms. Sujanthika M 6
IMAGE FEATURES
EC9580 Ms. Sujanthika M 7
GRADIENTS & EDGES
▪ A gradient in an image refers to the rate of change in pixel intensity (brightness)
over a spatial region.
▪ Gradients highlight areas where there is a significant change in intensity, such as
from dark to light, making them effective for identifying transitions in an image.
▪ Edges are defined as boundaries within an image where there is a rapid change in
intensity, marking the transition between different regions.
▪ They typically represent object boundaries, surface markings, or textures.
EC9580 Ms. Sujanthika M 8
CORNERS
▪ A corner is a point in an image where two or more edges meet, resulting in a
unique and distinctive location with significant intensity change in multiple
directions.
▪ Corners are highly distinctive features and often remain stable across
transformations like rotation and scaling, making them ideal for feature matching
and tracking.
EC9580 Ms. Sujanthika M 9
LINES & CURVES
▪ Lines and curves represent linear or smooth transitions across pixels, useful for
structural analysis of objects in an image.
▪ Lines can provide structural information about objects, representing object
boundaries, shapes, or alignment within the scene.
▪ Curves can describe complex shapes more accurately than straight lines, making
them useful for detecting and analyzing organic shapes.
EC9580 Ms. Sujanthika M 10
TEXTURES
▪ Texture refers to the visual patterns or repetitive structures within a region that
represent the surface quality of an object, such as roughness, smoothness, or
granularity
▪ Texture provides critical information about the surface properties of objects, aiding
in classification and segmentation where color or intensity alone is insufficient
EC9580 Ms. Sujanthika M 11
BASIC EDGE DETECTION
EC9580 Ms. Sujanthika M 12
OBSERVATIONS ON EDGE DETECTION
1. First-order derivatives generally produce thicker edges
2. Second-order derivatives have a stronger response to fine details, such as thin
lines, isolated points, and noise
3. Second-order derivatives produce a double-edge response at ramp and step
transitions in intensity
4. The sign of the second-order derivative can be used to determine whether a
transition into an edge is from light to dark or dark to light
EC9580 Ms. Sujanthika M 13
FIRST ORDER DERIVATIVE BASED EDGE
DETECTION
EC9580 Ms. Sujanthika M 14
FIRST ORDER DERIVATIVE
▪ General strategy
▪ determine image gradient
▪ mark points where gradient magnitude is particularly large with respect ti neighbors
EC9580 Ms. Sujanthika M 15
SCALE INVARIANT FEATURE TRANSFORM
(SIFT)
▪ SIFT is an algorithm used for feature detection where it aids the tasks like object
recognition, image stitching, 3D reconstruction and image matching
▪ The main goal is to identify distinctive, repeatable, and invariant keypoints in
images, along with descriptors that characterize the neighborhood around each
keypoint.
EC9580 Ms. Sujanthika M 16
PROPERTIES OF SIFT
▪ Scale Invariance: SIFT identifies keypoints and descriptors that remain stable
when the image is scaled.
▪ Rotation Invariance: SIFT descriptors are designed to handle image rotations.
▪ Affine Invariance (Partial): SIFT is somewhat robust to small affine
transformations, such as slight rotations or perspective distortions.
▪ Illumination Invariance (Partial): SIFT is moderately robust to changes in
lighting, due to the way gradients are calculated.
EC9580 Ms. Sujanthika M 17
STEPS IN SIFT ALGORITHM
Step 1: Scale – Space Extrema Detection
▪ To achieve scale invariance, SIFT identifies key points across multiple scales. This
is done by creating a scale space, which involves progressively blurring the image
at different scales
1. Gaussian Scale Space:
▪ The scale space L(x,y,σ) is generated by convolving the image I(x,y) with a
Gaussian filter G(x,y,σ):
L(x,y,σ)=G(x,y,σ)∗I(x,y)
𝒙𝟐+𝒚𝟐
1 −
Where the Gaussian function G(x,y,σ)= 2 𝒆
2σ2
2πσ
EC9580 Ms. Sujanthika M 18
2. Difference of Gaussians (DoG)
▪ SIFT approximates the Laplacian of Gaussian (LoG) using the Difference of
Gaussians (DoG), which is faster and computationally efficient.
▪ DoG images are generated by subtracting consecutive scales:
D(x,y,σ)=L(x,y,kσ)−L(x,y,σ)
where k is a constant factor between scales.
3. Extrema Detection
▪ For each pixel in the DoG image, we compare it with its 26 neighbors
▪ Pixels that are either a maximum or minimum in their neighborhood are
considered potential keypoints.
EC9580 Ms. Sujanthika M 19
Step 2: Keypoint Localization
After detecting potential keypoints, we refine their locations for greater accuracy.
1. Taylor Series Expansion
▪ To localize keypoints to sub-pixel accuracy, we fit a 3D quadratic function using a Taylor
expansion around the extrema point D(x,y,σ)
▪ The refined location (x,y,σ) is given by:
where x^ is the location offset.
EC9580 Ms. Sujanthika M 20
Step 2: Keypoint Localization
.
2. Thresholding
▪ Keypoints with low contrast (i.e., low ∣D(x,y,σ)∣) are removed, as they are likely due to
noise
▪ Edge Response Elimination: Keypoints that lie along edges are unstable. This is handled
by examining the Hessian matrix at each keypoint:
If the ratio of eigenvalues of H is too large, the keypoint is rejected
EC9580 Ms. Sujanthika M 21
Step 3: Orientation Assignment
Each keypoint is assigned a consistent orientation based on local image gradients to
achieve rotation invariance.
1. Gradient Magnitude and Orientation:
▪ For each keypoint, compute the gradient magnitude m(x,y)m(x, y)m(x,y) and
orientation θ(x,y)\theta(x, y)θ(x,y) in a neighborhood:
EC9580 Ms. Sujanthika M 22
Step 3: Orientation Assignment
2. Orientation Histogram:
▪ Construct a histogram of gradient orientations within the neighborhood
▪ The dominant orientation is assigned to the key point
▪ If there are additional peaks within 80% of the main peak, additional keypoints are
created at the same location with different orientations
EC9580 Ms. Sujanthika M 23
Step 4: Keypoint descriptor
Finally, SIFT generates a descriptor for each keypoint to enable robust matching
between images.
1. Descriptor structure
▪ Divide the local neighborhood into a 4x4 grid.
▪ For each grid cell, create an 8-bin orientation histogram, capturing gradient
orientations.
▪ This results in a 128-dimensional vector (4x4 grid cells with 8 orientation bins
each).
2. Normalization
▪ To achieve illumination invariance, normalize the descriptor vector to unit
length.
▪ Clipping the descriptor values and renormalizing helps reduce sensitivity to
lighting variations.
EC9580 Ms. Sujanthika M 24
MATCHING KEYPOINTS USING SIFT
DESCRIPTORS
Keypoints are matched between images by comparing their descriptors. Common
matching methods:
1. Euclidean Distance: Measure the Euclidean distance between descriptors.
Matches with smaller distances are considered better matches
2. Ratio Test: To reduce false matches, a ratio test compares the nearest
neighbor's distance with the second nearest neighbor's distance. If the ratio is
below a threshold (typically 0.75), the match is accepted
EC9580 Ms. Sujanthika M 25