EE5811
Topics in Computer Vision
Dr. LI Haoliang
Department of Electrical Engineering
Recap of Week 1
• A brief History of Computer Vision
• Different Applications
What is an image?
What is an image?
Digital Camera
We’ll focus on these in this course
Also image formation The Eye
Source: A. Efros
What is an image?
• A grid (matrix) of intensity values
255 255 255 255 255 255 255 255 255 255 255 255
255 255 255 255 255 255 255 255 255 255 255 255
255 255 255 20 0 255 255 255 255 255 255 255
255 255 255 75 75 75 255 255 255 255 255 255
=
255 255 75 95 95 75 255 255 255 255 255 255
255 255 96 127 145 175 255 255 255 255 255 255
255 255 127 145 175 175 175 255 255 255 255 255
255 255 127 145 200 200 175 175 95 255 255 255
255 255 127 145 200 200 175 175 95 47 255 255
255 255 127 145 145 175 127 127 95 47 255 255
255 255 74 127 127 127 95 95 95 47 255 255
255 255 255 74 74 74 74 74 74 255 255 255
255 255 255 255 255 255 255 255 255 255 255 255
255 255 255 255 255 255 255 255 255 255 255 255
(common to use one byte per value: 0 = black, 255 = white)
What is an image?
• We can think of a (grayscale) image as a function, f,
from R2 to R:
• f (x,y) gives the intensity at position (x,y)
f (x, y)
3D view
• A digital image is a discrete (sampled, quantized) version
of this function
Image transformations
• As with any function, we can apply operators to an
image
Example
g (x,y) = f (x,y) + 20 g (x,y) = f (-x,y)
Characterizing image
transformations
[i] does not mean transformation is applied at each pixel separately
Source: Deva Ramanan
Characterizing image
transformations
• Properties of “nice” functional transformation
Impulse response
• Delta function
Impulse response
• Delta function
Convolution
Convolution
Example
Example
Properties of Convolution
We can efficiently implement complex operations
Powerful way to think about ANY image transformation that
satisfies additivity, scaling, and shift-invariance.
Size
• Given F of length N and H of length M, what’s size
of G = F * H?
(Cross) Correlation
Commutative properties do not hold
Convolution vs. Correlation
Cross-correlation
Let be the image, be the kernel (of
size 2k+1 x 2k+1), and be the output
image
This is called a cross-correlation operation:
• Can think of as a “dot product” between
local neighborhood and kernel for each pixel
Convolution
• Same as cross-correlation, except that the kernel is
“flipped” (horizontally and vertically)
This is called a convolution operation:
• Convolution is commutative and associative
Linear filtering
• Cross-correlation, convolution
• Replace each pixel by a linear combination (a weighted sum)
of its neighbors
• The prescription for the linear combination is called
the “kernel” (or “mask”, “filter”)
10 5 3 0 0 0
4 6 1 0 0.5 0 8
1 1 8 0 1 0.5
Local image data kernel Modified image data
Source: L. Zhang
Filters
• Filtering
• Form a new image whose pixels are a combination of the
original pixels
• Why?
• To get useful information from images
• E.g., extract edges or contours (to understand shape)
• To enhance the image
• E.g., to remove noise
• E.g., to sharpen or to “enhance image”
Canonical Image Processing
problems
• Image Restoration
• denoising
• deblurring
• Image Compression
• JPEG, JPEG2000, MPEG..
• Computing Field Properties
• optical flow
• disparity
• Locating Structural Features
• corners
• edges
Question: Noise reduction
• Given a camera and a still scene, how can you
reduce noise?
Take lots of images and average them!
What’s the next best thing?
Source: S. Seitz
Image filtering
• Modify the pixels in an image based on some function of
a local neighborhood of each pixel
10 5 3 Convolution
4 5 1 7
1 1 7
Local image data Modified image data
Source: L. Zhang
Convolution
Adapted from F. Durand
Border effects
Border padding
Mean filtering
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 20 10
0 0 0 90 90 90 90 90 0 0 0 20 40 60 60 60 40 20
1 1 1 0 0 0 90 90 90 90 90 0 0 0 30 60 90 90 90 60 30
1
1
1
1
1
1
*
0
0
0
0
0
0
0
0
0
90
90
90
90
0
90
90
90
90
90
90
90
90
90
90
0
0
0
0
0
0
= 0
0
0
30
30
20
50
50
30
80
80
50
80
80
50
90
90
60
60
60
40
30
30
20
0 0 0 0 0 0 0 0 0 0 10 20 30 30 30 30 20 10
0 0 90 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Mean filtering/Moving average
Linear filters: examples
0 0 0
=
* 0
0
1
0
0
0
Original Identical image
Source: D. Lowe
Linear filters: examples
0 0 0
=
* 1
0
0
0
0
0
Original Shifted left
By 1 pixel
Source: D. Lowe
Linear filters: examples
1 1 1
=
* 1
1
1
1
1
1
Original Blur (with a mean filter)
Source: D. Lowe
Linear filters: examples
-
0 0 0 1 1 1
=
* 0
0
2
0
0
0
1
1
1
1
1
1
Sharpening filter
Original (accentuates edges)
Source: D. Lowe
Sharpening
Source: D. Lowe
Smoothing with box filter revisited
Source: D. Forsyth
Gaussian Kernel
0.003 0.013 0.022 0.013 0.003
0.013 0.059 0.097 0.059 0.013
0.022 0.097 0.159 0.097 0.022
0.013 0.059 0.097 0.059 0.013
0.003 0.013 0.022 0.013 0.003
5 x 5, = 1
• Constant factor at front makes volume sum to 1 (can be ignored, as
we should re-normalize weights to sum to 1 in any case)
Source: C. Rasmussen
Gaussian Kernel
Source: C. Rasmussen
Gaussian filters
= 1 pixel = 5 pixels = 10 pixels = 30 pixels
Mean vs. Gaussian filtering
Gaussian filter
• Removes “high-frequency” components from the
image (low-pass filter)
• Convolution with self is another Gaussian
* =
Source: K. Grauman
Sharpening revisited
• What does blurring take away?
– =
original smoothed (5x5) detail
Let’s add it back:
+α =
original detail sharpened
Source: S. Lazebnik
Filters: Thresholding
Image Scaling
This image is too big to fit on the
screen. How can we generate a
half-sized version?
Source: S. Seitz
Image sub-sampling
1/8
1/4
Throw away every other row and
column to create a smaller size image
- called image sub-sampling
Source: S. Seitz
Image sub-sampling
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Why does this look so bad?
Source: S. Seitz
Aliasing
• Occurs when your sampling rate is not high enough to
capture the amount of detail in your image
• Can give you the wrong signal/image—an alias
• To do sampling right, need to understand the structure of
your signal/image
• To avoid aliasing:
• sampling rate ≥ 2 * max frequency in the image
• said another way: ≥ two samples per cycle
• This minimum sampling rate is called the Nyquist rate
Source: L. Zhang
Nyquist limit – 2D example
Good sampling
Bad sampling
Aliasing
• When downsampling by a factor of two
• Original image has frequencies that are too high
• How can we fix this?
Gaussian pre-filtering
G 1/8
G 1/4
Gaussian 1/2
• Solution: filter the image, then subsample
Source: S. Seitz
Subsampling with Gaussian pre-filtering
Gaussian 1/2 G 1/4 G 1/8
• Solution: filter the image, then subsample
Source: S. Seitz
Compare with...
1/2 1/4 (2x zoom) 1/8 (4x zoom)
Source: S. Seitz
Gaussian
pre-filtering
• Solution: filter
the image, then
subsample
F0 F1 F2
blur subsample blur subsample …
F0 * H F1 * H
Gaussian
pyramid
F0 F1 F2
blur subsample blur subsample …
F0 * H F1 * H
Gaussian pyramids
[Burt and Adelson, 1983]
• In computer graphics, a mip map [Williams, 1983]
• A precursor to wavelet transform
Gaussian Pyramids have all sorts of applications in computer vision
Source: S. Seitz
Upsampling
• This image is too small for this screen:
• How can we make it 10 times as big?
• Simplest approach:
repeat each row
and column 10 times
• (“Nearest neighbor
interpolation”)
Image interpolation
d = 1 in this
example
1 2 3 4 5
Recall how a digital image is formed
• It is a discrete point-sampling of a continuous function
• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale
Adapted from: S. Seitz
Image interpolation
d = 1 in this
example
1 2 3 4 5
Recall how a digital image is formed
• It is a discrete point-sampling of a continuous function
• If we could somehow reconstruct the original function, any new
image could be generated, at any resolution and scale
Adapted from: S. Seitz
Image interpolation
1 d = 1 in this
example
1 2 2.5 3 4 5
• What if we don’t know ?
• Guess an approximation:
• Can be done in a principled way: filtering
• Convert to a continuous function:
• Reconstruct by convolution with a reconstruction filter, h
Adapted from: S. Seitz
Image interpolation
“Ideal” reconstruction
Nearest-neighbor
interpolation
Linear interpolation
Gaussian reconstruction
Source: B. Curless
Image interpolation
• What does the 2D version of this hat function look like?
performs
linear interpolation bilinear interpolation
Better filters give better resampled images
• Bicubic is common choice
Cubic reconstruction filter
Image interpolation
Original image: x 10
Nearest-neighbor interpolation Bilinear interpolation Bicubic interpolation
Potential project: Seam carving
https://en.wikipedia.org/wiki/Seam_carving
Image interpolation
• Resizing (resampling)
• Remapping (geometrical Transformation, rotation,...)
• Inpainting (restauration of holes)
• Morphing, nonlinear transformations
Coding Exercise
• Using Python to implement image filtering with the
provided image as input (cv2.filter2D)
-
0 0 0 1 1 1
1 1 1
0 2 0 1 1 1
1 1 1 0 0 0 1 1 1
1 1 1
You can also try
other filters.
Coding Exercise
• Image Interpolation
• Nearest Neighbor Interpolation: selects the value of the
nearest point and does not consider the values of
neighboring points at all, yielding a piecewise-constant
interpolant.
• Bilinear Interpolation: uses values of only the 4 nearest
pixels, located in diagonal directions from a given pixel
• Bicubic Interpolation: considers 16 pixels (4×4).
cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) → dst
Coding Exercise
https://docs.opencv.org/3.4/dc/dff/tutorial_py_pyramids.html
Question
• Suppose that you filter an image f(x,y) with a
spatial filter mask w(x,y) using convolution, where
the mask is smaller than the image in both spatial
directions.
• Show the important property that, if the coefficients of
the mask sum to zero, then the sum of all the elements
in the resulting filtered image will be zero also (you may
assume that the border of the image has been padded
with the appropriate number of zeros).
• Would the result be the same if the filtering is
implemented using correlation?
Solution