Stitching and Blending
Kari Pulli
VP Computational Imaging
Light
First project
Build your own (basic) programs
panorama
HDR (really, exposure fusion)
The key components
register images so their features align
determine overlap
blend
Scalado Rewind
We need to match (align) images
Detect feature points in both images
Find corresponding pairs
Use these pairs to align images
Matching with Features
Problem 1:
Detect the same point independently in both images
no chance to match!
We need a repeatable detector
Matching with Features
Problem 2:
For each point correctly recognize the corresponding one
We need a reliable and distinctive descriptor
Harris Detector: Basic Idea
“flat” region: “edge”: “corner”:
no change in all no change along the significant change in all
directions edge direction directions
Harris Detector: Mathematics
Window-averaged change of intensity for the shift [u,v]:
2
E (u, v) = ∑ w( x, y) [ I ( x + u, y + v) − I ( x, y) ]
x, y
Window Shifted Intensity
function intensity
Window function w(x,y) = or
1 in window, 0 outside Gaussian
Harris Detector: Mathematics
Expanding E(u,v) in a 2nd order Taylor series expansion,
we have, for small shifts [u,v], a bilinear approximation:
⎡u ⎤
E (u, v) ≅ [u, v ] M ⎢ ⎥
⎣ v ⎦
where M is a 2×2 matrix computed from image derivatives:
⎡ I x2 I x I y ⎤
M = ∑ w( x, y) ⎢ 2 ⎥
I x = I (x, y) − I (x −1, y)
x, y ⎢⎣ I x I y I y ⎥⎦
Eigenvalues λ1, λ2 of M at different locations
λ1 and λ2 are large
Eigenvalues λ1, λ2 of M at different locations
large λ1, small λ2
Eigenvalues λ1, λ2 of M at different locations
small λ1, small λ2
Harris Detector: Mathematics
⎡ I x2 I x I y ⎤
Measure of corner response: M = ∑ w( x, y) ⎢ 2 ⎥
x, y ⎢⎣ I x I y I y ⎥⎦
det M = λ1λ2
trace M = λ1 + λ2
2
R = det M − k ( trace M )
(k – empirical constant, k = 0.04 - 0.06)
Harris Detector: Mathematics
λ2 “Edge” “Corner”
• R depends only on R<0
eigenvalues of M
• R is large for a corner R>0
• R is negative with large
magnitude for an edge
• |R| is small for a flat region
“Flat” “Edge”
|R| small R<0
λ1
Harris Detector: Workflow
Harris Detector: Workflow
Compute corner response R
Harris Detector: Workflow
Find points with large corner response: R > threshold
Harris Detector: Workflow
Take only the points of local maxima of R
Harris Detector: Workflow
Not invariant to image scale!
All points will be Corner !
classified as edges
FAST Corners
Look for a contiguous arc of N pixels
all much darker (or brighter) than the central pixel p
It actually is fast
And repeatable!
AUR = Area Under Repeatability curve
Point Descriptors
We know how to detect points
Next question:
How to match them?
?
Point descriptor should be:
1. Invariant
2. Distinctive
SIFT – Scale Invariant Feature Transform
SIFT – Scale Invariant Feature Transform
Descriptor overview:
Determine scale (by maximizing DoG in scale and in space),
local orientation as the dominant gradient direction
Use this scale and orientation to make all further computations
invariant to scale and rotation
Compute gradient orientation histograms of several small
windows (128 values for each point)
Normalize the descriptor to make it invariant to intensity change
D. Lowe. “Distinctive Image Features from Scale-Invariant Keypoints” IJCV 2004
Match orientations
Determine the local orientation
Within the image patch
estimate dominant gradient direction
collect a histogram of gradient orientations
find the peak, rotate the patch so it becomes vertical
0 2π
D. Lowe. “Distinctive Image Features from Scale-Invariant Keypoints” IJCV 2004
Determine the scale
We define the characteristic scale as the scale that
produces peak of Laplacian response
characteristic scale
T. Lindeberg (1998). Feature detection with automatic scale selection.
International Journal of Computer Vision 30 (2): pp 77--116.
Blob detection in 2D
Laplacian of Gaussian: Circularly symmetric operator for
blob detection in 2D
2 2
0
1 -4
1 0
1
2 ⎛ ∂ g ∂ g ⎞
2
0 1 0 ∇ norm g = σ ⎜⎜ 2 + 2 ⎟⎟
∂2 g
∂x 2
= g(x −1, y) − 2g(x, y) + g(x +1, y) ⎝ ∂x ∂y ⎠
Difference of Gaussians (DoG)
Laplacian of Gaussian can be approximated by the
difference between two different Gaussians
Create a pack of DoGs
Fast computation, process scale space an octave at a time
Determine the scale
Find a local maximum in space and scale
D. Lowe. “Distinctive Image Features from Scale-Invariant Keypoints” IJCV 2004
ORB (Oriented FAST and Rotated BRIEF)
Use FAST-9
use Harris measure to order them
✓P P ◆
Find orientation xI(x, y) yI(x, y)
P , P
calculate weighted new center I(x, y) I(x, y)
reorient image so that gradients vary vertically
BRIEF
Binary Robust Independent Elementary Features
choose pixels to compare, result creates 0 or 1
combine to a binary vector, compare using
Hamming distance (XOR + pop count)
Rotated BRIEF
train a good set of pixels to compare
rBRIEF vs. SIFT
Aligning images: Translation?
left on top right on top
Translations are not enough to align the images
Which transform to use?
Translation Affine Perspective
2 unknowns 6 unknowns 8 unknowns
Homography
Projective mapping between any two PPs
with the same center of projection
rectangle maps to almost arbitrary quadrilateral
parallel lines do not remain parallel
but must preserve straight lines
is called a "wx' % " h11 h12 h13 % " x %
Homography $ ' $ '$ ' PP2
$wy' ' = $h21 h22 h23 ' $ y '
$# w '& $#h31 h32 h33 '& $# 1'&
p’ H p
To apply a homography H PP1
compute p’ = Hp (regular matric multiply)
€ homogeneous to image
convert p’ from
coordinates [x’, y’] (divide by w)
Homography from mapping quads
Fundamentals of Texture Mapping and Image Warping
Paul Heckbert, M.Sc. thesis, U.C. Berkeley, June 1989, 86 pp.
http://www.cs.cmu.edu/~ph/texfund/texfund.pdf
Homography from n point pairs (x,y ; x’,y’)
Multiply out "wx' % " h11 h12 h13 % " x %
wx’ = h11 x + h12 y + h13
$ ' $ '$ '
wy’ = h21 x + h22 y + h23
$wy' ' = $h21 h22 h23 ' $ y '
w = h31 x + h32 y + h33
$# w '& $#h31 h32 h33 '& $# 1'&
Get rid of w p’ H p
(h31 x + h32 y + h33)x’ – (h11 x + h12 y + h13) = 0
h11
(h31 x + h32 y + h33)y’ – (h21 x + h22 y + h23) = 0
h12
Create a new system Ah = 0 € h13
Each point constraint gives two rows of A h21
[-x -y -1 0 0 0 xx’ yx’ x’] h = h22
[ 0 0 0 -x -y -1 xy’ yy’ y’] h23
Solve with singular value decomposition of A = USVT h31
solution is in the nullspace of A h32
the last column of V (= last row of VT) h33
Example
common
picture
plane of
mosaic
image
perspective reprojection Pics: Marc Levoy
What to do with outliers?
Least squares OK when error has Gaussian distribution
But it breaks with outliers
data points that are not drawn from the same distribution
Mis-matched points are outliers to the Gaussian error
distribution
severely disturbs the Homography
Line fitting using
regression is
biased by outliers
RANSAC
RANdom SAmple Consensus
1. Randomly choose a subset of data points to fit model (a sample)
2. Points within some distance threshold t of model are a consensus set
Size of consensus set is model’s support
3. Repeat for N samples; model with the biggest support is the most robust fit
Points within distance t of best model are inliers
Fit final model to all inliers
Two samples
and their supports
for line-fitting
Hybrid multi-resolution registration
I.B. Image Based
Initial guess
F.B. Feature Based
I.B.
F.B.
F.B.
F.B.
Registration parameters
K. Pulli, M. Tico, Y. Xiong, X. Wang, C-K. Liang, “Panoramic Imaging System for Camera Phones”, ICCE 2010
Feature-based registration
Previous estimate Update search range
invalid
Feature Detection Feature Matching Validity
RANSAC
(Harris corners) (spherical coordinates) check
valid
Apply the previous
registration estimate New estimate
Convert to spherical Convert from spherical
coordinates coordinates
+ +
+ +
Best block cross-
correlation match
Progression of multi-resolution registration
Actual
size
Applied
to hi-res
Image blending
Directly averaging the overlapped pixels results in ghosting artifacts
Moving objects, errors in registration, parallax, etc.
Photo by Chia-Kai Liang
Alpha Blending / Feathering
Alpha Blending / Feathering
+
1 1
0 0
Iblend = αIleft + (1-α)Iright
=
Solution for ghosting: Image labeling
Assign one input image to each output pixel
Optimal assignment can be found by graph cut [Agarwala et al. 2004]
Faster solution with block
dynamic programming
Input texture
B1 B2 B1 B2 B1 B2
Random placement Neighboring blocks Minimal error
of blocks constrained by overlap boundary cut
Minimal error boundary with DP
overlapping blocks vertical boundary
2
_ =
overlap error min. error boundary
New artifacts
Inconsistency between pixels from different input images
Different exposure/white balance settings
Photometric distortions (e.g., vignetting)
Solution: Poisson blending
Copy the gradient field from the input image
Reconstruct the final image by solving a Poisson equation
Combined gradient field
Problems with direct cloning
P. Pérez, M. Gangnet, A. Blake. Poisson image editing. SIGGRAPH 2003
http://www.irisa.fr/vista/Papers/2003_siggraph_perez.pdf
Membrane interpolation
Solution: clone gradient, integrate colors
Copy the details
Seamlessly paste onto
Just add a linear function so that the boundary condition is respected
Gradients didn’t change much,
and function is continuous
SIGGRAPH 2009
Smooth interpolation over a triangulation
Alpha blending
After labeling
Poisson blending
Pyramid Blending
The Laplacian pyramid
Gaussian Pyramid Laplacian Pyramid
Gn expa
Ln = Gn
nd
G2
- = L2
G1 L1
- =
G0 L0
- =
Laplacian Pyramid: Blending
1. Build Laplacian pyramids LA and LB from images A and B
2. Build a Gaussian pyramid GM from selection mask M
3. Form a combined pyramid LS
from LA and LB
using nodes of GR as weights:
• LS = GM * LA + (1-GM) * LB
4. Collapse the LS pyramid
to get the final blended image
Laplacian
level
4
Laplacian
level
2
Laplacian
level
0
left pyramid right pyramid blended pyramid
Pyramid Blending
Multi-resolution fusion
Simplification: Two-band Blending
Brown & Lowe, 2003
Only use two bands: high freq. and low freq.
Blends low freq. smoothly
Blend high freq. with no smoothing: use binary alpha
2-band Blending
Low frequency (λ > 2 pixels)
High frequency (λ < 2 pixels)
Linear Blending
2-band Blending
Additional reading
Image Alignment and Stitching: A tutorial
Richard Szeliski
Foundations and Trends in Computer Graphics and Vision
Computer Vision: Algorithms and Applications
Richard Szeliski
http://szeliski.org/Book/
Chapters 4, 6, 9