MAN-522: COMPUTER VISION
SET-2
Projections and
Camera Calibration
Image formation
• How are objects in the world captured in
an image?
Physical parameters of
image formation
• Geometric
– Type of projection
– Camera pose
• Optical
– Sensor’s lens type
– focal length, field of view, aperture
• Photometric
– Type, direction, intensity of light reaching sensor
– Surfaces’ reflectance properties
Image formation
• Let’s design a camera
– Idea 1: put a piece of film in front of an object
– Do we get a reasonable image?
Slide by Steve Seitz
Pinhole camera
• Add a barrier to block off most of the rays
– This reduces blurring
– The opening is known as the aperture
– How does this transform the image?
Slide by Steve Seitz
Pinhole camera
• Pinhole camera is a simple model to approximate
imaging process, perspective projection.
Image
plane
Virtual pinhole
image
If we treat pinhole as a point, only one ray
from any given point can enter the camera.
Fig from Forsyth and Ponce
Camera obscura
In Latin, means
‘dark room’
"Reinerus Gemma-Frisius, observed an eclipse of the sun at Louvain on January
24, 1544, and later he used this illustration of the event in his book De Radio
Astronomica et Geometrica, 1545. It is thought to be the first published illustration of
a camera obscura..."
Hammond, John H., The Camera Obscura, A Chronicle
http://www.acmi.net.au/AIC/CAMERA_OBSCURA.html
Camera obscura
Jetty at Margate England, 1898.
An attraction in the late
Around 1870s
19th century
http://brightbytes.com/cosite/collection2.html
Adapted from R. Duraiswami
Camera obscura at home
http://blog.makezine.com/archive/2006/02/how_to_room_
Sketch from http://www.funsci.com/fun3_en/sky/sky.htm sized_camera_obscu.html
Perspective effects
Perspective effects
• Far away objects appear smaller
Forsyth and Ponce
Perspective effects
Perspective effects
• Parallel lines in the scene intersect in the image
• Converge in image on horizon line
Image plane
(virtual)
pinhole
Scene
Projection properties
• Many-to-one: any points along same ray map to
same point in image
• Points points
• Lines lines (collinearity preserved)
• Distances and angles are not preserved
• Degenerate cases:
– Line through focal point projects to a point.
– Plane through focal point projects to line
– Plane perpendicular to image plane projects to part of
the image.
Perspective and art
• Use of correct perspective projection indicated in
1st century B.C. frescoes
• Skill resurfaces in Renaissance: artists develop
systematic methods to determine perspective
projection (around 1480-1515)
Raphael Durer, 1525
Perspective projection equations
• 3d world mapped to 2d projection in image plane
Image
plane
Focal
length
Optical
Camera axis
frame
‘ Scene / world
‘’ ’ points
Scene point Image coordinates
Forsyth and Ponce
Homogeneous coordinates
Is this a linear transformation?
• no—division by z is nonlinear
Trick: add one more coordinate:
homogeneous image homogeneous scene
coordinates coordinates
Converting from homogeneous coordinates
Slide by Steve Seitz
Perspective Projection Matrix
• Projection is a matrix multiplication using
homogeneous coordinates:
x
1 0 0 0 x
0 1 0 0 y
y x
(f' , f' )
y
z z z
0 0 1 / f ' 0 z / f '
1 divide by the third
coordinate to convert back
to non-homogeneous
coordinates
Complete mapping from world points to image pixel
positions?
Slide by Steve Seitz
Perspective projection & calibration
• Perspective equations so far in terms of
camera’s reference frame….
• Camera’s intrinsic and extrinsic parameters
needed to calibrate geometry.
Camera
frame
Perspective projection & calibration
World Extrinsic:
frame
Camera frame World frame
Intrinsic:
Image coordinates relative to
camera Pixel coordinates
Camera
frame
Camera to World to 3D
2D Perspective
pixel coord.
point = trans. matrix
projection matrix camera coord.
trans. matrix
point
(3x1) (3x4) (4x1)
(3x3) (4x4)
Weak perspective
• Approximation: treat magnification as constant
• Assumes scene depth << average distance to
camera
Image World
plane points:
Orthographic projection
• Given camera at constant distance from scene
• World points projected along rays parallel to
optical access
Pinhole size / aperture
How does the size of the aperture affect the
image we’d get?
Larger
Smaller
Adding a lens
focal point
• A lens focuses light onto the film
– Rays passing through the center are not deviated
– All parallel rays converge to one point on a plane
located at the focal length f
Slide by Steve Seitz
Pinhole vs. lens
Cameras with lenses
focal point
optical center
(Center Of Projection)
• A lens focuses parallel rays onto a single focal
point
• Gather more light, while keeping focus; make
pinhole perspective projection practical
Camera Parameters
Imaging Geometry
W
Object of Interest
in World Coordinate
System (U,V,W)
V
U
Imaging Geometry
Camera Coordinate
Y System (X,Y,Z).
X Z
• Z is optic axis
f • Image plane located f units
out along optic axis
• f is called focal length
Imaging Geometry
W
Y y
X Z x
V
U
Forward Projection onto image plane.
3D (X,Y,Z) projected to 2D (x,y)
Imaging Geometry
W
Y y
X Z x
V
u U
v Our image gets digitized
into pixel coordinates (u,v)
Imaging Geometry
Camera Image (film) World
Coordinates Coordinates W
Coordinates
Y y
X Z x
V
u U
v Pixel
Coordinates
Forward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
We want a mathematical model to describe
how 3D World points get projected into 2D
Pixel coordinates.
Our goal: describe this sequence of
transformations by a big matrix equation!
Backward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
Note, much of vision concerns trying to
derive backward projection equations to
recover 3D scene structure from images
(via stereo or motion)
But first, we have to understand forward projection…
Forward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
3D-to-2D Projection
• perspective projection
We will start here in the middle, since we’ve already
talked about this when discussing stereo.
Basic Perspective Projection
Scene Point Perspective Projection Eqns
P = (X,Y,Z)
X
Image Point y x f
Y
p = (x,y,f) X Y
Z
x Y
Z
y
y f
x
X
f Z
Z
O
O.Camps, PSU
Basic Perspective Projection
Scene Point Perspective Projection Eqns
P = (X,Y,Z)
X
Image Point y x f
Y
p = (x,y,f) X Y
Z
x Y
Z
y
y f
x
X
f Z
Z derived via similar
O
triangles rule
x X
f
Z
O.Camps, PSU
Basic Perspective Projection
Scene Point Perspective Projection Eqns
P = (X,Y,Z)
X
Image Point y x f
Y
p = (x,y,f) X Y
Z
x Y
Z
y
y f
x
X
f Z
Z derived via similar
O
triangles rule
X Y
y
x
f f
Z Z
O.Camps, PSU
Basic Perspective Projection
Scene Point Perspective Projection Eqns
P = (X,Y,Z)
X
Image Point y x f
Y
p = (x,y,f) X Y
Z
x Y
Z
y
y f
x
X
f Z
Z
O
So how do we represent this as a matrix equation?
We need to introduce homogeneous coordinates.
O.Camps, PSU
Homogeneous Coordinates
Represent a 2D point (x,y) by a 3D point (x’,y’,z’) by
adding a “fictitious” third coordinate.
By convention, we specify that given (x’,y’,z’) we can
recover the 2D point (x,y) as
x' y'
x y
z' z'
Note: (x,y) = (x,y,1) = (2x, 2y, 2) = (k x, ky, k)
for any nonzero k (can be negative as well as positive)
Perspective Matrix Equation
(in Camera Coordinates)
X X
x f x' f 0 0 0
y ' 0 Y
Z
f 0 0
Y Z
y f z ' 0 0 1 0
Z 1
Forward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
Rigid Transformation (rotation+translation)
between world and camera coordinate systems
World to Camera Transformation
PC PW
W
X
U
Z V
Avoid confusion: Pw and Pc are not two different
points. They are the same physical point, described
in two different coordinate systems.
World to Camera Transformation
PC PW
W
X
U
R Z V
C
Rotate to Translate by - C
align axes (align origins)
P C = R ( PW - C )
Matrix Form, Homogeneous Coords
P C = R ( PW - C )
X r11 r12 r13 1 0 0 cx U
Y r21 r22 r23 0 1 0 cy V
Z r31 r32 r33 0 0 1 cz W
1 0 0 0 1 0 0 0 1 1
Example: Simple Stereo System
Y Z
(X,Y,Z)
z
left y
camera
located at ( , )
z
(0,0,0) y ( , )
x right
camera
Tx located at
x (Tx,0,0)
X
Left camera located at world origin (0,0,0)
and camera axes aligned with world coord axes.
Simple Stereo, Left Camera
X 1r11 r012 r013 1 0 0 0cx U
Y 0r21 r122 r023 0 1 0 0cy V
Z 0r31
r032 r133 0 0 1 0cz W
1 0 0 0 1 0 0 0 1 1
camera axes aligned located at world
with world axes position (0,0,0)
=
Simple Stereo Projection Equations
Left camera
Example: Simple Stereo System
Y Z
(X,Y,Z)
z
left y
camera
located at ( , )
z
(0,0,0) y ( , )
x right
camera
Tx located at
x (Tx,0,0)
X
Right camera located at world location (Tx,0,0)
and camera axes aligned with world coord axes.
Simple Stereo, Right Camera
X 1r11 r012 r013 1 0 0 -T
cxx U
Y 0r21 r122 r023 0 1 0 0cy V
Z 0r31
r032 r133 0 0 1 0cz W
1 0 0 0 1 0 0 0 1 1
camera axes aligned located at world
with world axes position (Tx,0,0)
=
Simple Stereo Projection Equations
Left camera
Right camera
Bob’s sure-fire way(s) to
figure out the rotation
X r11 r12 r13 1 0 0 cx U
Y r21 r22 r23 1 about
0forget 0 thiscy V
while thinking
Z r31 r32 r33 0 rotations
0about 1 cz W
1 0 0 0 1 0 0 0 1 1
PC = R P W
This equation says how vectors in the world coordinate
system (including the coordinate axes) get transformed
into the camera coordinate system.
Figuring out Rotations
X r11 r12 r13 U
Y r21 r22 r23 V PC = R P W
Z r31 r32 r33 W
1 0 0 0 1 1
what if world x axis (1,0,0) corresponds to camera axis (a,b,c)?
X
a r11 r12 r13 U
1 X
a ra11 r12 r13 U
1
Y
b r21 r22 r23 V
0 Y
b rb21 r22 r23 V
0
Z
c r31 r32 r33 0
W Z
c rc31 r32 r33 0
W
1 0 0 0 1 1 1 0 0 0 1 1
we can immediately write down the first column of R!
Figuring out Rotations
and likewise with world Y axis and world Z axis...
same axis in camera coords axis is world coords
X r11 r12 r13 U
Y r21 r22 r23 V
Z r31 r32 r33 W
1 0 0 0 1 1
world X axis (1,0,0) world Z axis (0,0,1)
in camera coords in camera coords
world Y axis (0,1,0)
in camera coords
Figuring out Rotations
Alternative approach: sometimes it is easier to specify
what camera X,Y,or Z axis is in world coordinates. Then
do rearrange the equation as follows.
PC = R P W R-1PC = PW RTPC = PW
r11 r21 r31 X U
r12 r22 r32 Y V
r13 r23 r33 Z W
0 0 0 1 1 1
Figuring out Rotations
r11 r21 r31 X U
r12 r22 r32 Y V RTPC = PW
r13 r23 r33 Z W
0 0 0 1 1 1
what if camera X axis (1,0,0) corresponds to world axis (a,b,c)?
r11 r21 r31 X
1 U
a ra11 r21 r31 X
1 U
a
r12 r22 r32 Y
0 V
b rb12 r22 r32 Y
0 V
b
r13 r23 r33 Z
0 W
c 0
rc13 r23 r33 Z W
c
0 0 0 1 1 1 0 0 0 1 1 1
we can immediately write down the first column of RT,
(which is the first row of R).
Figuring out Rotations
and likewise with camera Y axis and camera Z axis...
same axis in camera coords axis is world coords
X r11 r12 r13 U
Y r21 r22 r23 V
Z r31 r32 r33 W
1 0 0 0 1 1
camera X axis (1,0,0)
in world coords
camera Z axis (0,0,1)
camera Y axis (0,1,0) in world coords
in world coords
Example
y
z
x
0 0 1 0 -1 0
Rtrain 0 -1 0 Rfly 0 0 1
1 0 0 -1 0 0
Note: External Parameters
also often written as R,T
X r11 r12 r13 1 0 0 cx U
Y r21 r22 r23 0 1 0 cy V
Z r31 r32 r33 0 0 1 cz W
1 0 0 0 1 0 0 0 1 1
R ( PW - C ) r11 r12 r13 tx
r21 r22 r23 ty
= R PW - R C
r31 r32 r33 tz
= R PW + T 0 0 0 1
Summary
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
We now know how to transform 3D world
coordinate points into camera coords, and
then do perspective project to get 2D points
in the film plane.
Next time: pixel coordinates
Recall: Imaging Geometry
W
Object of Interest
in World Coordinate
System (U,V,W)
V
U
Imaging Geometry
Camera Coordinate
Y System (X,Y,Z).
X Z
• Z is optic axis
f • Image plane located f units
out along optic axis
• f is called focal length
Imaging Geometry
W
Y y
X Z x
V
U
Forward Projection onto image plane.
3D (X,Y,Z) projected to 2D (x,y)
Imaging Geometry
W
Y y
X Z x
V
u U
v Our image gets digitized
into pixel coordinates (u,v)
Imaging Geometry
Camera Image (film) World
Coordinates Coordinates W
Coordinates
Y y
X Z x
V
u U
v Pixel
Coordinates
Forward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
We want a mathematical model to describe
how 3D World points get projected into 2D
Pixel coordinates.
Our goal: describe this sequence of
transformations by a big matrix equation!
Intrinsic Camera Parameters
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Y y v
W Z
Affine Transformation
Intrinsic parameters
• Describes coordinate transformation
between film coordinates (projected image)
and pixel array
• Film cameras: scanning/digitization
• CCD cameras: grid of photosensors
still in T&V section 2.4
Intrinsic parameters (offsets)
film plane
pixel array
(projected image)
ox (0,0) u (col)
oy x v (row)
(0,0)
X Y
u f ox v f oy
Z Z
ox and oy called image center or principle point
Intrinsic parameters
sometimes one or more coordinate axes are flipped (e.g. T&V section 2.4)
film plane pixel array
ox (0,0) u (col)
oy y
v (row)
x
(0,0)
X Y
u f ox v f oy
Z Z
Intrinsic parameters (scales)
sampling determines how many rows/cols in the image
film scanning
resolution
pixel array
C cols x R rows
CCD
analog
resample
Effective Scales: sx and sy
1 X 1Y
u s f ox v f oy
x Z sy Z
Note, since we have different scale factors in x and y,
we don’t necessarily have square pixels!
Aspect ratio is sy / sx
O.Camps, PSU
Perspective projection matrix
Adding the intrinsic parameters into the
perspective projection matrix:
X
x' f / s x 0 ox 0
y ' 0 Y
f / sy oy 0
Z
z ' 0 0 1 0
1
To verify:
x’
u 1 X Y
1
z’ u s f ox v f oy
y’ x Z sy Z
v
z’
O.Camps, PSU
Note:
Sometimes, the image and the camera coordinate systems
have opposite orientations: [the book does it this way]
X X
f ( u o x ) s x x' f / s x 0 ox 0
Z y ' 0 Y
f / sy oy 0
Y Z
f ( v o y )s y z ' 0 0 1 0
Z 1
Note 2
In general, I like to think of the conversion as
a separate 2D affine transformation from film
coords (x,y) to pixel coordinates (u,v):
X
u’ a11 a12 xa' 13f 0 0 0
Y
v’ a21 a22 ya' 23 0
f 0 0
w’ Z
0 0 z 1' 0 0 1 0
1
Maff Mproj
u = Mint PC = Maff Mproj PC
Huh?
Did he just say it was “a fine” transformation?
No, it was “affine” transformation, a type of
2D to 2D mapping defined by 6 parameters.
More on this in a moment...
Summary : Forward Projection
World Camera Film Pixel
Coords Coords Coords Coords
U X x u
V Mext Y Mproj Maff
y v
W Z
U Mext X Mint u
V Y v
W Z
U u
M
V m11 m12 m13 m14 v
W m21 m22 m23 m24
m31 m31 m33 m34
Summary: Projection Equation
Film plane Perspective
World to camera
to pixels projection
Maff Mproj Mext
Mint
M
Intro to Image Mappings
Image Mappings
Overview
from R.Szeliski
Geometric Image Mappings
Geometric
image transformation transformed image
(x,y)
(x’,y’)
x’ = f(x, y, {parameters})
y’ = g(x, y, {parameters})
Linear Transformations
(Can be written as matrices)
Geometric
image transformation transformed image
(x,y)
(x’,y’)
x’ x
y’ = M(params) y
1 1
Translation
y y’
transform
1
ty
0
0 1 x tx x’
equations matrix form
Scale
y y’
transform S
0 0
0 1 x 0 S x’
equations matrix form
Rotation
y y’
transform
1
)
0
0 1 x x’
equations matrix form
Euclidean (Rigid)
y y’
transform
)
1
ty
0
0 1 x tx x’
equations matrix form
Partitioned Matrices
http://planetmath.org/encyclopedia/PartitionedMatrix.html
Partitioned Matrices
2x1 2x2 2x1 2x1
1x1 1x2 1x1 1x1 matrix form
equation form
Another Example (from last time)
X r11 r12 r13 tx U
Y r21 r22 r23 ty V
Z r31 r32 r33 tz W
1 0 0 0 1 1
3x1 3x3 3x1 3x1
PC R T PW
1x1 = 1x3 1x1 1x1
1 0 1 1
PC = R P W + T
Similarity (scaled Euclidean)
y y’ S
transform
)
1
ty
0
0 1 x tx x’
equations matrix form
Affine
y y’
transform
1
0
0 1 x x’
equations matrix form
Projective
y y’
transform
1
0
0 1 x x’
Note!
equations matrix form
Summary of 2D Transformations
Summary of 2D Transformations
Euclidean
Summary of 2D Transformations
Similarity
Summary of 2D Transformations
Affine
Summary of 2D Transformations
Projective
Summary of 2D Transformations
from R.Szeliski