0% found this document useful (0 votes)
86 views76 pages

Intro to Computer Vision Course

This document provides an overview of an introductory computer vision course. It discusses the instructor, course topics including low-level vision, geometry, recognition, and light/color. It also outlines course requirements, projects, and grading.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views76 pages

Intro to Computer Vision Course

This document provides an overview of an introductory computer vision course. It discusses the instructor, course topics including low-level vision, geometry, recognition, and light/color. It also outlines course requirements, projects, and grading.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

CS5670: Intro to Computer Vision

Instructor: Noah Snavely


Instructor
• Noah Snavely (snavely@cs.cornell.edu)

• Research interests:
– Computer vision and graphics
– 3D reconstruction and visualization of Internet
photo collections
– Deep learning for computer graphics
– Virtual reality video
Today
1. What is computer vision?

2. Course overview

3. Image filtering
Today
• Readings
– Szeliski, Chapter 1 (Introduction)
Every image tells a story
• Goal of computer vision:
perceive the “story”
behind the picture
• Compute properties of
the world
– 3D shape
– Names of people or
objects
– What happened?
The goal of computer vision
Can the computer match human
perception?
• Yes and no (mainly no)
– computers can be better at
“easy” things
– humans are much better at
“hard” things

• But huge progress has


been made
– Accelerating in the last 4
years due to deep learning
– What is considered “hard”
keeps changing
Human perception has its
shortcomings

Sinha and Poggio, Nature, 1996


But humans can tell a lot about a
scene from a little information…

Source: “80 million tiny images” by Torralba, et al.


The goal of computer vision
The goal of computer vision
• Compute the 3D shape of the world
The goal of computer vision
• Recognize objects and people

Terminator 2, 1991
slide credit: Fei-Fei, Fergus & Torralba
sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba


The goal of computer vision
• “Enhance” images
The goal of computer vision
• Forensics

Source: Nayar and Nishino, “Eyes for Relighting”


Source: Nayar and Nishino, “Eyes for Relighting”
Source: Nayar and Nishino, “Eyes for Relighting”
The goal of computer vision
• Improve photos (“Computational Photography”)

Low-light photography (credit: Hasinoff et al., SIGGRAPH ASIA 2016)

Super-resolution (source: 2d3)

Inpainting / image completion (image credit: Hays and Efros)


Why study computer vision?
• Billions of images/videos captured per day

• Huge number of useful applications


• The next slides show the current state of the art
Optical character recognition (OCR)
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
http://www.research.att.com/~yann/

Sudoku grabber
http://sudokugrab.blogspot.com/

Source: S. Seitz
Automatic check processing
Face detection

• Nearly all cameras detect faces in real time


– (Why?)
Face Recognition
Face recognition

Who is she? Source: S. Seitz


Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Source: S. Seitz
Leaf Recognition
Bird Identification

Merlin Bird ID (based on Cornell Tech technology!)


Special effects: camera tracking

Boujou, 2d3
Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC


Source: S. Seitz
Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz


3D face tracking w/ consumer cameras

Snapchat Lenses

Face2Face system (Thies et al.)


Sports

Sportvision first down line


Nice explanation on www.howstuffworks.com

Source: S. Seitz
Vision-based interaction (and games)

Assistive technologies

Nintendo Wii has camera-based IR


tracking built in. See Lee’s work at
CMU on clever tricks on using it to
create a multi-touch display!
Kinect
Smart cars

• Mobileye
• Tesla Autopilot
• Safety features in many high-end cars
Self-driving cars

Google Waymo
Robotics

NASA’s Mars Curiosity Rover Amazon Picking Challenge


https://en.wikipedia.org/wiki/Curiosity_(rover) http://www.robocup2016.org/en/events
/amazon-picking-challenge/

Amazon Prime Air


Medical imaging

Image guided surgery


3D imaging
Grimson et al., MIT
MRI, CT

Source: S. Seitz
Virtual & Augmented Reality

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture


My own work
• Automatic 3D reconstruction from Internet
photo collections
“Statue of Liberty” “Half Dome, Yosemite” “Colosseum, Rome”

Flickr photos

3D model
Photosynth
City-scale reconstruction

Reconstruction of Dubrovnik, Croatia, from ~40,000 images


Current state of the art
• You just saw examples of current systems.
– Most of these are less than 5 years old

• This is a very active research area, and rapidly


changing
– Many new apps in the next 5 years

• To learn more about vision applications and


companies
– David Lowe maintains an excellent overview of vision
companies
• http://www.cs.ubc.ca/spider/lowe/vision.html
Why is computer vision difficult?

Viewpoint variation

Scale
Illumination
Why is computer vision difficult?

Motion (Source: S. Lazebnik)


Intra-class variation

Background clutter Occlusion


Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba


But there are lots of cues we can exploit…

Source: S. Lazebnik
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a
particular 2D picture

– We often need to use prior knowledge about the


structure of the world
Image source: F. Durand
CS5670: Introduction to Computer
VIsion
Teaching Assistant
• Zhengqi Li
(zl548@cornell.edu)

• Office hours:
When: TuTh 3:30 – 5pm
Where: Bear Hug
(starting next week)
Important notes
• Textbook:
Rick Szeliski, Computer Vision: Algorithms and
Applications
online at: http://szeliski.org/Book/

• Course webpage:
http://www.cs.cornell.edu/courses/cs5670/2017sp/

• Announcements/grades via Piazza/CMS


https://piazza.com/class#fall2013/cs46705670
https://cms.csuglab.cornell.edu/
Course requirements
• Prerequisites—these are essential!
– Data structures
– A good working knowledge of Python programming
– Linear algebra
– Vector calculus

• Course does not assume prior imaging experience


– computer vision, image processing, graphics, etc.
Course overview (tentative)
1. Low-level vision
– image processing, edge detection,
feature detection, cameras, image
formation
2. Geometry and algorithms
– projective geometry, stereo,
structure from motion, Markov
random fields
3. Recognition
– face detection / recognition,
category recognition, segmentation
4. Light, color, and reflectance
1. Low-level vision
• Basic image processing and image formation

* =
Filtering, edge detection

Feature extraction Image formation


Project: Hybrid images from image
pyramids

G 1/8

G 1/4

Gaussian 1/2
Project: Feature detection and matching
2. Geometry

Projective geometry
Stereo

Multi-view stereo Structure from motion


Project: Creating panoramas
Project: Photometric Stereo
3. Recognition

Face detection and recognition


Single instance recognition

Category recognition
Sources: D. Lowe, L. Fei-Fei
Project: Deep Learning for Recognition
4. Light, color, and reflectance

Light & Color Reflectance


Grading
• Occasional quizzes (at the beginning of class)
• One prelim, one final exam
– (considering final project instead of exam)

• Rough grade breakdown:


– Quizzes + class evaluation: ~5%
– Midterm: 15-20%
– Programming projects: 40-50%
– Final exam: 15-20%
Late policy

• Three free “slip days” will be available for the


semester

• Late projects will be penalized by 5% for first


late day, and 10% for each day it is late after,
and no extra credit will be awarded.
Academic Integrity
• Assignments will be done solo or in pairs (we’ll
let you know for each project)

• Please do not leave any code public on GitHub


(or the like) at the end of the semester!

• Please see the Cornell Code of Academic


Integrity (http://cuinfo.cornell.edu/aic.cfm)
Questions?

You might also like