0 ratings0% found this document useful (0 votes) 46 views158 pagesReport 418
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, 
claim it here.
Available Formats
Download as PDF or read online on Scribd
Research Activities in Digital
Photogrammetry at The Ohio State University
A Collection of Papers Presented
At the XVIT Congress of ISPRS
‘Toni Schenk
Editor
Report No. 418
Department of Geodetic Science and Surveying
‘The Ohio State University
Columbus, Ohio 43210-1247
July 1992Foreword
Most of the research efforts in photogrammetry are now directed toward digital photogram-
metry. The arrival of digital photogrammetric workstations is @ clear demonstration of the
considerable success that has been achieved in this rapidly developing subfield of photogram-
stry
At The Ohio State University we embarked on research in digital photogrammetry six years
ago. From a rather small group with no special equipment we have grown: two faculty, sev-
‘eral postdoctoral résearchers and fifteen PRD students are now actively involved in digital
photogrammetry research projects. Our laboratories are equipped with a high-performance
softcopy workstation (Intergraph ImageStation), several UNIX workstations, image pracest-
5ng systems, digital cameras and scannest—all networked together.
tis with great pleasure that I serve as editor of this report. I have gently persuaded the
majority of my advisees to submit a paper to the ISPRS Congress 1992. This report is a col-
lection of those contributions. The research in my group is primarily focused on automating
photogrammetric processes. Specifically we are working on surface reconstruction, feature
extraction and recognition, and on automated aerotriengulation.
By surface reconstruction I refer not only to automatic DEM collection but include segment-
ing processes with the goal of grouping the surface into breaklines and smooth patches to
support the subsequent procestes of object recognition. Several papers contribute toward
that goal.
‘The first two contributions are my invited papers for the ISPRS Congress 1902. They
‘may serve as a framework within whici all the other contributions fit. The first paper
summarizes the most important concepts and issues of computer vision and relates them to
digital photogrammetry. The second paper builds on this overview and focuses on conceptual
and algorithmic aspects. There is some repetition because the presentations will not address
‘the same audience,
Zong’s paper ie concerned with matching edges—in our case zero-crostings. She contin:
‘ued work originally initiated by Jin-Chen Li. ‘The idea is to find corresponding edges by
positioning the templet on an edge in ane image and by finding the corresponding edge
by cross-corzelation. ‘The matching results are checked for continuity as it is unlikely that
discontinuities occur along edges.
Matched edges are irregularly distributed in object space, Thus, the problem of interpolating
‘the surface arises. Al-Tahir investigates surface fitting methods. The thin plate method
with weak continuity constraints is of particular interest because it allows detecting break-
lines. They are compared with the position of edges which are also potential bresklines. It
‘now becomes possible to verify the hypothesis about breaklines and to use this information
‘on the next level of matching.
‘Wang analyzes the interpolated surface for objects of a certain vertical dimension, called
humps. The surface is segmented into regions of similar elevations followed by comparing the
shapes of their boundaries. The boundavies are grouped and classified into near horizontal
and vertical edges. Hump detection is important for reconstructing surfaces in large-scale
surban areas.
One of the reasons for the astounding cepability of the human visual system to reconstruct
surfaces is to integrate several depth cues, eg., apparent size, perspective, motion and tex-ture, Lee's paper is concerned with segmenting the image by analysing texture. Surface
crientation and texture are very closely related.
Most surface reconstruction methods adopt a hierarchical approach, for example by con-
structing image pyramids. Stefanidis examines the hierarchical approach with regard to
the sale space theory. He explore the relationship between images and surfaces since both
can be zepresented in sale apace
‘The goal of our OSU surface reconstruction system is to segment the surface into smooth
patches and bresklines and to represent them by a symbolic description. This step
portant for the subsequent task of object recognition. Krupnile groups matched edges in
the object space into straight lines and regular curves. He compares different methods that
allow 3-D segmentation.
‘A fundamental task that occurs at all levels of the computer vision paradigm is compar-
ing shapes. Fourier descriptors have long been used for that purpose. However, there is
no real quantitative criterion for measuring the similarity of two objects. ‘Tseng employs
an innovative approach by embedding shape invariants in a least-squares adjustment proce-
dure that provides not only a supericr measure forthe goodness of the match but also the
transformation parameters between the two shapes.
Late vision processes, such as object recognition and image understanding, are application
dependent (or goal-driven, if you prefer) and mast incorporate domain-specific knowledge.
‘Al-Garni’s work of interpreting landforms with the help of a knowledge-based system is an
important contribution to our rescazch since we will have the surface reconstruction system
tunder the contra of a knowledge-based system.
‘This report contains two contributions in the eres of automated acrotriangulaton, a sub:
ject of considerable research interest. Agouris? paper addresses the problem of matching
Iultiple image patches simultaneously. ‘This important step corresponds to the classical
procedure of transferring and messuring points. Considering the notorious problem with
point transfecing one can expect a significant increase in reliability from multiple imege
Inatching, My paper deseibes general mathematical models which aze suitable for matching
snultiple image patches
‘The remaining two papers from Toth deal with analytical plotters and their digital coun-
terparts — softcopy workstations. Both workstation types play an important role in our
research. So does Chasles Toth who developed software systems for analytical plotters that
‘ce invaluable not only for research but student Inboratories as well. His second paper de
scribes our research efforts onthe softeopy workstation to keep the measuring mark (cursor)
automatically on the ground. Thus, the operator i rdieved from setting the cursor precisely
on the ground.
Finally, I want to thank the authore for their contributions. However, this report would aot
have been possible without the help of Peggy Agouris who spent many night shifts to put
everything together. I wish to thank Irene Tesfai who diligently read all the papers. Her
comments are appreciated by every writer—none with English as mother tongue. Funding
for most of the research reported here was provided in part by the NASA Center for the
‘Commercial Development of Space Componeat ofthe Center for Mapping at ‘The Ohio State
University,
 
 
 
 
 
 
 
‘Toni SchenkTABLE OF CONTENTS
Foreword ...
1. Machine Vision and Close-Range Photogrammetry
‘Toni Schenk
2, Algorithms and Software Concepts for Digital Photogrammetric
‘Workstations
‘Toni Schenk
3. Resampling Digital Imagery to Epipolar Geometry
‘Woosug Cho, Toni Schenk & Mustafa Madani
4, Aerial Image Matching Based on Zero-Crossings
‘ia Zong, Jn-Cheng Li & Toni Schenk
5. On the Interpolation Problem of Automated Surface Reconstruction
Raid ALTahis & Toni Schenk
6, 8D Urban Area Surface Analysis
‘hong Wang & ‘Tal Schenk
7. Image Segmentation from Texture Measurement
Dong-Cheon Lee & Toni Schenk
8. On the Application of Seale Space Techniques in Digital Photogrammetry
Anthony Stefanidis &e Toni Schenk
9. Segmentation of Edges in 2-D Object Space
‘Ammon Krupnik & Toni Sehenke
10. A Least-Squares Approach to Matching Lines with Fourier Descriptors
Yillsing Teeng & Toni Schenk
11, Control Strategies for an Expert System to Interpret Landforms
‘Abdallah AL-Garni & Toni Schenk
12, Multiple Image Matching
Peggy Agouris & Toni Schenk
13, Reconstructing Small Surface Patches from Multiple Images
‘Toni Schenk &e Charles Toth
14. On Matching Image Patches Under Various Geometrical Constraints
CChasies Toth & Toni Schenke
15. A GIS Workstation-Based Analytical Plotter .
Charles Toth & Toni Schenk
 
i
7
“6
58
65
1
a
49
90
109
ast
14sMACHINE VISION AND CLOSE-RANGE,
PHOTOGRAMMETRY
‘Toni Schenk
Department of Geodetic Science and Surveying
‘The Ohio State University, Columbus, Ohio 43210-1247
USA
ABSTRACT
 
paper provides an overview of concepts and methods of machine vision as it may
pertain to clore-range photogrammetry. ‘The ultimate goal of a machine vision system is
to recognize objects from one o several 2-D images. This cannot be achieved in one giant
step. Intermediate procetses and representations are necessary. Usually, the frst goal is
to reconstruct the $-D surface of the object space, with emphasis placed on a symbolic
description in which surface properties are made explicit. The surface information aids the
subsequent object recognition task. The paper concludes with suggestions on how some of
the concepts developed in machine vision can (and should!) be employed in digital dose-
range applications.
 
 
Report No. 418 July 19921 INTRODUCTION
Since time immemorial, mankind hes been
fascinated by the idea to create a machine
that would somehow exhibit mental eapabil
fies. ‘The robot is a typical example of such
dreams. With the attempt of endowing com
puters with information processing capabilities
Similar to those of humans, researchers in ar-
{ifelal intelligence purtue this dzeam in mod
em times. Ever since computers became aval-
able, researchers tried to mimic the mental fa
Ulty of seeing. ‘The endeavour machine vision
Seamed to achieve quick success. Expectations
‘were pushed far beyond what could be deliv-
fred and disusion followed. ‘The problem has
been tremendously underestimated —like many
other problems tackled by artifical intelligence.
We see and interpret scenes without corscious
cffort, however, this does not mean that the
task ie easy,
Clearly, the Ick of a detailed understanding
of vision is the reason why itis so difiult to
rake computers understand and analyze im
ages. It seems only natural that someone who
fttempts to solve a vision task should have &
food tnderstanding of the human visual sys-
fem. Admittedly, this view is not sheed by
every vision researcher.
 
 
‘As the name suggests, digital photogrammetry
deals with digital imagery. Great strides have
bbeen made during the last ten years due to the
availability of new hardware and software, such
fs image processing workstations, paral pro-
essing, and increased storage capacity This
Jn turn spurced much interest in research and
evelopment. The arrival of digital photogram-
‘metric workstations is «clear demonstration of
the progress achieved.
‘The goal of digital photogrammetry is to
ture images and fo Store, manipulate ard pro-
cess them automatically’ In that regard, dig-
ital photogrammetry and machine vision have
the Same goals. The purpose af this paper is
to present the major concepts, methods, solu-
tions and issues of machine vison. This may
‘be a risky enterprise, considering the glut of
publications in that feld, and the high proba-
bility that the machine vision research exmmma-
nity would not unanimously agree on what the
concepts and isues are
 
 
We begin with a summary of human vision for
{tis measure beyond all bounds. Most of the
‘material presented is based on recent research
rerulis, We conclude the section about human
Vision with Marr's theory about vision because
itis the most advanced approsch to date. Tehas
‘been widely accepted by visual paychologsts
and the machine vision esearch commnity,
‘The exposition of machine vision bogins with
‘the paradigm, followed by the most important
concepts, methods, and critical issues. ‘This
paves the way for comparing digital close range
Photogrammetry and machine vision. We elab-
rate on a few but very important aspects
which the two disciplines share and point out
Where they difer. It is hoped the concluding
remarks stimulate discussions on how digital
‘photogrammetry and machine vision ean ben
bit from each other ~ more than they do now
2 HUMAN VISION
Por an animal or person to respond properly to
‘ changing environment i aust detect objects,
vents and structures, ‘This ability, called pe
‘ception, requires that a living orgenism mast
be sensitive to afferent atimli which carry
Important information about the environment.
‘Most animals have some visual pereeption abi
ities, For peopl, vision isthe most important
sense. By the samme token, iti by far the most
impressive and complicated sense
We see and analyze our environment contin:
‘ously, nearly in Fea-time. That we do this
‘without conscious effort does not imply that we
know how we analyze and understand scenes,
however. Infact, the lack ofa detailed under
standing of vision ie the reason why i is #0
Alificalt to program a computer to analyze and
‘understand images. Te seems only natural then
that someone who attempts to solve pat of this
tak should have a basie understanding of
‘man vision, Consider the following summary
fs an exciting journey through the fascinating
‘world of vision. Mort ofthe material presented
in the next subsection i fom Hubel (1988)
2.1 Neurophysiology of Human Vision
Nearophysiology is concerned with the pro-
‘eaten that are performed by specialized tissues and cells of the nervous system (Uttl,
1975) Vitual information is processed in ari
ous stages at ceaters of spedalied nerve calls,
{fom the retina to the primary visual cortex.
‘The proceming centers aze connected by the
‘nual pathway which can be thought of
eval Unk (0 Fig. 2).
 
 
 
 
Fatoty
 
 
 
 
 
 
 
Gd
 
 
 
 
Fig. 1: Visual pathway; cach structure con
fists of millions of cells Information is tent to
tne or several higher order structures. (Figure
‘adapted from ube, 1988)
Tight is focused on the retina to form an im:
age. Approximately 125 million light sensi:
tive photoreceptors (rods and cones) are un
cvenly distributed over the entire posterior po.
tion of the eyeball. The retina consists of
three layers: photoreceptors, middle layer, and
ganglion eels whose dendrifes are bundled to-
gether to form the optic nerve. Oday, light
passes through two layers before it reaches the
photoreceptors, except forthe site of acute vi
Sion, the fovea region smal than amine
ter in diameter.
It is tempting to compare the eye with » cam:
cra. The analogy must be met with caution,
however. Firs, the quality of the retinal image
is fer inferior fo that of any cheap Tnstamatie
‘camera, Aberrations of lms and cornea
responsible for considerable distortions. The
curvature of the retina causes straight lines
in object space to appear curved, disturbing
the metrical relationship Yetween image and
 
‘object space. Moreover, the constant move
‘ments of the eye results in a blurred image.
‘While the purpose of the camera is lo render
a static mapshot of the world, the eye’ and
Deain’s purpose isto extract useful information
to guide a person's response to an ever chang
ing environment,
How do the ganglion cells respond to incident
light and what is reported back to the next
pTocesing centers? First, we note that there
are far fewer ganglion cells than photoreceptors
(he ratio is approximately 1125). Thi is 8
fiset indication that the retinal image is pro-
cessed by the cells ofthe mile Iayer and the
{ganglion cells. Te also implies that one garglion
all receives impulses from several photoxcep
‘The receptive fled of « ganglion cell refers to
those receptors which are “eomnected” to it
‘The circular center of a receptive field is sur-
rounded by ring-shaped region. An on-center
tzanglion cell reacts most (inerenses its ring.
ate) ifthe center of its receptive fel i stnmae
lated, for example by shining a spot of light
con the receptors that form the center. The
anglion call stops fring i the center-suround
region ofits receptive ld is stimulated, but
reacts with a burst of impulses when the sin:
‘lus is turned off. Off-center ealls exhibt the
‘opposite behvior. For example, i thei cen
ters are stirmalated, fring is suppressed. Both,
fon and off-center clls do not zeapond if their
fentire receptive fd ie evenly dluminated
‘We conclude that ganglion cells respond to.
brightness diferencer within their receptive
feds, that is, to loal intensity difference, Re-
captive fields differ in size. As one vould ex
‘pct, the size is smallest in the fovea and pro-
igrenively increases further out in the visual
Feld. The light intensity changes, transmitted.
by the optical nerve, are detected by biclog
cal filters of the retina, Campbell and Robson
(1968) showed that calls are sensitive t dit
{ferent spatial frequencies—a strong indication.
that the visual input is processed in multiple
independent channels.
Inthe interest of brevity, we skip the nex: pro-
cessing stage, the lateral geniculate bods, and
shift our attetion to the primary visual (stri-
ate) cortex, a complex substructure of the cere-
bral corter. ‘The vieual cortex is topogreph-Jeally organized: an area of about two mil-
limeter square has all the functionality. These
arear—telf-contained modules of the striate
Cortex—map out a portion of the visual field
‘Consequently, ifone such area is damaged, the
corresponding part of the atinal image i aot
processed further andthe reslt is local blind
tess. Neighboring modules 4o not compensate
forthe loss. However, the perceptual process
“ili” completes the missing information by
interpolating it from the surrounding area.
 
‘The specialization of cells in the cortex in-
creases. So does the complecity of ther recep
tive fields. Unlike eels of earlier levels, cort-
‘eal cells have no circular syzamettical receptive
fies, and they respond quite differently too. A
simple call, for example, reyponds best if x ait
of light crosses its receptive field at a specific
Angle, Changing the orientation and position
‘nly lightly evokes no response, Other simple
tells respond more strongly if one half of the
receptive fd is stimulated
 
‘The mest commonly found cells in the stri-
ate cortex are the complex cells. Like simple
cells they respond to propedly oriented stimuli
However, the eel’ fring rate fades out rather
quickly unless the stimulus is moved. So, com
ples eels are movement sensitive; they respond
With a barrage of impulses if « properly ot
ented sit is swept across their receptive fields.
Some complex eels are also direction sensitive,
‘That is, it matters in whieh direction the sit
is moved, That a large population of cells is
Highly sensitive to movements makes a lot of
Sense, atleast from an evelutionary point of
view. After al, to react properly and timely
to the environment, moving objects should be
discovered promptly.
End-stopped cells are further specialized in
that they aze sensitive to the length ofthe stim
ulus. They respond much more strongly ifthe
slit of ligt ends or changes direction within
their receptive field. ‘Thus, they respond best
to comers and curvature,
 
So far information ftom the two eyes was
‘rested separately, even though one corti
‘al hemisphere receives infermation from bath
eyes. As photogrammetrist we are prfession-
ally intrested in stereopss. The corpus callo-
‘sum ie the site of stereovsion. Here, binocular
Calls are found that respon to depth. Some of
‘these calls only fire if the stimulus is roughly
as far away as the distance on which the to
‘yes are focused (zero parallax). Other calls
vole a brisk barrage of impulses ifthe sim
1s is nearer or farther away from the fixation
poiat. Ancther characteristic feature of these
Aispacty-toned eels is that they are also erien-
tation and movement sensitive. As one vould
‘expect, they do not respond at all if only one
yes stimulated. Though dsparity-tunee eels
“undoubtedly contribute to stereovsion they are
just 2 partial explanation of how we perceive
depth. One should bear in mind that Stereop:
sis is only one of several depth cues.
et us interrupt our journey through the visual
system for a moment and recapitulate, What
Feaches the brain is not an image, but infurma-
tion about changes in the scene, eg. light in
tensity differences, their orientation, and move
‘ment. The specialization of ells and the com-
plenty of their receptive fields increase. How
far will thie specialization go? After cells were
discovered inthe visual aren of s monkey that
responded to the shape of paws, the notion of
4 grandmother eell arose. Is there a cel that
‘would respond to grandmother's face?
 
 
2.2 Visual Perception
‘Vieual perception is the ability of humms to.
‘organize and interpret visual sensory informa:
‘on. ‘The psychology of human visual percep
tom war dominated in the late 10th century by
associationismn. Te was thought that perception
‘could be explained by associating simple sensa
‘ions. This was precisely what the Gestal: py.
hologists attacked mot, for their basic tenet
was that “the whole ie more tan the umnof ite
parts". ‘They argued that the form and struc
fre of sensations and their interrelatiocshipe
should be taken into account. The Gestaliste
thought that this synergism is accomplished by
‘magnetic force fields between brain events, The
Gestalt psychology has fallen into disrepute,
mainly because no evidence was found for the
force-fields in the brain,
Cognitive prychology adopts a more infor-
mation theoretical approach where computer
models of perceptual processes are legitimate
goals for establishing psychological theories
‘This, together with a more quantitative approach in research, paves the way for “compu
{ational perception”, results that can be con
verted to algorithms.
 
 
Perceptual organiza‘ion
‘The neurophysiological approach to vsion eft
tus with the image decomposed into simple local
features, suchas edges, corners and some depth
Information. Such low-level descriptions must
bbe organized into lange: perceptual structure
Perceptual organization Is the fist process of
perception (Hock, 1978). It detects groupings
fd structures in images which in turn are be-
Teved to be the input for object recognition
and image understanding
 
‘The following are examples of a sot of crite.
ia for grouping the image and finding asso-
ations. "Most of these principles have been
advocated by the Gettlt prychologists and
fre known af the Gestalt laws of organization
Prosimity groups local features together which
fare cloae together. Depth is a very strong for
proximity. ‘Things with similar disparity vale
tes are grouped togethor and perceived as be-
Tonging to the vame nface, Similarity groups
similar features together. Similarity ean over
Fide proximity. Commen fate groups things to-
fether which appear te move together. Te can
bbe demonstrated by generating randomly dis
tributed dots and superimposing a copy with a
light shift or rotation, The shift or rotation ie
‘dessly perceived. Anocher Gestalt law is good
continuation which emphasizes smooth conti-
nuity over abrupt changes, Closure emphasizes
‘preference for closee figures and symmetry
‘groups symmetrical features together, Figure
ground separation is quite a strong perceptual
‘ganization process
 
 
In reality, grouping processes work coneur-
rently on the same image. Two (oF moze) pro-
estes yielding the some interpretation rerults
in-a more salient perception. McCafferty and
Fryer (1087) showed that a very strong and
stable perception results from eombining stereo
with fgure-ground separation,
 
Other perceptual processes
Here, we mention some other powerful percep
tual processes which could be used in compu
tational vision.
Filling in oF completion is responsible fr us to
‘not perceive the world asa patchwork of edges
and blobs (as might be concluded from the
‘europhysiological discusion about vision). A
very illustrative example is the blind spot
Cote one eye and fix point with the open
eye. Move a pencil with one hand so that it
‘crores the visual feld. Wien the pencil is i>
aged at the blind spot, 1 disappears, as ex
pected. However, you are not left with a black
‘pot; rather the hole in the retinal image i ov
‘red (Glled in) by the surrounding background,
Filing in appears to belong to a more general
perceptual process called surface interpolation
(Ramachandran, 1992)
e%
& 2
Fig. 2(2): Example for virtual lies. Fig. 2(b)
demonstrates the phenomenon of ilusionsry
contours. he figure is perceived as © square
fand not as four partial creles,
 
Virtval Hines are imaginary lines, linking
nearby tokens. Fig. 2(a) is an example. A
Similar phenomenon are ilusionary contours,
investigated by Kanissa (1979). In Fig. 2(0)
wwe perceive the structure ofa square, The foar
corners are lying on crcl. Another (unlikely)
Interpretation ofthis fe four parti ci
‘Teatureis a very important but not well under.
stood perceptual process. Texture is strongly
related to surfaces. Slovly changing texture
patterns give a strong perception for surface
formals, Tulese studied texture segmentation.
intensively. He concludes that textured regions
cannot be segregated if their rst and second.
forder statistics are idenical. In Julees and.
Bergen (1989) the notion of textoas is intro-
duced. ‘The authors claim that they play acomplementary role in human texture segrega-
tion,
2.8 Mare's Theory about Vision
‘The physiological approach to vision answered
‘the question: what happens where? How some
thing happens cannot be fully explained unless
the cells behavior can be described by a com:
plate wiring diagram. For answering the que
tion why single ells respond they way they do,
fa broader view must be adopted. As Marr put
it
  
trying to understand percep
tion by studying only reurons is
ike trying to understand bird fight
bby studying only feathers: Tt just
cannot be done. In order to un-
derstand bird fight, we have to un-
derstand aerodynamics; only then
do the structure of feschers and
the different shapes of bird's wings
make sense. (Marr, 1982, p. 27)
Marr's theory about vision hat strong infor.
mation processing underpinning. He argues for
understanding an information process ~ vision
at three diferent levels,
computational theory species what the vi
‘sual system must do. It answers the ques
tion about the purpose of the compute
tion and the strategy for solutions.
representation and algorithm
investigates the representation of input
snd output and the algoritm that trans:
orm one into the ot
hhardware implementation answers the
question how the representation and the
slgocthn ca be physialy implemented
y neurons.
 
 
‘The tenet of Marr's theory is that the shapes
‘and, positions of things can be made ex-
plicit from images without knowing what the
things are and what role they play. However,
this cannot be accomplished inne step, rather
in a sequence of representations designed to fa-
clitate the subsequent construction of physical
properties of objects. ‘The thres main steps are
briefly discuseed
 
Primal sketch
‘The purpose of the primal sketch is to make
intensity changes in the image explicit. Inten:
sity changes, or edges for short, are aa. impor
tant physical property of objects. In the real
world edges occur over a wide range of spatial
extents, “A sharp edge, for example, is man
fest within a small aee, comprising a few pix
‘ls only. On the other hand, a fussy edge ean
‘only be detected by looking at # much larger
tres, Marr and Hildreth (1980) propore a s=-
quence of LoG operators to detect edges at
various scale. ‘The LoG operator (Laplacian
‘of s Gaussian) is obtained by taking the see
fond derivative of » Gaussian Mter. The Lapla
lan (V4) is particularly suited because itis
rection independent. By varying the stan-
dard deviation ¢ of the Gaussian, the desired
Sequence, also called multi channel implemen-
{ation, is obtained. Obviously, the parameter
@ determines the spatial extent within which
fan edge is detected. ges are identical with
the zero-cossing contours that result fom in-
tersecting the convolution surface with a plane,
‘whore convolution vale i tro. ‘Thus, «sharp
‘uge is obtained by convolving the image with
‘a small ¢ (fine channel), and fanny edges result
from coarser channels
‘There is much evidence that the human vigual
system performs the same operations. Cells
fcxist in the cortex that respond to difer.
‘ent spatial frequencies. Spatial information is
‘processed in each part of the visual field by
five independent channels (Wilson and Bergen,
1078). Actually, the LoG operator is approx:
mated by the dilference of two Gaussians of
slightly diferent ¢. ‘The two coarser channels
hhave transient properties, reponding to duct
ating patterns, while the finer channels respond
to stationary abjects, The finest chanel is n=
Isted to acute vison
‘The primal sketch is more than just an agglom-
ration of zero-crossings. Perceptual processes
fperate on the image az well at on the edge,
eulting in a curvilinear organizstion, virtual
lines and groupings. Zero-crossngs from dif
ferent chasnels are combined, governed by the
rule that edges in different channels are local-
ized in space.2.5-D sketch
 
 
Its purpose
and depth of
{inuities, The name of this sketch drives from
‘the assumption that il captures a great deal
about the relative depths and surface orien-
tations, and local changes and discontinuities,
Dut some aspects are more accurately repre:
seated than others,
Very locally we can easly say from
‘motion or stereopsis information
‘whether one point isin front of ane
other. But if we try to compare
the distances to two surfaces that
lie in different parts of the visual
field, we do very poorly and can do
this much les accurately than we
can compare their surface orienta
tions, (Marr, 1982, p. 282)
‘The 25D sketch is built up from the pri
sal sketch, augmented with information from
Sereopsis, texture, azalysis of motion, and
Shading. ‘The surface eientaton is much more
fccurate than depth. Only local changer in
depth have a comparable accuracy. Discont
nultis in depth may aise from stereopsis and
‘celusion. Occlusion may be specified by the
presence of oceluded edges in the prima sketch,
or by analysing motion patterns.
‘The 25-D sketch s represented as asa of prim-
itives, depiciod as “eedles". ‘The length of
each needle deseribes the degre of tlt of that
pat of the surface, while the orientation of 2
peedle reflects the direction of slant. The dit-
tance ftom the viewer i represented by a scalar
quantity.
Interpolation procedums are invoked in areas
of fnguficient information. In ateas oflow con
tart, no edges are present and therefore no
depth informetion. ‘Tae missing depth infor:
‘mation is interpolated from surrounding areas
‘where contrast is present. Another example for
‘an interpolation process are illusory contours
(see Fig: 28.
‘The 25D sketch jn the end product of early
vision procesies, solely derived from images,
vithout support from late vision or knowledge
of the scene, The early vision processes are
 
 
modular, they work pardlel and independent
‘from one ancther. ‘The segmentation problem
is implicitly solved by making explicit the ds-
continuities between diffrent surface,
S:D Model representation
‘The purpote of this last step is to. describe
shapes and their spatial ganization in object
centered coordinate system. Marr and Nishi
hhara (1978) suggest « modular organization of
shape descriptions in a cosrdinate frame which
is determined by the shape itself (canonical co
crdinate freme). The modslar organization l-
Tows a description that js independent on the
degree of details an object is described.
‘The theory is restricted to a sot of guneralized
cones. A generalized cone is obtained by mov-
ing a cross section of eonitant shape but vari
able size along an axis, A. vase is 8 good ex:
‘ample of a generalized cone. An object may
‘consist of several generalized cones, each with
its own axis. All axer of one object form the
‘component axes of that object
A library of -D model descriptions at diferent
levels of specificity is generated for objects that
say possibly appear in «scene. ‘The same 3
‘D model description mus: be derived from the
Jimage. Object recognitioa then entails to com-
pare these descriptions with the library
Occluding contours of an image provide strong:
clues for finding the axes of generalized cones
Oceluding contours are the silhouettes of ob-
jects. Even though mos: sihouettes ace am-
biguous, humane interpre! them in a particular
way. Marr hypothesizes chat additional infor-
‘mation is ured to constrain the perception of -
D shapes to silhouettes i we see them. These.
constraints are general and do not 1
prior knowledge of the scene
 
 
 
 
 
3 MACHINE VISION
3.1 Introduction
‘rom time immemorial psople dreamed of ee-
sting machines that would exhibit mental abil
ities, With the invention of computers, re
tearchers in the field of artificial intelligence(AD) pursue this dream to endow computers
with information processing capabilities simi-
Tar to those of humans. Richie (1988) defines
Alas " the study of how to make computers
o things at which, atthe moment, people are
better”. Vision is'not only our most impret
sive gente but also the most intensively studied
sense in AT.
By and large, machine vition pureues the same
goal ae inuman vision: generate deserptions
About the seene from images. The descptions
‘must be explicit and meaningful so as te allow
other system components to carry out a task
Ta that aspect, machine vision is part of an
entire system that interects with the exvizon
‘ment, say a robot. Consequently, tasks mich as
ecision making, planning, executing decisions,
‘are not part of machine vision. By the way, the
terms computer vision and machine vision are
used interchangeably.
‘Machine vision isa relatively new and rapidly
changing feld. Many ofthe essential ccacepts
hhave only evolved during the last ten years
‘The purpose of this chapter isto eucidete the
‘most important concepts and to elaborate on
the major issues. Even though machine vie
sion ir now fed in its own right i ir related
to other areas, such as psychology, computer
sraphice, pattern recognition and image pro-
ring. In fact, significant progress bar bean
‘made, and wil be made, when an intedieee
plinary approach is adopted. Take Mar’ the-
‘ory of vision as an example. It is actually the
combination of research results in newrophysi-
‘ology, payehophysies, peresption, compucersc-
‘ence and signal processing.
‘Even though our knowledge of the human vi-
sual system is only fragmentary, we know that
itis very complex. Machine vision, therefore,
is anon trivial task. Not surprisingly thea, no
‘general purpose vision system exists today and
will not exit in the foreseeable future The
lack of rapid success, as enthusiastically pre
dicted thirty years ago, led some AT researchers
to a rather pessimistic assessment. In their
view, machine vision is so Ul-defined and ux
Aerconstrained that no general solution exist,
‘As Barrow put it
 
 
Despite considerable progress in
recent yeats, our understanding
of the principles underlying visual
 
perception remains primitive. At
{empteto construct computer mod.
cls for the interpretation of ar
‘trary scones have resulted in such
poor performance, limited range
of abilities, and inflexibility that,
‘wore it not for the human existence
‘roof, we might have been tempted
Jong ‘ago to conclude that high-
performance, general-purpose vi-
ion is impossible. (Barrow, 1978)
 
Nevertheless, progress has been made, mainly
in industrial applications, where the envirex-
‘ment, such as lighting condition, ean be better
controlled.
3.2 Machine Vision Paradigm
Marr's theory of vision gave rise to the most
advanced and widely accepted paradigm of m=
chine vision. Fig. depicts the building blocks.
Usually, at the outset isa raw image. We also
includs image formation, a point foreeflly ad
vocated by Hora (see Hom, 1986) and now ac
cepted by many vision researchers. Afterall,
‘machine vision may be viewed as the inver
process ofimage formation. ‘Thus it makes only
Sense to obtain a thorough understanding of
Image formation.
‘The primal sketch isthe result of edge detec-
tion. Badges are likely to have been caused
by structures in the scene, such ar object
Doundaries, markings and surface discontinu-
iting. ‘The’ unorganized edge fragments, bare
and blobs are grouped into higher-level tokens,
which are now processed by the independent
modules steropis, shading, motion, texture to
yield the 25-D sketch.
‘The 26-D sketch contains fewer data than the
raw image, but more important, itis more ex-
plicit. An’ edge could be an object boundary
for a shadow; a single pixel can be everything.
Depth and $D shape information is particu.
larly important. Shape and depth information
is obtained independently from stereo, shad-
ing, motion and texture proceses, also called
shape-from-X processes. Note that the 25-D
sketch is purely obtained from the raw images.
Tis the result of bottom-up processes, also re
ferred to as early vision,