0% found this document useful (0 votes)
79 views8 pages

Sketch Based Interfaces: Early Processing For Sketch Understanding

Uploaded by

rajput0302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views8 pages

Sketch Based Interfaces: Early Processing For Sketch Understanding

Uploaded by

rajput0302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Sketch Based Interfaces: Early Processing for Sketch

Understanding

Tevfik Metin Sezgin Thomas Stahovich Randall Davis


MIT AI Laboratory CMU Deptarment of MIT AI Laboratory
Massachusetts Institute of Mechanical Engineering Massachusetts Institute of
Technology Pittsburgh, PA 15213 Technology
Cambridge MA 02139, USA Cambridge MA 02139, USA
stahov@andrew.cmu.edu
mtsezgin@ai.mit.edu davis@ai.mit.edu

ABSTRACT up having to sacrifice the utility of freehand sketching for


Freehand sketching is a natural and crucial part of every- the capabilities provided by the tools. When they move to
day human interaction, yet is almost totally unsupported a computer for detailed design, designers usually leave the
by current user interfaces. We are working to combine the sketch behind and the effort put into defining the rough ge-
flexibility and ease of use of paper and pencil with the pro- ometry on paper is largely lost.
cessing power of a computer, to produce a user interface We are working to provide a system where users can sketch
for design that feels as natural as paper, yet is considerably naturally and have the sketches understood. By “under-
smarter. One of the most basic steps in accomplishing this is stood” we mean that sketches can be used to convey to the
converting the original digitized pen strokes in a sketch into system the same sorts of information about structure and
the intended geometric objects. In this paper we describe behavior as they communicate to a human engineer.
an implemented system that combines multiple sources of Such a system would allow users to interact with the com-
knowledge to provide robust early processing for freehand puter without having to deal with icons, menus and tool se-
sketching. lection, and would exploit direct manipulation (e.g., specify-
ing curves by sketching them directly, rather than by spec-
ifying end points and control points). We also want users
Categories and Subject Descriptors to be able to draw in an unrestricted fashion. It should,
D.2.2 [Design Tools and Techniques]: User interfaces; for example, be possible to draw a rectangle clockwise or
H.5.2 [User Interfaces]: Input Devices and strategies; J.6 counterclockwise, or with multiple strokes. Even more gen-
[Computer Aided Engineering]: CAD erally, the system, like people, should respond to how an
object looks (e.g., like a rectangle), not how it was drawn.
General Terms This will, we believe, produce a sketching interface that feels
much more natural, unlike Graffiti and other gesture-based
Design, Human Factors systems (e.g., [9], [14]), where pre-specified motions (e.g.,
an L-shaped stroke or a clockwise rectangular stroke) are
Keywords required to specify a rectangular shape.
The work reported here is part of our larger effort aimed at
freehand sketching, natural interaction, multiple sources of
knowledge providing natural interaction with software, and with design
tools in particular. That larger effort seeks to enable user
to interact with automated tools in much the same man-
1. INTRODUCTION ner as they interact with each other: by informal, messy
Freehand sketching is a familiar, efficient, and natural way sketches, verbal descriptions, and gestures. Our overall sys-
of expressing certain kinds of ideas, particularly in the early tem uses a blackboard-style architecture [6], combining mul-
phases of design. Yet this archetypal behavior is largely un- tiple sources of knowledge to produce a hierarchy of succes-
supported by user interfaces in general and by design soft- sively more abstract interpretations of a sketch.
ware in particular, which has for the most part aimed at Our focus in this paper is on the very first step in the
providing services in the later phases of design. As a re- sketch understanding part of that larger undertaking: inter-
sult designers either forgo tool use at the early stage or end preting the pixels produced by the user’s strokes and pro-
ducing low level geometric descriptions such as lines, ovals,
rectangles, arbitrary polylines, curves and their combina-
tions. Conversion from pixels to geometric objects is the
Permission to make digital or hard copies of all or part of this work for first step in interpreting the input sketch. It provides a
personal or classroom use is granted without fee provided that copies are more compact representation and sets the stage for further,
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to more abstract interpretation (e.g., interpreting a jagged line
republish, to post on servers or to redistribute to lists, requires prior specific as a symbol for a spring).
permission and/or a fee.
PUI 2001 Orlando FL, USA
Copyright 2001 ACM X-XXXXX-XX-X/XX/XX ...$5.00.
2. THE SKETCH UNDERSTANDING TASK

1
Sketch understanding overlaps in significant ways with the
extensive body of work on document image analysis gener-
ally (e.g., [2]) and graphics recognition in particular (e.g.,
[16]), where the task is to go from a scanned image of, say,
an engineering drawing, to a symbolic description of that
drawing.
Differences arise because sketching is a realtime, interac-
tive process, and we want to deal with freehand sketches,
not the precise diagrams found in engineering drawings. As
a result we are not analyzing careful, finished drawings, but
are instead attempting to respond in real time to noisy, in- Figure 1: The stroke on the left contains both
complete sketches. The noise is different as well: noise in curves and straight line segments. The points we
a freehand sketch is typically not the small-magnitude ran- want to detect in the vertex detection phase are in-
domly distributed variation common in scanned documents. dicated with large dots in the figure on the right.
There is also an additional source of very useful information The beginning and the end points of the stroke are
in an interactive sketch: as we show below, the timing of indicated with smaller dots.
pen motions can be very informative.
Sketch understanding is a difficult task in general as sug-
gested by reports in previous systems (e.g., [9]) of a recogni- 3.1 Stroke Approximation
tion rate of 63%, even for a sharply restricted domain where Stroke processing consists of detecting vertices at the end-
the objects to be recognized are limited to rectangles, circles, points of linear segments of the stroke, then detecting and
lines, and squiggly lines (used to indicate text). characterizing curved segments of the stroke.
Our domain–mechanical engineering design–presents the
additional difficulty that there is no fixed set of shapes to 3.1.1 Vertex detection
be recognized. While there are a number of traditional sym-
We use the sketch in Fig. 1 as a motivating example of
bols with somewhat predictable geometries (e.g., symbols for
what should be done in the vertex detection phase. Points
springs, pin joints, etc.), the system must also be able to deal
marked in Fig. 1 indicate the corners of the stroke, where
with bodies of arbitrary shape that include both straight
the local curvature is high.
lines and curves. As consequence, accurate early processing
Note that the vertices are marked only at what we would
of the basic geometry–finding corners, fitting both lines and
intuitively call the corners of the stroke (i.e., endpoints of
curves–becomes particularly important.
linear segments). There are, by design, no vertices marked
on curved portions of the stroke because we want to handle
3. SYSTEM DESCRIPTION these separately, modeling them with curves (as described
Sketches can be created in our system using any of a vari- below). This is unlike the well studied problem of piecewise
ety of devices that provide the experience of freehand draw- linear approximation [13].
ing while capturing pen movement. We have used tradi-
tional digitizing tablets, a Wacom tablet that has an LCD-
display drawing surface (so the drawing appears under the
stylus), and a Mimio whiteboard system. In each case the
pen motions appear to the system as mouse movements,
with position sampled at rates between 30 and 150 points/sec,
depending on the device and software in use.
In the description below, by a single stroke we mean the
set of points produced by the drawing implement between
the time it contacts the surface (mouse-down) and the time
it breaks contact (mouse-up). This single path may be com-
posed of multiple connected straight and curved segments Figure 2: Stroke representing a square.
(see, Fig. 1).
Our approach to early processing consists of three phases Our approach takes advantage of the interactive nature of
approximation, beautification, and basic recognition. Ap- sketching, combining information from both stroke direction
proximation fits the most basic geometric primitives–lines and speed data. Consider as an example the square in Fig. 2;
and curves–to a given set of pixels. The overall goal is to Fig. 3 shows the direction, curvature (change in direction
approximate the stroke with a more compact and abstract with respect to arc length) and speed data for this stroke.
description, while both minimizing error and avoiding over- We locate vertices by looking for points along the stroke that
fitting. Beautification modifies the output of the approxi- are minima of speed (the pen slows at corners) or maxima
mation layer, primarily to make it visually more appealing of the absolute value of curvature.1
without changing its meaning, and secondarily to aid the While extrema in curvature and speed typically corre-
third phase, basic recognition. Basic recognition produces spond to vertices, we cannot rely on them blindly because
interpretations of the strokes, as for example, interpreting a noise in the data introduces many false positives. To deal
sequence of four lines as a rectangle or square. (Subsequent with this we use average based filtering.
recognition, at the level of mechanical components, such as
springs, and pin joints is accomplished by another of our 1
From here on for ease of description we use curvature to
systems [1]). mean the absolute value of the curvature data.

2
3 0.7 1

0.9
2 0.6

0.8

1 0.5
0.7

0.6
0 0.4

0.5

−1 0.3
0.4

0.3
−2 0.2

0.2

−3 0.1
0.1

−4 0 0
0 50 100 150 200 250 0 50 100 150 200 250 0 50 100 150 200 250

Figure 3: Direction, curvature and speed graphs for


the stroke in Fig. 2 Figure 6: At left the original sketch of a piece of
metal; at right the fit generated using only curvature
data.

Average based filtering


We want to find extrema corresponding to vertices while
avoiding those due to noise. To increase our chances at do-
ing this, we look for extrema in those portions of the curva-
ture and speed data that lie beyond a threshold. Intuitively,
we are looking for maxima of curvature only where the cur-
vature is already high and minima of speed only where the
speed is already low. This will help to avoid selecting false
positives of the sort that would occur say, when there is
a brief slowdown in an otherwise fast section of a straight
Figure 7: At left the speed graph for the piece; at
stroke.
right the fit based on only speed data.
To avoid the problems posed by choosing a fixed threshold,
we set the threshold based on the mean of each data set.2
We use these thresholds to separate the data into regions
where it is above/below the threshold and select the global
Application to curvature data
extrema in each region that lies above the threshold.
Fig. 4 shows the curvature graph partitioned into regions
0.7
of high and low curvature. Note that this reduces but doesn’t
0.6 eliminate the problem of false positives introduced by noise
in the stroke. We deal with the false positives using the
hybrid fit generation scheme described below.3
0.5

0.4
While average based filtering performs better than simply
0.3
comparing the curvature data against a hard coded thresh-
old, it is still clearly not free of empirical constants. As we
0.2
explain when considering future work, scale space provides
0.1
a better approach for dealing with noisy data without hav-
ing to make a priori assumptions about the scale of relevant
0
0 50 100 150 200 250 features.

Figure 4: Curvature graph for the square in Fig. 2 Application to speed change
with the threshold dividing it into regions. Our experience is that curvature data alone rarely provides
sufficient reliability. Noise is one problem, but variety in
angle changes is another. Fig. 6 illustrates how curvature
1
fit alone misses a vertex (at the upper right) because the
0.9
curvature around that point was too small to be detected
0.8
in the context of the other, larger curvatures. We solve this
0.7
problem by incorporating speed data into our decision as an
0.6
independent source of guidance.
0.5
Just as we did for the curvature data, we reduce the num-
0.4
ber of false extrema by average based filtering, then look for
0.3
speed minima. The intuition here is simply that pen speed
0.2
drops when going around a corner in the sketch. Fig. 7
0.1
shows (at left) the speed data for the sketch in Fig. 6, along
0
with the polygon drawn from the speed-detected vertices (at
0 50 100 150 200 250
right).
Figure 5: Speed graph for the stroke in Fig. 2 with 3
An alternative approach is to detect consecutive almost-
the threshold dividing it into regions. collinear edges (using some empirical threshold for collinear-
ity) and combine them into one edge, removing the vertex
2
The exact threshold has been determined empirically; for in between. Our hybrid fit scheme deals with the problem
curvature data the threshold is the mean, while for the speed without the need to decide what value to use for “almost-
the threshold is 90% of the mean. collinear.”

3
Using speed data alone has its shortcomings as well. Poly- computed as the average of the sum of the squares of the
lines formed from a combination of very short and long line distances to the fit from each point in the stroke S:
segments can be problematic: the maximum speed reached 1 
along the short line segments may not be high enough to εi = ODSQ(s, Hi )
|S| s∈S
indicate the pen has started traversing another edge, with
the result that the entire short segment is interpreted as the Here ODSQ stands for orthogonal distance squared, i.e., the
corner. This problem arises frequently when drawing thin square of the distance from the stroke point to the relevant
rectangles, common in mechanical devices. Fig. 8 illustrates line segment of the polyline defined by Hi . We compute the
this phenomena. In this figure, the speed fit misses the up- error for Hi and for Hi ; the higher scoring of these two (i.e.,
per left corner of the rectangle because the pen failed to the one with smaller least squares error) becomes Hi+1 , the
gain enough speed between the endpoints of the short verti- next fit in the succession. This process continues until all
cal segment. The curvature fit, by contrast, detects all cor- points in the speed and curvature fits have been used. The
ners, along with some other vertices that are artifacts due to result is a set of hybrid fits.
hand dynamics during freehand sketching. This illustrates In selecting the best of the hybrid fits the problem is as
the utility of having both fits available. usual trading off more vertices in the fit against lower error.
Here our approach is simple: We set an error upper bound
and designate as our final fit Hf , the Hi with the fewest
vertices that also has an error below the threshold.

3.1.2 Handling curves


(a) Input, 63 (b) Using (c) Using The approach described thus far yields a good approxima-
points speed data, curvature tion to strokes that consists solely of line segments, but as
4 vertices data, 7
vertices noted our input may include curves as well, hence we require
a means of detecting and approximating them.
The polyline approximation Hf generated in the process
Figure 8: Average based filtering using speed data described above provides a natural foundation for detecting
misses a vertex. The curvature fit detects the missed areas of curvature: we compare the Euclidean distance l1
point (along with vertices corresponding to the ar- between each pair of consecutive vertices in Hf to the accu-
tifact along the left edge of the rectangle). mulated arc length l2 between those vertices in the input S.
The ratio l2 /l1 is very close to 1 in the linear regions of S,
We use information from both sources, generating hybrid and significantly higher than 1 in curved regions.
fits by combining the set of candidate vertices derived from We approximate curved regions with Bézier curves, de-
curvature data Fd with the candidate set from speed data fined by two end points and two control points. Let u = Si ,
Fs , taking into account the system’s certainty that each can- v = Sj , i < j be the end points of the part of S to be ap-
didate is a real vertex. proximated with a curve. We compute the control points
as:
Generating hybrid fits
Hybrid fit generation occurs in three stages: computing ver-
c1 = kt̂1 + v
tex certainties, generating a set of hybrid fits, and selecting
the best fit.
Our certainty metric for a curvature candidate vertex vi c2 = kt̂2 + u
is the scaled magnitude of the curvature in a local neighbor-
1 
hood around the point, computed as |di−k − di+k |/l. Here k= |Sk − Sk+1 |
l is the curve length between points Si−k , Si+k and k is a 3
i≤k<j
small integer defining the neighborhood size around vi . The
certainty metric for a speed fit candidate vertex vi is a mea- where t̂1 and t̂2 are the unit length tangent vectors pointing
sure of the pen slowdown at the point, 1 − vi /vmax , where inwards at the curve segment to be approximated. The 1/3
vmax is the maximum pen speed in the stroke. The certainty factor in k controls how much we scale tˆ1 and tˆ2 in order to
values are normalized to [0, 1]. reach the control points; the summation is simply the length
While both of these metrics are designed to produce val- of the chord between Si and Sj .4
ues in [0, 1], they have different scales. As the metrics are As in fitting polylines, we want to use least squares to
used only for ordering within each set, they need not be evaluate the goodness of a fit, but computing orthogonal dis-
numerically comparable across sets. Candidate vertices are tances from each Si in the input stroke to the Bézier curve
sorted by certainty within each fit. segments would require solving a fifth degree polynomial.
The initial hybrid fit H0 is the intersection of Fd and Fs . A (Bézier curves are described by third degree polynomials,
succession of additional fits is then generated by appending hence computing the minimum distance from an arbitrary
to Hi the highest scoring curvature and speed candidates point to the curve involves minimizing a sixth degree poly-
not already in Hi . nomial, equivalent to solving a fifth degree polynomial.) A
To do this, on each cycle we create two new fits: Hi = numerical solution is both computationally expensive and
Hi +vs (i.e., Hi augmented with the best remaining speed fit heavily dependent on the goodness of the initial guesses for
candidate) and Hi = Hi + vd (i.e., Hi augmented with the 4
The 1/3 constant was determined empirically, but works
best remaining curvature candidate). We use least squares very well for freehand sketches. As we discovered subse-
error as a metric of the goodness of a fit: the error εi is quently, the same constant was independently chosen in [15].

4
Figure 10: At left the original sketch of a piece of
metal revisited, and the final beautified output at
right.

sured against the original sketch, and look right to the user.
Fig. 10 shows the original stroke for the metal piece we had
before, and the output of the beautifier. Some examples of
beautification are also present in Fig. 13.

3.3 Basic Object Recognition


The final step in our processing is recognition of the most
basic objects that can be built from the line segments and
curve segments produced thus far, i.e., simple geometric ob-
jects (ovals, circles, rectangles, squares).
Figure 9: Examples of arbitrary stroke approxima- Recognition of these objects is done with hand-tailored
tion. Boundaries of Bézier curves are indicated with templates that examine various simple properties. A rectan-
crosses, detected vertices are indicated with dots. gle, for example, is recognized as a polyline with 4 segments
all of whose vertices are within a specified distance of the
roots [12], hence we resort to an approximation. We dis- center of the figure’s bounding box; a stroke will be recog-
cretize the Bézier curve using a piecewise linear curve and nized as an oval if it has a small least squares error when
compute the error for that curve. This error computation is compared to an oval whose axes are given by the bounding
O(n) because we impose a finite upper bound on the number box of the stroke.
of segments used in the piecewise approximation.
If the error for the Bézier approximation is higher than 3.4 Evaluation
our maximum error tolerance, the curve is recursively sub- We have conducted a user study to measure the degree to
divided in the middle, where middle is defined as the data which the system is perceived as easy to use, natural and
point in the original stroke whose index is midway between efficient. Study participants were asked to create a set of
the indices of the two endpoints of the original Bézier curve. shapes using our system and Xfig, a Unix tool for creating
New control points are computed for each half of the curve, diagrams. Xfig is a useful point of comparison because it
and the process continues until the desired precision is achieved. is representative of the kinds of tools that are available for
Examples of the capability of our approach is shown in drawing diagrams using explicit indication of shape (i.e.,
Fig. 9, a hastily-sketched mixture of lines and curves. Note the user indicates explicitly which parts of the sketch are
that all of the curved segments have been modeled curves, supposed to be straight lines, which curves, etc.) As in other
rather than the piecewise linear approximations that have such tools, XFig has a menu and toolbar interface; the user
been widely used previously. selects a tool (e.g., for drawing polygons), then creates the
shapes piece by piece.
Thirteen subjects participated in our study, including com-
3.2 Beautification puter science graduate students, computer programmers and
Beautification refers to the (currently minor) adjustments an architecture student. Subjects were given sufficient time
made to the approximation layer’s output, primarily to make to get familiar with each system and then asked to draw a
it look as intended. We adjust the slopes of the line seg- set of 10 shapes (examples given in Fig 11). All of the sub-
ments in order to ensure the lines that were apparently jects reported our system being easier to use, efficient and
meant to have the same slope end up being parallel. This more natural feeling. The subjects were also asked which
is accomplished by looking for clusters of slopes in the fi- system they would prefer when drawing these sort of infor-
nal fit produced by the approximation phase, using a simple mal shapes on a computer. All but one subject preferred
sliding-window histogram. Each line in a detected cluster is our system; the sole dissenter preferred a tablet surface that
then rotated around its midpoint to make its slope be the had the texture and feel of paper.
weighted average of the slopes in that cluster. The (new) Overall users praised our system because it let them draw
endpoints of these line segments are determined by the in- shapes containing curves and lines directly and without hav-
tersections of each consecutive pair of lines. This process ing to switch back and forth between tools. We have also
(like any neatening of the drawing) may result in vertices observed that with our system, users found it much easier
being moved; we chose to rotate the edges about their mid- to draw shapes corresponding to the gestures they routinely
points because this produces vertex locations that are close draw freehand, such as a star.
to those detected, have small least square errors when mea- While the central point of this comparison was to deter-

5
Figure 12: An overtraced oval and a line along with
and the system’s output.

• It should be possible to draw arbitrary shapes with a


single stroke, (i.e., without requiring the user to draw
objects in pieces).

• The system should do automatic feature point detec-


tion. The user should not have to specify vertex posi-
tions by hand.

• The system should not have sketching modes for draw-


ing different geometric object classes (i.e., modes for
drawing circles, polylines, curves etc.).
Figure 11: Examples of the shapes used in the user • The sketching system should feel natural to the user.
study.
The Phoenix sketching system [15] had some of the same
motivation as our work, but a more limited focus on inter-
mine how natural it felt to use each system, we also evalu- active curve specification. While the system provided some
ated our system’s ability to produce a correct interpretation support for vertex detection, its focus on curves led it to
of each shape (i.e., interpret strokes appropriately as lines use Gaussian filters to smooth the data. While effective for
or curves). Overall the system’s identification of the vertices curves, Gaussians tend to treat vertices as noise to be re-
and approximation of the shapes with lines and curves was duced, obscuring vertex location. As a result the user was
correct 96% of the time on the ten figures. often required to specify the vertices manually.
In addition to the user studies we have conducted, we Work in [5] describes a system for sketching with con-
wrote a higher level recognizer for evaluation purposes. The straints that supports geometric recognition for simple strokes
higher level recognizer takes the geometric descriptions gen- (as well as a constraint maintenance system and extrusion
erated by the basic object recognition module of our system for generating solid geometries). The set of primitives is
and combines them into domain specific objects. more limited than ours: each stroke is interpreted as a line,
Fig. 13 shows the original input and the program’s anal- arc or as a Bézier curve. More complex shapes can be formed
ysis for a variety of simple but realistic mechanical devices by combinations of these primitives, but only if the user lifts
drawn as freehand sketches. The last two of them are differ- the pen at the end of each primitive stroke, reducing the
ent sketches for a part of the direction reversing mechanism feeling of natural sketching.
for a tape player. Recognized domain specific components The work in [3] describes a system for generating realtime
include gears (indicated by a circle with a cross), springs (in- spline curves from interactively sketched data. They focus
dicated by wavy lines), and the standard fixed-frame symbol on using knot removal techniques to approximate strokes
(a collection of short parallel lines). Components that are known to be composed only of curves, and do not handle sin-
recognized are replaced with standard icons scaled to fit the gle strokes that contain both lines and curves. They do not
sketch. support corner detection, instead requiring the user to spec-
An informal comparison of the raw sketch and the sys- ify corners and discontinuities by lifting the mouse button,
tem’s approximations shows whether the system has selected or equivalently by lifting the pen. We believe our approach
vertices where they were drawn, fit lines and curves accu- of automatically detecting the feature points provides a more
rately, and successfully recognized basic geometric objects. natural and convenient sketching interface.
While informal, this is an appropriate evaluation because Zeleznik [7] describes a mode-based stroke approxima-
the program’s goal is to produce an analysis of the strokes tion system that uses simple rules for detecting the drawing
that “looks like” what was sketched. mode. The user has to draw objects in pieces, reducing
We have also begun to deal with overtracing, one of the the sense of natural sketching. Switching modes is done by
(many) things that distinguishes freehand sketches from care- pressing modifier buttons in the pen or in the keyboard.
ful diagrams. Fig. 12 illustrates one example of the limited In this system, a click of the mouse followed by immediate
ability we have thus far embodied in the program. dragging signals that the user is drawing a line. A click fol-
lowed by a pause and then dragging of the mouse tells the
system to enter the freehand curve mode. Our system allows
4. RELATED WORK drawing arbitrary shapes without any restriction on how the
In general, systems supporting freehand sketching lack user draws them. There is enough information provided by
one or more of the properties that we believe a sketching the freehand drawing to differentiate geometric shapes such
system should have: as curves, polylines, circles and lines from one another, so

6
we believe requiring the user to draw things in a particu- and toward earlier use of automated tools in the design cycle
lar fashion is unnecessary and reduces the natural feeling of in particular.
sketching. Our goal is to make computers understand what
the user is doing rather than requiring the user to sketch in
a way that the computer can understand.
Among the large body of work on beautification, Igarashi 7. REFERENCES
et al. [8] describes a system combining beautification with [1] C. Alvarado. A natural sketching environment:
constraint satisfaction, focusing on exploiting features such Bringing the computer into early stages of mechanical
as parallelism, perpendicularity, congruence and symmetry. design. Master’s thesis, Massachusetts Institute of
The system infers geometric constraints by comparing the Technology, 2000.
input stroke with previous ones. Because sketches are inher-
[2] H. S. Baird, H. Bunke, and K. Yamamoto. Structured
ently ambiguous, their system generates multiple interpreta-
document image analysis. Springer-Verlag, Heidelberg,
tions corresponding to different ways of beautifying the in-
1992.
put, and the most plausible interpretation is chosen among
these interpretations. The system is interactive, requiring [3] M. Banks and E. Cohen. Realtime spline curves from
the user to do the selection, and doesn’t support curves. It interactively sketched data. In SIGGRAPH,
is, nevertheless, more effective then ours at beautification, Symposium on 3D Graphics, pages 99–107, 1990.
but beautification is not the main focus of our work and is [4] A. Bentsson and J. Eklundh. Shape representation by
present for the purposes of completeness. multiscale contour approximation. IEEE PAMI 13, p.
The works in [15] and [3] describe methods for generating 85–93, 1992., 1992.
very accurate approximations to strokes known to be curves [5] L. Eggli. Sketching with constraints. Master’s thesis,
with precision several orders of magnitude below the pixel University of Utah, 1994.
resolution. The Bézier approximations we generate are less [6] L. D. Erman, F. Hayes-Roth, V. R. Lesser, and D. R.
precise but are sufficient for approximating free-hand curves. Reddy. The hearsay-ii speech understanding system:
We believe techniques in [15] and [3] are excessively pre- Integrating knowledge to resolve uncertainty.
cise for free-hand curves, and the real challenge is detecting Computing Surveys, 12:213–253, 1980. Reprinted in:
curved regions in a stroke rather than approximating those Readings in Artificial Intelligence, Bonnie L. Webber
regions down to the numerical machine precision. and Nils J. Nilssen (eds.)(1981), pp 349-389. Morgan
Kaufman Pub. Inc., Los Altos, CA.
[7] A. Forsberg, K. Herndon, and R. Zeleznik. Sketch: An
5. FUTURE WORK interface for sketching 3d scenes. In Proceedings of
We are working to link this early processing to other work SIGGRAPH’96, pages 163–170, 1996.
in our group that has focused on recognition [1] of higher [8] T. Igarashi, S. Matsuoka, S. Kawachiya, and
level mechanical objects. This will provide the opportunity H. Tanaka. Interactive beautification: A technique for
to add model-based processing of the stroke, in which early rapid geometric design. In UIST ’97, pages 105–114,
operations like vertex localization may be usefully guided by 1997.
knowledge of the current best recognition hypothesis.
[9] J. A. Landay and B. A. Myers. Sketching interfaces:
In addition, incorporating ideas from scale space theory
Toward more human interface design. IEEE
looks like a promising way of detecting different scales in-
Computer, vol. 34, no. 3, March 2001, pp. 56-64.
herent in the data and avoiding a priori judgments about
[10] T. Lindeberg. Edge detection and ridge detection with
the size of relevant features. In the pattern recognition com-
automatic scale selection. ISRN
munity [4], [11] and [10] apply some of the ideas from scale
KTH/NA/P–96/06–SE, 1996., 1996.
space theory to similar problems. We are currently working
on ways of applying these techniques to speed and curvature [11] A. Rattarangsi and R. T. Chin. Scale-based detection
data. We believe this may allow us to deal more effectively of corners of planar curves. IEEE Transactionsos
with sketches that contain relevant details at a variety of Pattern Analysis and Machine Intelligence,
scales. There is no guaranteed way of deciding which scales 14(4):430–339, Apr. 1992.
are important at the geometric level, so using constraints [12] N. Redding. Implicit polynomials, orthogonal distance
and/or information provided by the domain of application regression, and closest point on a curve. IEEE
may help in scale selection. Transactions on Pattern Analysis and Machine
Humans naturally seem to slow down when they draw Intelligence, pages 191–199, 2000.
things carefully as opposed to casually, so another inter- [13] R. Rosin. Techniques for assessing polygonal
esting research direction would be to explore the degree to approximations of curves. 7th British Machine Vision
which one can use the time it takes to draw a stroke as an Conf., Edinburgh, 1996.
indication of how careful and precise the user meant to be. [14] D. Rubine. Specifying gestures by example. Computer
Graphics, 25(4):329–337, 1991.
[15] P. Schneider. Phoenix: An interactive curve design
6. CONCLUSION system based on the automatic fitting of
We have built a system capable of using multiple sources hand-sketched curves. Master’s thesis, University of
of information to produce good approximations of freehand Washington, 1988.
sketches. Users can sketch on an input device as if drawing [16] K. Tombre. Analysis of engineering drawings. In
on paper and have the computer detect the low level geome- GREC 2nd international workshop, pages 257–264,
try, enabling a more natural interaction with the computer, 1997.
as a first step toward more natural user interfaces generally,

7
Figure 13: Performance examples: The first two pair are sketches of a marble dispenser mechanism and a
toggle switch. The last two are sketches of the direction reversing mechanism in a tape player.

You might also like