0% found this document useful (0 votes)
104 views8 pages

N-D Image Segmentation Techniques

This document describes a new technique for interactive segmentation of N-dimensional images. The technique allows a user to mark pixels as "object" or "background" to provide hard constraints. A cost function is defined incorporating both boundary and region properties of segments. Graph cuts are used to find a globally optimal segmentation that best balances these properties while satisfying the constraints. This globally optimal solution provides advantages over other techniques that only find local optima or can only segment 2D images.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views8 pages

N-D Image Segmentation Techniques

This document describes a new technique for interactive segmentation of N-dimensional images. The technique allows a user to mark pixels as "object" or "background" to provide hard constraints. A cost function is defined incorporating both boundary and region properties of segments. Graph cuts are used to find a globally optimal segmentation that best balances these properties while satisfying the constraints. This globally optimal solution provides advantages over other techniques that only find local optima or can only segment 2D images.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Interactive Graph Cuts

for Optimal Boundary & Region Segmentation of Objects in N-D Images

Yuri Y. Boykov Marie-Pierre Jolly


Imaging and Visualization Department
Siemens Corporate Research
755 College Road East, Princeton, NJ 08540, USA
yuri @ scr.siemens.com

Abstract mentation. A globally optimal segmentation can be very


efficiently recomputed when the user adds or removes any
In this paper we describe a new technique for general hard constraints (seeds). This allows the user to get any
purpose interactive segmentation of N-dimensional images. desired segmentation results quickly via very intuitive in-
The user marks certain pixels as “object” or “background” teractions. Our method applies to N-D images (volumes).
to provide hard constraints for segmentation. Additional One of the main advantages of our interactive segmen-
soji constraints incorporate both boundary and region in- tation method is that it provides a globally optimal solution
formation. Graph cuts are used to find the globally optimal for an N-dimensional segmentation when the cost function
segmentation of the N-dimensional image. The obtained so- is clearly defined. Some earlier techniques (snakes [ 14, 41,
lution gives the best balance of boundary and region prop- deformable templates [21], shortest path [ 151, ratio regions
erties among all segmentations satishing the constraints. [5], and other [ 131) can do that only in 2D applications when
The topology o$our segmentation is unrestricted and both a segmentation boundary is a 1D curve. Many techniques
“object” and “background” segments may consist of sev- either don’t have a clear cost function at all (region growing,
eral isolatedparts. Some experimental results are presented split and merge, see Chapter 10 in [ 1 I]) or compute only an
in the context ofphotohideo editing and medical image seg- approximate solution (e.g. a local minimum) that can be
mentation. We also demonstrate an interesting Gestalt ex- arbitrarily far from the global optimum (region competition
ample. A fast implementation of our segmentation method [22], level set methods [ 161, normalized cuts [IS]). Global
is possible via a new mar-$ow algorithm in [2]. properties of such segmentations may be difficult to analyze
or,predict. Imperfections in the result might come from de-
ficiencies at the minimization stage. In contrast, imperfec-
1. Introduction tions of a globally optimal solution are directly related to
the definition of the cost function. Thus, the segmentation
Interactive segmentation is becoming more and more can be controlled more reliably.
popular to alleviate the problems inherent to fully automatic It is also important that the cost function that we use
segmentation which seems to never be perfect. Our goal is as a soft constraint for segmentation is general enough to
a general purpose interactive segmentation technique that include both region and boundary properties of segments.
divides an image into two segments: “object” and “back- In fact, our cost function is similar to one in [9] obtained
ground”. A user imposes certain hard constraints for seg- in a context of MAP-MRF estimation. Consider an arbi-
mentation by indicating certain pixels (seeds) that abso- trary set of data elements P and some neighborhood sys-
lutely have to be part of the object and certain pixels that tem represented by a set N of all unordered’ pairs { p i q }
have to be part of the background. Intuitively, these hard
‘For simplicity, we present the case o f unordered pairs of neighbors
constraints provide clues on what the user intends to seg- { p , q } . If pairs of neighbors are ordered then we have two distinct pairs
ment. The rest of the image is segmented automatically by ( p , q ) and ( q , p ) . This would allow different (directed) discontinuity
computing a global optimum among all segmentations sat- penalties for two cases when p E “object”, q E “background’ and when
p E “background”, q E “object”. To set such directed costs one should
isfying the hard constraints. The cost function is defined in have prior information on properties of the boundary from object to back-
terms of boundary and region properties of the segments. ground. Algorithmically, generalization from unordered to ordered neigh-
These properties can be viewed as soft constraints for seg- bors means switching from undirected (presented here) to directed graphs.

105
0-7695-1143-0/01 $10.00 0 2001 IEEE
of neighboring elements in P. For example, P can con- desired boundary. Moreover, this approach can not be easily
tain pixels (or voxels) in a 2D (or 3D) grid and can generalized to 3D data.
contain all unordered pairs of neighboring pixels (voxels)
under a standard 8- (or 26-) neighborhood system. Let We consider hard constraints that indicate segmentation
regions rather than the boundary. We assume that some pix-
A = (AI ,... , A p , . . . , A l p , ) be a binary vector whose
components A, specify assignments to pixels p in P . Each els were marked as internal and some as external for the
A, can be either “obj” or “bkg” (abbreviations of “object” given object of interest. The subsets of marked pixels will
and “background”). Vector A defines a segmentation. Then, be referred to as “object” and “background” seeds. The seg-
the soft constraints that we impose on boundary and region mentation boundary can be anywhere but it has to separate
properties of A are described by the cost function E ( A ) : the object seeds from the background seeds. Note that the
seeds can be loosely positioned inside the object and back-
E(A) = A.R(A)+B(A) (1) ground regions. Our segmentation technique is quite stable
and normally produces the same results regardless of par-
where ticular seed positioning within the same image object.

Obviously, the hard constraints by themself are not


enough to obtain a good segmentation. A segmentation
method decides how to segment unmarked pixels. [ 101 and
[17] use the same type of hard constraints as us but they
do not employ a clear cost function and segment unmarked
and pixels based on variations of “region growing”. Since the
1 ifA, # A q properties of segmentation boundary are not optimized, the
d(A,,A,) = 0 otherwise. results are prone to “leaking” where the boundary between
The coefficient A 2 0 in (1) specifies a relative impor- objects is blurry. In contrast, we combine the hard con-
tance of the region properties term R(A) versus the bound- straints as above with energy (1) that incorporates region
ary properties term B(A). The regional term R ( A ) as- and boundary properties of segments.
sumes that the the individual penalties for assigning pixel p In this paper we show how to generalize the algorithm
to “object” and “background”, correspondingly R, (“obj”) in [9] to combine soft constraints encoded in (1) with user
and R,(“bkg”), are given. For example, R p ( . )may reflect defined hard constraints. We also show how the optimal
on how the intensity of pixel p fits into a known intensity segmentations can be efficiently recomputed if the hard con-
model (e.g. histogram) of the object and background. straints are added or changed. Note that initial segmentation
The term B ( A ) comprises the “boundary” properties of may not be perfect. After reviewing it, the user can specify
segmentation A. Coefficient B{p,ql2 0 should be inter- additional object and background seeds depending on the
preted as a penalty for a discontinuity between p and q. Nor- observed results. To incorporate these new seeds the algo-
mally, B { p , q is
} large when pixels p and q are similar (e.g. rithm can efficiently adjust the current segmentation with-
in their intensity) and B { p , q is
} close to zero when the two out recomputing the whole solution from scratch.
are very different. The penalty B{,,,} can also decrease as a
function of distance between p and q. Costs B{p,qlmay be Our technique is based on powerful graph cut algorithms
based on local intensity gradient, Laplacian zero-crossing, from combinatorial optimization [7, 81. Our implementa-
gradient direction, and other criteria (e.g. [ 151). tion uses a new version of “max-flow” algorithm reported
The algorithm in [9] computes a globally optimal binary in 121. In the next section we explain the terminology for
segmentation minimizing the energy similar to (1) when no graph cuts and provide some background information on
hard constraints are placed. In contrast, our goal is to com- previous computer vision techniques relying on graph cuts.
pute the global minimum of (1) among all segmentations The details of our segmentation method and its correctness
that satisfy additional hard constraints imposed by a user. are shown in Section 3. Section 4 provides a number of
There are different types of hard constraints that were examples where we apply our technique to photo/video-
proposed in the past for interactive segmentation methods. editing and to segmentation of medical images/volumes.
For example, for intelligent scissors [ 151 and live wire [6], We demonstrate that with a few simple and intuitive ma-
the user has to indicate certain pixels where the segmenta- nipulations a user can always segment an object of interest
tion boundary should pass. The segmentation boundary is as precisely as (s)he wants. Note that some additional ex-
then computed as the “shortest path” between the marked perimental results can be found in our preliminary work [ 11
pixels according to some energy function based on image which can be seen as a restricted case where only bound-
gradient. One difficulty with such hard constraints is that ary term is present in energy (1). Information on possible
the user inputs have to be very accurately positioned at the extensions and future work is given in Section 5.

106
the nodes in the graph. As illustrated in Figure 1 (c-d), this
partitioning corresponds to a segmentation of an underlying
image or volume. A minimum cost cut generates a segmen-
tation that is optimal in terms of properties that are built into
the edge weights.
(a) Image with seeds. (d) Segmentation results. Our technique is based on a well-known combinatorial
optimization fact that a globally minimum cut of a graph
U -h with two terminals can be computed efficiently in low-order
polynomial time [7, 8, 21. In Section 3 we show how to set
a two terminal graph so that the minimum cut would give
a segmentation that minimizes (1) among all segmentations
satisfying the given hard constraints.
* It should be noted that graph cuts were used for image
segmentation before. In [20] the image is optimally divided
into K parts to minimize the maximum cut between the seg-
ments. In this formulation, however, the segmentation is
strongly biased to very small segments. Shi and Malik [ 181
(b) Graph. (c) cut. try to solve this problem by normalizing the cost of a cut.
The resulting-optimization
. problem is NP-hard and they use
an approximation technique. In [9, 3, 12, 191 graph cuts are
Figure 1. A simple 2D segmentation example applied to minimize certain energy functions used in im-
for a 3 x 3 image. The seeds are 0 = {w} and age restoration, stereo, 3D object reconstruction, and other
B = { p } . The cost of each edge is reflected problems in computer vision.
by the edge’s thickness. The regional term A fast implementation of theoretically polynomial graph
(2) and hard constraints (4’5) define the costs cut algorithms can be an issue. The most straight-forward
of t-links. The boundary term (3) defines the implementations of the standard graph cut algorithms, e.g.
costs of n-links. Inexpensive edges are at- max-flow [7] or push-relabel [SI, can be slow. The experi-
tractive choices for the minimum cost cut. ments in [ 2 ] compare several well-known “tuned’ versions
of these standard algorithms in the context of graph based
methods in vision. The same paper also describes a new ver-
2. Graph Cuts and Computer Vision sion of the max-flow algorithm that (on typical in vision ex-
amples) significantly outperformed the standard techniques.
Our implementation of the interactive segmentation method
First, we describe the basic terminology that pertains to
of this paper uses the new graph cut algorithm from [ 2 ] .
graph cuts in the context of our segmentation method. An
undirected graph G = ( V , E ) is defined as a set of nodes
(vertices V ) and a set of undirected edges ( E ) that connect 3. Segmentation Technique
these nodes. An example of a graph that we use in this
paper is shown in Figure I@). Each edge e E E in the In this section we provide algorithmic details about our
graph is assigned a nonnegative weight (cost) U,.There are segmentation technique. Assume that 0 and B denote
also two special nodes called terminals. A cut is a subset the subsets of pixels marked as “object” and “background”
of edges C c & such that the terminals become separated seeds. Naturally, the subsets (3 C P and B C P are such
on the induced graph G(C) = (V,&\C). It is normal in that 0fl G = 0. Remember that our goal is to compute the
combinatorial optimization to define the cost of a cut as the global minimum of ( I ) among all segmentations .A satisfy-
sum of the costs of the edges that i t severs ing hard constraints

V p E 0, -Ap = “obj” (4)


V p E 6, *Ap = “bkg”. (5)

Graph cut formalism is well suited for segmentation The general work flow is shown in Figure I . Given an
of iningcs. In fact. i t is completely appropriate for N- image (Figure I(a)) we create a graph with two terminals
dinicnsionul volumcs. The nodes o f the graph can represent (Figure I(b)). The edge weights reflect the parameters in
pixcls (or voxels) and the cdges can represent any neigh- the regional ( 2 ) and the boundary (3) terms of the cost func-
borhood relationship bctwccn the pixels. A cut partitions tion. as well as the known positions of seeds in the image.

107
The next step is to compute the globally optimal minimum (see Section 2) assuming that the edge weights specified in
cut (Figure l(c)) separating two terminals. This cut gives the table above are non-negative2.
a segmentation (Figure l(d)) of the original image. In the Below we state exactly how the minimum cut C defines
simplistic example of Figure 1 the image is divided into ex- a segmentation A and prove this segmentation is optimal.
actly one “object” and one “background” regions. In gen- We need one technical lemma. Assume that 3denotes a set
eral, our segmentation method generates binary segmenta- of allfeasible cuts C on graph G such that
tion with arbitrary topological properties. Examples in Sec-
tion 4 illustrate that object and background segments may 0 C severs exactly one t-link at each p
consist of several isolated connected blobs in the image.
Below we describe the details of the graph and prove 0 { p , q } E C iff p , q are t-linked to different terminals
that the obtained segmentation is optimal. To segment a
given image we create a graph 6 = ( V ,E ) with nodes cor- 0 if p E 0then { p , T } E C
responding to pixels p E P of the image. There are two
additional nodes: an “object” terminal (a source S ) and a 0 if p E B then { p , S } E C .
“background” terminal (a sink T ) .Therefore,
Lemma 1 The minimum cut on G is feasible, i.e. C E 3.
V=PU{S,T}.
PROOF: C severs et least one t-link at each.pixel since it
The set of edges E consists of two types of undirected edges: is a cut that separates the terminals. On the other hand, it
n-links (neighborhood links) and t-links (terminal links). can not sever both t-links. In such a case it would not be
Each pixel p has two t-links { p , S} and { p , T } connecting minimal since one of the t-links could be returned. Simi-
it to each terminal. Each pair of neighboring pixels { p , q } larly, a minimum cut should sever an n-link { p , q } if p and
in N is connected by an n-link. Without introducing any q are connected to the opposite terminals just because any
ambiguity, an n-link connecting a pair of neighbors p and q cut must separate the termi$s. If p and q are connected
will be denoted by { p , q } . Therefore, to the same terminal then C should not sever unnecessary
n-link { p , q}^due to its minimality. The last two properties
=N U { { P , SI, { P , V ) . are true for C because the constant K is larger than the sum
PEP of all n-links costs for any given pixel p . For example, if
The following table gives weights of edges in E p E 0 and d severs { p , S } (costs K ) then we would con-
struct a smaller cost cut by restoring { p , S } and severing all
edge 1 weight (cost) n-links from p (costs less then K ) as well as the opposite
t-link { p , T } (zero cost). I

For any feasible cut C E 3 we can define a unique cor-


responding segmentation A ( C ) such that

if { p , T } E C

I
AdC) = { “obj”,
“bkg”, if { p , S } E C. (6)

X . R,(“obj”) The definition above is coherent since any feasible cut sev-
ers exactly one of the two t-liyks at each pixel p . The lemma
showed that a minimum cut C is feasible. Thus, we can de-
fine a corresponding segmentation A = A ( C ) . The next
theorem completes the description of our algorithm.

Theorem 1 The segmentation A = A(C) defined by the


minimum cut C as in ( 6 ) minimizes ( I ) among all segmen-
tations satisfying constraints (4,5).

The graph G is now completely defined. We draw the *In fact, our technique does require that the penalties B { p , q }be non-
segmentation boundary between the object and the back- negative. The restriction that R P ( . )should be non-negative is easy to
avoid. For any pixel p we can always add a large enough positive con-
ground by finding the minimum cost cut on the graph G.
stant to both R,(“obj”) and R,(“bkg”) to make sure that they become
The minimum cost cut C on G can be computed exactly in non-negative. This operation can not change the minimum A of E ( A ) in
polynomial time via algorithms for two terminal graph cuts (1) since it is equivalent to adding a constant to the whole energy function.

108
PROOF: Using the table of edge weights, definition of fea- new cost
sible cuts 3,and equation (6) one can show that a cost of
any C E 3 is

IC1 = X.RP(AP(C))+
PBOUB
These new costs are consistent with the edge weight table
B{p,q} . &4P(C)7 AP(C)) for pixels in U since the extra constant cp at both t-links of
{p,q)EN a pixel does not change the optimal cut4. Then, a maximum
= E ( A ( C ) )- XR,(“obj”) - XRp(“bkg”). flow (minimum cut) on a new graph can be efficiently ob-
PE0 P€B tained starting from the previous flow without recomputing
the whole solution from scratch.
Therefore, IC1 = E ( A ( C ) )- const(C).Note that for any Note that the same trick can be done to adjust the seg-
C E 3 assignment A ( C )satisfies constraints (4,5). In fact, mentation when a new “background” seed is added or when
equation (6) gives a one-to-one correspondence between the a seed is deleted. One has to figure the right amounts that
set of all feasible cuts in 3 and the set 31: of all assignments have to be added to the costs of two t-links at the corre-
A that satisfy hard constraints (43). Then, sponding pixel. The new costs should be consistent with
E(A) = ~Cl+const = minl~l+const = the edge weight table plus or minus the same constant.
C€3
min E ( A ( C ) ) = minE(A)
CE3 AE31 4. Examples
and the theorem is proved.
We demonstrate our general-purpose segmentation
To conclude this section we would like to show that our method in several examples including photohideo editing
algorithm can efficiently adjust the segmentation to incor- and medical data processing. We show original data and
porate any additional seeds that the user might interactively segments generated by our technique for a given set of
add. To be specific, assume that a max-flow algorithm is seeds. Our actual interface allows a user to enter seeds via
used to determine the minimum cut on G. The max-flow mouse operated brush of red (for object) or blue (for back-
algorithm gradually increases the flow sent from the source ground) color. Due to limitations of this B&W publication
S to the sink T along the edges in G given their capacities we show seeds as strokes of white (object) or black (back-
(weights). Upon termination the maximum flow saturates3 ground) brush. In addition, these strokes are marked by the
the graph. The saturated edges correspond to the minimum letters “0” and “B”. For the purpose of clarity, we employ
cost cut on giving us an optimal segmentation. different methods for the presentation of segmentation re-
Assume now that an optimal segmentation is already sults in our examples below.
computed for some initial set of seeds. A user adds a new Our current implementation actually makes a double use
“object” seed to pixel p that was not previously assigned of the seeds entered by a user. First of all, they provide
any seed. We need to change the costs for two t-links at p the hard constraints for the segmentation process as dis-
I t-link I initial cost I new cost 1 cussed in Section 3. In addition, we use intensities of pix-
els (voxels) marked as seeds to get histograms for “ob-
ject” and “background” intensity distributions: Pr(1)Q)
I I I
I and Pr(llB)s. Then, we use these histograms to set the
regional penalties R p ( . )as negative log-likelihoods:

and then compute the maximum Bow (minimum cut) on the Rp(“obj”) = - lnPr(1,JQ)
new graph. In fact, we can start from the flow found at R,(“bkg”) = - lnPr(1,IB).
the end of initial computation. The only problem is that re-
assignment of edge weights as above reduces capacities of Such penalties are motivated by the underlying MAP-MRF
some edges. If there is a flow through such an edge then formulation [9, 31.
we may break the flow consistency. Increasing an edge ca- To set the boundary penalties we use an ad-hoc function
pacity, on the other hand, is never a problem. Then, we can
solve the problem as follows.
To accommodate the new “object” seed at pixel p we
increase the t-links weights according to the table 4Note that the minimum cut severs exactly one of two t-links at pixel p .
’See [7] or (21 for details. SAltematively, the histograms can be learned from historic data.

109
B

(a) Original image (b) Result for X = 7-43 B


B

B
(a) Original B&W photo (b) Segmentation results

(c) Result for X = 0 (d) Result for X = 60


(c) Details of segmentation (d) Details of segmentation
with regional term without regional term
Figure 2. Synthetic Gestalt example. The seg-
mentation results in (b-d) are shown for vari-
ous levels X of relative importance of “region” Figure 3. Segmentation of a photograph.
versus “boundary” in (1). Note that the result
in (b) corresponds to a wide range of A.
4.2. Photo and Video Editing

In Figure 3 we segmented a bell with a group of people


This function penalizes a lot for discontinuities between from a photograph. The user can start with a few “object”
pixels of similar intensities when 11, - IqI < 0.However, and “background” seeds loosely positioned inside and, cor-
if pixels are very different, 11, - IQI > CJ,then the penalty is respondingly, outside the object(s) of interest. By reviewing
small. Intuitively, this function corresponds to the distribu- the results of initial segmentation the user will see what ar-
tion of noise among neighboring pixels of an image. Thus, eas are segmented incorrectly. Then (s)he can put additional
0 can be estimated as “camera noise”.
seeds6 into the troubled places and efficiently recompute the
Note that we use an 8-neighborhood system in 2D ex- optimal segmentation as explained at the end of Section 3.
amples and 26-neighborhood system in 3D examples. All This process of adding seeds gives more clues to the algo-
running times are given for 333MHz Pentium 111. Our im- rithm and may be continued until the user likes the results.
plementation uses a new “max-flow” algorithm from [ 2 ] .
Naturally, the hope is that the method can quickly iden-
4.1. Gestalt Example tify the right object. The user would not like to keep adding
new seeds until the whole image is covered by these seeds.
Figure 2 illustrates some interesting properties of our This is no better than a manual segmentation. The perfor-
cost function (1). Isolated dark blocks that stay relatively mance of an algorithm can be judged by the efforts required
close to each other in (a) are segmented as one object in (b). from the user. Thus, the results of our segmentation are
This may correspond to the psychological perception of the shown with seeds that we entered to get this segmentation.
original image (a). Only an optimal solution that combines Our segmentation algorithm runs in less than a second
both boundary and region properties can generate such re- for most 2D images (up to 512 x 512) with a fixed set of
sult. Our segmentation does not leak through the numer- seeds. When additional seeds are entered the solution is
ous “holes” as a simple region growing would. As shown recomputed in the blink of an eye. Thus, the speed eval-
in (c), the segmentation based on boundary properties only uation of our method in 2D is mainly concerned with the
“shrinks” the object. The opposite extreme when the region user efforts. The detailed segmentation in Figure 3(b) is
properties strongly dominate the boundary forces is shown obtained in approximately a minute. Note that in this exam-
in (d). In this case the method creates isolated segments for
‘Note that adding correcting “object” seeds to pixels that are already
individual blocks. The bright space between the blocks has segmented as “object” would not change the optimal segmentation. These
a better fit with the “background” histogram and the super- additional hard constraint are already satisfied by the current solution. The
strong regional term forces the “object” segment out. same argument applies to correcting “background” seeds.

110
Figure 5. Bone removal in a CT image. Bone
segments are marked by horizontal lines.

tained by placing seeds in 3 out of 21 frames. Each car was


segmented in an independent experiment. We do not show
Figure 4. Segmentation of a video sequence. seeds to avoid confusion.
The computation of the minimum cut is slower in 3D
cases. The initial segmentation might take from 2-3 sec-
onds on smaller volumes (200 x 200 x 10)to a few minutes
on bigger ones (512 x 512 x 50). Thus, efficient recom-
ple the algorithm created some isolated “background” seg- puting of an optimal solution when new seeds are added is
ments. In fact, the algorithm automatically decided which crucial. Most of the time the new seeds are incorporated
“background” seeds were grouped together and which were in a few seconds even for bigger volumes. Therefore, we
placed into isolated segments. The same is equally true for can still consider our method as “interactive”. The results
the “object” seeds. The segments can have any topology. in Figure 4 can be obtained in approximately 30 seconds
In many cases the regional term of energy (1) helps to including user interactions.
get the right results faster. In Figures 3(c,d) we show some
details of segmentation with and without the regional term 4.3. Medical Images and Volumes
(A = 0) given the same sets of seeds. In (d) the user would
spend more time by placing additional “background” seeds Figure 5 shows a segmentation that we obtained on a ab-
to correct imperfections. dominal CT image. Our goal was to remove bones from
The globally minimum two-terminal cut can be com- the image for better volume rendering. The right segmenta-
puted on any graph. Thus, our technique is valid for seg- tion was obtained after one “click” on one of the bones and
mentation of N-D data. In Figure 4 we segmented moving a couple of loose “strokes” in the background. The other
cars in a video sequence. The sequence of 2 1 video frames bones were correctly segmented out based on regional prop-
(255 x 189) was treated as a single 3D volume. The nec- erties. The boundary term helped to avoid incorrect “bone”
essary seeds were entered in a simple 3D interface where segments in the bright parts of liver and other tissues. The
we could browse through individual 2D slices (frames) of results are obtained in a few seconds. Without the regional
the volume. Initial seeds can be entered in just a few repre- term in the energy function (1) when X = 0 the user would
sentative frames. Note that the boundary, region, and seed have to add “object” seeds in all isolated bones.
information is automatically propagated between the slices In Figure 6 we demonstrate our segmentation method on
since we compute a globally optimum solution directly on 3D medical data. We segmented out cortex, medulla, and
the 3D data set. Thus, the whole sequence can be segmented collecting system of a kidney from a background. This was
based on seeds placed in just a few frames. For example, en- done in three steps. First, the kidney was separated from the
tering correcting seeds in one frame can fix imperfections background. Then we separated the cortex from the rest of
in many adjacent frames. The results in Figure 4 are ob- the kidney. In the last step the medulla was separated from

111
[4] L. D. Cohen. On active contour models and ballons. Com-
puter Vision, Graphics, and Image Processing: Image Un-
derstanding, 53(2):211-218, 1991.
[5] 1. J. Cox, S. B. Rao, and Y.Zhong. ”Ratio regions”: a tech-
nique for image segmentation. In Intemational Conference
on Pattem Recognition, volume 11, pages 557-564, 1996.
[6] A. X. Falclo, J. K. Udupa, S. Samarasekera, and S. Sharma.
User-steered image segmentation paradigms: Live wire and
live lane. Graphical Models and Image Processing, 60:233-
260, 1998.
[7] L. Ford and D. Fulkerson. Flows in Networks. Princeton
University Press, 1962.
[8] A. Goldberg and R. Tarjan. A new approach to the maximum
flow problem. Journal of the Association for Computing
Machinery, 35(4):921-940, October 1988.
[9] D. Greig, B. Porteous, and A. Seheult. Exact maximum a
posteriori estimation for binary images. Joumal of the Royal
Statistical Society, Series B , 51(2):271-279, 1989.
[IO] L. D. Griffin, A. C. F. Colchester, S. A. Roll, and C. S.
Studholme. Hierarchical segmentation satisfying con-
straints. In British Machine Vision Conference, pages 135-
144, 1994.
Figure 6. Kidney in a 3D MRI angio data. [l 11 R. M. Haralick and L. G. Shapiro. Computer and Robot
Vision. Addison-Wesley Publishing Company, 1992.
[12] H. Ishikawa and D. Geiger. Segmentation by grouping junc-
the collecting system. The results in Figure 6 are shown tions. In IEEE,Conference on Computer Vision and Pattern
without the seeds since the process involved three different Recognition, pages 125-131, 1998.
segmentations. The segmentation can be computed in 3-4 [13] I. H. Jermyn and H. Ishikawa. Globally optimal regions and
minutes including the efforts of a user. boundaries. In Intemational Conference on Computer vi-
sion, volume 11, pages 904-910, 1999.
[14] M. Kass, A. Witkin, and D. Terzolpoulos. Snakes: Active
5. Extensions contour models. Intemational Joumal of Computer Vision,
’ 2~321-331, 1988.

The process of putting seeds can be automated for cer- [15] E. N. Mortensen and W. A. Barrett. Interactive segmenta-
tion with intelligent scissors. Graphical Models and Image
tain applications. The seeds should be positioned with a low
Processing,60:349-384, 1998.
probability of “false alarm” while the probability of “right [ 161 S. Osher and 3. Sethian. Fronts propagating with curvature-
detect” is not required to be high. Even very simple recog- dependent speed: algorithms based on the Hamilton-Jacobi
nition techniques based on filters might be good enough if formulation. Journal of Computational Physics, 79: 12-49,
the corresponding threshold is set high. Such filters would 1988.
have difficulties near the boundaries of objects but they can [17] L. J. Reese. Intelligent paint: Region-based interactive im-
confidently place many seeds anywhere else. These seeds age segmentation. Master’s thesis, Brigham Young Univer-
give hard constraints. Based on additional soft constraints sity, 1999.
[18] J. Shi and J. Malik. Normalized cuts and image segmenta-
(1) the minimum cut can complete the segmentation where tion. In IEEE Conference on Computer Vision and Pattern
the recognition method failed. Recognition, pages 73 1-737, 1997.
[19] D. Snow, P. Viola, and R. Zabih. Exact voxel occupancy
References with graph cuts. In IEEE Conference on Computer Vision
and Pattem Recognition, volume 1, pages 345-352, 2000.
[20] Z. Wu and R. Leahy. An optimal graph theoretic approach to
[l] Y. Boykov and M.-P. Jolly. Interactive organ segmenta- data clustering: Theory and its application to image segmen-
tion using graph cuts. In Medical Image Computing and tation. IEEE Transactions on Pattem Analysis and Machine
Computer-Assisted Intervention, pages 276286, 2000. Intelligence, 15(11):1101-1113, November 1993.
[2] Y. Boykov and V. Kolmogorov. An experimental compar- [21] A. Yuille and P. Hallinan. Deformable templates. In
ison of min-cut/max-flow algorithms for energy minimiza- A. Blake and A. Yuille, editors, Active Vision, pages 20-38.
tion in vision. In 3rd. Intnl. Workshop on Energy Minimiza- MIT Press, 1992.
tion Methods in Computer Vision and Pattem Recognition [22] S. C. Zhu and A. Yuille. Region competition: Unifying
(EMMCVPR).Springer-Verlag, September 2001, to appear. snakes, region growing, and BayeslMDL for multiband im-
[3] Y. Boykov, 0. Veksler, and R. Zabih. Markov random fields age segmentation. IEEE Transactions on Pattem Analysis
with efficient approximations. In IEEE Conference on Com- and Machine Intelligence, 18(9):884-900, September 1996.
puter Vision and Pattern Recognition, pages 648-655, 1998.

112

You might also like