N-D Image Segmentation Techniques
N-D Image Segmentation Techniques
105
0-7695-1143-0/01 $10.00 0 2001 IEEE
of neighboring elements in P. For example, P can con- desired boundary. Moreover, this approach can not be easily
tain pixels (or voxels) in a 2D (or 3D) grid and can generalized to 3D data.
contain all unordered pairs of neighboring pixels (voxels)
under a standard 8- (or 26-) neighborhood system. Let We consider hard constraints that indicate segmentation
regions rather than the boundary. We assume that some pix-
A = (AI ,... , A p , . . . , A l p , ) be a binary vector whose
components A, specify assignments to pixels p in P . Each els were marked as internal and some as external for the
A, can be either “obj” or “bkg” (abbreviations of “object” given object of interest. The subsets of marked pixels will
and “background”). Vector A defines a segmentation. Then, be referred to as “object” and “background” seeds. The seg-
the soft constraints that we impose on boundary and region mentation boundary can be anywhere but it has to separate
properties of A are described by the cost function E ( A ) : the object seeds from the background seeds. Note that the
seeds can be loosely positioned inside the object and back-
E(A) = A.R(A)+B(A) (1) ground regions. Our segmentation technique is quite stable
and normally produces the same results regardless of par-
where ticular seed positioning within the same image object.
106
the nodes in the graph. As illustrated in Figure 1 (c-d), this
partitioning corresponds to a segmentation of an underlying
image or volume. A minimum cost cut generates a segmen-
tation that is optimal in terms of properties that are built into
the edge weights.
(a) Image with seeds. (d) Segmentation results. Our technique is based on a well-known combinatorial
optimization fact that a globally minimum cut of a graph
U -h with two terminals can be computed efficiently in low-order
polynomial time [7, 8, 21. In Section 3 we show how to set
a two terminal graph so that the minimum cut would give
a segmentation that minimizes (1) among all segmentations
satisfying the given hard constraints.
* It should be noted that graph cuts were used for image
segmentation before. In [20] the image is optimally divided
into K parts to minimize the maximum cut between the seg-
ments. In this formulation, however, the segmentation is
strongly biased to very small segments. Shi and Malik [ 181
(b) Graph. (c) cut. try to solve this problem by normalizing the cost of a cut.
The resulting-optimization
. problem is NP-hard and they use
an approximation technique. In [9, 3, 12, 191 graph cuts are
Figure 1. A simple 2D segmentation example applied to minimize certain energy functions used in im-
for a 3 x 3 image. The seeds are 0 = {w} and age restoration, stereo, 3D object reconstruction, and other
B = { p } . The cost of each edge is reflected problems in computer vision.
by the edge’s thickness. The regional term A fast implementation of theoretically polynomial graph
(2) and hard constraints (4’5) define the costs cut algorithms can be an issue. The most straight-forward
of t-links. The boundary term (3) defines the implementations of the standard graph cut algorithms, e.g.
costs of n-links. Inexpensive edges are at- max-flow [7] or push-relabel [SI, can be slow. The experi-
tractive choices for the minimum cost cut. ments in [ 2 ] compare several well-known “tuned’ versions
of these standard algorithms in the context of graph based
methods in vision. The same paper also describes a new ver-
2. Graph Cuts and Computer Vision sion of the max-flow algorithm that (on typical in vision ex-
amples) significantly outperformed the standard techniques.
Our implementation of the interactive segmentation method
First, we describe the basic terminology that pertains to
of this paper uses the new graph cut algorithm from [ 2 ] .
graph cuts in the context of our segmentation method. An
undirected graph G = ( V , E ) is defined as a set of nodes
(vertices V ) and a set of undirected edges ( E ) that connect 3. Segmentation Technique
these nodes. An example of a graph that we use in this
paper is shown in Figure I@). Each edge e E E in the In this section we provide algorithmic details about our
graph is assigned a nonnegative weight (cost) U,.There are segmentation technique. Assume that 0 and B denote
also two special nodes called terminals. A cut is a subset the subsets of pixels marked as “object” and “background”
of edges C c & such that the terminals become separated seeds. Naturally, the subsets (3 C P and B C P are such
on the induced graph G(C) = (V,&\C). It is normal in that 0fl G = 0. Remember that our goal is to compute the
combinatorial optimization to define the cost of a cut as the global minimum of ( I ) among all segmentations .A satisfy-
sum of the costs of the edges that i t severs ing hard constraints
Graph cut formalism is well suited for segmentation The general work flow is shown in Figure I . Given an
of iningcs. In fact. i t is completely appropriate for N- image (Figure I(a)) we create a graph with two terminals
dinicnsionul volumcs. The nodes o f the graph can represent (Figure I(b)). The edge weights reflect the parameters in
pixcls (or voxels) and the cdges can represent any neigh- the regional ( 2 ) and the boundary (3) terms of the cost func-
borhood relationship bctwccn the pixels. A cut partitions tion. as well as the known positions of seeds in the image.
107
The next step is to compute the globally optimal minimum (see Section 2) assuming that the edge weights specified in
cut (Figure l(c)) separating two terminals. This cut gives the table above are non-negative2.
a segmentation (Figure l(d)) of the original image. In the Below we state exactly how the minimum cut C defines
simplistic example of Figure 1 the image is divided into ex- a segmentation A and prove this segmentation is optimal.
actly one “object” and one “background” regions. In gen- We need one technical lemma. Assume that 3denotes a set
eral, our segmentation method generates binary segmenta- of allfeasible cuts C on graph G such that
tion with arbitrary topological properties. Examples in Sec-
tion 4 illustrate that object and background segments may 0 C severs exactly one t-link at each p
consist of several isolated connected blobs in the image.
Below we describe the details of the graph and prove 0 { p , q } E C iff p , q are t-linked to different terminals
that the obtained segmentation is optimal. To segment a
given image we create a graph 6 = ( V ,E ) with nodes cor- 0 if p E 0then { p , T } E C
responding to pixels p E P of the image. There are two
additional nodes: an “object” terminal (a source S ) and a 0 if p E B then { p , S } E C .
“background” terminal (a sink T ) .Therefore,
Lemma 1 The minimum cut on G is feasible, i.e. C E 3.
V=PU{S,T}.
PROOF: C severs et least one t-link at each.pixel since it
The set of edges E consists of two types of undirected edges: is a cut that separates the terminals. On the other hand, it
n-links (neighborhood links) and t-links (terminal links). can not sever both t-links. In such a case it would not be
Each pixel p has two t-links { p , S} and { p , T } connecting minimal since one of the t-links could be returned. Simi-
it to each terminal. Each pair of neighboring pixels { p , q } larly, a minimum cut should sever an n-link { p , q } if p and
in N is connected by an n-link. Without introducing any q are connected to the opposite terminals just because any
ambiguity, an n-link connecting a pair of neighbors p and q cut must separate the termi$s. If p and q are connected
will be denoted by { p , q } . Therefore, to the same terminal then C should not sever unnecessary
n-link { p , q}^due to its minimality. The last two properties
=N U { { P , SI, { P , V ) . are true for C because the constant K is larger than the sum
PEP of all n-links costs for any given pixel p . For example, if
The following table gives weights of edges in E p E 0 and d severs { p , S } (costs K ) then we would con-
struct a smaller cost cut by restoring { p , S } and severing all
edge 1 weight (cost) n-links from p (costs less then K ) as well as the opposite
t-link { p , T } (zero cost). I
if { p , T } E C
I
AdC) = { “obj”,
“bkg”, if { p , S } E C. (6)
X . R,(“obj”) The definition above is coherent since any feasible cut sev-
ers exactly one of the two t-liyks at each pixel p . The lemma
showed that a minimum cut C is feasible. Thus, we can de-
fine a corresponding segmentation A = A ( C ) . The next
theorem completes the description of our algorithm.
The graph G is now completely defined. We draw the *In fact, our technique does require that the penalties B { p , q }be non-
segmentation boundary between the object and the back- negative. The restriction that R P ( . )should be non-negative is easy to
avoid. For any pixel p we can always add a large enough positive con-
ground by finding the minimum cost cut on the graph G.
stant to both R,(“obj”) and R,(“bkg”) to make sure that they become
The minimum cost cut C on G can be computed exactly in non-negative. This operation can not change the minimum A of E ( A ) in
polynomial time via algorithms for two terminal graph cuts (1) since it is equivalent to adding a constant to the whole energy function.
108
PROOF: Using the table of edge weights, definition of fea- new cost
sible cuts 3,and equation (6) one can show that a cost of
any C E 3 is
IC1 = X.RP(AP(C))+
PBOUB
These new costs are consistent with the edge weight table
B{p,q} . &4P(C)7 AP(C)) for pixels in U since the extra constant cp at both t-links of
{p,q)EN a pixel does not change the optimal cut4. Then, a maximum
= E ( A ( C ) )- XR,(“obj”) - XRp(“bkg”). flow (minimum cut) on a new graph can be efficiently ob-
PE0 P€B tained starting from the previous flow without recomputing
the whole solution from scratch.
Therefore, IC1 = E ( A ( C ) )- const(C).Note that for any Note that the same trick can be done to adjust the seg-
C E 3 assignment A ( C )satisfies constraints (4,5). In fact, mentation when a new “background” seed is added or when
equation (6) gives a one-to-one correspondence between the a seed is deleted. One has to figure the right amounts that
set of all feasible cuts in 3 and the set 31: of all assignments have to be added to the costs of two t-links at the corre-
A that satisfy hard constraints (43). Then, sponding pixel. The new costs should be consistent with
E(A) = ~Cl+const = minl~l+const = the edge weight table plus or minus the same constant.
C€3
min E ( A ( C ) ) = minE(A)
CE3 AE31 4. Examples
and the theorem is proved.
We demonstrate our general-purpose segmentation
To conclude this section we would like to show that our method in several examples including photohideo editing
algorithm can efficiently adjust the segmentation to incor- and medical data processing. We show original data and
porate any additional seeds that the user might interactively segments generated by our technique for a given set of
add. To be specific, assume that a max-flow algorithm is seeds. Our actual interface allows a user to enter seeds via
used to determine the minimum cut on G. The max-flow mouse operated brush of red (for object) or blue (for back-
algorithm gradually increases the flow sent from the source ground) color. Due to limitations of this B&W publication
S to the sink T along the edges in G given their capacities we show seeds as strokes of white (object) or black (back-
(weights). Upon termination the maximum flow saturates3 ground) brush. In addition, these strokes are marked by the
the graph. The saturated edges correspond to the minimum letters “0” and “B”. For the purpose of clarity, we employ
cost cut on giving us an optimal segmentation. different methods for the presentation of segmentation re-
Assume now that an optimal segmentation is already sults in our examples below.
computed for some initial set of seeds. A user adds a new Our current implementation actually makes a double use
“object” seed to pixel p that was not previously assigned of the seeds entered by a user. First of all, they provide
any seed. We need to change the costs for two t-links at p the hard constraints for the segmentation process as dis-
I t-link I initial cost I new cost 1 cussed in Section 3. In addition, we use intensities of pix-
els (voxels) marked as seeds to get histograms for “ob-
ject” and “background” intensity distributions: Pr(1)Q)
I I I
I and Pr(llB)s. Then, we use these histograms to set the
regional penalties R p ( . )as negative log-likelihoods:
and then compute the maximum Bow (minimum cut) on the Rp(“obj”) = - lnPr(1,JQ)
new graph. In fact, we can start from the flow found at R,(“bkg”) = - lnPr(1,IB).
the end of initial computation. The only problem is that re-
assignment of edge weights as above reduces capacities of Such penalties are motivated by the underlying MAP-MRF
some edges. If there is a flow through such an edge then formulation [9, 31.
we may break the flow consistency. Increasing an edge ca- To set the boundary penalties we use an ad-hoc function
pacity, on the other hand, is never a problem. Then, we can
solve the problem as follows.
To accommodate the new “object” seed at pixel p we
increase the t-links weights according to the table 4Note that the minimum cut severs exactly one of two t-links at pixel p .
’See [7] or (21 for details. SAltematively, the histograms can be learned from historic data.
109
B
B
(a) Original B&W photo (b) Segmentation results
110
Figure 5. Bone removal in a CT image. Bone
segments are marked by horizontal lines.
111
[4] L. D. Cohen. On active contour models and ballons. Com-
puter Vision, Graphics, and Image Processing: Image Un-
derstanding, 53(2):211-218, 1991.
[5] 1. J. Cox, S. B. Rao, and Y.Zhong. ”Ratio regions”: a tech-
nique for image segmentation. In Intemational Conference
on Pattem Recognition, volume 11, pages 557-564, 1996.
[6] A. X. Falclo, J. K. Udupa, S. Samarasekera, and S. Sharma.
User-steered image segmentation paradigms: Live wire and
live lane. Graphical Models and Image Processing, 60:233-
260, 1998.
[7] L. Ford and D. Fulkerson. Flows in Networks. Princeton
University Press, 1962.
[8] A. Goldberg and R. Tarjan. A new approach to the maximum
flow problem. Journal of the Association for Computing
Machinery, 35(4):921-940, October 1988.
[9] D. Greig, B. Porteous, and A. Seheult. Exact maximum a
posteriori estimation for binary images. Joumal of the Royal
Statistical Society, Series B , 51(2):271-279, 1989.
[IO] L. D. Griffin, A. C. F. Colchester, S. A. Roll, and C. S.
Studholme. Hierarchical segmentation satisfying con-
straints. In British Machine Vision Conference, pages 135-
144, 1994.
Figure 6. Kidney in a 3D MRI angio data. [l 11 R. M. Haralick and L. G. Shapiro. Computer and Robot
Vision. Addison-Wesley Publishing Company, 1992.
[12] H. Ishikawa and D. Geiger. Segmentation by grouping junc-
the collecting system. The results in Figure 6 are shown tions. In IEEE,Conference on Computer Vision and Pattern
without the seeds since the process involved three different Recognition, pages 125-131, 1998.
segmentations. The segmentation can be computed in 3-4 [13] I. H. Jermyn and H. Ishikawa. Globally optimal regions and
minutes including the efforts of a user. boundaries. In Intemational Conference on Computer vi-
sion, volume 11, pages 904-910, 1999.
[14] M. Kass, A. Witkin, and D. Terzolpoulos. Snakes: Active
5. Extensions contour models. Intemational Joumal of Computer Vision,
’ 2~321-331, 1988.
The process of putting seeds can be automated for cer- [15] E. N. Mortensen and W. A. Barrett. Interactive segmenta-
tion with intelligent scissors. Graphical Models and Image
tain applications. The seeds should be positioned with a low
Processing,60:349-384, 1998.
probability of “false alarm” while the probability of “right [ 161 S. Osher and 3. Sethian. Fronts propagating with curvature-
detect” is not required to be high. Even very simple recog- dependent speed: algorithms based on the Hamilton-Jacobi
nition techniques based on filters might be good enough if formulation. Journal of Computational Physics, 79: 12-49,
the corresponding threshold is set high. Such filters would 1988.
have difficulties near the boundaries of objects but they can [17] L. J. Reese. Intelligent paint: Region-based interactive im-
confidently place many seeds anywhere else. These seeds age segmentation. Master’s thesis, Brigham Young Univer-
give hard constraints. Based on additional soft constraints sity, 1999.
[18] J. Shi and J. Malik. Normalized cuts and image segmenta-
(1) the minimum cut can complete the segmentation where tion. In IEEE Conference on Computer Vision and Pattern
the recognition method failed. Recognition, pages 73 1-737, 1997.
[19] D. Snow, P. Viola, and R. Zabih. Exact voxel occupancy
References with graph cuts. In IEEE Conference on Computer Vision
and Pattem Recognition, volume 1, pages 345-352, 2000.
[20] Z. Wu and R. Leahy. An optimal graph theoretic approach to
[l] Y. Boykov and M.-P. Jolly. Interactive organ segmenta- data clustering: Theory and its application to image segmen-
tion using graph cuts. In Medical Image Computing and tation. IEEE Transactions on Pattem Analysis and Machine
Computer-Assisted Intervention, pages 276286, 2000. Intelligence, 15(11):1101-1113, November 1993.
[2] Y. Boykov and V. Kolmogorov. An experimental compar- [21] A. Yuille and P. Hallinan. Deformable templates. In
ison of min-cut/max-flow algorithms for energy minimiza- A. Blake and A. Yuille, editors, Active Vision, pages 20-38.
tion in vision. In 3rd. Intnl. Workshop on Energy Minimiza- MIT Press, 1992.
tion Methods in Computer Vision and Pattem Recognition [22] S. C. Zhu and A. Yuille. Region competition: Unifying
(EMMCVPR).Springer-Verlag, September 2001, to appear. snakes, region growing, and BayeslMDL for multiband im-
[3] Y. Boykov, 0. Veksler, and R. Zabih. Markov random fields age segmentation. IEEE Transactions on Pattem Analysis
with efficient approximations. In IEEE Conference on Com- and Machine Intelligence, 18(9):884-900, September 1996.
puter Vision and Pattern Recognition, pages 648-655, 1998.
112