ELEN E4830 Digital Image Processing
Homework 6 Solution
Chuxiang Li
cxli@ee.columbia.edu
Department of Electrical Engineering, Columbia University
April 10, 2006
1 Edge Detection
1.1 Sobel Operator
The gradient of a 2-D image f (x, y) at location (x, y) is defined as the vector
· ¸T
T ∂f ∂f
∇f , [Gx , Gy ] = , , (1)
∂x ∂y
which corresponds to the magnitude and direction components respectively
q
mag(∇f ) = ∇f = G2x + G2y , (2)
µ ¶
−1 Gy
and ang(∇f ) = α(x, y) = tan . (3)
Gx
Fig. 1 shows the masks employed in Sobel operator. The gradient components corre-
sponding to the masks can be written respectively as
Gx = (z7 + 2z8 + z9 ) − (z1 + 2z2 + z3 ),
and Gy = (z3 + 2z6 + z9 ) − (z1 + 2z4 + z7 ). (4)
1
Moreover, to reduce the computational costs involved by square and root calculation, the
magnitude of the vector ∇f in (2) can be approximated by
∇f ≈ |Gx | + |Gy |. (5)
z1 z2 z3
z4 z5 z6
z7 z8 z9
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
Figure 1: Masks employed in Sobel operator.
Using (4), we can calculate the gradients of the binary image including a centered rect-
angular with a size of m × n pixels. In particular, we only need to treat the pixels at some
specific positions as follows:
• the top edge of the rectangular
• the bottom edge of the rectangular
• the left edge of the rectangular
• the right edge of the rectangular
• the four corners
2
Fig. 2 shows the approximated magnitudes of the pixels in the image computed using (5);
Fig. 3 shows the histogram of the edge directions computed using (3).
¡1
Figure 2: Magnitudes of the gradients obtained via Sobel operator.
1.2 Laplacian Operator
The Laplacian of a 2-D image f (x, y) is a second-order derivative defined as
∂ 2f ∂ 2f
∇2 f = + . (6)
∂x2 ∂y 2
For a 3 × 3 region, one practical approach of calculating ∇2 f is shown in Fig. 4, which
corresponds to
∇2 f = 4z5 − (z2 + z4 + z6 + z8 ). (7)
Using (7), we can calculate all relevant pixel values shown in Fig. 5.
3
Figure 3: Histogram of angles obtained via Sobel operator.
z1 z2 z3 0 -1 0
z4 z5 z6 -1 4 -1
z7 z8 z9 0 -1 0
Figure 4: Masks employed in Laplacian operator.
4
Figure 5: Results of Laplacian operator.
2 Application of Hough Transform
This problem is a natural for the Hough transform, which is set up as follows: The θ axis
is divided into six subdivisions, corresponding to the six specified directions and their error
bands. For example (since the angle directions specified in the problem statement are with
respect to the horizontal) the first band for angle θ extends from −30◦ to −20◦ , corresponding
√ √
to the −25◦ direction and its ±5◦ band. The ρ axis extends from ρ = − D to ρ = + D,
where D is the largest distance between opposite corners of the image, properly calibrated
to fit the particular imaging set up used. The subdivisions in the ρ axis are chosen finely
enough to resolve the minimum expected distance between tracks that may be parallel, but
have different origins, thus satisfying the last condition of the problem statement.
Set up in this way, the Hough transform can be used as a “filter” to categorize all points
in a given image into groups of points in the six specified directions. Each group is then
processed further to determine if its points satisfy the criteria for a valid track:
5
• each group must have at least 100 points;
• it cannot have more than three gaps, each of which cannot be more than 10 pixels long
(see the Remark below for the estimation of gaps of a given length).
Remark: Estimation of Gaps of a Given Length
The key to solving this problem is to find all end points of line segments in the image.
End points are those points on a line which have only one 8-neighbor valued 1. Once all
end points have been found, the D8 distance between all pairs of such end points gives the
lengths of the various gaps. We choose the smallest distance between end points of every
pair of segments and any such distance less than or equal to L satisfies the statement of
the problem. This is a rudimentary solution, and numerous embellishments can be added
to build intelligence into the process. For example, it is possible for end points of different,
but closely adjacent, lines to be less than L pixels apart, and heuristic tests that attempt to
sort out things like this are quite useful. Although the problem statement does not call for
any such tests, they are normally needed in practice and it is worthwhile to bring this up in
class if this particular problem is assigned as a homework assignment.
3 Edge Linking
The image is first converted to the grayscale image. Then the gradient components in vertical
and horizontal direction can be computed, and the gradient vector can be further transformed
into the magnitude and angle components. Note that the angle component here is quantized
into eight levels [0, 45, 90, 135, 180, 225, 270, 315], representing the [west, southwest, south,
southeast, east, northeast, north, northwest] directions, respectively. It is also worth noting
that the gradient direction at each point is perpendicular to the corresponding link direction.
The basic criterion is quite simple as demonstrated in the lecture notes. In particular,
6
for each object, we first find one point locating on its contour, and this can be achieved by
using Matlab functions “edge()” and “imtool()”. Such points on the objects are treated as
the starting points of the edge search and linking. The search process can be summarized
as follows.
• Neighbor selection:
The pixels within 45◦ of the current direction are treated as possible candidates;
• Next pixel selection:
Within these candidates, each pixel, whose gradient direction falls in 45◦ of the previous
pixel and gradient magnitude is greater than a lower bound, is selected as the next
pixel.
• Stop criterion: The recursive search keeps going until the starting point is reached once
again.
• Path evaluation: Each path has a normalized sum of the gradient magnitudes of all
pixels; compare such values of different paths and find the best one which corresponds
to the “optimum” path.
Fig. 6 shows the original test image, and Fig. 7 shows the edge detection and linking
results obtained via the algorithm described above applied on the test image.
Remark 1: The above procedure may result in too many paths as candidates, and thus, a
very high complexity. To reduce the complexity and searching time, we may only search
certain number of paths, from which the best one is picked up as the “optimum” path.
Remark 2: For most of the objects in the test image, the above scheme achieves good
results; however, the results for the pen cap is not fully perfect though its main contour can
be detected and linked.
7
Figure 6: Test image: “rubberbandcap”.
Figure 7: Edge detection and linking results: “rubberbandcap”.