0% found this document useful (0 votes)
108 views13 pages

Remote Sensing: Accuracy Assessment Measures For Object Extraction From Remote Sensing Images

Uploaded by

João Júnior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views13 pages

Remote Sensing: Accuracy Assessment Measures For Object Extraction From Remote Sensing Images

Uploaded by

João Júnior
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

remote sensing

Article
Accuracy Assessment Measures for Object Extraction
from Remote Sensing Images
Liping Cai 1,2 ID
, Wenzhong Shi 3, *, Zelang Miao 4,5 and Ming Hao 6
1 School of Geography and Tourism, Qufu Normal University, Rizhao 276826, China; cumtcailp@126.com
2 Key Laboratory of Coastal Zone Exploitation and Protection, Ministry of Land and Resource,
Nanjing 210024, China
3 Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University,
Hong Kong, China
4 School of Geosciences and Info-Physics, Central South University, Changsha 410012, China;
zelang.miao@csu.edu.cn
5 Key Laboratory of Metallogenic Prediction of Nonferrous Metals and Geological Environment Monitoring,
Central South University, Changsha 410012, China
6 School of Environment Science and Spatial Informatics, China University of Mining and Technology,
Xuzhou 221116, China; haomingcumt@gmail.com
* Correspondence: lswzshi@polyu.edu.hk

Received: 18 November 2017; Accepted: 6 February 2018; Published: 15 February 2018

Abstract: Object extraction from remote sensing images is critical for a wide range of applications,
and object-oriented accuracy assessment plays a vital role in guaranteeing its quality. To evaluate
object extraction accuracy, this paper presents several novel accuracy measures that differ from the
norm. First, area-based and object number-based accuracy assessment measures are given based on
a confusion matrix. Second, different accuracy assessment measures are provided by combining the
similarities of multiple features. Third, to improve the reliability of the object extraction accuracy
assessment results, two accuracy assessment measures based on object detail differences are designed.
In contrast to existing measures, the presented method synergizes the feature similarity and distance
difference, which considerably improves the reliability of object extraction evaluation. Encouraging
results on two QuickBird images indicate the potential for further use of the presented algorithm.

Keywords: object-based image analysis; accuracy assessment; feature similarity; distance difference

1. Introduction
High spatial resolution satellite images are easily available thanks to advancements in modern
sensor technology and have led to many applications in various fields, such as agriculture,
forestry, and environmental protection [1–3]. Compared to medium/low resolution satellite images,
high resolution satellite images contain richer information and clearer boundaries, making them
attractive for object extraction [4–6]. The concept of the object, a group of pixels that share similar
properties, was originally proposed in the 1970s [7], triggering a considerable amount of research in
object-based image analysis (OBIA). Since its introduction, researchers have wondered how they may
assess the results of OBIA.
Noise is inherent in satellite images, and thus, the accuracy of object extraction needs to be
examined. This issue has received considerable critical attention [8–17]. Examples include the error
matrix and confusion matrix, which are two typical methods for accuracy assessment. Despite their
popularity, these methods ignore object features, making them unsuitable for OBIA accuracy evaluation.
A direct solution [18] is to compute error and confusion matrixes on each object rather than at the pixel
level. Although this simple solution can amend the shortage of pixel-level error and confusion matrixes

Remote Sens. 2018, 10, 303; doi:10.3390/rs10020303 www.mdpi.com/journal/remotesensing


Remote Sens. 2018, 10, 303 2 of 13

to a certain extent, it suffers from the issue of missing the object detail, leading to unreliable evaluation
results. To enhance the reliability of this accuracy evaluation, a series of per-object accuracy assessment
measures based on object features have been designed [8,19]. Among them, the similarity between
evaluation and reference objects (e.g., area, size, shape, and location [19–24]) is commonly used. As the
similarity measure can judge the correctness of extracted objects, it is possible to obtain numbers
of correct, wrong, and missing objects for accuracy evaluation. Statistical values of object similarity
(e.g., the weighted average value) are also commonly used measures to directly assess object extraction
accuracy [8]. This method exploits the degree of overlap and position difference between evaluation
and reference objects for evaluating object extraction accuracy [19,25–28]. Although these methods
have resulted in significant improvements compared to pixel-wise accuracy measures, the effects of
geometric information and difference in detail have not been comprehensively examined. Meanwhile,
most existing evaluation methods were designed based on geometrical errors associated with objects.
The influence of thematic error on object extraction accuracy evaluation, however, lacks a systematic
understanding [29,30]. Thus, there is still significant room to improve the generalization ability of
existing object-level accuracy measures.
This study investigates the use of four designed accuracy measures for object extraction evaluation.
The proposed method systematically studies the influence of object characteristics on extraction
accuracy, the aim being to present reliable accuracy measures for object extraction. The remainder of
this paper is organized as follows. Section 2 introduces the methodology. The experimental results
and discussion are given in Sections 3 and 4, respectively. Finally, Section 5 concludes the paper.

2. Methodology
This study supposes that remote sensing images have been pre-processed, and have undergone
radiometric calibration and geometric correction. The objects are extracted from pre-processed images,
and some of the objects are selected as evaluation samples. A unique reference object is matched for
each evaluation object. The accuracy evaluation of object extraction is based on object matching.

2.1. Object Matching


Object extraction accuracy is evaluated by comparing the difference between the evaluated object
and its reference data, and thus it is fundamental to match the reference and evaluated objects. To this
end, this paper matches objects using the maximum overlap area algorithm due to its computation
efficiency. The central idea of the maximum overlap area method is to compute the coincidence degree
Oij between two objects.
!
1 AC,i ∩ AR,j AC,i ∩ AR,j
Oij = + (1)
2 AC,i AR,j

where AC,i denotes the area of the ith evaluated object, AR,j is the area of the jth reference object, and
AC,i ∩ AR,j represents the intersection area. For an evaluated object and candidate reference objects,
each coincidence degree will be computed. Two objects will be judged as being a matching pair if their
coincidence degree is a maximum amongst all pairs.

2.2. Area-Based Accuracy Measures


Three area-based accuracy measures (i.e., correctness, completeness, and quality) are designed for
OBIA evaluation. The purpose of area-based accuracy measures is to obtain stable accuracy measurements.
Correctness PAC is defined as the ratio of correctly extracted area and the whole extracted area.

AC
PAC = (2)
ADC

where ADC is the area of the extracted object, and AC is the correct part of ADC . The range of
correctness is from 0 to 1. If all the evaluated objects have their own fully corresponding reference
AC
PAC = (2)
ADC

where ADC is the area of the extracted object, and AC is the correct part of ADC . The range of
correctness
Remote is from
Sens. 2018, 10, 3030 to 1. If all the evaluated objects have their own fully corresponding reference
3 of 13
objects, then PAC = 1 . If there is no evaluated object from the same thematic class overlapping the
reference object, then PAC = 0 .
objects, then PAC = 1. If there is no evaluated object from the same thematic class overlapping the
The ratio of correctly extracted area AC to the reference area ARC is called the completeness
reference object, then PAC = 0.
PAR .
The ratio of correctly extracted area AC to the reference area ARC is called the completeness PAR .
A
P = AC (3)
PARAR= ARCC (3)
ARC
The range of completeness is 0 to 1. If all reference objects have their own fully corresponding
The range of completeness is 0 to 1. If all reference objects have their own fully corresponding
evaluated objects, then PAR = 1 . If there is no reference object from the same thematic class
evaluated objects, then PAR = 1. If there is no reference object from the same thematic class overlapping
overlapping
evaluatedthe evaluated
then Pobject, then PAR = 0 .
the object, AR = 0.
Equations (2) and (3) show an interaction between correctness and completeness. For instance,
instance,
large AADC
a large leads to a small correctness value, while a small A
DC leads to a small correctness value, while a small ARC results
RC results in a large completeness
in a large completeness value.
To amend
value. this issue,
To amend thisthe quality
issue, the quality PAL is designed
PAL is designed to balance
to correctness and completeness.
balance correctness and completeness.

AACC
PAL P=AL = (4)
(4)
ADC++ A
ADC RC-A−
ARC C AC

The range of quality is 0 to 1. If the extraction


extraction results
results are exactly
exactly the
the same
same asas the
the reference
reference data,
data,
then PPAL = 11.. IfIfno
AL = nothematic
thematic class
class evaluated
evaluated object
object overlaps
overlaps with
with the reference object, then PPAL
the reference AL =
=00..
Figure
Figure 11presents
presentstwo
twocases to illustrate
cases the advantage
to illustrate of area-based
the advantage accuracy measures
of area-based accuracy compared
measures
to the confusion
compared matrix. The
to the confusion accuracy
matrix. values ofvalues
The accuracy two cases
of twocomputed by the confusion
cases computed matrix
by the confusion
will
matrixbe will
significantly different,
be significantly as the confusion
different, matrix depends
as the confusion on total pixel
matrix depends number.
on total In contrast,
pixel number. In
the evaluation results for two cases using area-based accuracy measures are equivalent,
contrast, the evaluation results for two cases using area-based accuracy measures are equivalent, because the
latter
becausemeasurements rely only on the rely
the latter measurements evaluation and the
only on reference objectsand
evaluation and are independent
reference objectsof and
the total
are
pixel number.of the total pixel number.
independent

study area
reference object
evaluation object

(a) (b)

Figure 1.
Figure 1. Schematic
Schematic diagram
diagram ofof the
the influence
influence of
of study
study area
area on
on object-based
object-based image
image analysis
analysis evaluation:
evaluation:
(a) large
(a) large study
study area;
area; (b)
(b) small
small study
study area.
area.

2.3. Number-Based Accuracy Measures


2.3. Number-Based Accuracy Measures
Three accuracy measures (i.e., correct, false, and missing rates), relying on counting the number
Three accuracy measures (i.e., correct, false, and missing rates), relying on counting the number of
of objects with different properties, are presented for testing OBIA performance. Specifically, the
objects with different properties, are presented for testing OBIA performance. Specifically, the correct
correct rate PC , the false rate PF , and the missing rate PM are defined as
rate PC , the false rate PF , and the missing rate PM are defined as
NC
PC = , 0 ≤ PC ≤ 1 (5)
N CN+C N F
PC = , 0 ≤ PC ≤ 1 (5)
NC + NFNF
PF = , 0 ≤ PF ≤ 1 (6)
NNC + NF
F
PF = , 0 ≤ PF ≤ 1 (6)
NC + NF
NM
PM = , 0 ≤ PM ≤ 1 (7)
NC + NM
where NC , NF , and NM represent the number of correct, false, and missed extracted objects, respectively.
If all evaluated objects are correct, then PC = 1 and PF = 0. If all evaluated objects are incorrect,
Remote Sens. 2018, 10, x FOR PEER REVIEW 4 of 13

NM
PM = , 0 ≤ PM ≤ 1 (7)
NC + N M
where N C , N F , and N M represent the number of correct, false, and missed extracted objects,
Remote Sens. 2018, 10, 303 4 of 13
respectively. If all evaluated objects are correct, then PC = 1 and PF = 0 . If all evaluated objects are
incorrect, then PC = 0 . If there is no false evaluated object, then PF = 1 . If all reference objects have
then PC = 0. If there is no false evaluated object, then PF = 1. If all reference objects have their own
their own correct evaluated objects, then P = 0 . If no reference object corresponds correctly to the
correct evaluated objects, then PM = 0. If noM reference object corresponds correctly to the evaluated
evaluated object, then PM = 1 .
object, then PM = 1.
The purpose of
The purpose of Equations
Equations(5)–(7)
(5)–(7)isistotoexamine
examineif if
thethe object
object is extracted
is extracted correctly
correctly or falsely.
or falsely. To
To this
this
end,end,
if theif proportion
the proportion of correct
of correct pixels
pixels to total
to total pixels
pixels forobject
for an an object is larger
is larger thanthan a given
a given threshold,
threshold, it is
it is correctly extracted; otherwise, it is considered to be wrongly
correctly extracted; otherwise, it is considered to be wrongly extracted. extracted.

2.4. Feature Similarity-Based Accuracy Measures


The difference between object-based and pixel-basedpixel-based accuracy
accuracy measures
measures is is the
the assessment
assessment unit.
unit.
Compared to tothe
thepixel,
pixel,the
theobject
object consists
consists of many
of many similar
similar pixels,
pixels, and thus
and thus has more
has more features.
features. The
The object
object number-based
number-based accuracyaccuracy
measures measures
considerconsider feature difference,
feature difference, but omit but omit the
the feature feature
detail detail
difference
difference
and degreeand of degree of difference
difference between evaluated
between evaluated and reference
and reference objects. objects.
As shown As shown
in Figurein Figure 2a,
2a, if the
if thecoincidence
area area coincidence rate used
rate was was used
as a as a criterion
criterion to judge
to judge similarity,
similarity, twotwo evaluated
evaluated objects
objects would
would be
be judged
judged correctly.
correctly. However,
However, twotwoevaluated
evaluatedobjects
objectswould
wouldbe bejudged
judged incorrectly,
incorrectly, ifif the
the maximum
deviation
deviation distance was used as the the distinguishing
distinguishing criterion.
criterion. Thus,
Thus, the
the correctly extracted object
number cannot fully reflect the difference between two evaluated objects that have large geometrical
differences.
differences. TheThearea-based
area-basedmeasures
measures cancan reflect
reflect differences
differences betweenbetween
objects,objects, but object
but neglect neglectfeatures.
object
features. Although
Although object results
object extraction extraction
mayresults
have themay samehave the same
correct, false, correct,
and missingfalse,rates,
and different
missing object
rates,
different object features
features derived derived
from satellite fromcan
images satellite images
generate can generate
different differentFigure
object qualities. object2b qualities.
shows thatFigure
the
2b shows that the overlap areas between two evaluated objects and reference objects
overlap areas between two evaluated objects and reference objects are the same, but their locations and are the same,
but their locations
geometrical and geometrical
characteristics characteristics
differ. This indicates thatdiffer.
using This
objectindicates
number that usingarea
or object object number or
independently
object
cannotarea
assessindependently
object extractioncannot assess object
accurately. extraction
This issue can beaccurately.
solved to aThis issue
certain can by
extent be considering
solved to a
certain extent
more object by considering more object features.
features.

reference object
evaluated object 1

evaluated object 2

(a) (b)

Figure
Figure 2.
2. Schematic
Schematic diagram
diagram of
of object
object area
area and
and number
number uncertainties:
uncertainties: (a)
(a) feature
feature difference;
difference; (b)
(b) feature
feature
detail difference.
detail difference.

Besides object number and object area, object geometric features can also reflect object difference,
Besides object number and object area, object geometric features can also reflect object difference,
and thus can be used in complement to measure object extraction accuracy [31]. There are various
and thus can be used in complement to measure object extraction accuracy [31]. There are various
geometric measures, and this paper selects typical measurements (e.g., area, perimeter, and
geometric measures, and this paper selects typical measurements (e.g., area, perimeter, and barycenter)
barycenter) to design accuracy measures for OBIA.
to design accuracy measures for OBIA.
The size difference reflects the basic similarity between two objects. Based on this observation,
The size difference reflects the basic similarity between two objects. Based on this observation,
an object-based accuracy assessment method using size and size similarity S M is defined as
an object-based accuracy assessment method using size and size similarity SM is defined as
min ( SizeC , SizeR )
SM = (8)
min(( Size
max SizeCC, ,Size
Size ))
R R
SM = (8)
max(SizeC , SizeR )

where SizeC denotes the size of the evaluated object and SizeR denotes the size of the reference object.
Standard geometric features, such as area, perimeter, and outer radius, can be used as assessment
indices. The range of size similarity is 0 to 1. If all evaluated objects have the same size as that of the
Remote Sens. 2018, 10, 303 5 of 13

reference object, then SM = 1. If no evaluated object has the same size as that of the reference object,
then SM = 0.
Equation (8) ignores feature details of the object that may lead to inaccurate evaluation results.
To tackle this issue, an improved size similarity SF is presented.

| fC − fR |
SF = 1 − (9)
min( f C , f R )

where f C is the feature value of the evaluated object, f R is the feature value of the reference object,
and | f C − f R | is the feature difference between evaluated and reference objects. The features used in
Equation (9) include area, perimeter, and diameter. The range of improved size similarity is 0 to 1.
If all evaluated objects have the same size as that of the reference object, then SF = 1. When the ratio
between f C and f R exceeds 2 or is less than 0.5, then SF is set to 0.
Equations (8) and (9) are relatively easy to implement, making them suitable for obtaining
assessment results in near-real time. However, these two measures completely ignore the location
difference that increases errors in the assessment results. To improve the measures in Equations (8) and
(9), Tversky’s feature contrast model [32], based on the feature similarity description, was proposed.
This model measures SO , the similarity of two objects, using the following equation:

f (C ∩ R)
SO = (10)
f (C ∩ R) + α f (C − R) + β f ( R − C )

where f (C ∩ R) are common features of the evaluated object C and its reference object R, f (C − R)
denotes features that belong to the evaluated object C but not reference object R, and f ( R − C ) stands
for features that belong to the reference object R but not the evaluated object C, and α and β are weights
for f (C − R) and f ( R − C ), respectively.
Equation (10) can measure the similarity of objects at the class or individual scale. Features in
Equation (10) should be selected carefully, as some features (e.g., shape complexity, sphericity, and
circularity), are challenging to describe using f (C − R) and f ( R − C ). To improve the generalization
of the Tversky’s feature contrast model, this paper defines an improved matching similarity as follows:

f A (C ∩ R)
SO = (11)
f A (C ∩ R) + α f A (C − R) + β f A ( R − C )

where f A (C ∩ R) represents features of the intersection area of C and R, f A (C − R) denotes features


of the area of R to erase the evaluated object C, and f A ( R − C ) denotes the features of the area of the
evaluated object C to erase reference object R. The improved model considers location differences and
eases restrictions on feature selection. The range of SO is 0 to 1. If the extracted and reference objects
overlapped completely, then SO = 1. If there is no overlap between the two objects, then SO = 0.
Computing object similarity using a single feature will result in uncertain accuracy values.
A natural solution is to apply multiple features to calculate object similarity. To this end, the object
comprehensive similarity S0 is defined as

 0 i f TC 6= TR
S0 = 1
N (12)
 N ∑ u i Si otherwise
i =1

where TC and TR denote the classes of the evaluation and its matching reference objects, respectively,
N is the number of features, Si denotes the object similarity using the ith feature, and ui is the weight
of Si . The feature weight is determined according to the real scenario, and the determination basis can
be human subjectivity or feature applicability.
Remote Sens. 2018, 10, 303 6 of 13

After computing the similarity of each evaluated object, the overall accuracy Soverall for object
extraction can be calculated by
M
Soverall = ∑ w j S0j (13)
j =1

where S0j is the calculated similarity of the jth evaluated object, M is the number of evaluated objects,
and w j denotes the weight of jth evaluation object. The ratio of the area of the evaluated object to the
sum of the area of all objects can be used as the object weight.

2.5. Distance-Based Accuracy Measures


The distance difference between two objects, which can be completely reflected by the boundary
distribution, is an essential aspect to evaluate the similarity between objects. Particularly, Pratt introduced
a figure of merit (FFOM ) model to evaluate the accuracy of image segmentation [33]. FFOM is defined
as follows:
lC
1 1
max{lC , lR } i∑
FFOM = (14)
=1 1 + d2i
where lC is the boundary pixel number of the evaluated object, lR is the boundary pixel number of its
matching reference object, and di is the distance of the ith boundary pixel from the evaluated object to
the corresponding pixel on the reference object’s boundary. Based on FFOM , the shape similarity BD is
defined as follows:
lC
1 1
BD = ∑
max{lC , lR } i=1 1 + di
(15)
max(rC ,rR )

where rC and rR are the radii of the circumcircles for the evaluation and reference objects, respectively.
Generally, the boundary of the extracted object cannot be strictly the same as that of the reference
due to the error propagation during image interpretation. This phenomenon reduces the feasibility of
using Equation (15). To improve the flexibility of BD , a tolerance is set to judge if the two objects are
identical. The improved BD is defined as

 0 di ≥ d2
lC 
1  1
d1 < di < d2
BD = ∑ f,
max{lC , lR } i=1 i
fi =
 1+ max(ri
d
C ,rR )
(16)

1 di ≤ d1

where d1 and d2 are two thresholds. The value of d1 represents the tolerance for random errors. If the
distance di is less than the threshold d1 , the distance di can be tolerated. The value of the d2 represents
the unacceptable value of the error. If the distance di is larger than the threshold d2 , there is no reference
object boundary pixel that matches the ith boundary pixel of the evaluation object. In combination
with application purposes, the values of d1 and d2 are determined by the size of the object and the
spatial resolution of the image. The range of shape similarity is from 0 to 1. If all distances from
boundary pixels of the objects to matching reference object boundary are within the tolerance range,
then BD = 1. If all boundary pixels of the objects are incorrectly extracted, then BD = 0.
Equation (16) compares all pixels in the extracted and reference objects that lead to precise
evaluation result as well as low computation efficiency. This process can be simplified by choosing
boundary pixels at an equal interval range. The simplified shape similarity BL is defined as follows:
 
k
∑ C i
| l ( θ ) − l ( θ )|
R i  

i=1 d C−R
BL = 1 −  1− (17)
 
 kMin(rC , rR )  2Min(rC , rR )
 k 
  lC ( θi ) − lR ( θi )   dC-R 
BL = 1- i =1  1 −  (17)
 kMin ( rC , rR )   2Min ( rC , rR ) 

 
 
where k is the direction number, lC ( θi ) and lR ( θi ) denote the distance between the evaluation
Remote Sens. 2018, 10, 303 7 of 13

and reference objects from the barycenter to the boundary along the direction θi respectively, and
where i ∗k2πis the direction number, lC (θi ) and lR (θi ) denote the distance between the evaluation and
θi = . If the evaluation or reference object is a concave polygon, where there may be many i∗2π
referencek objects from the barycenter to the boundary along the direction θi respectively, and θi = k .
i , lC ( θi ) or lR ( θi ) is replaced by the mean distance. d C-R is
If the evaluation
boundary pointsor reference
along object is aθ concave
the direction polygon, where there may be many boundary points
along the direction θi , lC (θi ) or lR (θi ) is replaced by the mean distance. dC−R is the distance between
the distance between the evaluated object barycenter and reference object barycenter, respectively.
the evaluated object barycenter and reference object barycenter, respectively. The range of BL is from
The range of BL is from 0 to 1. If all sampling boundary pixels of the objects overlap with the
0 to 1. If all sampling boundary pixels of the objects overlap with the matching reference object
matching reference
boundary, then BD =object boundary,
1. If all evaluatedthen BD have
objects = 1 . Ifnoallmatching
evaluated objects have
reference nothen
objects, matching
BD = reference
0. Before
objects, then B =
calculating the distance
D 0 . Before calculating the distance difference, the evaluated object is shifted
difference, the evaluated object is shifted to the gravity center of the reference to the
object
gravity (see Figure
center 3). reference object (see Figure 3).
of the

evaluated object reference object


Figure 3. Illustration of Equation (17).
Figure 3. Illustration of Equation (17).

The sample number of boundary pixel determines the assessment reliability as well as the
The sample number of boundary pixel determines the assessment reliability as well as the
computation efficiency. If the requirement for reliability is high, the number of samples should be
computation efficiency. If the requirement for reliability is high, the number of samples should be
appropriately increased. If the calculation speed needs to be increased, the number of samples should
appropriately increased. If the calculation speed needs to be increased, the number of samples should
be appropriately reduced.
be appropriately reduced.
Considering object classes, the object extraction accuracy Boverall of the entire evaluation area can
Considering object classes, the object extraction accuracy Boverall of the entire evaluation area can
be calculated using Equation (18).
be calculated using Equation (18).
M 0 if T ≠T
Boverall =
C, j R, j
M w B′ , B0 ′j =  0 (18)
(
i f TC,j 6= TR,j
Boverall = ∑
j j0
j =1 w j Bj , Bj = 
Bj  Bj
otherwise
otherwise
(18)
j=1
where TC, j and TR, j denote the classes for the jth evaluated object and its matched reference objects,
B j is the
where TC,jshape
and Tsimilarity
R,j denote of
thethe
classes for the jth
jth evaluated evaluated
object with itsobject and reference
matched its matched reference
objects objects,
on one of the
Babove
j is the shape similarity of the jth
shape similarities, M is the number of evaluated objects, and w j denotes the weight of the
evaluated object with its matched reference objects on one of the
above shape similarities, M is the number of evaluated objects, and w j denotes the weight of the jth
jth evaluation object. The proportion of the area of the evaluated object to the total area of objects can
evaluation object. The proportion of the area of the evaluated object to the total area of objects can be
be used as the object weight.
used as the object weight.

3. Experimental Results and Analysis


In this section, the performance of the presented method was validated on two object types
(i.e., water and building), as they are representative of natural and artificial scenarios in general.
Generally, the extraction performance of water is satisfactory because water has a distinct boundary
feature compared to its surrounding pixels. Compared to water extraction, the building extraction
results may contain more errors that lead to lower accuracy, due to the complex environment
surrounding buildings. On the other hand, the building boundary is regular, while the water boundary
In this section, the performance of the presented method was validated on two object types (i.e.,
water and building), as they are representative of natural and artificial scenarios in general.
Generally, the extraction performance of water is satisfactory because water has a distinct boundary
feature compared to its surrounding pixels. Compared to water extraction, the building extraction
Remote Sens. 2018, 10, 303 8 of 13
results may contain more errors that lead to lower accuracy, due to the complex environment
surrounding buildings. On the other hand, the building boundary is regular, while the water
boundary
is irregular.is The
irregular. The experiments
experiments were conducted
were conducted on aan
on a PC with PCIntel
with an Intel Core2Quad
Core2Quad processor processor
at a clock
at a clock
speed speed
of 1.80 of 1.80
GHz. GHz. MATLAB
MATLAB ® ®
® and ARCGIS
and ARCGIS ® ®
® were to
were utilized utilized
produceto produce experimental
experimental results. results.

3.1. Data
Data Description
Description
Two QuickBird
QuickBird images
images were
were selected
selected toto validate
validate the proposed method.
the proposed method. One image, with a spatial
spatial
resolution
resolution of 2.4 m per pixel and an area of
an area of 1200
1200 ×
× 1200
1200 pixels,
pixels, was
was acquired
acquired on
on 16
16 July 2009 over
Wuhan, China
China (see
(see Figure 4a). The study area located on the outskirts of the city was mainly covered
Figure 4a).
by water, farmland, roads, and buildings. Another pan-sharpened image, with a spatial resolution of
0.61 m
m per
perpixel
pixeland
andananarea
areaofof400 × 400
400× 400 pixels, waswas
pixels, acquired
acquiredon 2on
May 20052005
2 May overover
Xuzhou, China
Xuzhou, (see
China
Figure 5a). 5a).
(see Figure The The
second study
second area,
study locating
area, near
locating thethe
near city center,
city center,was
wasmainly
mainlycovered
coveredby
bybuildings,
buildings,
roads, water, bare land, and grassland.
land, and grassland. Figures
Figures 4b
4b and
and 5b
5b present
present two complete reference maps
produced via manual interpretation.
interpretation.

(a) (b) (c) (d)


reference data extraction results

false area missed area


correct area

Figure 4. Water extraction results.


results. Centre
Centre coordinates:
coordinates: 30 ◦ 290 41” N, 114
30°29′41″ ◦ 310 54” E:
114°31′54″ E: (a)
(a) remote sensing
Water extraction
image;
image; (b)
(b) reference
reference data; (c)
(c) extraction
extraction results;
results; and
and (d)
(d) comparisons
comparisons of of extraction
extraction results with
reference data.

(a) (b) (c) (d)


reference data extraction results

false area missed area


correct area

Figure 5. Building
Building extraction results.
results. Centre coordinates: 34
Centre coordinates: ◦ 100 49” N, 117°09′16″
34°10′49″ 117◦ 090 16”E:
E: (a)
(a) remote
remote sensing
sensing
image;
image; (b)
(b) reference
reference data; (c)
(c) extraction
extraction results;
results; and
and (d)
(d) comparisons
comparisons of of extraction
extraction results with
reference data.

3.2. Object Extraction


Satellite images are processed to generate objects that will be subsequently used to verify the
performance of OBIA assessment measures. To this end, an improved watershed segmentation
method [34] was applied. The advantage of this method is that it integrates the spectral information,
texture feature and spatial relationships, which in turn makes it able to produce objects whose sizes
Remote Sens. 2018, 10, 303 9 of 13

are closer to the true sizes. Once the segmentation results were obtained, object features, including the
geometric characteristic, modified normalized difference water index (WNDWI) and normalized
difference vegetation index (NDVI), were computed. Finally, the decision tree [35] using object features
as input was performed to classify the image into target and background classes. The classification
results are shown in Figures 4c and 5c, respectively.

3.3. Evaluation of Object Extraction Accuracy Using Different Measures


The accuracy assessment is carried out by comparing extracted results with reference data,
as shown in Figures 4d and 5d. In this paper, the accuracy assessment employs all objects rather than
a fixed number of test samples.
The object extraction accuracies of two classes are firstly evaluated by the area-based measurement,
and Table 1 reports the evaluation results. It can be seen that the water class performs better than
the building class. The reason is that water is easily separated from background due to the relatively
large spectral difference between water and the surrounding objects. However, both material change
of the buildings rooves and the spectral similarity of buildings and their nearby objects decrease the
extraction performance. The different performances of water and building classes is shown in Table 1,
indicating that the area-based accuracy measure can assess OBIA accuracy in a straightforward and
efficient manner.

Table 1. Object extraction evaluation in terms of the area-based measurement.

Class Correctness (%) Completeness (%) Quality (%)


Water 93.55 89.09 83.94
Building 76.34 83.84 66.55

In the second experiment, an accuracy evaluation was conducted using the object number-based
accuracy index. To this end, numbers of total, correct, incorrect, and missing objects need to be
computed in advance. As in the first experiment, the object matching method is able to judge whether
objects are extracted correctly. With such automatic object extraction methods, it is extremely difficult
to achieve accurate results. To assess the precision characteristic based on the object number, different
object matching thresholds are set to judge if the object is extracted correctly. Table 2 reports the
evaluation results. The water class has a generally better performance than the building class, indicating
that the object number-based index can reflect the OBIA performance. Table 2 also shows that the
choice of threshold value has a profound impact on the number of correctly identified objects. If the
threshold value is high, the correct number is low; conversely, a low threshold value leads to a high
correct rate. Thus, the threshold can be taken as a guideline for users to measure the confidence level.
That is, if objects with high confidence are required, the threshold should be set to a large value.

Table 2. Object extraction evaluation using the object number-based measurement.

Class Threshold Correct Number Correct Rate (%) False Rate (%) Missing Rate (%)
0.90 38 56.72 43.28 46.48
Water 0.85 55 82.09 17.91 22.54
0.80 63 94.03 5.97 11.27
0.90 4 9.30 90.70 90.48
Building 0.85 21 48.84 51.16 50.00
0.80 31 72.09 27.91 26.19

Geometric features can effectively reflect the characteristics of an object. This experiment validates
its potential in evaluating OBIA results. To this end, area and perimeter are selected to measure the
feature similarity between the extracted and reference objects. In this experiment, weights of area
Remote Sens. 2018, 10, 303 10 of 13

and perimeter are set as 0.67 and 0.33, respectively, by trial and error. The size similarity is calculated
using Equations (8), (12), and (13), the improved size similarity is calculated using Equations (9), (12),
and (13), and the matching similarity is calculated using Equations (11)–(13), as shown in Table 3.
The evaluation results indicate similar trends in different similarity measures for the two experimental
areas. Similar to other assessment measures, all the similarity accuracies for the Xuzhou area are
lower than those for the Wuhan area. The size similarity and improved size similarity in terms of
area are both lower than those in terms of perimeter. However, matching similarity in terms of area
is higher than that in terms of perimeter. For both the area and perimeter as assessment measures,
the size similarity is always higher than the improved size similarity and the matching similarity
is the lowest. This difference stems from the size similarity and improved size similarity, which do
not consider the positional differences. The matching similarity considers the positional difference
between the evaluated object and the reference data, which better reflects the similarity of features.
The similarities of objects, calculated using different methods and features, differ, and the accuracy
(based on similarity) also contains great uncertainty. To obtain stable evaluation results, more features
should be considered to calculate the similarity.

Table 3. Assessing quality in terms of similarity.

Class Index Size Similarity (%) Improved Size Similarity (%) Matching Similarity (%)
Area 84.86 82.57 77.09
Water Perimeter 89.12 87.88 58.92
Area and perimeter 86.28 84.34 71.04
Area 79.19 73.74 66.21
Building Perimeter 81.16 78.55 55.55
Area and perimeter 79.85 75.34 62.66

The Euclidean distance between the gravity centers is used to calculate the distance between the
evaluated object and its matching reference object. Thresholds d1 and d2 are set to the width of a pixel
and five pixels, respectively. The shape similarity is calculated using Equations (16) and (18), and the
improved shape similarity is estimated using Equations (17) and (18), as shown in Table 4. In the two
experimental areas, the similarity based on boundary distance is slightly higher than that based on
boundary difference and barycenter distance. The precision of object extraction is relatively high in
the Wuhan area, which can be attributed to the similarity based on boundary distance being very
close to that based on boundary difference and barycenter distance. Both similarities are calculated by
comparing the detailed differences in objects, which fully reflects the differences in objects and ensures
that the assessment measures are more stable. Although they require a tedious calculation process,
these two measures need to be considered when assessing the accuracy of object extraction requiring
high precision.

Table 4. Assessing quality in terms of distance.

Class Shape Similarity (%) Improved Shape Similarity (%)


Water 85.50 85.90
Building 76.46 77.99

A comprehensive comparison of the object extraction for the two experimental areas is generated
using Tables 1–4. According to the comprehensive comparison, we can conclude that the water
extraction result is generally better than the building extraction result. The superior results are due to
the greater spectral difference between water and its surrounding objects than that between buildings
and their surrounding objects, especially as the interior pixels of water have high homogeneity and
building structures are complex.
Remote Sens. 2018, 10, 303 11 of 13

4. Discussion
This paper presents four kinds of accuracy measures based on different object characteristics,
namely, area-based, object number-based, feature similarity-based, and distance-based accuracy
measures. Since the study area is usually large, a suitable sampling method needs to be selected before
assessing the accuracy. The accuracy assessment results using object area are similar to those at the
pixel level. Accuracy assessment based on object number requires that the objects are first extracted
correctly. The criteria used to check if the objects are correctly or falsely identified have a profound
impact on the assessment results. Many characteristics can be used as the basis of assessment measures
for object feature similarity. The selection of a base has a significant impact on the assessment results:
unreasonable feature selection will lead to unreliable assessment results (such as when only perimeter
or length is used, or equal weights are used for area and perimeter). Thus, feature selection and feature
weight are essential to determine the optimal basis.
The computation of difference-based accuracy measures is relatively complex. Selecting boundary
pixels at reasonable intervals can improve the computational efficiency while retaining the reliability
of the assessment results. Both area- and object number-based accuracy measures ignore the object
detail. By contrast, feature-based accuracy assessments do not directly judge an object to be correct or
otherwise; this aspect can better reflect the feature difference between the extraction object and the
reference object. The accuracy measures based on boundary/location difference consider details of the
object and reflect the object local feature difference, resulting in more accurate and confident results.
Each assessment measure has its own advantages and disadvantages. Thus, the measures should be
selected carefully according to the requirements. Specifically, if the accuracy needs to be computed
in near-real time, the area-based measure would be chosen. This is because the area-based accuracy
measure is straightforward and does not require an object matching process. Despite some errors,
considering the advantage of computational efficiency, area-based measures can be selected to obtain
faster assessment results with fewer requirements. If object extraction accuracy and its confidence
are required simultaneously, the object number-based accuracy measure is recommended, as the
threshold of coincidence degree is related to the confidence level: the larger the threshold, the higher
the confidence level. If object extraction accuracy is to be understood comprehensively, feature-based
and distance-based accuracy measurements are advisable as they fully consider detailed information
of the object, such as shape and size.

5. Conclusions
A series of factors influence the assessment of object extraction from remote sensing images,
which makes a complete and general accuracy index difficult to obtain. To tackle this issue, this paper
presents four novel assessment measures with different criteria. The designed measurements are
highly generalizable and provide users with practical means to evaluate object extraction results
according to their unique needs. The methods presented in this paper require static objects with clearly
defined edges. Further investigation and experimentation into dynamic objects (e.g., moving clouds,
cars, and ships) with fuzzy boundaries is strongly recommended. The accuracy for objects with
indeterminate boundaries can be assessed by two means: (1) assuming that the determinate boundaries
are assigned to an object with indeterminate boundaries; and (2) setting the tolerance for the uncertainty
of object boundaries (for example, the accuracy can be obtained by calculating the shape similarity
based on the tolerance of object boundary distance).

Acknowledgments: This work was supported partly by the National Natural Science Foundation of China
(41331175 and 41701500), the Shandong Social Science Planning Fund Program (17CGLJ27), a Project of Shandong
Province Higher Educational Science and Technology Program (J17KA064), and the Open Fund of Key Laboratory
of Coastal Zone Exploitation and Protection, Ministry of Land and Resource (2017CZEPK02).
Author Contributions: Liping Cai proposed the study, conducted the experiments, interpreted the results,
and wrote and revised the manuscript. Wenzhong Shi advised on the study design and the manuscript structure.
Remote Sens. 2018, 10, 303 12 of 13

Zelang Miao advised on the manuscript structure and contributed to the manuscript writing and revision.
Ming Hao advised on the manuscript writing and revision.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Guanter, L.; Segl, K.; Kaufmann, H. Simulation of optical remote-sensing scenes with application to the
EnMAP hyperspectral mission. TGRS 2009, 47, 2340–2351. [CrossRef]
2. Bovensmann, H.; Buchwitz, M.; Burrows, J.P.; Reuter, M.; Krings, T.; Gerilowski, K.; Schneising, O.;
Heymann, J.; Tretner, A.; Erzinger, J. A remote sensing technique for global monitoring of power plant CO2
emissions from space and related applications. Atmos. Meas. Tech. 2010, 3, 781–811. [CrossRef]
3. Pajares, G. Overview and current status of remote sensing applications based on Unmanned Aerial Vehicles
(UAVs). Photogramm. Eng. Remote Sens. 2015, 81, 281–329. [CrossRef]
4. Yu, Q. Object-based detailed vegetation classification with airborne high spatial resolution remote sensing
imagery. Photogramm. Eng. Remote Sens. 2006, 72, 799–811. [CrossRef]
5. Blaschke, T. Object based image analysis for remote sensing. ISPRS J. Photogramm. Remote Sens. 2010, 65, 2–16.
[CrossRef]
6. Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images:
From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [CrossRef]
7. Kettig, R.L.; Landgrebe, D.A. Classification of multispectral image data by extraction and classification of
homogeneous objects. ITGE 1976, 14, 19–26. [CrossRef]
8. Zhan, Q.; Molenaar, M.; Tempfli, K.; Shi, W. Quality assessment for geo-spatial objects derived from remotely
sensed data. Int. J. Remote Sens. 2005, 26, 2953–2974. [CrossRef]
9. Möller, M.; Lymburner, L.; Volk, M. The comparison index: A tool for assessing the accuracy of image
segmentation. Int. J. Appl. Earth Obs. Geoinf. 2007, 9, 311–321. [CrossRef]
10. Rutzinger, M.; Rottensteiner, F.; Pfeifer, N. A comparison of evaluation techniques for building extraction
from airborne laser scanning. J.-STARS 2009, 2, 11–20. [CrossRef]
11. Hofmann, P.; Blaschke, T.; Strobl, J. Quantifying the robustness of fuzzy rule sets in object-based image
analysis. Int. J. Remote Sens. 2011, 32, 7359–7381. [CrossRef]
12. Radoux, J.; Bogaert, P. Accounting for the area of polygon sampling units for the prediction of primary
accuracy assessment indices. Remote Sens. Environ. 2014, 142, 9–19. [CrossRef]
13. Styers, D.M.; Moskal, L.M.; Richardson, J.J.; Halabisky, M.A. Evaluation of the contribution of LiDAR data
and postclassification procedures to object-based classification accuracy. J. Appl. Remote Sens. 2014, 8, 083529.
[CrossRef]
14. Whiteside, T.G.; Maier, S.W.; Boggs, G.S. Area-based and location-based validation of classified image objects.
Int. J. Appl. Earth Obs. Geoinf. 2014, 28, 117–130. [CrossRef]
15. Shi, W.; Zhang, X.; Hao, M.; Shao, P.; Cai, L.; Lyu, X. Validation of land cover products using reliability
evaluation methods. Remote Sens. 2015, 7, 7846–7864. [CrossRef]
16. Yang, J.; He, Y.; Caspersen, J.; Jones, T. A discrepancy measure for segmentation evaluation from the
perspective of object recognition. ISPRS J. Photogramm. Remote Sens. 2015, 101, 186–192. [CrossRef]
17. Zhang, X.; Feng, X.; Xiao, P.; He, G.; Zhu, L. Segmentation quality evaluation using region-based precision
and recall measures for remote sensing images. ISPRS J. Photogramm. Remote Sens. 2015, 102, 73–84.
[CrossRef]
18. Maclean, M.G.; Congalton, R.G. Map accuracy assessment issues when using an object-oriented approach.
In Proceedings of the ASPRS 2012 Annual Conference, Sacramento, CA, USA, 19–23 March 2012; pp. 19–23.
19. Clinton, N.; Holt, A.; Scarborough, J.; Yan, L.; Gong, P. Accuracy assessment measures for object-based image
segmentation goodness. Photogramm. Eng. Remote Sens. 2010, 76, 289–299. [CrossRef]
20. Montaghi, A.; Larsen, R.; Greve, M.H. Accuracy assessment measures for image segmentation goodness of
the Land Parcel Identification System (LPIS) in Denmark. Remote Sens. Lett. 2013, 4, 946–955. [CrossRef]
21. Awrangjeb, M.; Fraser, C.S. An automatic and threshold-free performance evaluation system for building
extraction techniques from airborne LiDAR data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
2014, 7, 4184–4198. [CrossRef]
Remote Sens. 2018, 10, 303 13 of 13

22. Bonnet, S.; Gaulton, R.; Lehaire, F.; Lejeune, P. Canopy Gap Mapping from Airborne Laser Scanning:
An Assessment of the Positional and Geometrical Accuracy. Remote Sens. 2015, 7, 11267–11294. [CrossRef]
23. Shahzad, N.; Ahmad, S.R.; Ashraf, S. An assessment of pan-sharpening algorithms for mapping mangrove
ecosystems: A hybrid approach. Int. J. Remote Sens. 2017, 38, 1579–1599. [CrossRef]
24. Kuffer, M.; Pfeffer, K.; Sliuzas, R.; Baud, I. Extraction of slum areas from VHR imagery using GLCM variance.
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 1830–1840. [CrossRef]
25. Möller, M.; Birger, J.; Gidudu, A.; Gläßer, C. A framework for the geometric accuracy assessment of classified
objects. Int. J. Remote Sens. 2013, 34, 8685–8698. [CrossRef]
26. Cheng, J.; Bo, Y.; Zhu, Y.; Ji, X. A novel method for assessing the segmentation quality of high-spatial
resolution remote-sensing images. Int. J. Remote Sens. 2014, 35, 3816–3839. [CrossRef]
27. Eisank, C.; Smith, M.; Hillier, J. Assessment of multiresolution segmentation for delimiting drumlins in
digital elevation models. Geomorphology 2014, 214, 452–464. [CrossRef] [PubMed]
28. Zhang, X.; Xiao, P.; Feng, X.; Feng, L.; Ye, N. Toward evaluating multiscale segmentations of high spatial
resolution remote sensing images. TGRS 2015, 53, 3694–3706. [CrossRef]
29. Radoux, J.; Bogaert, P. Good Practices for Object-Based Accuracy Assessment. Remote Sens. 2017, 9, 646.
[CrossRef]
30. Witharana, C.; Civco, D.L.; Meyer, T.H. Evaluation of data fusion and image segmentation in earth
observation based rapid mapping workflows. ISPRS J. Photogramm. Remote Sens. 2014, 87, 1–18. [CrossRef]
31. Lizarazo, I. Accuracy assessment of object-based image classification: Another STEP. Int. J. Remote Sens.
2014, 35, 6135–6156. [CrossRef]
32. Tversky, A. Features of similarity. Read. Cognit. Sci. 1977, 84, 290–302. [CrossRef]
33. Pratt, W. Introduction to Digital Image Processing; CRC Press: Boca Raton, FL, USA, 2013.
34. Cai, L.; Shi, W.; He, P.; Miao, Z.; Hao, M.; Zhang, H. Fusion of multiple features to produce a segmentation
algorithm for remote sensing images. Remote Sens. Lett. 2015, 6, 390–398. [CrossRef]
35. Friedl, M.A.; Brodley, C.E. Decision tree classification of land cover from remotely sensed data.
Remote Sens. Environ. 1997, 61, 399–409. [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like