OMIS
OMIS
SCIENCE AND
TECHNOLOGY
Mining Science and Technology 19 (2009) 0835–0841
www.elsevier.com/locate/jcumt
Abstract: In order to combine feature extraction operations with specific hyperspectral remote sensing information processing
objectives, two aspects of feature extraction were explored. Based on clustering and decision tree algorithm, spectral absorption
index (SAI), continuum-removal and derivative spectral analysis were employed to discover characterized spectral features of dif-
ferent targets, and decision trees for identifying a specific class and discriminating different classes were generated. By combining
support vector machine (SVM) classifier with different feature extraction strategies including principal component analysis (PCA),
minimum noise fraction (MNF), grouping PCA, and derivate spectral analysis, the performance of feature extraction approaches in
classification was evaluated. The results show that feature extraction by PCA and derivate spectral analysis are effective to OMIS
(operational modular imaging spectrometer) image classification using SVM, and SVM outperforms traditional SAM and MLC
classifiers for OMIS data.
Keywords: hyperspectral remote sensing; feature extraction; decision tree; SVM; OMIS
116.254089'E. OMIS sensor has 64 wavebands within ple components. It is difficult to extract characterized
spectral interval from 0.46 ȝm to 1.1 ȝm. The ex- spectral features from original spectral curve directly.
perimental image consists of 512 rows and 512 pixels. Pre-processing of spectral curves is used to reinforce
Fig. 1 is the RBG composite image of Band 36(0.81 characterized reflectance and absorption spectral fea-
ȝm), Band 23(0.68 ȝm) and Band 11(0.56 ȝm) as red, tures. Continuum removal is a very effective algo-
green and blue components respectively. rithm for spectral curve processing[10]. The continuum
of the spectral curve is equal to its crust, and the
value after continuum removal is the ratio of actual
Bare soil value with the corresponding value on the continuum.
By using continuum removal the reflectance and ab-
Water Cropland sorption features can be reinforced. By comparing
those spectral features extracted from original data
and continuum-removal data, it proves contin-
uum-removal is effective to extract significant spec-
tral features.
3.3 Spectral absorption feature
Spectral curve picked from hyperspectral data can
illustrate the reflectance or absorption features of
ground objects because most objects have typical
Fig. 1 False composite of the original hyperspectral spectral features which are highly correlated with
remote sensing image
their chemical components[11]. Spectral absorption
features can be extracted from spectral curve directly
3 Feature extraction for target identifica- or derivative spectra indirectly, and some parameters
tion based on data mining including width, height, slope and symmetry, can be
further derived to calculate a comprehensive index:
This section focuses on spectral feature extraction SAI for every spectral absorption location. SAI can
for target identification based on data mining algo- be obtained by Eq.(1)[11]:
rithms. For the given spectra acquired from field W ⋅ Rs + (1 − W ) Re (1)
SAI =
spectral measurement using spectrometer or pixels on Rm
hyperspectral remote sensing image, how to extract
where W = λe − λs , Rs and λs are the reflectance
those significant features or characterized spectral
features that can characterize the objects is still one of the absorption peak and wavelength left to the ab-
important task for hyperspectral information process- sorption trough respectively. Re and λe are the
ing and applications. Different methods to extract reflectance of absorption peak and wavelength right
characteristic spectral features have been researched to absorption trough respectively. Rm is the reflec-
in the past[8–9]. Data mining (DM) algorithms, in-
cluding clustering, association rules and decision tree, tance of absorption trough. The wavelength of the
are capable of providing new powerful tools for hy- absorption trough band is called as spectral absorp-
perspectral information intelligent processing and tion position (SAP).
feature discovery. Some researches on spectroscopy have show that
SAI can represent the variation characteristics of
3.1 Clustering spectral absorption features essentially. Based on our
Before candidate spectral features are derived, experiments, SAI is suitable to identify and discrimi-
clustering is used to categorize those processed spec- nate objects. The index of SAP and SAI which are
tra into different groups for further processing. Hy- suitable to construct the decision tree can be derived
perspectral image or spectral clustering aims to parti- by selecting the most distinct features. For example,
tion all pixels into different categories based on simi- the features that are effective to describe different
larity measure. Four spectral similarity measure indi- classes are listed in Table 1.
cators, including spectral angle, spectral information Table 1 Spectral features derived from spectral curve
divergence, distance and correlation coefficients, are of different objects
commonly used in clustering. By comparison of Objects SAI SAP (nm)
above four indexes, spectral angle is adopted for Crop I 0.5389 690
similarity measure. Crop II 0.4520 690
Note: Crop I is the spectra from (X:282, Y:188), Crop II from (X:244,
Y:150), Grassland from (X:314, Y:46), Built-up land from (X:65, Y:448),
Bare soil from (X:443, Y:352). X is column and Y is row of the imagery.
Fig. 3 Multi-class decision tree for target identification Symmetry 0.3846 0.3846
SAI 0.5389 0.4520
In Fig. 3, T0, T1, T2 and T3 are discrimination
rules, and they may be different when using various If we want to distinguish some objects from others,
spectral curves. The following discrimination condi- the characterized features that can stand for that ob-
tions are based on third-order derivative spectra. ject should be extracted and employed. Spectra of
T0: (SAP=690±10 and SAI=0.5±0.05); different groups can be compared, and then those
T1: (SAP=530±10 and SAI=0.65±0.05); common features are ignored because they are not
838 Mining Science and Technology Vol.19 No.6
effective to distinguish different targets, but the fea- tions may be combined to give Eq.(2)[15–17]:
tures that can distinguish from others are remained. yi ( w ⋅ xi + b) ≥ 1 (2)
Those effective features are listed in Table 4. So the geometrical margin between the two classes
Table 4 Valid features to distinguish grassland, is given by 2 / w , named as margin. The concept of
built-up land and bare soil from crops
margin is central in the SVM approach, since it is a
Valid Built-up Bare
features
Crop I Crop II Grassland
land soil measure of its generalization capability. The larger
Wavelength
690 690 530
530 630 the margin is, the higher the expected generalization
(nm)
70 70
is.
Width 130 130 150
Accordingly, it turns out the optimal hyperplane
Depth 0.5977 0.6455 0.1882 0.1812
can be determined as the solution of the following
Slope 1.2324 1.1125 –1.1651
convex quadratic programming problem:
Symmetry 0.3846 0.3846 0.2667 0.2667 0.2857
1 2
SAI 0.5389 0.4520 0.7417 0.7417 0.9029 m in : w
w ,b 2
subject to: yi [ (w ⋅ xi ) + b] ≥ 1, i = 1, 2, , N (3)
By above operations those characterized features to
recognize a specific class and discriminate different The classical linearly constrained optimization
classes can be extracted. Based on those features, the problem can be translated (using a Lagrangian for-
decision tree to identify specific class can be created mulation) into the following dual problem:
and then adopted to further image processing and N
1 N N
ferent feature extraction approaches, including PCA, from original data is effective to OMIS hyperspectral
MNF and grouping PCA, are experimented. image classification.
Different combination schemes of principal com- When MNF is used for feature extraction and di-
ponents are experimented. The classification accuracy mensionality reduction, the same component combi-
using different components is shown in Table 5. At nations as PCA are used to experiment the three clas-
beginning, accuracy increases with components num- sifiers: SVM, SAM and MLC. The classification ac-
bers, but after a specific stage the classification accu- curacy indicators are shown in Table 7 and Table 8.
racy decrease with more components used in classi- Table 7 Classification accuracy of SVM using
fication. The possible reason is the noises in those MNF components
latter components are introduced when more compo- Amount of Total accuracy
Kappa C σ
nents are used. components (%)
1 43.56 0.316 32 0.5
Table 5 Classification accuracy of SVM using PCA
2 44.78 0.319 32 0.2
different components
Amount of 3 44.89 0.330 32 32
Total accuracy (%) Kappa C σ
components 4 45.34 0.334 32 0.2
1 40.57 0.299 32 0.5 5 45.47 0.350 32 0.2
2 65.52 0.540 32 0.2 10 55.69 0.411 32 0.1
3 67.39 0.576 32 32 15 56.53 0.424 16 0.06
4 67.63 0.581 32 0.2 20 57.68 0.440 8 0.05
5 68.83 0.591 32 0.2 30 56.21 0.414 8 0.03
10 66.92 0.567 32 0.1 40 56.11 0.411 8 0.03
15 66.54 0.561 16 0.06 50 55.98 0.409 4 0.02
20 66.35 0.559 8 0.05 60 55.56 0.402 4 0.02
30 66.05 0.550 8 0.0333
40 65.95 0.550 8 0.03 Table 8 Classification accuracy of SAM and MLC
50 65.56 0.541 4 0.02 using MNF components
60 64.30 0.530 4 0.02 SAM MLC
Amount of
components Total Total
Kappa Kappa
In order to compare the performance of SVM clas- accuracy (%) accuracy (%)
sifier with traditional spectral angle mapper (SAM) 2 15.71 0.008 19.10 0.033
and maximum likelihood classifier (MLC) to OMIS 3 16.25 0.002 21.26 0.058
data, the classification accuracy of SAM and MLC 4 17.39 0.008 21.29 0.058
are listed in Table 6. 5 18.58 0.020 18.99 0.030
Table 6 Classification accuracy of SAM and MLC 10 18.97 0.020 16.33 0.002
using PCA components 15 20.03 0.029 14.79 0.019
SAM MLC 20 18.41 0.001 14.26 0.026
Amount of
components Total accuracy Total accuracy 30 18.32 0.003 15.07 0.015
Kappa Kappa
(%) (%) 40 18.22 0.006 16.31 0.001
2 17.35 0.008 15.36 0.009
50 18.11 0.007 16.28 0.001
3 17.57 0.010 16.46 0.001
60 17.49 0.007 16.24 0.001
4 17.44 0.008 15.25 0.013
5 17.45 0.008 16.56 0.003 It can be concluded that the accuracy of SVM clas-
10 17.60 0.010 15.52 0.009 sifier increases with the component amounts and ar-
15 17.62 0.010 15.16 0.014 rives at its maximum when 20 components are em-
20 17.61 0.010 14.95 0.014 ployed, and then decreases with the increase of com-
30 17.64 0.011 15.76 0.006 ponent numbers. In contrast with PCA, MNF is less
40 17.66 0.011 16.35 0.001 effective to OMIS image feature extraction for OMIS
50 17.70 0.011 16.42 0.001 image classification.
60 17.69 0.011 16.37 0.001 Apart from the PCA and MMN transformation to
the entire dataset, grouping PCA is experimented. In
It shows that SVM outperforms MLC and SAM to grouping PCA, PCA are used to different groups that
OMIS image classification, especially the classifica- contain some similar bands of original data based on
tion accuracy of SVM is much higher than that of subspace partition and the first components of each
SAM and MLC when principal components are group are selected for classification. Correlation co-
adopted. It also shows that the former 5 components efficients among adjacent bands are used as the crite-
can result in high accuracy. Therefore, SVM classifier rion of subspace partition.
using the first five principal components extracted Two grouping schemes are used. In the first
840 Mining Science and Technology Vol.19 No.6
scheme, five groups are created: band 1 to band 11, Derivative spectral analysis can enhance some in-
band 12 to band 22, band 23 to band 38, band 39 to trinsic spectral features for target identification and
band 53, and band 54 to band 64. In the second classification. For the OMIS hyperspectral remote
scheme, ten groups are generated: band 1 to band 7, sensing imagery with 64 bands, 62 new derivative
band 8 to band 11, band 12 to band 19, band 20 to spectral spaces (images) are extracted based on the
band 22, band 23 to band 34, band 35 to band 38, principle of one-order derivative spectral analysis. By
band 39 to band 47, band 48 to band 53, band 54 to comparing the classification accuracy of 64 dimen-
band 60, band 61 to band 64. sional original data and 62 dimensional one-order
When grouping PCA is used, PCA is conducted to derivative spectral data, it shows that classifier using
each group and the first component of that group is derivative spectra as inputs performs better than that
selected to generate the data sets for classification, so using original data. In addition, when the mixed data-
5 components for the first scheme, and 10 compo- set of derivative spectra and original data is used for
nents for the second scheme are extracted. Table 9 is classification, there is a bit improvement to classifi-
the classification accuracy of grouping PCA. cation accuracy. Table 11 is the classification accu-
Table 9 Classification accuracy of SVM to grouping PCA
racy of SVM using original data, derivative spectra
σ
and mixed dataset.
Grouping PCA Total accuracy (%) Kappa C
PC1 of five groups 65.96 0.558 32 0.2 Table 11 Classification accuracy of SVM, SAM and MLC
using derivative spectra
PC1 of ten groups 66.16 0.561 32 0.1
Total
accuracy Kappa C σ
(%)
As a comparison, the band with maximum infor- SVM using derivative spectra 66.08 0.559 32 0.018
mation content that is measured by the maximum /62 dimensional
variance in each group is selected to generate other SVM using Original data 64.18 0.525 32 0.011
/64 dimensional
data sets for classification. Table 10 is the accuracy of
SVM using mixed data 66.91 0.570 32 0.0079
grouping band selection. It can be known that group- /126 dimensional
ing PCA is also an effective mean to feature extrac- SAM using derivative spectra 21.33 0.033
tion and classification, but its accuracy is a bit lower /62 dimensional
than that by overall PCA. MLC using derivative spectra 19.25 0.025
/62 dimensional
Table 10 Classification accuracy of SVM to grouping
band selection In order to indicate the performance of different
Grouping selection Total accuracy (%) Kappa C σ feature extraction and combination schemes, some
Five groups 61.72 0.502 32 0.05 classification results are listed in Fig. 4. Those results
Ten groups 64.71 0.541 32 0.11
can show the effectiveness of those feature extraction
methods further.
(a) Original data (b) Former five principal components (c) Former 20 MNF components
(d) Grouping PCA (10 groups) (e) First order derivatie spectra (f) Mixed data set of original data
and first derivative spectra
Fig. 4 Classification results of SVM using different feature combination schemes
DU Pei-jun et al Feature extraction for target identification and image classification of … 841
References
[1] Pu R, Gong P. Hyperspectral Remote Sensing and Its
Application. Beijing: Higher Education Press, 2000. (In
Chinese)