2009 Advanced Video and Signal Based Surveillance
A Pruning Approach Improving Face Identification Systems
        Anis CHAARI12, Sylvie LELANDAIS1                                                Mohamed BEN AHMED2
           1                                                                 2
           IBISC Laboratory, CNRS FRE 3190                                    RIADI Laboratory, National School of Computer
                    Evry University                                                  Science, Manouba University
                    Evry, FRANCE                                                        La Manouba, TUNISIA
    anis.chaari@ibisc.fr, s.lelandais@iut.univ-evry.fr                             mohamed.benahmed@riadi.rnu.tn
 Abstract—We propose, in this paper, a new biometric                      scheme of the database according to the similarity of the
 identification approach which aims to improve recognition                most discriminatory features we could extract from the facial
 performances in identification systems. We aim to split the              images. Therefore, our system is based essentially on two
 identity database into well separated partitions in order to             critical tasks namely the extraction of discriminating features
 simplify the identification task. In this paper we develop a face        and the partition of the database.
 identification system and we use the reference algorithms of                 Several studies in this context of databases partitioning
 Eigenfaces and Fisherfaces in order to extract different                 have been developed to identify individuals by their
 features describing each identity. These features, which                 fingerprints [3, 4, 5]. In fact, fingerprints possess a global
 describe faces, are generally optimized to establish the
                                                                          morphological structure that helps to discriminate and
 required identity in a classical identification process. In this
 work, we develop a novel criterion to extract features used to
                                                                          separate them into well defined classes. Global ridge and
 partition the identity database. We develop database                     furrow structure forms special patterns in the central region
 partitioning with clustering methods which split the gallery by          of the fingerprint. These patterns are learned and supervised
 bringing together identities which have similar features and             methods of classification are used to separate them into the
 separating dissimilar features in different bins. Pruning the            five well defined classes, namely, left loop, right loop, whorl,
 most dissimilar bins from the query identity features allows us          arch and tented arch. The identification of a query fingerprint
 to improve the identification performances. We report results            is performed in a subset of the gallery, which corresponds to
 from the XM2VTS database.                                                its membership class.
                                                                              Having other perspectives to identify the sex or the
     Keywords-Biometry; face identification; clustering; feature          ethnicity of a person, classification of facial images
 extraction; image database.                                              according mainly to their sex membership was extensively
                                                                          studied [6, 7, 8, 9, 10]. Although the classes are known a
                       I.     INTRODUCTION                                priori, data extracted from faces are generally implicit and
                                                                          any explicit pattern has been found to discriminate classes.
     We propose in this paper a new approach improving                    Thus the attributes extracted from facial images could not be
 identification performances in biometric recognition systems.            compatible with the learned distribution of classes and
 The classical process of identification consists in matching             therefore not sufficiently discriminating.
 features of the unknown identity with the equivalent features                Unlike these systems which require supervised
 of all users already stored in the database system (gallery).            classification methods, we address unsupervised learning
 The unknown person to be recognised or the probe (image                  methods to classify and partitioning facial images according
 containing his face) is identified like the user (enrolled               to the extracted features. Few studies have attempted to
 person in the database) having the features which resemble               cluster any biometric modality which hasn’t morphological
 the more to the probe features (1:N match). Within the                   features defining some categories unlike fingerprints. The
 framework of identification, we don’t have any a priori                  authors in [11, 12] have proposed an approach which
 information of the probe identity. Thus, identification                  performs clustering using hand geometry and signature
 systems have a significant decrease of performances                      features to reduce the search space upstream the exhaustive
 according especially to the potential number of system users             identification. In addition, such clustering approaches are not
 (size of the database) [1]. They become more complex and                 available for other biometry such as face or iris.
 very computing expensive especially with large biometric                     Unsupervised classification (or clustering) aims to
 databases [2].                                                           provide homogenous and well separated classes. Indeed, the
     The system we propose aims to reduce the complexity                  general rule of data clustering is to minimize the intra-class
 and to improve the performances of biometric identification.             and to maximize the inter-class data variance. We develop,
 To achieve this goal, we add, to the classic identification              in this paper, clustering with K-means algorithm in order to
 scheme, an offline partitioning phase of the gallery upstream            partition the face database and to reduce the search space to a
 the online searching phase. We propose a partitioning                    subset of the whole database.
978-0-7695-3718-4/09 $25.00 © 2009 IEEE                              85
DOI 10.1109/AVSS.2009.80
        Biometric data            Representation                 Partitioning
         2D faces, Iris               Features                    Clustering                  Identification of       Identification of
              …                      extraction                    methods                       partitions               persons
                                                                  Offline phase                Online phase
                                          Figure 1. The different steps of our identification system.
                                                                            of features from the same algorithm like Eigenfaces.
                    II.   CONCEPTUAL APPROACH                               However, in this case partitioning features are extracted
    We describe, in this section, our conceptual approach of                differently from identification features. In fact, partitioning
biometric identification systems. Figure 1 describes the                    features constitute the input of clustering algorithm, whereas
hierarchy of the different processes, which constitute our                  identification features will be matched with a similarity
identification system. The gray rectangles show the original                measure in the identification process. On the other hand, the
stages we conceive in addition to the conventional                          use of different algorithms (multi-algorithmic) that
identification steps. We introduce mainly a clustering based                represents as independently as possible the information
approach to divide the gallery into several subsets. This                   contained in facial images is needed between the
partitioning process aims to group the data (identities) that               classification and the identification processes. Our aim is to
have similar characteristics and separate dissimilar ones into              generate several and different features for each person in the
different bins. We propose, through this partitioning process               gallery. This allows to operate the partition of the gallery
of the gallery, to retain the closest clusters from the query               following some patterns different from those used for
image and to keep their representing identities for the final               identification. The dependence level of these patterns will
step of identification. Pruning the most dissimilar partitions              affect undoubtedly the final results of identification after the
with the query identity features lead to simplify the search in             pruning process.
the gallery. This simplification is induced by reducing the                     We develop in the next section the subspaces methods of
number of identity from the gallery. Finally, we operate the                Eigenfaces and Fisherfaces. Eigenfaces method tries to
identification process on the retained identities from the                  produce a new face space whose axes define the data
gallery simply by similarity measurement.                                   variation. On the other hand, Fisherfaces find axes which
    We develop hereafter the partitioning stage which aims                  better discriminate the different samples of each identity. We
to divide the database into several clusters. The extracted                 develop in this work these two different algorithms in order
vectors of features for the partitioning task are assimilated to            to combine them for partitioning and identification tasks in
multi-dimensional data points occupying the face space F.                   our approach framework.
Data clustering algorithms have the aim to identify the sparse
and the crowded places of this space and divide consequently
the distribution points into groups in such way to minimize
the intra class variance and maximize the inter class
variance. Clustering problem is then formalised as follows:
given the desired number of clusters K and a dataset of N
points, and a distance-based measurement function, it is a
question of finding a partition of the dataset that minimizes
the value of the measurement function [13]. Data points are
then assigned to clusters Ci so that each cluster must contain
at least one data point; and each data point may belong to
one and only one cluster as show (1):
                K
            * i =1
                     Ci = Φ and Ci ŀ Cj = Ø, i  j               (1)
    Following such binning, the biometric database will be
partitioned such that features (or templates) in each bin are
similar and correspond to a statistical class. We show in
figure 2 some representation of clusters by using Eigenfaces
features. We remark that faces in each bin are homogenous
and present some common patterns. These results
demonstrate the relevance of our clustering approach.
    We differentiate features which will be used for the
database partitioning and those which will be used for the
final identification process. We can develop these two kinds                       Figure 2. Some clusters from the XM2VTS partitioning.
                                                                       86
                 III.   FEATURE EXTRACTION                                 The choice of features for the identification task is
    A very significant point in a pattern recognition problem           obviously guided by those giving the best identification rate.
is to extract pertinent and discriminative features which will          On the other hand, for the database partitioning, we choose
constitute the data input to the classification module. We              smallest features (with minimal size) which mainly
distinguish, through our experiments, the main pertinent                minimize the TMR. Smallest features are preferred in order
features as well for identification task as for clustering task.        to reduce the dimensionality of the classification problem.
We report results in term of Identification Rate and the Top            We can deduce already a simplification of the gallery
Match Rank as show the figure 3.                                        through the ratio between the TMR and the database size.
    The Identification Rate (IR) is the most commonly used              Thus this criterion (TMR) is very important for our
evaluation result and criteria for identification systems. But          partitioning problem in order to prune a subset of the
in the case of error, it is useful to know the rank of the right        database.
response among a number of ordered scores. Then we trace                   Eigenfaces and Fisherfaces are reference face recognition
the Cumulative Match Score (CMS) curve which represents                 methods which build different face spaces. In order to
the probability that the right identity is among a set of               produce the feature vector, these two methods proceed
nearest scores.                                                         similarly by projecting the facial image on the new space. A
   The shape behavior of the cumulative match score curve               two dimensional image of size p by q pixels is generally
could have a considerable importance for many pruning                   represented as a vector x of size n (n = p x q) in a high
applications. For instance, we can cite the law enforcement             dimensional space. A sample of face vector x can be
applications where we analyze in posteriori the content of              expressed as linear combination in a basis Ɏ of the new face
surveillance systems. In these systems, the goal is to reduce           space F which has a lower dimension (m << n):
the potential identities of the gallery which could be
                                                                                                 n               m
matched to the query one in order to improve the decision
stage of identification. For these systems, we seek to retain
                                                                                          x = ¦ ai xi ≈ ¦ α i ϕ i                                (2)
                                                                                                i =1             i =1
the minimum number of identities from the gallery, in
which we find potentially the searched identity. To assess                 We discuss in the next sections the choice of Eigenfaces
this functional point, we search the rank from which we                 and Fisherfaces features {Į i}.
certainly have the right identity that we denote by Top
Matches Rank - TMR. This rank is determined as the x-axis               A. Eigenfaces Features
coordinate where the CMS curve reaches 100%                                 To determine the optimal value of g, we can study the
identification rate. The TMR may give an indication of the              eigenvalue spectrum Ȝi which correspond to the eigenvectors
maximal distance that has a probe feature with its                      iji. A natural algorithm determining g is to seek the threshold
corresponding one from the database. The choice of feature              from which the eigenvalues are very small.
vector that minimizes the TMR induces minimization of this                  Kirby and Sirovitch [14] have introduced the first and the
distance. Since clustering module separates people into                 most used criterion for eigenvectors selection that is the
partitions, based primarily on the distance between their               inertia of the dimension [15]. The ratio of inertia from the
features, we opt mainly on this criterion (TMR) to select               first j eigenvectors is:
features for the clustering task.
                                                                                                         ¦
                                                                                                             j
                                                                                                                  λi
                                                                                                           i =1                                  (3)
                                                                                                         ¦
                                                                                                           n
                                                                                                               λ
                                                                                                           i =1 i
                                                                        Ȝj is the eigenvalue associated to the jth eigenvector.
                                                                              λi
                                                                             ¦ λi
                                                                              i           F                Fɝ
    Figure 3. Eigenfaces Cumulative Match Score curve indicating
              Identification Rate and Top Match Rank.                                                g                                 i
                                                                           Figure 4. Typical aspect of eigenvalues sort by a decreasing order.
                                                                   87
    Swets et Weng [16] advocate the use of a ratio of 95% of             and g = k - 1. N is the number of image in the learning set
inertia, while Kirby [15] uses a rate of 90%. Kirby [15]                 and k is the number of identities in this set. Swets and Weng
introduced another criterion, defined as the ratio si = Ȝi / Ȝ1          presented a similar technique in [16], but based on the use
between the eigenvalue of iji axis and the largest eigenvalue             of M < N - k axis. They choose the value of the parameter M
Ȝ1. Only eigenvectors which si are higher than a threshold IJ             using the dimension inertia of 95%.
are retained (IJ = 1%). Both these methods provide roughly                   We have studied, many values of M < N - k according to
the same performances in terms of identification rate.                   the technique of Swets et Weng [16]. The choice of M is
    For our partitioning problem, we aim to minimize the                 achieved by using the dimension inertia. We tried different
TMR as explained above. We assess the different sizes of
                                                                         thresholds of inertia ranging from 10% to 90% in order to
face spaces following the XM2VTS protocol of Lausanne
                                                                         build the first step components. On the other hand, we chose
[17] with a gallery of 200 persons. Figure 5 shows the TMR
curve for the different subspaces inertia corresponding to the           the g value as the minimum of N - k and M. Table 1 shows
different feature sizes.                                                 the Identification Rates (IR) and the Top Match Rank (TMR)
    We see from Figure 5 that the TMR is minimal (around                 values corresponding to the chosen levels of inertia.
80) with subspaces inertia from 80 to 85% of the total face                  Sub spaces containing 30% of the total inertia of the
space. The corresponding sizes of features are ranging from              learning set of faces optimize the TMR, while sub spaces
44 to 66. In order to reduce the dimensionality of the                   with 65% of the total inertia optimize the identification rate.
classification problem, we choose rather the feature vector of           We retain features of these dimensions for the clustering and
size 44 to operate clustering. By obtaining almost a TMR of              identification process respectively.
80 identities out of 200 in the gallery, we carry out already a                       IV.     IDENTIFICATION OF PARTITIONS
simplification of 60% of the gallery.
                                                                             Given a facial image that we want to establish its
B. Fisherfaces Features                                                  identity, we compute its partitioning feature that we compare
    Fisherfaces method is based on solving the Linear                    to the center of each bin. The searched identity is potentially
Discriminant Analysis (LDA) using the following                          in the nearest bin. However, by choosing identities that
algorithm. The first step consists to perform PCA in the                 belong only to this partition, we increase the probability of
original face space by retaining M eigenvectors. The second              an error discard of the searched identity. This error called
one projects the within class scatter matrix Sw and the                  Research Error Rate (RER) is introduced in [19]. The RER
between class scatter matrix Sb on the reduced space. Finally            tends to compromise the next identification step and degrade
it is to rebuild a PCA on the reduced scatter matrices. The              its accuracy. Therefore, we make the following study to
deduced Fisherfaces space contains a number g of axes.                   choose the partition which minimize and cancel this error
                                                                         rate.
Thus the Fisherfaces technique is dependent on two
parameters M and g that Belhumeur et al. [18] choose to set
at M = N - k (maximum value of M such that Sw is full rank),
                               Figure 5. The TMR behaviour by increasing the subspaces inertia of Eigenfaces.
                                                                    88
                  TABLE I.         IR AND TMR PERFORMANCES OF FISHERFACES ALGORITHM IN FUNCTION OF M AND M PARAMETERS.
       Inertia       90% 85% 80% 75% 70% 65% 60% 55% 50% 40% 35% 30% 25% 20% 15% 10%
        M            400     338     284    238     200 165 136 111               90      56        42        31        22         15         9         5
        g            199     199     199    199     199 165 136 111               90      56        42        31        22         15         9         5
        IR           88,5 88,75 87,25 85,75 88,25 89,25 89 88,75 89                       87,5 85,5 82,5 76,5 66,25 55,5 34,5
        TMR          122     119     155    136     72     71      46     55      52      60        58        49        97         85     114       155
    By subdividing the database into several partitions, the                      The penetration rate R ( ∈ [0,1]) is the percentage of the
probe identity is potentially found in the closest bin.                       retained partition compared to the total size of the gallery.
However, the probability to find the query identity in the                    The simplification rate can be derived as 1 - R. By assuming
closest bin is relatively small and decrease by increasing the                that the built bins contain the same number of identities, an
number K of bins. To prevent missing the bin in which lies                    approximate value of the penetration rate is given as follows:
the query identity, we propose to perform research into the P
closest bins. The P closest bins constitute the subset on                                                          NB P       P
which we perform the last task of identification. If we retain                                            R≈              ≈                                       (4)
                                                                                                                    N         K
a large number P of bins, the simplification of the gallery is
marginal and the identification accuracy could hardly be
                                                                              where NB is the average number of identity features per bin,
improved. However, if we accept one or relatively few
                                                                              N is the database size and K is the number of bins.
classes as a subset of research from the gallery, the RER will
                                                                                  However, we don’t find usually the same number of
surely affect the identification performance. Thus, the
                                                                              identities per bin. For instance, for a 10 classes binning of
estimation of the number P of bins is quite difficult. In fact,
                                                                              our gallery which is composed of 200 individuals, the
the parameter P is related to the total number K of bins, to
                                                                              average number of data points per each class is 20. But these
the intrinsic quality of extracted features and to the clustering
                                                                              bins include really a number which varies from 3 to 35
method.
                                                                              identities. This inequality is due to the non-uniformity
    By analogy with the Cumulative Match Score curve used
                                                                              occupation of the face space where there are inevitably non
to assess identification performances, we delineate the
                                                                              uniform distributions with sparse and crowded places. Fig. 2
Binning Cumulative Match Score curve that we denote by
                                                                              shows an example of a first bin which contains 30 identities
BCMS. In fact, to evaluate our partitioning approach, we
                                                                              and a second bin representing only with three persons. Thus,
compute the classification rate that is the probability to find
                                                                              the effective penetration rate is the ratio of the number of
the query identity in the closest bin. But in case of error, it is
                                                                              identities from the retained partition and the size of the
useful to know the rank of the bin where the probe identity
                                                                              gallery. We report in the table below the effective
is. Fig. 6 illustrates a BCMS curve with a clustering into a
                                                                              penetration rate R from various clustering experiments on
number K of 10 bins. For a probe face, the P closest bins
                                                                              Eigenfaces features with different number K of bins.
constitute the retained partition of the gallery. Thus, we
                                                                                  We remark that raising the number of bins decreases the
control an RER equal to 0%.
                                                                              penetration rate until a number about 70 bins. After that, the
                                                                              penetration rate makes a climb. By considering the
                                                                              complexity of the clustering problem, we retain the 70 bins
                                                                              classification in order to simplify the search space. This
                                                                              procedure avoids any false detection and ensures the
                                                                              simplification of a substantial part of the database. The
                                                                              effectiveness of this procedure is confirmed by using a
                                                                              different validation sets and various learning strategies. All
                                                                              this experiences give the same results in terms of penetration
                                                                              and simplification rates.
                                                                              TABLE II.        THE KEPT NUMBER OF BINS AND THE PENETRATION RATE.
                                                                              K     10         20        30        40         50        70        90        100
                                                                              P     6          13        15        24         26        29        38        48
                                                                              R    142.5 156.7 136.3 133.7 129                          107       105       117
Figure 6. Binning Cumulative Match Score curve with a clustering on 10
                              classes.
                                                                         89
                V.     IDENTIFICATION RESULTS                                                            REFERENCES
   Combination of the Eigenfaces and Fisherfaces                          [1]    S. Z. Li, A. K. Jain, Handbook of face recognition, Springer, 2004.
representation algorithms does not improve the                            [2]    P.J. Phillips, P. Grother, R.J. Micheals, D.M. Blackburn, E. Tabassi,
                                                                                 and J.M. Bone, FRVT 2002: Evaluation report, March 2003.
identification rate. However, our approach doesn’t degrade
                                                                          [3]    A. K. Jain, S. Pankanti, "Fingerprint classification and matching".
the performance of the classical identification. To examine                      Handbook for Image and Video Processing, A. Bovik (ed.), Academic
the behavior of all responses returned by our system of                          Press, 2000.
identification, we plot the cumulative match scores curve.                [4]    X. Tan, B. Bhanu, Y. Lin, "Fingerprint identification: classification
Fig. 7 illustrates the CMS curve of our approach in red and                      vs. indexing", IEEE Conference on Advanced Video and Signal
that of classical identification in blue. We combined                            Based Surveillance, pp.151-156, 2003.
Eigenfaces and Fisherfaces methods alternately for                        [5]    J. Gu, J. Zhou, Analysis of singular points in fingerprints based on
                                                                                 topological structure and orientation field, International Conference
clustering and identification. Given that the best results for                   on Biometrics, 2007.
identification are provided by Fisherfaces, the best                      [6]    A. S. Tolba, Invariant gender identification, DIGITAL SIGNAL
combination shown in figure 7 is achieved by Eigenfaces                          PROCESSING, 11(3), pp. 222 – 240, 2001.
features (g=44) for clustering and Fisherfaces one (g=90)                 [7]    N.P. Costen, M. Brown, S. Akamatsu, Sparse models for gender
for identification.                                                              classification. IEEE Proceeding in Automatic Face and Gesture
                                                                                 Recognition, pp. 201- 206, May 2004.
                      VI.    CONCLUSIONS                                  [8]    X. Lu, H. Chen, A. K. Jain, Multimodal Facial Gender and Ethnicity
                                                                                 Identification, International Conference on Biometrics (ICB), pp.554-
    We have proposed in this paper a framework of                                561, 2006.
clustering and partitioning facial databases in order to                  [9]    Z. Yang, M. Li, H. Ai, “An Experimental Study on Automatic Face
improve biometric identification within these databases.                         Gender Classification,” Proc. 18th IEEE Int'l Conf. Pattern
Clustering was achieved on Eigenfaces features using the                         Recognition, vol. 3, pp. 1099-1102, Aug. 2006.
k-means clustering algorithm. The final step of identification            [10]   Z. Xu, L. Lu, P. Shi, A Hybrid Approach to Gender Classification
is performed through Fisherfaces features. We improve just                       from Face Images, the 19th International Conference on Pattern
                                                                                 Recognition, 2008.
a little bit results of identification. But the use of more
independent features from those used in the database                      [11]   S. Palla, S. Chikkerur, V. Govindaraju, Classification and indexing in
                                                                                 large biometric databases, Biometrics Consortium Conference,
clustering could improve more and more identification                            Crystal City, VA, September 2004.
performances. Our future work involves to develop other                   [12]   A. Mhatre , S. Palla , S. Chikkerur, V. Govindaraju, "Efficient Search
face representing and clustering methods. We plan also to                        and Retrieval in Biometric Databases", SPIE Defense and Security
extend the evaluation of our recognition algorithms with a                       Symposium, Vol - 5779, pages:265-273, March-2005.
large scale database gathered from various benchmark                      [13]   A. K. Jain, M. N. Murty, P. J. Flynn, “Data clustering: a review”,
datasets like FERET, FRGC, ORL and IV² databases.                                ACM Computing Surveys, 31(3) pp. 264-323, 1999.
                                                                          [14]   M. Kirby and L. Sirovich, Application of the karhunen-loeve
                                                                                 procedure for the characterization of human faces, IEEE Transactions
                                                                                 on Pattern Analysis and Machine Intelligence, 12(1):103--108, Jan.
                                                                                 1990.
                                                                          [15]   M. Kirby, Dimentionality reduction and pattern analysis : an
                                                                                 empirical Approach, Wiley, New York, 2000.
                                                                          [16]   D. L. Swets, J. Weng, Using discriminant eigenfeatures for image
                                                                                 retrieval, IEEE Trans. Pattern Analysis and Machine Intelligence, 18,
                                                                                 831– 836, 1996.
                                                                          [17]   K. Messer, J. Matas, J. Kittler, J. Luettin, G. Maître, XM2VTSBD:
                                                                                 the extended M2VTS database, Int. Conf. on Audio & Video-based
                                                                                 Biometric Authentication (AVBPA), pp. 72-77, 1999.
                                                                          [18]   P. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs. Fisherfaces:
                                                                                 recognition using class specific linear projection, IEEE Transactions
                                                                                 on Pattern Analysis and Machine Intelligence 19, pp. 711-720, July
                                                                                 1997.
                                                                          [19]   D. Maltoni, D. Maio, A.K. Jain, S. Prabhakar, "Handbook of
                                                                                 fingerprint recognition", Springer, 2005.
  Figure 7. Comparison of the Cumulative Match Score curves of our
          approach and the classical approach of identification.
                                                                     90