Selection and fusion of facial features for face recognition


Expert Systems with Applications 36 (2009) 7157–7169
Contents lists available at ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a
Selection and fusion of facial features for face recognition

Xiaolong Fan, Brijesh Verma *

School of Computing Sciences, Faculty of Business and Informatics, Central Queensland University, Rockhampton, QLD 4701, Australia

a r t i c l e i n f o
Keywords:
Face recognition
Neural networks
Evolutionary algorithms
Pattern recognition
0957-4174/$ - see front matter � 2008 Elsevier Ltd. A
doi:10.1016/j.eswa.2008.08.052

* Corresponding author.
E-mail addresses: x.fan@cqu.edu.au (X. Fan), b.verm
a b s t r a c t

This paper proposes and investigates a facial feature selection and fusion technique for improving the
classification accuracy of face recognition systems. The proposed technique is novel in terms of feature
selection and fusion processes. It incorporates neural networks and genetic algorithms for the selection
and classification of facial features. The proposed technique is evaluated by using the separate facial
region features and the combined features. The combined features outperform the separate facial region
features in the experimental investigation. A comprehensive comparison with other existing face recog-
nition techniques on FERET benchmark database is included in this paper. The proposed technique has
produced 94% classification accuracy, which is a significant improvement and best classification accuracy
among the published results in the literature.

� 2008 Elsevier Ltd. All rights reserved.
1. Introduction

1.1. Background

Face recognition is one of the most remarkable capabilities of
human beings. It develops over the early years of childhood, and
is important for several aspects of our social life. Human beings
can remember hundreds or even thousands of faces in their whole
life and can easily identify a familiar face in different perspective
variations, such as illumination variations, age variations, and pose
variations. Face recognition together with other abilities, such as
estimating the expression of people with whom we interact, has
played an important role in the course of evolution.

The problem of machine recognition of faces has been studied
for more than 30 years. It has attracted research interest from sev-
eral disciplines such as image processing, pattern recognition,
computer vision, neural networks and computer graphics. Such
interest has been motivated by the growth of Face Recognition
Technology (FRT) used in the applications in many areas, including
face identification in law enforcement and forensics, user authen-
tication in building access or automatic teller machines, indexing
of, and searching for, faces in video databases, intelligent computer
user interfaces, etc. After the September 11, 2001, terrorist attacks,
FRT has been gaining more interest due to its significant involve-
ment in anti-terror activities. FRT numerously used in commercial
and law enforcement applications poses a wide range of technical
challenges and requires an equally wide range of techniques from
different disciplines.
ll rights reserved.

a@cqu.edu.au (B. Verma).
A general statement of the problem of machine recognition of
faces can be described as follows: Given still or video images of a
scene, identify or verify one or more persons in the scene using a
stored database of faces. Available collateral information such as
race, gender, age, facial expression or speech may be used in nar-
rowing the search. The solution to the problem involves face detec-
tion, feature extraction from face region, face verification or
recognition. Face detection refers to the determination of the exact
position and size of a human face from cluttered scenes. Feature
extraction refers to obtaining the features that can be fed into a
face classification system. Face recognition refers to comparing
an input face against models of faces that are stored in a database
of known faces and then indicating if a match is found. Face veri-
fication refers to confirming or rejecting the claimed identity of
the input face.

Although human beings seem to recognize a face in cluttered
scenes with relative ease, machine recognition is much more dif-
ficult for a variety of reasons. Firstly, different faces may appear
very similar, i.e. every face contains two eyes, two ears, one nose
and one mouth, thereby necessitating an exacting discriminant
task. Secondly, different views of the same face may appear
quite different due to imaging constraints, such as changes in
illumination and variability in facial expressions, and due to
the presence of personal accessories, such as glasses, beards,
and hats. Finally, when the face undergoes rotations out of the
imaging plane, a large amount of detailed facial structure may
be occluded. Therefore, until now in many implementations of
face recognition algorithms, the face images are obtained in a
constrained environment with controlled illumination, minimal
occlusions of facial structures, uncluttered background, and so
on. Face recognition in an unconstrained environment is still a
quite challenging task.

mailto:x.fan@cqu.edu.au
mailto:b.verma@cqu.edu.au
http://www.sciencedirect.com/science/journal/09574174
http://www.elsevier.com/locate/eswa


7158 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169
1.2. Literature review

In the last decade, face recognition has become one of most ac-
tive research areas of pattern recognition. The most existing face
recognition methods can be simply classified into three categories:
holistic feature-based matching method, local feature-based
matching method and hybrid matching method (Chellappa,
Wilson, & Sirohey, 1995). In holistic feature-based matching
method, the whole face region is used as raw input to the recogni-
tion system, like Principal Component Analysis (PCA) projection
method (Turk & Pentland, 1991), Fisher-face method (Belhumeur,
Hespanha, & Kriegman, 1997) and Nearest Feature Line (NFL)
method (Li & Lu, 1999). Recently, an Independent Gabor Features
(IGF) method (Liu & Wechsler, 2003) and a kernel Associative
Memory (kAM) models-based method (Zhang, Zhang, & Ge, 2004)
were also applied to face recognition.

In local feature-based matching method, the local features such
as eyes, nose, and mouth are first extracted and then their locations
and local statistics (geometric and/or appearance) are fed into a
structural classifier. Geometrical features method (Brunelli &
Poggio, 1992) and Elastic Bunch Graph Matching (EBGM) method
(Wiskott, Fellous, & Malsburg, 1997) belong to this category.

In hybrid matching method, both holistic and local features are
used for the recognition. A feature combination scheme for face
recognition by fusion of global and local features is presented in
Fang, Tan, and Wang (2002). A fully automatic system for face rec-
ognition in databases with only a small number of samples is pre-
sented in Yan et al. (2004). Global and local texture features are
extracted and used in the recognition.

Genetic Algorithms (GAs) can be used to select an optimal fea-
ture set for pattern classification problems. Some researchers
have used GAs for face recognition. In Bala, Huang, Vafaie,
DeJong, and Wechsler (1995), GA–ID3 (decision tree learning)
method is proposed to find an optimal subset of discriminatory
features for pattern classification. GAs were used to search the
possible optimal subset of extracted features. ID3 was used to
produce a decision tree based on a subset selected by GAs. The
GA–ID3 method was experimented to recognize visual concepts
in satellite and face images. The results showed a significant
improvement in classification performance and a good reduction
in feature set dimension. In Liu, Tang, Lu, and Ma (2004), a kernel
scatter-difference-based discriminant analysis for face recognition
is presented. In Sun and Yin (2005), a genetic algorithm was used
to select features for 3D face recognition. The method presented
in Sun and Yin (2005), tries to optimise features by capturing
good features which can minimize the inner-class distance and
maximize the intra-class distance. In Liu and Wechsler (2000),
an evolutionary pursuit (EP) based on GAs has been applied to
face recognition. The idea in EP is to search for face basis through
the rotated axes defined in PCA space. The overall classification
rate obtained by the existing techniques is unsatisfactory; there-
fore there is a need for a better feature selection and fusion tech-
nique which could improve the overall classification accuracy for
face recognition.

In this paper, a novel feature selection and fusion technique for
face recognition is presented. GA for feature selection and Artificial
Neural Network (ANN) for classification were incorporated into the
proposed technique. The proposed technique has been tested on a
separate feature set from each facial region and compared with the
combined feature set. A large set of dataset from the FERET bench-
mark database (Phillips, Wechsler, Huang, & Rauss, 1998) is used
for testing. The main research questions are (1) How to select the
most significant facial features and combine them to improve an
overall classification rate of face recognition systems? (2) What
is the best combination of these features to a specific classifier?
The original contributions of the research presented in this paper
are as follows: (1) Identification of local facial regions by using dis-
tance threshold method based on center coordinate information of
each facial region. The facial features are extracted from each facial
region. (2) A Genetic Algorithms (GAs)-based approach for facial
feature selection. The significant areas inside each facial region
are located using this approach. (3) An Artificial Neural Network
(ANN)-based approach for facial feature classification. The selected
facial features from GA approach are passed to ANN for final clas-
sification. The classification error is passed back to GA to calculate
the fitness of each individual. (4) A combined technique for face
recognition. The proposed approach is tested on the separate fea-
ture set from each facial region and the combined feature set.
The FERET benchmark database is adopted to evaluate and com-
pare the proposed approach. A comprehensive comparison of the
proposed technique with other existing face recognition ap-
proaches has been conducted.

2. Proposed technique

This section describes the proposed feature selection and fusion
technique for face recognition. Section 2.1 provides an overview of
the proposed methodology. Section 2.2 introduces the distance
threshold method that is used to locate facial regions. The average
grey level value features are discussed in Section 2.3. Section 2.4
describes PCA features. The details of incorporating GAs and neural
networks for feature selection and classification are discussed in
Section 2.5.

2.1. Overview

The goal of the proposed technique is to select the most signif-
icant facial features effectively and find the best combination of
these features for the classifier. The proposed technique aims to lo-
cate the significant areas in facial regions from which the signifi-
cant features are extracted. Facial regions refer to the separate
regions in the face that contain one local organ, such as left eye re-
gion, right eye region, nose region and mouth region. These facial
regions contain the most discriminant facial characteristics on hu-
man faces. The facial regions are the basis for the local feature-
based feature extraction techniques. Even on these discriminant fa-
cial regions, some areas inside may be more important than the
other areas in a recognition task. By locating the most significant
areas on the facial regions, the proposed approach actually re-
moves ‘‘noise” information caused by other non-significant areas
of the facial region. It may also remove part of the variation infor-
mation caused by changes in facial expression, head rotation and
illumination. By concentrating on these significant areas, it allows
us to extract the most significant facial features from them to rep-
resent human faces. These features may improve the classification
rate of face recognition systems.

The first step in the proposed technique is to locate facial re-
gions in the face images. The facial feature extraction technique
is performed on these facial regions. After feature extraction, the
features are selected, fused and classified. Through selection, the
significant areas are located, and through classification, the input
face image is recognised or verified.

The block diagram of the proposed technique to conduct exper-
iments using separate and combined features on FERET benchmark
dataset is depicted in Fig. 2.1. The details are described in the fol-
lowing subsections.

2.2. Locate facial regions

We first locate facial regions on each face image and then we
extract features. The experimental face images are extracted from
the FERET database. The center coordinate information provided

vahab
Highlight

vahab
Highlight


Locate 
Facial 
Regions 

Locate Significant Areas 

GA 

Small Face Dataset 
Feature Extraction 

Separate Feature Set 

Classification Recognized 
Face 

Feature Selection 

Large Face Dataset 

ANN 

Classification Error 

Combined Feature Set 

Fig. 2.1. Block diagram of the proposed technique.

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7159
for each facial region, such as eye center coordinate, nose tip coor-
dinate and mouth center coordinate, is used and the distance
threshold method is applied to locate the local facial regions.

The distance threshold method defines distance thresholds in
vertical and horizontal directions for the local facial region. These
thresholds decide the size of the facial region. With center coordi-
nate information, the facial region is easy to locate. Based on the
images in the experimental database, the distance thresholds are
set as follows. The vertical distance threshold is set to 16 and the
horizontal distance threshold is set to 30 for the eyes and nose re-
gions. They are set to 12 and 60 separately for the mouth region.

2.3. Average grey level value feature

After locating the facial regions, each facial region was equally
divided into small-size rectangle areas. The average grey level va-
lue features are extracted from these small rectangle areas. The
average grey level value feature can be expressed as

gi ¼
P

pðx; yÞ
w � h � v

ð1Þ

where gi is the average grey level value feature for the small rectan-
gle area i, p(x, y) is grey level value of pixel p inside the rectangle
area i, w is the width of the small rectangle area and h is the height.
v is the maximum grey level value for the image, here 255 for the
experimental database.

After division, the average grey level value features are ex-
tracted on these small rectangle areas from left to right, top to bot-
tom. In the experiments, The size of the small rectangle area was
chose to be 6 � 4 (w = 6, h = 4). Then for the left eye region (same
as right eye region and nose region), the size of the extracted fea-
ture set becomes 20. For the mouth region, the size of the extracted
feature set increases to 30 due to the larger size of the mouth
region.

2.4. PCA feature

PCA projection method for face recognition, which is also called
eigenface method, is a classical method for face recognition. The
simple idea behind eigenface method is to capture the largest vari-
ances among a set of face images, and then use this information to
encode and compare face images. The advantage of eigenface
method is reduction of dimensionality, while maximizing the scat-
ter of all the projected samples. Let {X1, X2, . . . , XN} be as set of N
sample images. It takes values in an n-dimensional image space
and each image belongs to one of the c classes {x1, x2, . . . , xc}. A lin-
ear transformation needs to be found to map the original n-dimen-
sional image space into an m-dimensional feature space, where
m < n. The new feature vector yk 2 Rm is defined by the following
equation:

yk ¼ W
TXk; k ¼ 1; 2; . . . ; N ð2Þ

where W 2 Rn�m is a matrix with orthonormal columns. W is chosen
to maximize the determinant of the total scatter matrix S of the pro-
jected samples.

S ¼
XN

k¼1
ðXk � lÞðXk � lÞ

T ð3Þ

W opt ¼ argmaxwjW TSWj ¼ ðw1; w2; . . . ; wmÞ ð4Þ

where N is the number of sample images and l is the mean image of
all the samples. {wi|i=1, 2, . . . , m} is the set of n-dimensional eigen-
vectors of S corresponding to the m largest eigenvalues. In the
experiments, the PCA projection method is applied to local facial re-
gions instead of the whole face images to extract features.

2.5. GA–ANN technique

The GA and ANN-based technique is used to identify the signif-
icant areas in each facial region and perform fusion and selection of
features for face recognition. In this research, GAs are used to find
potential significant features which will generate a higher recogni-
tion rate. The areas that contain these significant features are con-
sidered to be the significant areas. The chromosomes represent the
possible selection of the significant features. Binary encoding is
used for the chromosomes, where 1 represents that the feature is
selected and 0 represents that the feature is not selected. In one
generation, each chromosome is multiplied by the input feature
set to generate the input feature vector to ANN. The input feature
vector F can be represented as

F ¼ CP ð5Þ
C ¼ðc1; c2; . . . ; clÞ; ci 2f0; 1g ð6Þ
P ¼ L þ R þ N þ M ð7Þ

where C is a single chromosome, ci is one gene in the chromosome. l
is the length of the chromosome, which is the same as the size of
the input feature set P. When testing on the separate feature set
from each facial region, P represents the separate feature set. As
mentioned in the last section, size of the left eye feature set L is
20, size of the right eye feature set R is 20, size of the nose feature
set N is 20 and size of the mouth feature set M is 30. When combin-
ing them together, the size of P is 90. Eq. (7) shows the combining
feature set P.


7160 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169
The input feature vector F is fed to ANN for classification. An
ANN with a single hidden layer is used in this technique. A resilient
backpropagation algorithm is used to train the network. The test-
ing classification error is used to calculate the fitness of the corre-
sponding individual in GA. In the reproduction, the ‘fittest’
individual that achieves the best testing classification rate is re-
tained in the next generation. The chromosomes in each genera-
tion of GAs that achieve the best classification rate are recorded.
The chromosomes indicate which feature is selected and which is
not. After all generations, the total number of times that each fea-
ture has been selected for the best classification rate is calculated.
All the features are ranked according to the number of times they
have been selected. The areas that contain the feature inside top n
ranking are the top n significant areas. For the experiments, n is de-
fined as 3.

3. Databases

Three experimental databases were used in this research. All of
them are extracted from the FERET benchmark database. The pre-
liminary experimental database is a small subset of FERET database
and consists of 13 classes (one class represents one distinct per-
son). In each class, there are four face images. Three of them are
randomly chosen for training and the remainder is chosen for test-
ing. The total number of face images in the database is 52. The
images are selected carefully in order to have minimum pose var-
iation. Fig. 4.1 shows the example images from the preliminary
database. Top 3 rows show training images and bottom row shows
testing images.

The advance databases consist of 50 classes, each class repre-
sents one distinct person. In the original dataset (DB1) from our
previous study, there are four face images in every class. Three of
them are randomly selected for training and one for testing. The
extended dataset (DB2) includes all the images from DB1 and more
Fig. 4.1. Example images of the preliminary database. The top 3 rows show training
images and the bottom row shows testing images.
images. In DB2, each class has 4–12 images for training and one for
testing. There are totally 376 images for training and 50 images for
testing in DB2.

4. Experimental results

This section describes the experimental results on small and
large databases. The goal of the experiments is to evaluate the pro-
posed technique and make comparison with the other existing
techniques. All experimental databases are extracted from the FER-
ET benchmark database. Section 4.1 presents the experiments
which are based on the preliminary database. Section 4.2 describes
the advance experiments which are based on larger databases.

4.1. Preliminary results

The preliminary experiments were conducted using preliminary
database as described in Section 3. Fig. 4.1 shows the example
images from the preliminary database. Top 3 rows show training
images and bottom row shows testing images.

At first, the experiments on location of the significant areas
using GA–ANN were conducted on each facial region separately.
The features used in the experiments were average grey level va-
lue features. The extracted average grey level value features from
each facial region formed the input feature vector for that region.
The size of the small rectangular area for feature extraction was
chosen to be 6 � 4. The size of the extracted feature set L from
left eye region was 20. The same size was for the extracted fea-
ture set R from right eye region and the extracted feature set N
from nose region. For mouth region, the size of the extracted fea-
ture set M was 30. L, R, N, M can be expressed in the following
equations

L ¼ðl1; l2; l3; . . . ; l20Þ; li 2 ð0; 1Þ ð8Þ
R ¼ðr1; r2; r3; . . . ; r20Þ; ri 2 ð0; 1Þ ð9Þ
N ¼ðn1;n2; n3; . . . ; n20Þ; ni 2 ð0; 1Þ ð10Þ
M ¼ðm1; m2; m3; . . . ; m30Þ; ami 2 ð0; 1Þ ð11Þ

These extracted feature sets were then fed to GA–ANN separately
for selection and classification. To make the experiments consistent,
the parameters of GA–ANN were set exactly same for every set of
experiments. The generation number was set to 50 and the popula-
tion number was set to 15. The crossover rate was set to 0.9 and the
mutation rate was set to 0.2. The hidden units of ANN were in-
creased from 6 to 44 (each time increased 2 hidden units), and
the selections that generated the best recognition rate were re-
corded. The epoch for ANN was set to 3000.

The best classification results for each facial region feature set
are shown in Tables 4.1–4.4. The shadowed cells indicate the high-
est testing classification rate.

The results given in Tables 4.1–4.4 show that the eye region
and the mouth region achieve better recognition rate than the
nose region. For the nose region, the best recognition rate is just
76.92% when the hidden units are 34. For left eye region, the
best recognition rate is 92.31% when hidden units are 14, 24
and 34. For the right eye region, the best recognition rate is
92.31% when the hidden units are 30 and 44. For mouth region,
the best recognition rate is also 92.31% when hidden units are
10 and 36.

When achieving the best recognition rate for each facial region,
the corresponding feature selection and combinations that were
obtained are shown in Table 4.5. The results given in Table 4.5,
the left eye region has four different feature combinations, which
contain the same feature l1, l8, l9, l20. The right eye region has two
different feature combinations, which contain the same feature r2,
r3, r4, r5, r7, r9, r10, r16, r19. The mouth region also has two different

vahab
Highlight

vahab
Highlight

vahab
Highlight

vahab
Highlight

vahab
Highlight


Table 4.1
Best classification results for the left eye feature set

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 100 92.31 0.04719
14

1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 1 97.44 92.31 0.06064

1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 100 84.62 0.03070
16

1 1 0 0 0 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 100 84.62 0.03960

24 1 0 0 0 1 1 0 1 1 0 0 1 1 0 1 1 0 0 1 1 100 92.31 0.03910

1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100 84.62 0.03758
30

1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 100 84.62 0.03948

34 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 1 1 1 100 92.31 0.03698

Table 4.2
Best classification results for the right eye feature set

Classification Rate
Hidden

Units
Feature Selection (GA Chromosomes) Training 

Rate [%]
Testing

Rate [%]

RMS
Error

14 0 0 0 1 1 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 100 84.62 0.04686

16 0 0 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 0 100 84.62 0.05209

1 0 0 1 0 1 1 1 0 1 0 0 1 0 0 1 0 1 1 1 100 84.62 0.04268
24

1 0 0 1 0 1 1 1 0 1 0 1 1 0 0 1 0 1 1 1 100 84.62 0.03669

30 0 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 0 0 1 1 100 92.31 0.03492

36 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 100 84.62 0.04036

Table 4.3
Best classification results for the nose feature set

Classification Rate
Hidden

Units
Feature Selection (GA Chromosomes) Training 

Rate [%]
Testing

Rate [%]

RMS
Error

14 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 76.92 69.23 0.10628

1 1 1 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 92.31 61.54 0.08497

1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 0 1 1 1 1 87.18 61.54 0.0787516

1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 1 97.44 61.54 0.06720

0 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 0 87.18 61.54 0.09534
22

0 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 94.87 61.54 0.08828

32 1 1 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 1 1 89.74 69.23 0.07957

34 1 1 0 1 0 0 0 0 1 1 0 0 1 0 1 1 0 1 1 1 97.44 76.92 0.06574

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7161
feature combinations, which contain the same features m2, m3, m10,
m12, m16, m20, m22, m29. The nose region just has one feature
combination.

All feature sets were combined together to feed to GA–ANN
again for the experiments. The size of the input feature vector in-
creased to 90. The parameters of GA–ANN were set exactly the
same as those of the previous experiments. Table 4.6 lists the best
classification results achieved. The recognition rate is improved to
100%. When the recognition rate is 100%, these selected features
were added together to locate the most selected features. The


Table 4.4
Best classification results for the mouth feature set

Classification Rate
Hidden

Units
Feature Selection (GA Chromosomes) Training 

Rate [%]
Testing

Rate [%]

RMS
Error

10 011101000111101100011100100011 94.88 92.31 0.056066

14 011110000100100101110000100100 100 84.62 0.048529

18 100011011111011010011000110111 100 84.62 0.03591

111111100111110001111101000111 100 84.62 0.029281

111111100111111100111111100011 100 84.62 0.02807530

111111100111111100110011101111 100 84.62 0.02902

32 011011000000001001010010101111 100 84.62 0.038419

36 011010010101010101110100011010 100 92.31 0.03661

Table 4.5
Feature selections for achieving the best recognition rate

Facial region Hidden
units

Feature selection

Left eye
region

14 l1, l8, l9, l14, l16, l18, l20
l1, l8, l9, l14, l18, l20

24 l1, l5, l6, l8, l9, l12, l13, l15, l16, l19, l20
34 l1, l6, l7, l8, l9, l13, l16, l17, l18, l19, l20

Right eye
region

30 r2, r3, r4, r5, r6, r7, r9, r10, r11, r15, r16, r19, r20
44 r2, r3, r4, r5, r7, r9, r10, r14, r16, r19

Nose region 34 n1, n2, n4, n9, n10, n13, n15, n16, n18, n19, n20
Mouth

region
10 m2, m3, m4, m6, m10, m11, m12, m13, m15, m16, m20, m21,

m22, m25, m29, m30
36 m2, m3, m5, m8, m10, m12, m14, m16, m18, m19, m20, m22,

m26, m27, m29

Table 4.6
Classification results for combined feature set

Hidden
units

Training classification rate
(%)

Testing classification rate
(%)

RMS
error

8 100 100 0.030384
24 100 100 0.008929
38 100 100 0.009836
44 100 100 0.01825

7162 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169
top 10 selected features are shown in Table 4.7. Most of these fea-
tures are concentrated in the eye region and there is no feature
coming from the nose region.
Table 4.7
Top 10 most selected features

Rank Features

1 l20
2 l11, r5
3 m21
4 r1
5 m28
6 r3, r4
7 r19
8 m27
9 l19

10 r6
4.2. Advance experiments

The preliminary experiments achieved very good results. This
indicates that the proposed technique is promising. Since the pre-
liminary database is relatively small, the proposed technique
needs to be investigated on much larger databases. Another two
databases are set up to conduct the experiments, referred to as
Databases 1 and 2. Same as the preliminary database, both Dat-
abases 1 and 2 are extracted from the FERET database and consist
of 50 classes each. In Database 1, there are 150 face images for
training and 50 face images for testing. Database 2 includes all
the images from Database 1 and increases the training set. In
Database 2, there are totally 376 face images for training and
50 face images for testing. Section 4.2.1 presents the experimen-
tal results from Database 1 and Section 4.2.2 presents the results
from Database 2.

4.2.1. Database 1 results
There are four face images per class in Database 1. Three of

them are randomly selected for training and the left one is se-
lected for testing. Example images from Database 1 could be
found in Fig. 4.2. Two different sets of experiments were con-
ducted on Database 1. In the first set of experiments, the average
grey level value features were investigated. In the second set of
experiments, the PCA features were investigated. Section 4.2.1.1
describes the experiments using average grey level value features.
The experiments using PCA features are explained in Section
4.2.1.2.

4.2.1.1. Average grey level value features. For the experiments using
average grey level value features, two different sizes of small rect-
angular areas for feature extraction were investigated. In the
experiments, the size of the small rectangular area was firstly set
to 6 � 4 and then set to 10 � 4. Section Section 4.2.1.1.1 presents
the results when the size of the small rectangular area is 6 � 4.

4.2.1.2. Small rectangular area size is 6 � 4. When the size of the
small rectangular area for feature extraction is 6 � 4, the GA–
ANN technique was firstly tested on each facial region feature
set separately. During the experiments, the hidden units of ANN
were increased from 8 to 64 (an increment of 4 hidden units each
time), and the selections that generated the best recognition rate
were recorded. To make the experiments consistent, the other
parameters of GA–ANN were set exactly same for every experi-
ment. The generation number was set to 40 and the population


Fig. 4.2. Example images from Database 1. Top three rows show training images
and bottom row shows testing images.

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7163
number was set to 10. The crossover rate was set to 0.9 and the
mutation rate was set to 0.2. The epoch for ANN was set to
10,000. Tables 4.8–4.11 list the best classification results achieved
for each facial region feature set.

In the above tables, the shadowed cells indicate the highest
testing classification rate. The highest testing classification rate
for the left eye feature set is 54% (hidden units 44, 56, 60), for
the right eye feature set is 62% (hidden units 28), for the nose
feature set is 38% (hidden units 52) and for the mouth feature
set is 70% (hidden units 36). The results given in Tables 4.8–
4.11 show that the mouth region alone achieved the best classi-
fication rate, while the nose region achieved the worst classifica-
tion rate.

The extracted average grey level value features from each
facial region were combined together to form the input feature
Table 4.8
Best classification results for the left eye feature set

Classification
Hidden
Units

Feature Selection (GA 
Chromosomes) Training 

Rate [%]
Testin

[

12 1 0 0 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 82

32 1 1 0 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 100

40 1 1 1 1 1 0 0 0 1 0 1 0 0 1 1 1 0 1 1 0 100

44 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 1 0 1 100

48 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100

56 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 1 0 100

60 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 1 100

64 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100
vector for GA–ANN. The size of the input feature vector increased
to 90. The other parameters of GA–ANN were set exactly the
same as those of the experiments using the separate facial feature
set. The feature combination sequence was left eye, right eye,
nose and mouth.

Table 4.12 lists the results of the combined feature set that
achieved above 80% testing classification rate on Database1. The
results given in the table show that the best testing classification
rate is 86% (hidden units 40, 56), and the best training classification
rate is 100%. The shadowed cells indicate the best testing classifi-
cation rate. The combined feature set outperformed the separate
feature set from each facial region and improved the classification
rate significantly.

To investigate the effects of feature combination sequence on
the recognition rate, the feature combination sequence was re-
versed to mouth, nose, left eye and right eye to form a new input
vector. Then the same experiments under the same parameters
were conducted. Table 4.13 lists the best classification results of
the combined feature set in the reverse order. The best recognition
rate is still 86% (hidden units 64).

When the recognition rate is 86% (hidden units 40, 56) for the
combined feature set in the original order, the total selection times
of each feature were calculated. By mapping the total selection
times of each feature to its corresponding extraction area,
Fig. 4.3 is generated. In Fig. 4.3, the shaded areas are the areas that
contain the top selected features. These areas are considered to be
the significant areas. There are totally 36 areas. Among these areas,
there are 9 areas from the left eye region, 7 areas from right eye
region, 5 areas from the nose region and 15 areas from the mouth
region.

The results in Table 4.13 show that the best recognition rate is
still 86% when the feature combination sequence is reversed.
When the recognition rate is 86% (hidden units 64), the total selec-
tion times of each feature were calculated. Similarly, by mapping
the total selection times of each feature to its corresponding
extraction area, Fig. 4.4 is generated. In Fig. 4.4, the shaded areas
are the areas that contain the top selected features. There are total
49 areas. Among these areas, there are 7 areas from left eye region,
12 areas from right eye region, 11 areas from nose region and 19
areas from mouth region.

4.2.1.3. PCA features. PCA features are extracted separately from
each facial region, and then combined together to form the input
feature vector for GA–ANN. After feature extraction, the sequence
of feature combination is left eye, right eye, nose and mouth.

Because we do not know how many eigenvectors should be
suitable for encoding the face images, a different number of
 Rate
g Rate
%]

RMS Error

48 0.090757

52 0.065463

48 0.055730

54 0.055398

50 0.055502

54 0.04531

54 0.045535

52 0.050977


Table 4.11
Best classification results for the mouth feature set

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

32 101101110100011010100100100110 100 62 0.055177

36 111111111110111001011000110010 100 70 0.047729

40 111111001110011010000100100010 100 66 0.048634

44 001110110101100101100100101010 100 62 0.047705

48 001110110100011010100100011010 100 66 0.048532

111111110100011010100100100101 100 66 0.042634
52

111111110010011010100100111010 100 66 0.04244

56 110001001101101110011000110010 100 68 0.042558

001101001111101010101001010110 100 62 0.04191

001101001111101001010101010110 100 62 0.041762

001101010011101001010101010110 100 62 0.043019
60

001010101111101001010101010110 100 62 0.0413

64 101001110000010101101000111101 100 64 0.042305

Table 4.10
Best classification results for the nose feature set

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

TestingR
ate[%]

RMS Error

20 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 1 1 1 0 1 82 34 0.083747

32 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 1 1 1 94.67 32 0.074111

36 1 1 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 91.33 32 0.078527

44 1 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 87.33 34 0.087115

48 1 1 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 84.67 30 0.084655

52 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 0 1 98.67 38 0.06674

60 1 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 1 0 1 88 34 0.081849

64 1 0 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 97.33 32 0.075388

Table 4.9
Best classification results for the right eye feature set

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate[%]

RMS Error

16 1 1 1 1 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 1 100 56 0.075269

24 1 1 1 1 1 0 0 0 0 1 1 1 0 0 0 1 0 1 0 1 98 56 0.067813

28 1 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 98.67 62 0.067738

36 1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 99.33 60 0.061682

44 1 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 0 1 100 56 0.052249

52 1 1 0 1 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 99.33 60 0.058238

60 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 99.33 58 0.062595

64 1 1 1 1 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 99.33 58 0.057562

7164 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169


Table 4.13
Best classification results for combined feature set in the reverse order

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate[%]

Testing
Rate[%]

RMS Error

28
10100101111111110100011011001110101

10101001000001001101100001000011110111
00111111111101101

100 80 0.038631

40
00111011010001101010010000101101100

01111101000000001001000100001101101000
00101110000011100

100 80 0.033954

52
00011010110001101010010010010101011

00011110000010110001100100110110001001
01101111000001101

100 84 0.023553

56
10011101011010010101011100001010011

10100111010011110110100100001101111110
00001000000010101

100 82 0.021568

60
00111101011010101010100011110101100

01011100101100110110111011110010010111
11011110111100001

100 80 0.017561

11011101011010100011110011101101110
00110011010011110111011011001101101000

00100110000011110
100 86 0.016062

64
11011101011010101011110011101101110

00110011010011110110111011001101101000
00100110000011110

100 86 0.015144

01100010100101010101011100001010011
10100100101100001001000100001101101000

00100001111100001
100 82 0.01995

68
10011101011010101010100011110101100

01011011010011110110111011110010010111
11011110000011110

100 82 0.015121

Table 4.12
Best classification results for combined feature set in original order

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

1101000001001000001111001110110111
0001011011100100001011011010000111101

1111101101101110111
100 80 0.03366

36
1101000001001000001111001110110111

0001011011100101110100100101111000010
0000010010010001111

100 80 0.03500

40
0010111110100111110000111100111101

0010000101000110011111001101101001101
1100111111111101101

100 86 0.02964

48
1000101100001101001111001110110111

0001011011100101110100100101111000010
0000010010010001000

100 82 0.02819

56
1010111111011011101111001110110111

1110100100011010001010010101111100110
0111110111111101010

100 86 0.01939

60
0011110000110001110110100110111111

1100001101000101100011001001011010100
0001001010011011111

100 84 0.02054

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7165
eigenvectors was evaluated in the experiments. The experiments
using 10 eigenvectors, 14 eigenvectors, 18 eigenvectors and 22
eigenvectors were conducted. The parameters of GA–ANN were
set exactly same for every experiment. The generation number
was set to 40 and the population number was set to 10. The cross-
over rate was set to 0.9 and the mutation rate was set to 0.2. The
epoch for ANN was set to 10000. The hidden units were increased
from 8 to 68 (an increment of 4 hidden units each time).


Table 4.15
Best classification results for 14 eigenvectors

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

00001100101100110100101100110011100
001100111111100000100

100 68 0.059964
20

11110111101100110100101100110011100
001100111111100011011

100 68 0.055764

40
00001100101100110100101100110011100

001100111111100101001
100 74 0.038337

48
00001100101100110100101100110111100

001100111111100000100
100 76 0.034414

60
00001100101100110100101011110111100

001100111111101100001
100 78 0.026711

64
00001100101100110101011100010011100

001100111111100010010
100 78 0.027607

Table 4.14
Best classification results for 10 eigenvectors

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

0000001011100100110000101101111010
111001

100 60 0.080433
12

1111110101001010001111001110001100
110011

98.67 60 0.076291

36
0000001011100100101100110100101100

110011
100 64 0.043994

1000011001111111000001000001011000
110010

100 62 0.035194
52

1000011001111111111001000001011000
110010

100 62 0.033143

56
0000001011100100101100110100101100

110011
100 66 0.033341

Fig. 4.3. Significant areas in facial regions (original order combination). Fig. 4.4. Significant areas in facial regions (reverse order combination).

7166 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169
Tables 4.14–4.16 present the best classification results for a dif-
ferent number of eigenvectors. The results given in Table 4.14
show that the best testing classification rate for 10 eigenvectors
is 66% when the hidden units are 56. The corresponding best train-
ing classification rate is 100%.

Table 4.15 shows that the best testing classification rate for 14
eigenvectors is 78% when the hidden units are 60 and 64. The cor-
responding best training classification rate is 100%.
Table 4.16 shows that the best testing classification rate for 18
eigenvectors is 80% when the hidden units are 60. The correspond-
ing best training classification rate is 100%.

4.2.2. Database 2 results
In Database 2, each class has 4–12 images for training and one

for testing. Because the combined feature set (using average grey
level value features) achieved much better results on Database 1,


Table 4.17
Best classification results (Database 2) of hidden units 40 feature selection
(Database 1)

Hidden Units
Training 

Classification  Rate
Testing

Classification Rate

30 100% 88%

42 100% 94%

48 100% 88%

52 100% 90%

54 100% 92%

60 100% 90%

66 100% 88%

74 100% 94%

76 100% 90%

82 100% 88%

86 100% 94%

Table 4.16
Best classification results for 18 eigenvectors

Classification Rate
Hidden
Units

Feature Selection (GA Chromosomes) Training 
Rate [%]

Testing
Rate [%]

RMS Error

16
11011110101100101010101111110010000

1010001000110101000110110100000000000
100 70 0.064818

32
11111001101111101010100100111101111

0101110110110101000110010011101010100
100 72 0.040375

40
00001000111010101010111010111010000

0110100000101100000111100111011011100
100 74 0.036662

48
11101100011010101010100101001010011

0110100000101100000111100111011011100
100 68 0.030918

56
10010111100000111101011010111010000

0110100000101100000111100111011011100
100 74 0.026269

60
10101011101010101001001010111110000

0110100000110011011100111111111101101
100 80 0.024114

64
00010011100101010101011010111010000

0110100000101100000110110011010100001
100 74 0.027777

Table 4.18
Best classification results (Database 2) of hidden units 56 feature selection
(Database 1)

Hidden Units
Training 

Classification Rate
Testing

Classification Rate

26 100% 88%

38 100% 92%

44 100% 92%

54 100% 90%

58 100% 94%

68 100% 90%

72 100% 92%

78 100% 88%

80 100% 88%

88 100% 90%

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7167
so only the combined feature set (using average grey level value
features) experiments were conducted on Database 2. To make
the experiments faster, the best feature selection from the previous
experiments on Database 1 was used directly to train and test the
ANN. The epoch was increased to 15,000 because there are more
face images in the database. More hidden units were also used in
the experiments.

When the size of the small rectangular area was 6 � 4, the hid-
den units 40 and 56 feature selections achieved the best recogni-
tion rate on Database 1. These two feature selections were
directly used in the experiments on Database 2. The results based
on the hidden units 40 feature selection are listed in Table 4.17.
The results based on the hidden units 56 feature selection are pre-
sented in Table 4.18. The results given in both tables, the highest
recognition rate is improved to 94%.

When the size of small rectangular area was 10 � 4, the hidden
units 44 feature selection achieved the best recognition rate on
Database 1. Table 4.19 shows the best classification results using
hidden units 44 feature selection on Database 2. The results given
Table 4.19
Best classification results of hidden units 44 feature selection

Hidden Units
Training 

Classification Rate
Testing

Classification Rate

28 99.20% 90%

30 99.73% 88%

34 100% 88%

36 99.47% 90%

38 100% 94%

42 100% 90%

50 99.73% 90%

52 100% 92%

54 100% 94%

56 100% 94%

62 100% 90%

64 100% 92%

66 100% 92%


Table 5.2
Combined feature set results on DB1 (Database 1)

Hidden Units Training Rate [%] Testing Rate [%]
40 100 86
44 100 82
48 100 82
56 100 86
60 100 84
64 100 84
68 100 82

Table 5.3
Combined feature set results on DB2 (Database 2)

Hidden Units Training Rate [%] Testing Rate [%]
26 99.73 88
30 100 88
38 100 88
42 100 94
50 100 90

7168 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169
in Table 4.19 show that the highest recognition rate is also im-
proved to 94%.

5. Comparative analysis

The results obtained in this research are compared to the results
of the other methods mentioned in a recent study (Zhang et al.,
2004). The authors (Zhang et al., 2004) also extracted a dataset
from the FERET database as their experimental database, which
has 927 images corresponding to 119 persons. Three different
methods are experimented on this dataset. These methods include
kernel associative memory (kAM) method which is proposed in
their study, PCA-nearest-neighbor method and a simple NN-based
template matching method termed ARENA. In this study, we com-
pared with the highest classification rate achieved in their (Zhang
et al., 2004) study. They conducted two sets of experiments similar
as in our research: the first one used 3 images per class for training
and the second one used 4 images per class for training.

Fig. 5.1 shows the comparison of the best recognition rates be-
tween our DB1 (Database 1) experimental results and their first set
results. Fig. 5.2 shows the comparison of the best recognition rates
between our DB2 (Database 2) experimental results and their sec-
ond set results. Both figures show that our approach achieves a
better recognition rate.

6. Conclusions

We have presented a feature selection and fusion technique for
face recognition in this paper. The GA for feature selection and ANN
for feature classification are incorporated in the proposed tech-
nique. The technique performs fusion and selection of facial fea-
tures for face recognition. The significant areas inside each facial
region are located through the feature selection.

The FERET benchmark database is adopted to evaluate and com-
pare the proposed technique with the existing techniques. Three
different databases were used in the experimental investigation
and all of them were extracted from the FERET benchmark data-
base. Database 2 is the largest database, containing 50 classes
and 426 face images. The experiments are conducted on Cluster
machine at Central Queensland University.

The preliminary experiments were conducted simply to pre-test
the proposed technique. The experiments investigated the separate
facial region feature set and the combined feature set using the
average grey level value features. The preliminary results were
Table 5.1
Separtate facial region feature set results on DB1

Hidden Units Training Rate [%
44 100
56 100
60 100

Left Eye Region

64 100
20 97.33
28 98.67
36 99.33

Right Eye Region

52 99.33
20 82
44 87.33
52 98.67

Nose Region

56 96.67
36 100
40 100
48 100

Mouth Region

56 100
promising. The left eye feature set, right eye feature set and mouth
feature set all achieved the highest recognition rate of 92.31%. The
nose feature set just achieved the highest recognition rate 76.92%
and was the worst performer. The combined feature set outper-
formed the separate facial region feature set by improving the rec-
ognition rate to 100%. On Database 1, many experiments were
conducted to perform further investigation. The different size of
the feature extraction area, the different feature extraction tech-
nique and the different sequence for feature combination were
considered in the experimental investigation. When average grey
level value features were used and the size of the small rectangular
area was 6 � 4, the left eye feature set achieved 54% recognition
rate, right eye feature set achieved 62% recognition rate, nose fea-
ture set achieved 38% recognition rate and mouth feature set
achieved 70% recognition rate (see Table 5.1). The mouth feature
set was the best performer and the nose feature set was the worst
performer. The combined feature set outperformed the separate fa-
cial region feature set by achieving 86% recognition rate. The com-
bination sequence did not affect the recognition rate for combined
feature set. For significant areas, the mouth region contributed the
] Testing Rate [%]
54
54
54
52
60
62
60
60
34
34
38
36
70
66
66
68


20%

30%

40%

50%

60%

70%

80%

90%

100%

Left Eye Right Eye Nose Mouth Combined

Feature Set

C
la

ss
if

ic
at

io
n 

R
at

e

Training Rate

Testing Rate

Fig. 5.1. Classification rate comparison between different feature sets.

35%

40%

45%

50%

55%

60%

65%

70%

75%

80%

85%

90%

PCA ARENA kAM GA-ANN
Approaches

R
ec

og
ni

ti
on

 R
at

e

Fig. 5.2. Comparison with other approaches. (3 images per class for training set).

40.00%

45.00%

50.00%

55.00%
60.00%

65.00%

70.00%

75.00%

80.00%

85.00%
90.00%

95.00%

100.00%

PCA ARENA kAM GA-ANN

Approach

R
ec

og
ni

ti
on

 R
at

e

Fig. 5.3. Comparison with other approaches. (4 or more images per class for
training set).

X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7169
most and the nose region contributed the least. When the size of
the small rectangular area was increased to 10 � 4, the left eye fea-
ture set achieved 54% recognition rate, the right eye feature set
achieved 56% recognition rate, the nose feature set achieved 36%
recognition rate and the mouth feature set achieved 64%. The
mouth feature set was still the best performer and the nose feature
set was still the worst performer. The combined feature set im-
proved the recognition rate to 86% when compared to the separate
facial region feature set (see Table 5.2). The combination sequence
slightly affected the recognition rate. The original order combina-
tion achieved of 84% recognition rate while the reverse order com-
bination achieved slightly higher recognition rate 86%. For
significant areas, the mouth region still contributed the most and
the nose region contributed the least. The above results indicate
the mouth region is the most important facial region. Combination
of facial features from each facial region is much more useful in
improving recognition rate compared to just one facial region fea-
ture set. A different number of eigenvectors was used for PCA fea-
tures experiments. The 18 eigenvectors achieved the highest
recognition rate 80%. The average grey level value features (com-
bined feature set) outperformed the PCA features by 6%.

The experiments on Database 2 were conducted by just using
the combined feature set of average grey level value features.
The recognition rate was improved to 94% (see Table 5.3). The
experimental results of the proposed approach were also com-
pared with the results of the other three approaches based on FER-
ET database: PCA, ARENA and kAM. Fig. 5.2 was based on Database
1 results and the proposed approach improved the recognition rate
by 1.3% compared to kAM method, 36% compared to PCA method
and 41% compared to ARENA method. Fig. 5.3 was based on Data-
base 2 results and the proposed technique improved the recogni-
tion rate by 2.4% compared to kAM method, 43.2% compared to
PCA method and 48.3% compared to ARENA method. The proposed
technique achieved the highest recognition rate among the exist-
ing techniques based on FERET database.

References

Bala, J., Huang, J., Vafaie, H., DeJong, K., & Wechsler, H. (1995). Hybrid learning using
genetic algorithms and decision trees for pattern classification. Proceedings of
the Fourteenth International Joint Conference on Artificial Intelligence, 1, 719–724.

Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces:
Recognition using class specific linear projection. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 19, 711–720.

Brunelli, R., & Poggio, T. (1992). Face recognition through geometrical features.
Proceedings of ECCV, 92, 792–800.

Chellappa, R., Wilson, C. L., & Sirohey, S. (1995). Human and machine recognition of
faces: A survey. Proceedings of the IEEE, 83, 705–740.

Fang, Y., Tan, T., & Wang, Y. (2002). Fusion of global and local features for face
verification. IEEE International Conference on Pattern Recognition, 2, 382–385.

Li, S. Z., & Lu, J. (1999). Face recognition using the nearest feature line method. IEEE
Transactions on Neural Networks, 10, 439–443.

Liu, Q., Tang, X., Lu, H., & Ma, S. (2004). Kernel scatter-difference based discriminant
analysis for face recognition. In International 17th conference on pattern
recognition (Vol. 2, pp. 419–422).

Liu, C., & Wechsler, H. (2000). Evolutionary pursuit and its application to face
recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6),
570–582.

Liu, C., & Wechsler, H. (2003). Independent component analysis of Gabor features
for face recognition. IEEE Transactions on Neural Networks, 14, 919–928.

Phillips, P. J., Wechsler, H., Huang, J., & Rauss, P. (1998). The FERET database and
evaluation procedure for face recognition algorithms. Image and Vision
Computing, 16(5), 295–306.

Sun, Y., & Yin, L. (2005). A genetic algorithm based feature selection approach for 3D
face recognition. In Biometric consortium conference. USA.

Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive
Neuroscience, 3, 71–86.

Wiskott, L., Fellous, J. M., & Malsburg, C. (1997). Face recognition by elastic bunch
graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence,
19, 775–779.

Yan, S., He, X., Hu, Y., Zhang, H., Li, M., & Cheng, Q. (2004). Bayesian shape
localization for face recognition using global and local textures. IEEE
Transactions on Circuits and Systems for Video Technology, 1(14), 102–113.

Zhang, B., Zhang, H., & Ge, S. (2004). Face recognition by applying wavelet subband
representation and kernel associative memory. IEEE Transactions on Neural
Networks, 15, 166–177.


	Selection and fusion of facial features for face recognition
	Introduction
	Background
	Literature Reviewreview

	Proposed Techniquetechnique
	Overview
	Locate facial regions
	Average grey level value feature
	PCA feature
	GA-ANN GA-ANN technique

	Databases
	Experimental Resultsresults
	Preliminary Resultsresults
	Advance Experimentsexperiments
	Database1 ResultsDatabase 1 results
	Average Grey Level Value Featuresgrey level value features
	Small rectangular area size is 6 times 4
	PCA Featuresfeatures

	Database 2 Resultsresults


	Comparative Analysisanalysis
	Conclusions
	References