Selection and fusion of facial features for face recognition Expert Systems with Applications 36 (2009) 7157–7169 Contents lists available at ScienceDirect Expert Systems with Applications j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a Selection and fusion of facial features for face recognition Xiaolong Fan, Brijesh Verma * School of Computing Sciences, Faculty of Business and Informatics, Central Queensland University, Rockhampton, QLD 4701, Australia a r t i c l e i n f o Keywords: Face recognition Neural networks Evolutionary algorithms Pattern recognition 0957-4174/$ - see front matter � 2008 Elsevier Ltd. A doi:10.1016/j.eswa.2008.08.052 * Corresponding author. E-mail addresses: x.fan@cqu.edu.au (X. Fan), b.verm a b s t r a c t This paper proposes and investigates a facial feature selection and fusion technique for improving the classification accuracy of face recognition systems. The proposed technique is novel in terms of feature selection and fusion processes. It incorporates neural networks and genetic algorithms for the selection and classification of facial features. The proposed technique is evaluated by using the separate facial region features and the combined features. The combined features outperform the separate facial region features in the experimental investigation. A comprehensive comparison with other existing face recog- nition techniques on FERET benchmark database is included in this paper. The proposed technique has produced 94% classification accuracy, which is a significant improvement and best classification accuracy among the published results in the literature. � 2008 Elsevier Ltd. All rights reserved. 1. Introduction 1.1. Background Face recognition is one of the most remarkable capabilities of human beings. It develops over the early years of childhood, and is important for several aspects of our social life. Human beings can remember hundreds or even thousands of faces in their whole life and can easily identify a familiar face in different perspective variations, such as illumination variations, age variations, and pose variations. Face recognition together with other abilities, such as estimating the expression of people with whom we interact, has played an important role in the course of evolution. The problem of machine recognition of faces has been studied for more than 30 years. It has attracted research interest from sev- eral disciplines such as image processing, pattern recognition, computer vision, neural networks and computer graphics. Such interest has been motivated by the growth of Face Recognition Technology (FRT) used in the applications in many areas, including face identification in law enforcement and forensics, user authen- tication in building access or automatic teller machines, indexing of, and searching for, faces in video databases, intelligent computer user interfaces, etc. After the September 11, 2001, terrorist attacks, FRT has been gaining more interest due to its significant involve- ment in anti-terror activities. FRT numerously used in commercial and law enforcement applications poses a wide range of technical challenges and requires an equally wide range of techniques from different disciplines. ll rights reserved. a@cqu.edu.au (B. Verma). A general statement of the problem of machine recognition of faces can be described as follows: Given still or video images of a scene, identify or verify one or more persons in the scene using a stored database of faces. Available collateral information such as race, gender, age, facial expression or speech may be used in nar- rowing the search. The solution to the problem involves face detec- tion, feature extraction from face region, face verification or recognition. Face detection refers to the determination of the exact position and size of a human face from cluttered scenes. Feature extraction refers to obtaining the features that can be fed into a face classification system. Face recognition refers to comparing an input face against models of faces that are stored in a database of known faces and then indicating if a match is found. Face veri- fication refers to confirming or rejecting the claimed identity of the input face. Although human beings seem to recognize a face in cluttered scenes with relative ease, machine recognition is much more dif- ficult for a variety of reasons. Firstly, different faces may appear very similar, i.e. every face contains two eyes, two ears, one nose and one mouth, thereby necessitating an exacting discriminant task. Secondly, different views of the same face may appear quite different due to imaging constraints, such as changes in illumination and variability in facial expressions, and due to the presence of personal accessories, such as glasses, beards, and hats. Finally, when the face undergoes rotations out of the imaging plane, a large amount of detailed facial structure may be occluded. Therefore, until now in many implementations of face recognition algorithms, the face images are obtained in a constrained environment with controlled illumination, minimal occlusions of facial structures, uncluttered background, and so on. Face recognition in an unconstrained environment is still a quite challenging task. mailto:x.fan@cqu.edu.au mailto:b.verma@cqu.edu.au http://www.sciencedirect.com/science/journal/09574174 http://www.elsevier.com/locate/eswa 7158 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 1.2. Literature review In the last decade, face recognition has become one of most ac- tive research areas of pattern recognition. The most existing face recognition methods can be simply classified into three categories: holistic feature-based matching method, local feature-based matching method and hybrid matching method (Chellappa, Wilson, & Sirohey, 1995). In holistic feature-based matching method, the whole face region is used as raw input to the recogni- tion system, like Principal Component Analysis (PCA) projection method (Turk & Pentland, 1991), Fisher-face method (Belhumeur, Hespanha, & Kriegman, 1997) and Nearest Feature Line (NFL) method (Li & Lu, 1999). Recently, an Independent Gabor Features (IGF) method (Liu & Wechsler, 2003) and a kernel Associative Memory (kAM) models-based method (Zhang, Zhang, & Ge, 2004) were also applied to face recognition. In local feature-based matching method, the local features such as eyes, nose, and mouth are first extracted and then their locations and local statistics (geometric and/or appearance) are fed into a structural classifier. Geometrical features method (Brunelli & Poggio, 1992) and Elastic Bunch Graph Matching (EBGM) method (Wiskott, Fellous, & Malsburg, 1997) belong to this category. In hybrid matching method, both holistic and local features are used for the recognition. A feature combination scheme for face recognition by fusion of global and local features is presented in Fang, Tan, and Wang (2002). A fully automatic system for face rec- ognition in databases with only a small number of samples is pre- sented in Yan et al. (2004). Global and local texture features are extracted and used in the recognition. Genetic Algorithms (GAs) can be used to select an optimal fea- ture set for pattern classification problems. Some researchers have used GAs for face recognition. In Bala, Huang, Vafaie, DeJong, and Wechsler (1995), GA–ID3 (decision tree learning) method is proposed to find an optimal subset of discriminatory features for pattern classification. GAs were used to search the possible optimal subset of extracted features. ID3 was used to produce a decision tree based on a subset selected by GAs. The GA–ID3 method was experimented to recognize visual concepts in satellite and face images. The results showed a significant improvement in classification performance and a good reduction in feature set dimension. In Liu, Tang, Lu, and Ma (2004), a kernel scatter-difference-based discriminant analysis for face recognition is presented. In Sun and Yin (2005), a genetic algorithm was used to select features for 3D face recognition. The method presented in Sun and Yin (2005), tries to optimise features by capturing good features which can minimize the inner-class distance and maximize the intra-class distance. In Liu and Wechsler (2000), an evolutionary pursuit (EP) based on GAs has been applied to face recognition. The idea in EP is to search for face basis through the rotated axes defined in PCA space. The overall classification rate obtained by the existing techniques is unsatisfactory; there- fore there is a need for a better feature selection and fusion tech- nique which could improve the overall classification accuracy for face recognition. In this paper, a novel feature selection and fusion technique for face recognition is presented. GA for feature selection and Artificial Neural Network (ANN) for classification were incorporated into the proposed technique. The proposed technique has been tested on a separate feature set from each facial region and compared with the combined feature set. A large set of dataset from the FERET bench- mark database (Phillips, Wechsler, Huang, & Rauss, 1998) is used for testing. The main research questions are (1) How to select the most significant facial features and combine them to improve an overall classification rate of face recognition systems? (2) What is the best combination of these features to a specific classifier? The original contributions of the research presented in this paper are as follows: (1) Identification of local facial regions by using dis- tance threshold method based on center coordinate information of each facial region. The facial features are extracted from each facial region. (2) A Genetic Algorithms (GAs)-based approach for facial feature selection. The significant areas inside each facial region are located using this approach. (3) An Artificial Neural Network (ANN)-based approach for facial feature classification. The selected facial features from GA approach are passed to ANN for final clas- sification. The classification error is passed back to GA to calculate the fitness of each individual. (4) A combined technique for face recognition. The proposed approach is tested on the separate fea- ture set from each facial region and the combined feature set. The FERET benchmark database is adopted to evaluate and com- pare the proposed approach. A comprehensive comparison of the proposed technique with other existing face recognition ap- proaches has been conducted. 2. Proposed technique This section describes the proposed feature selection and fusion technique for face recognition. Section 2.1 provides an overview of the proposed methodology. Section 2.2 introduces the distance threshold method that is used to locate facial regions. The average grey level value features are discussed in Section 2.3. Section 2.4 describes PCA features. The details of incorporating GAs and neural networks for feature selection and classification are discussed in Section 2.5. 2.1. Overview The goal of the proposed technique is to select the most signif- icant facial features effectively and find the best combination of these features for the classifier. The proposed technique aims to lo- cate the significant areas in facial regions from which the signifi- cant features are extracted. Facial regions refer to the separate regions in the face that contain one local organ, such as left eye re- gion, right eye region, nose region and mouth region. These facial regions contain the most discriminant facial characteristics on hu- man faces. The facial regions are the basis for the local feature- based feature extraction techniques. Even on these discriminant fa- cial regions, some areas inside may be more important than the other areas in a recognition task. By locating the most significant areas on the facial regions, the proposed approach actually re- moves ‘‘noise” information caused by other non-significant areas of the facial region. It may also remove part of the variation infor- mation caused by changes in facial expression, head rotation and illumination. By concentrating on these significant areas, it allows us to extract the most significant facial features from them to rep- resent human faces. These features may improve the classification rate of face recognition systems. The first step in the proposed technique is to locate facial re- gions in the face images. The facial feature extraction technique is performed on these facial regions. After feature extraction, the features are selected, fused and classified. Through selection, the significant areas are located, and through classification, the input face image is recognised or verified. The block diagram of the proposed technique to conduct exper- iments using separate and combined features on FERET benchmark dataset is depicted in Fig. 2.1. The details are described in the fol- lowing subsections. 2.2. Locate facial regions We first locate facial regions on each face image and then we extract features. The experimental face images are extracted from the FERET database. The center coordinate information provided vahab Highlight vahab Highlight Locate Facial Regions Locate Significant Areas GA Small Face Dataset Feature Extraction Separate Feature Set Classification Recognized Face Feature Selection Large Face Dataset ANN Classification Error Combined Feature Set Fig. 2.1. Block diagram of the proposed technique. X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7159 for each facial region, such as eye center coordinate, nose tip coor- dinate and mouth center coordinate, is used and the distance threshold method is applied to locate the local facial regions. The distance threshold method defines distance thresholds in vertical and horizontal directions for the local facial region. These thresholds decide the size of the facial region. With center coordi- nate information, the facial region is easy to locate. Based on the images in the experimental database, the distance thresholds are set as follows. The vertical distance threshold is set to 16 and the horizontal distance threshold is set to 30 for the eyes and nose re- gions. They are set to 12 and 60 separately for the mouth region. 2.3. Average grey level value feature After locating the facial regions, each facial region was equally divided into small-size rectangle areas. The average grey level va- lue features are extracted from these small rectangle areas. The average grey level value feature can be expressed as gi ¼ P pðx; yÞ w � h � v ð1Þ where gi is the average grey level value feature for the small rectan- gle area i, p(x, y) is grey level value of pixel p inside the rectangle area i, w is the width of the small rectangle area and h is the height. v is the maximum grey level value for the image, here 255 for the experimental database. After division, the average grey level value features are ex- tracted on these small rectangle areas from left to right, top to bot- tom. In the experiments, The size of the small rectangle area was chose to be 6 � 4 (w = 6, h = 4). Then for the left eye region (same as right eye region and nose region), the size of the extracted fea- ture set becomes 20. For the mouth region, the size of the extracted feature set increases to 30 due to the larger size of the mouth region. 2.4. PCA feature PCA projection method for face recognition, which is also called eigenface method, is a classical method for face recognition. The simple idea behind eigenface method is to capture the largest vari- ances among a set of face images, and then use this information to encode and compare face images. The advantage of eigenface method is reduction of dimensionality, while maximizing the scat- ter of all the projected samples. Let {X1, X2, . . . , XN} be as set of N sample images. It takes values in an n-dimensional image space and each image belongs to one of the c classes {x1, x2, . . . , xc}. A lin- ear transformation needs to be found to map the original n-dimen- sional image space into an m-dimensional feature space, where m < n. The new feature vector yk 2 Rm is defined by the following equation: yk ¼ W TXk; k ¼ 1; 2; . . . ; N ð2Þ where W 2 Rn�m is a matrix with orthonormal columns. W is chosen to maximize the determinant of the total scatter matrix S of the pro- jected samples. S ¼ XN k¼1 ðXk � lÞðXk � lÞ T ð3Þ W opt ¼ argmaxwjW TSWj ¼ ðw1; w2; . . . ; wmÞ ð4Þ where N is the number of sample images and l is the mean image of all the samples. {wi|i=1, 2, . . . , m} is the set of n-dimensional eigen- vectors of S corresponding to the m largest eigenvalues. In the experiments, the PCA projection method is applied to local facial re- gions instead of the whole face images to extract features. 2.5. GA–ANN technique The GA and ANN-based technique is used to identify the signif- icant areas in each facial region and perform fusion and selection of features for face recognition. In this research, GAs are used to find potential significant features which will generate a higher recogni- tion rate. The areas that contain these significant features are con- sidered to be the significant areas. The chromosomes represent the possible selection of the significant features. Binary encoding is used for the chromosomes, where 1 represents that the feature is selected and 0 represents that the feature is not selected. In one generation, each chromosome is multiplied by the input feature set to generate the input feature vector to ANN. The input feature vector F can be represented as F ¼ CP ð5Þ C ¼ðc1; c2; . . . ; clÞ; ci 2f0; 1g ð6Þ P ¼ L þ R þ N þ M ð7Þ where C is a single chromosome, ci is one gene in the chromosome. l is the length of the chromosome, which is the same as the size of the input feature set P. When testing on the separate feature set from each facial region, P represents the separate feature set. As mentioned in the last section, size of the left eye feature set L is 20, size of the right eye feature set R is 20, size of the nose feature set N is 20 and size of the mouth feature set M is 30. When combin- ing them together, the size of P is 90. Eq. (7) shows the combining feature set P. 7160 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 The input feature vector F is fed to ANN for classification. An ANN with a single hidden layer is used in this technique. A resilient backpropagation algorithm is used to train the network. The test- ing classification error is used to calculate the fitness of the corre- sponding individual in GA. In the reproduction, the ‘fittest’ individual that achieves the best testing classification rate is re- tained in the next generation. The chromosomes in each genera- tion of GAs that achieve the best classification rate are recorded. The chromosomes indicate which feature is selected and which is not. After all generations, the total number of times that each fea- ture has been selected for the best classification rate is calculated. All the features are ranked according to the number of times they have been selected. The areas that contain the feature inside top n ranking are the top n significant areas. For the experiments, n is de- fined as 3. 3. Databases Three experimental databases were used in this research. All of them are extracted from the FERET benchmark database. The pre- liminary experimental database is a small subset of FERET database and consists of 13 classes (one class represents one distinct per- son). In each class, there are four face images. Three of them are randomly chosen for training and the remainder is chosen for test- ing. The total number of face images in the database is 52. The images are selected carefully in order to have minimum pose var- iation. Fig. 4.1 shows the example images from the preliminary database. Top 3 rows show training images and bottom row shows testing images. The advance databases consist of 50 classes, each class repre- sents one distinct person. In the original dataset (DB1) from our previous study, there are four face images in every class. Three of them are randomly selected for training and one for testing. The extended dataset (DB2) includes all the images from DB1 and more Fig. 4.1. Example images of the preliminary database. The top 3 rows show training images and the bottom row shows testing images. images. In DB2, each class has 4–12 images for training and one for testing. There are totally 376 images for training and 50 images for testing in DB2. 4. Experimental results This section describes the experimental results on small and large databases. The goal of the experiments is to evaluate the pro- posed technique and make comparison with the other existing techniques. All experimental databases are extracted from the FER- ET benchmark database. Section 4.1 presents the experiments which are based on the preliminary database. Section 4.2 describes the advance experiments which are based on larger databases. 4.1. Preliminary results The preliminary experiments were conducted using preliminary database as described in Section 3. Fig. 4.1 shows the example images from the preliminary database. Top 3 rows show training images and bottom row shows testing images. At first, the experiments on location of the significant areas using GA–ANN were conducted on each facial region separately. The features used in the experiments were average grey level va- lue features. The extracted average grey level value features from each facial region formed the input feature vector for that region. The size of the small rectangular area for feature extraction was chosen to be 6 � 4. The size of the extracted feature set L from left eye region was 20. The same size was for the extracted fea- ture set R from right eye region and the extracted feature set N from nose region. For mouth region, the size of the extracted fea- ture set M was 30. L, R, N, M can be expressed in the following equations L ¼ðl1; l2; l3; . . . ; l20Þ; li 2 ð0; 1Þ ð8Þ R ¼ðr1; r2; r3; . . . ; r20Þ; ri 2 ð0; 1Þ ð9Þ N ¼ðn1;n2; n3; . . . ; n20Þ; ni 2 ð0; 1Þ ð10Þ M ¼ðm1; m2; m3; . . . ; m30Þ; ami 2 ð0; 1Þ ð11Þ These extracted feature sets were then fed to GA–ANN separately for selection and classification. To make the experiments consistent, the parameters of GA–ANN were set exactly same for every set of experiments. The generation number was set to 50 and the popula- tion number was set to 15. The crossover rate was set to 0.9 and the mutation rate was set to 0.2. The hidden units of ANN were in- creased from 6 to 44 (each time increased 2 hidden units), and the selections that generated the best recognition rate were re- corded. The epoch for ANN was set to 3000. The best classification results for each facial region feature set are shown in Tables 4.1–4.4. The shadowed cells indicate the high- est testing classification rate. The results given in Tables 4.1–4.4 show that the eye region and the mouth region achieve better recognition rate than the nose region. For the nose region, the best recognition rate is just 76.92% when the hidden units are 34. For left eye region, the best recognition rate is 92.31% when hidden units are 14, 24 and 34. For the right eye region, the best recognition rate is 92.31% when the hidden units are 30 and 44. For mouth region, the best recognition rate is also 92.31% when hidden units are 10 and 36. When achieving the best recognition rate for each facial region, the corresponding feature selection and combinations that were obtained are shown in Table 4.5. The results given in Table 4.5, the left eye region has four different feature combinations, which contain the same feature l1, l8, l9, l20. The right eye region has two different feature combinations, which contain the same feature r2, r3, r4, r5, r7, r9, r10, r16, r19. The mouth region also has two different vahab Highlight vahab Highlight vahab Highlight vahab Highlight vahab Highlight Table 4.1 Best classification results for the left eye feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 100 92.31 0.04719 14 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 1 97.44 92.31 0.06064 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 100 84.62 0.03070 16 1 1 0 0 0 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 100 84.62 0.03960 24 1 0 0 0 1 1 0 1 1 0 0 1 1 0 1 1 0 0 1 1 100 92.31 0.03910 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100 84.62 0.03758 30 1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 100 84.62 0.03948 34 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 1 1 1 1 1 100 92.31 0.03698 Table 4.2 Best classification results for the right eye feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 14 0 0 0 1 1 0 1 0 0 1 0 0 1 0 1 0 0 1 0 1 100 84.62 0.04686 16 0 0 0 1 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 0 100 84.62 0.05209 1 0 0 1 0 1 1 1 0 1 0 0 1 0 0 1 0 1 1 1 100 84.62 0.04268 24 1 0 0 1 0 1 1 1 0 1 0 1 1 0 0 1 0 1 1 1 100 84.62 0.03669 30 0 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 0 0 1 1 100 92.31 0.03492 36 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 100 84.62 0.04036 Table 4.3 Best classification results for the nose feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 14 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 1 1 0 76.92 69.23 0.10628 1 1 1 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 92.31 61.54 0.08497 1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 0 1 1 1 1 87.18 61.54 0.0787516 1 1 1 0 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 1 97.44 61.54 0.06720 0 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 0 0 87.18 61.54 0.09534 22 0 1 0 0 0 1 0 1 0 0 1 0 1 0 1 0 1 0 1 0 94.87 61.54 0.08828 32 1 1 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 1 1 89.74 69.23 0.07957 34 1 1 0 1 0 0 0 0 1 1 0 0 1 0 1 1 0 1 1 1 97.44 76.92 0.06574 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7161 feature combinations, which contain the same features m2, m3, m10, m12, m16, m20, m22, m29. The nose region just has one feature combination. All feature sets were combined together to feed to GA–ANN again for the experiments. The size of the input feature vector in- creased to 90. The parameters of GA–ANN were set exactly the same as those of the previous experiments. Table 4.6 lists the best classification results achieved. The recognition rate is improved to 100%. When the recognition rate is 100%, these selected features were added together to locate the most selected features. The Table 4.4 Best classification results for the mouth feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 10 011101000111101100011100100011 94.88 92.31 0.056066 14 011110000100100101110000100100 100 84.62 0.048529 18 100011011111011010011000110111 100 84.62 0.03591 111111100111110001111101000111 100 84.62 0.029281 111111100111111100111111100011 100 84.62 0.02807530 111111100111111100110011101111 100 84.62 0.02902 32 011011000000001001010010101111 100 84.62 0.038419 36 011010010101010101110100011010 100 92.31 0.03661 Table 4.5 Feature selections for achieving the best recognition rate Facial region Hidden units Feature selection Left eye region 14 l1, l8, l9, l14, l16, l18, l20 l1, l8, l9, l14, l18, l20 24 l1, l5, l6, l8, l9, l12, l13, l15, l16, l19, l20 34 l1, l6, l7, l8, l9, l13, l16, l17, l18, l19, l20 Right eye region 30 r2, r3, r4, r5, r6, r7, r9, r10, r11, r15, r16, r19, r20 44 r2, r3, r4, r5, r7, r9, r10, r14, r16, r19 Nose region 34 n1, n2, n4, n9, n10, n13, n15, n16, n18, n19, n20 Mouth region 10 m2, m3, m4, m6, m10, m11, m12, m13, m15, m16, m20, m21, m22, m25, m29, m30 36 m2, m3, m5, m8, m10, m12, m14, m16, m18, m19, m20, m22, m26, m27, m29 Table 4.6 Classification results for combined feature set Hidden units Training classification rate (%) Testing classification rate (%) RMS error 8 100 100 0.030384 24 100 100 0.008929 38 100 100 0.009836 44 100 100 0.01825 7162 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 top 10 selected features are shown in Table 4.7. Most of these fea- tures are concentrated in the eye region and there is no feature coming from the nose region. Table 4.7 Top 10 most selected features Rank Features 1 l20 2 l11, r5 3 m21 4 r1 5 m28 6 r3, r4 7 r19 8 m27 9 l19 10 r6 4.2. Advance experiments The preliminary experiments achieved very good results. This indicates that the proposed technique is promising. Since the pre- liminary database is relatively small, the proposed technique needs to be investigated on much larger databases. Another two databases are set up to conduct the experiments, referred to as Databases 1 and 2. Same as the preliminary database, both Dat- abases 1 and 2 are extracted from the FERET database and consist of 50 classes each. In Database 1, there are 150 face images for training and 50 face images for testing. Database 2 includes all the images from Database 1 and increases the training set. In Database 2, there are totally 376 face images for training and 50 face images for testing. Section 4.2.1 presents the experimen- tal results from Database 1 and Section 4.2.2 presents the results from Database 2. 4.2.1. Database 1 results There are four face images per class in Database 1. Three of them are randomly selected for training and the left one is se- lected for testing. Example images from Database 1 could be found in Fig. 4.2. Two different sets of experiments were con- ducted on Database 1. In the first set of experiments, the average grey level value features were investigated. In the second set of experiments, the PCA features were investigated. Section 4.2.1.1 describes the experiments using average grey level value features. The experiments using PCA features are explained in Section 4.2.1.2. 4.2.1.1. Average grey level value features. For the experiments using average grey level value features, two different sizes of small rect- angular areas for feature extraction were investigated. In the experiments, the size of the small rectangular area was firstly set to 6 � 4 and then set to 10 � 4. Section Section 4.2.1.1.1 presents the results when the size of the small rectangular area is 6 � 4. 4.2.1.2. Small rectangular area size is 6 � 4. When the size of the small rectangular area for feature extraction is 6 � 4, the GA– ANN technique was firstly tested on each facial region feature set separately. During the experiments, the hidden units of ANN were increased from 8 to 64 (an increment of 4 hidden units each time), and the selections that generated the best recognition rate were recorded. To make the experiments consistent, the other parameters of GA–ANN were set exactly same for every experi- ment. The generation number was set to 40 and the population Fig. 4.2. Example images from Database 1. Top three rows show training images and bottom row shows testing images. X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7163 number was set to 10. The crossover rate was set to 0.9 and the mutation rate was set to 0.2. The epoch for ANN was set to 10,000. Tables 4.8–4.11 list the best classification results achieved for each facial region feature set. In the above tables, the shadowed cells indicate the highest testing classification rate. The highest testing classification rate for the left eye feature set is 54% (hidden units 44, 56, 60), for the right eye feature set is 62% (hidden units 28), for the nose feature set is 38% (hidden units 52) and for the mouth feature set is 70% (hidden units 36). The results given in Tables 4.8– 4.11 show that the mouth region alone achieved the best classi- fication rate, while the nose region achieved the worst classifica- tion rate. The extracted average grey level value features from each facial region were combined together to form the input feature Table 4.8 Best classification results for the left eye feature set Classification Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testin [ 12 1 0 0 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 82 32 1 1 0 1 1 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 100 40 1 1 1 1 1 0 0 0 1 0 1 0 0 1 1 1 0 1 1 0 100 44 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 1 0 1 100 48 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100 56 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 1 0 100 60 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 1 100 64 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 100 vector for GA–ANN. The size of the input feature vector increased to 90. The other parameters of GA–ANN were set exactly the same as those of the experiments using the separate facial feature set. The feature combination sequence was left eye, right eye, nose and mouth. Table 4.12 lists the results of the combined feature set that achieved above 80% testing classification rate on Database1. The results given in the table show that the best testing classification rate is 86% (hidden units 40, 56), and the best training classification rate is 100%. The shadowed cells indicate the best testing classifi- cation rate. The combined feature set outperformed the separate feature set from each facial region and improved the classification rate significantly. To investigate the effects of feature combination sequence on the recognition rate, the feature combination sequence was re- versed to mouth, nose, left eye and right eye to form a new input vector. Then the same experiments under the same parameters were conducted. Table 4.13 lists the best classification results of the combined feature set in the reverse order. The best recognition rate is still 86% (hidden units 64). When the recognition rate is 86% (hidden units 40, 56) for the combined feature set in the original order, the total selection times of each feature were calculated. By mapping the total selection times of each feature to its corresponding extraction area, Fig. 4.3 is generated. In Fig. 4.3, the shaded areas are the areas that contain the top selected features. These areas are considered to be the significant areas. There are totally 36 areas. Among these areas, there are 9 areas from the left eye region, 7 areas from right eye region, 5 areas from the nose region and 15 areas from the mouth region. The results in Table 4.13 show that the best recognition rate is still 86% when the feature combination sequence is reversed. When the recognition rate is 86% (hidden units 64), the total selec- tion times of each feature were calculated. Similarly, by mapping the total selection times of each feature to its corresponding extraction area, Fig. 4.4 is generated. In Fig. 4.4, the shaded areas are the areas that contain the top selected features. There are total 49 areas. Among these areas, there are 7 areas from left eye region, 12 areas from right eye region, 11 areas from nose region and 19 areas from mouth region. 4.2.1.3. PCA features. PCA features are extracted separately from each facial region, and then combined together to form the input feature vector for GA–ANN. After feature extraction, the sequence of feature combination is left eye, right eye, nose and mouth. Because we do not know how many eigenvectors should be suitable for encoding the face images, a different number of Rate g Rate %] RMS Error 48 0.090757 52 0.065463 48 0.055730 54 0.055398 50 0.055502 54 0.04531 54 0.045535 52 0.050977 Table 4.11 Best classification results for the mouth feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 32 101101110100011010100100100110 100 62 0.055177 36 111111111110111001011000110010 100 70 0.047729 40 111111001110011010000100100010 100 66 0.048634 44 001110110101100101100100101010 100 62 0.047705 48 001110110100011010100100011010 100 66 0.048532 111111110100011010100100100101 100 66 0.042634 52 111111110010011010100100111010 100 66 0.04244 56 110001001101101110011000110010 100 68 0.042558 001101001111101010101001010110 100 62 0.04191 001101001111101001010101010110 100 62 0.041762 001101010011101001010101010110 100 62 0.043019 60 001010101111101001010101010110 100 62 0.0413 64 101001110000010101101000111101 100 64 0.042305 Table 4.10 Best classification results for the nose feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] TestingR ate[%] RMS Error 20 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 1 1 1 0 1 82 34 0.083747 32 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 1 0 1 1 1 94.67 32 0.074111 36 1 1 1 0 1 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 91.33 32 0.078527 44 1 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 87.33 34 0.087115 48 1 1 0 1 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 84.67 30 0.084655 52 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 0 1 1 0 1 98.67 38 0.06674 60 1 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 1 0 1 88 34 0.081849 64 1 0 1 1 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0 1 97.33 32 0.075388 Table 4.9 Best classification results for the right eye feature set Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate[%] RMS Error 16 1 1 1 1 1 0 0 1 0 1 0 0 1 1 1 0 1 1 0 1 100 56 0.075269 24 1 1 1 1 1 0 0 0 0 1 1 1 0 0 0 1 0 1 0 1 98 56 0.067813 28 1 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 98.67 62 0.067738 36 1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 1 0 0 1 0 99.33 60 0.061682 44 1 1 1 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 0 1 100 56 0.052249 52 1 1 0 1 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 99.33 60 0.058238 60 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 99.33 58 0.062595 64 1 1 1 1 1 0 0 1 0 1 0 0 0 0 0 1 0 0 0 1 99.33 58 0.057562 7164 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 Table 4.13 Best classification results for combined feature set in the reverse order Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate[%] Testing Rate[%] RMS Error 28 10100101111111110100011011001110101 10101001000001001101100001000011110111 00111111111101101 100 80 0.038631 40 00111011010001101010010000101101100 01111101000000001001000100001101101000 00101110000011100 100 80 0.033954 52 00011010110001101010010010010101011 00011110000010110001100100110110001001 01101111000001101 100 84 0.023553 56 10011101011010010101011100001010011 10100111010011110110100100001101111110 00001000000010101 100 82 0.021568 60 00111101011010101010100011110101100 01011100101100110110111011110010010111 11011110111100001 100 80 0.017561 11011101011010100011110011101101110 00110011010011110111011011001101101000 00100110000011110 100 86 0.016062 64 11011101011010101011110011101101110 00110011010011110110111011001101101000 00100110000011110 100 86 0.015144 01100010100101010101011100001010011 10100100101100001001000100001101101000 00100001111100001 100 82 0.01995 68 10011101011010101010100011110101100 01011011010011110110111011110010010111 11011110000011110 100 82 0.015121 Table 4.12 Best classification results for combined feature set in original order Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 1101000001001000001111001110110111 0001011011100100001011011010000111101 1111101101101110111 100 80 0.03366 36 1101000001001000001111001110110111 0001011011100101110100100101111000010 0000010010010001111 100 80 0.03500 40 0010111110100111110000111100111101 0010000101000110011111001101101001101 1100111111111101101 100 86 0.02964 48 1000101100001101001111001110110111 0001011011100101110100100101111000010 0000010010010001000 100 82 0.02819 56 1010111111011011101111001110110111 1110100100011010001010010101111100110 0111110111111101010 100 86 0.01939 60 0011110000110001110110100110111111 1100001101000101100011001001011010100 0001001010011011111 100 84 0.02054 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7165 eigenvectors was evaluated in the experiments. The experiments using 10 eigenvectors, 14 eigenvectors, 18 eigenvectors and 22 eigenvectors were conducted. The parameters of GA–ANN were set exactly same for every experiment. The generation number was set to 40 and the population number was set to 10. The cross- over rate was set to 0.9 and the mutation rate was set to 0.2. The epoch for ANN was set to 10000. The hidden units were increased from 8 to 68 (an increment of 4 hidden units each time). Table 4.15 Best classification results for 14 eigenvectors Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 00001100101100110100101100110011100 001100111111100000100 100 68 0.059964 20 11110111101100110100101100110011100 001100111111100011011 100 68 0.055764 40 00001100101100110100101100110011100 001100111111100101001 100 74 0.038337 48 00001100101100110100101100110111100 001100111111100000100 100 76 0.034414 60 00001100101100110100101011110111100 001100111111101100001 100 78 0.026711 64 00001100101100110101011100010011100 001100111111100010010 100 78 0.027607 Table 4.14 Best classification results for 10 eigenvectors Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 0000001011100100110000101101111010 111001 100 60 0.080433 12 1111110101001010001111001110001100 110011 98.67 60 0.076291 36 0000001011100100101100110100101100 110011 100 64 0.043994 1000011001111111000001000001011000 110010 100 62 0.035194 52 1000011001111111111001000001011000 110010 100 62 0.033143 56 0000001011100100101100110100101100 110011 100 66 0.033341 Fig. 4.3. Significant areas in facial regions (original order combination). Fig. 4.4. Significant areas in facial regions (reverse order combination). 7166 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 Tables 4.14–4.16 present the best classification results for a dif- ferent number of eigenvectors. The results given in Table 4.14 show that the best testing classification rate for 10 eigenvectors is 66% when the hidden units are 56. The corresponding best train- ing classification rate is 100%. Table 4.15 shows that the best testing classification rate for 14 eigenvectors is 78% when the hidden units are 60 and 64. The cor- responding best training classification rate is 100%. Table 4.16 shows that the best testing classification rate for 18 eigenvectors is 80% when the hidden units are 60. The correspond- ing best training classification rate is 100%. 4.2.2. Database 2 results In Database 2, each class has 4–12 images for training and one for testing. Because the combined feature set (using average grey level value features) achieved much better results on Database 1, Table 4.17 Best classification results (Database 2) of hidden units 40 feature selection (Database 1) Hidden Units Training Classification Rate Testing Classification Rate 30 100% 88% 42 100% 94% 48 100% 88% 52 100% 90% 54 100% 92% 60 100% 90% 66 100% 88% 74 100% 94% 76 100% 90% 82 100% 88% 86 100% 94% Table 4.16 Best classification results for 18 eigenvectors Classification Rate Hidden Units Feature Selection (GA Chromosomes) Training Rate [%] Testing Rate [%] RMS Error 16 11011110101100101010101111110010000 1010001000110101000110110100000000000 100 70 0.064818 32 11111001101111101010100100111101111 0101110110110101000110010011101010100 100 72 0.040375 40 00001000111010101010111010111010000 0110100000101100000111100111011011100 100 74 0.036662 48 11101100011010101010100101001010011 0110100000101100000111100111011011100 100 68 0.030918 56 10010111100000111101011010111010000 0110100000101100000111100111011011100 100 74 0.026269 60 10101011101010101001001010111110000 0110100000110011011100111111111101101 100 80 0.024114 64 00010011100101010101011010111010000 0110100000101100000110110011010100001 100 74 0.027777 Table 4.18 Best classification results (Database 2) of hidden units 56 feature selection (Database 1) Hidden Units Training Classification Rate Testing Classification Rate 26 100% 88% 38 100% 92% 44 100% 92% 54 100% 90% 58 100% 94% 68 100% 90% 72 100% 92% 78 100% 88% 80 100% 88% 88 100% 90% X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7167 so only the combined feature set (using average grey level value features) experiments were conducted on Database 2. To make the experiments faster, the best feature selection from the previous experiments on Database 1 was used directly to train and test the ANN. The epoch was increased to 15,000 because there are more face images in the database. More hidden units were also used in the experiments. When the size of the small rectangular area was 6 � 4, the hid- den units 40 and 56 feature selections achieved the best recogni- tion rate on Database 1. These two feature selections were directly used in the experiments on Database 2. The results based on the hidden units 40 feature selection are listed in Table 4.17. The results based on the hidden units 56 feature selection are pre- sented in Table 4.18. The results given in both tables, the highest recognition rate is improved to 94%. When the size of small rectangular area was 10 � 4, the hidden units 44 feature selection achieved the best recognition rate on Database 1. Table 4.19 shows the best classification results using hidden units 44 feature selection on Database 2. The results given Table 4.19 Best classification results of hidden units 44 feature selection Hidden Units Training Classification Rate Testing Classification Rate 28 99.20% 90% 30 99.73% 88% 34 100% 88% 36 99.47% 90% 38 100% 94% 42 100% 90% 50 99.73% 90% 52 100% 92% 54 100% 94% 56 100% 94% 62 100% 90% 64 100% 92% 66 100% 92% Table 5.2 Combined feature set results on DB1 (Database 1) Hidden Units Training Rate [%] Testing Rate [%] 40 100 86 44 100 82 48 100 82 56 100 86 60 100 84 64 100 84 68 100 82 Table 5.3 Combined feature set results on DB2 (Database 2) Hidden Units Training Rate [%] Testing Rate [%] 26 99.73 88 30 100 88 38 100 88 42 100 94 50 100 90 7168 X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 in Table 4.19 show that the highest recognition rate is also im- proved to 94%. 5. Comparative analysis The results obtained in this research are compared to the results of the other methods mentioned in a recent study (Zhang et al., 2004). The authors (Zhang et al., 2004) also extracted a dataset from the FERET database as their experimental database, which has 927 images corresponding to 119 persons. Three different methods are experimented on this dataset. These methods include kernel associative memory (kAM) method which is proposed in their study, PCA-nearest-neighbor method and a simple NN-based template matching method termed ARENA. In this study, we com- pared with the highest classification rate achieved in their (Zhang et al., 2004) study. They conducted two sets of experiments similar as in our research: the first one used 3 images per class for training and the second one used 4 images per class for training. Fig. 5.1 shows the comparison of the best recognition rates be- tween our DB1 (Database 1) experimental results and their first set results. Fig. 5.2 shows the comparison of the best recognition rates between our DB2 (Database 2) experimental results and their sec- ond set results. Both figures show that our approach achieves a better recognition rate. 6. Conclusions We have presented a feature selection and fusion technique for face recognition in this paper. The GA for feature selection and ANN for feature classification are incorporated in the proposed tech- nique. The technique performs fusion and selection of facial fea- tures for face recognition. The significant areas inside each facial region are located through the feature selection. The FERET benchmark database is adopted to evaluate and com- pare the proposed technique with the existing techniques. Three different databases were used in the experimental investigation and all of them were extracted from the FERET benchmark data- base. Database 2 is the largest database, containing 50 classes and 426 face images. The experiments are conducted on Cluster machine at Central Queensland University. The preliminary experiments were conducted simply to pre-test the proposed technique. The experiments investigated the separate facial region feature set and the combined feature set using the average grey level value features. The preliminary results were Table 5.1 Separtate facial region feature set results on DB1 Hidden Units Training Rate [% 44 100 56 100 60 100 Left Eye Region 64 100 20 97.33 28 98.67 36 99.33 Right Eye Region 52 99.33 20 82 44 87.33 52 98.67 Nose Region 56 96.67 36 100 40 100 48 100 Mouth Region 56 100 promising. The left eye feature set, right eye feature set and mouth feature set all achieved the highest recognition rate of 92.31%. The nose feature set just achieved the highest recognition rate 76.92% and was the worst performer. The combined feature set outper- formed the separate facial region feature set by improving the rec- ognition rate to 100%. On Database 1, many experiments were conducted to perform further investigation. The different size of the feature extraction area, the different feature extraction tech- nique and the different sequence for feature combination were considered in the experimental investigation. When average grey level value features were used and the size of the small rectangular area was 6 � 4, the left eye feature set achieved 54% recognition rate, right eye feature set achieved 62% recognition rate, nose fea- ture set achieved 38% recognition rate and mouth feature set achieved 70% recognition rate (see Table 5.1). The mouth feature set was the best performer and the nose feature set was the worst performer. The combined feature set outperformed the separate fa- cial region feature set by achieving 86% recognition rate. The com- bination sequence did not affect the recognition rate for combined feature set. For significant areas, the mouth region contributed the ] Testing Rate [%] 54 54 54 52 60 62 60 60 34 34 38 36 70 66 66 68 20% 30% 40% 50% 60% 70% 80% 90% 100% Left Eye Right Eye Nose Mouth Combined Feature Set C la ss if ic at io n R at e Training Rate Testing Rate Fig. 5.1. Classification rate comparison between different feature sets. 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% PCA ARENA kAM GA-ANN Approaches R ec og ni ti on R at e Fig. 5.2. Comparison with other approaches. (3 images per class for training set). 40.00% 45.00% 50.00% 55.00% 60.00% 65.00% 70.00% 75.00% 80.00% 85.00% 90.00% 95.00% 100.00% PCA ARENA kAM GA-ANN Approach R ec og ni ti on R at e Fig. 5.3. Comparison with other approaches. (4 or more images per class for training set). X. Fan, B. Verma / Expert Systems with Applications 36 (2009) 7157–7169 7169 most and the nose region contributed the least. When the size of the small rectangular area was increased to 10 � 4, the left eye fea- ture set achieved 54% recognition rate, the right eye feature set achieved 56% recognition rate, the nose feature set achieved 36% recognition rate and the mouth feature set achieved 64%. The mouth feature set was still the best performer and the nose feature set was still the worst performer. The combined feature set im- proved the recognition rate to 86% when compared to the separate facial region feature set (see Table 5.2). The combination sequence slightly affected the recognition rate. The original order combina- tion achieved of 84% recognition rate while the reverse order com- bination achieved slightly higher recognition rate 86%. For significant areas, the mouth region still contributed the most and the nose region contributed the least. The above results indicate the mouth region is the most important facial region. Combination of facial features from each facial region is much more useful in improving recognition rate compared to just one facial region fea- ture set. A different number of eigenvectors was used for PCA fea- tures experiments. The 18 eigenvectors achieved the highest recognition rate 80%. The average grey level value features (com- bined feature set) outperformed the PCA features by 6%. The experiments on Database 2 were conducted by just using the combined feature set of average grey level value features. The recognition rate was improved to 94% (see Table 5.3). The experimental results of the proposed approach were also com- pared with the results of the other three approaches based on FER- ET database: PCA, ARENA and kAM. Fig. 5.2 was based on Database 1 results and the proposed approach improved the recognition rate by 1.3% compared to kAM method, 36% compared to PCA method and 41% compared to ARENA method. Fig. 5.3 was based on Data- base 2 results and the proposed technique improved the recogni- tion rate by 2.4% compared to kAM method, 43.2% compared to PCA method and 48.3% compared to ARENA method. The proposed technique achieved the highest recognition rate among the exist- ing techniques based on FERET database. References Bala, J., Huang, J., Vafaie, H., DeJong, K., & Wechsler, H. (1995). Hybrid learning using genetic algorithms and decision trees for pattern classification. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1, 719–724. Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 711–720. Brunelli, R., & Poggio, T. (1992). Face recognition through geometrical features. Proceedings of ECCV, 92, 792–800. Chellappa, R., Wilson, C. L., & Sirohey, S. (1995). Human and machine recognition of faces: A survey. Proceedings of the IEEE, 83, 705–740. Fang, Y., Tan, T., & Wang, Y. (2002). Fusion of global and local features for face verification. IEEE International Conference on Pattern Recognition, 2, 382–385. Li, S. Z., & Lu, J. (1999). Face recognition using the nearest feature line method. IEEE Transactions on Neural Networks, 10, 439–443. Liu, Q., Tang, X., Lu, H., & Ma, S. (2004). Kernel scatter-difference based discriminant analysis for face recognition. In International 17th conference on pattern recognition (Vol. 2, pp. 419–422). Liu, C., & Wechsler, H. (2000). Evolutionary pursuit and its application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6), 570–582. Liu, C., & Wechsler, H. (2003). Independent component analysis of Gabor features for face recognition. IEEE Transactions on Neural Networks, 14, 919–928. Phillips, P. J., Wechsler, H., Huang, J., & Rauss, P. (1998). The FERET database and evaluation procedure for face recognition algorithms. Image and Vision Computing, 16(5), 295–306. Sun, Y., & Yin, L. (2005). A genetic algorithm based feature selection approach for 3D face recognition. In Biometric consortium conference. USA. Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3, 71–86. Wiskott, L., Fellous, J. M., & Malsburg, C. (1997). Face recognition by elastic bunch graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 775–779. Yan, S., He, X., Hu, Y., Zhang, H., Li, M., & Cheng, Q. (2004). Bayesian shape localization for face recognition using global and local textures. IEEE Transactions on Circuits and Systems for Video Technology, 1(14), 102–113. Zhang, B., Zhang, H., & Ge, S. (2004). Face recognition by applying wavelet subband representation and kernel associative memory. IEEE Transactions on Neural Networks, 15, 166–177. Selection and fusion of facial features for face recognition Introduction Background Literature Reviewreview Proposed Techniquetechnique Overview Locate facial regions Average grey level value feature PCA feature GA-ANN GA-ANN technique Databases Experimental Resultsresults Preliminary Resultsresults Advance Experimentsexperiments Database1 ResultsDatabase 1 results Average Grey Level Value Featuresgrey level value features Small rectangular area size is 6 times 4 PCA Featuresfeatures Database 2 Resultsresults Comparative Analysisanalysis Conclusions References