key: cord-0685137-bcu0lzqk authors: Shao, Huikai; Zhong, Dexing title: Towards Open-set Touchless Palmprint Recognition via Weight-based Meta Metric Learning date: 2021-08-12 journal: Pattern Recognit DOI: 10.1016/j.patcog.2021.108247 sha: c5cd1cbdde744a5369d71bafd6ef031fa9b8d2ba doc_id: 685137 cord_uid: bcu0lzqk Touchless biometrics has become significant in the wake of novel coronavirus 2019 (COVID-19). Due to the convenience, user-friendly, and high-accuracy, touchless palmprint recognition shows great potential when the hygiene issues are considered during COVID-19. However, previous palmprint recognition methods are mainly focused on close-set scenario. In this paper, a novel Weight-based Meta Metric Learning (W2ML) method is proposed for accurate open-set touchless palmprint recognition, where only a part of categories is seen during training. Deep metric learning-based feature extractor is learned in a meta way to improve the generalization ability. Multiple sets are sampled randomly to define support and query sets, which are further combined into meta sets to constrain the set-based distances. Particularly, hard sample mining and weighting are adopted to select informative meta sets to improve the efficiency. Finally, embeddings with obvious inter-class and intra-class differences are obtained as features for palmprint identification and verification. Experiments are conducted on four palmprint benchmarks including fourteen constrained and unconstrained palmprint datasets. The results show that our W2ML method is more robust and efficient in dealing with open-set palmprint recognition issue as compared to the state-of-the-arts, where the accuracy is increased by up to 9.11% and the Equal Error Rate (EER) is decreased by up to 2.97%. recognition, where only a part of categories is seen during training. Deep metric learning-based feature extractor is learned in a meta way to improve the generalization ability. Multiple sets are sampled randomly to define support and query sets, which are further combined into meta sets to constrain the set-based distances. Particularly, hard sample mining and weighting are adopted to select informative meta sets to improve the efficiency. Finally, embeddings with obvious inter-class and intra-class differences are obtained as features for palmprint identification and verification. Experiments are conducted on four palmprint benchmarks including fourteen constrained and unconstrained palmprint datasets. The results show that our W2ML method is more robust and efficient in dealing with open-set palmprint recognition issue as compared to Biometrics is a significant and useful technology for personal authentication using the physical or behavioral characteristics of humans. There are some popular biometrics technologies widely studied and applied, such as face recognition [1] and fingerprint recognition [2] . Though they have obtained promising performance, there are still certain shortcomings about accuracy and hygiene considerations in some specific scenarios, especially for the outbreak of novel coronavirus 2019 . For example, touched fingerprint recognition has to require the users to press the sensor, which may increase the risk of contracting coronavirus. In this paper, we are focused on a potential biometrics technology, touchless palmprint recognition, which has attracted more and more attentions from academia and industry [3, 4] . There have been some touchless palmprint acquisition devices established to provide data support for research, as shown in Fig. 1 . They consist of simple cameras and lights, and some even smart phones, which shows the potential of touchless palmprint recognition to be used as a convenient and hygeian biometrics technology [5] . There are many excellent palmprint recognition algorithms in the literature, e.g., Local Microstructure Tetra Pattern (LMTrP) [6] and Local Discriminant Direction Binary Pattern (LDDBP) [7] . However, they often rely on carefully designed handcraft features. With the emergence of deep learning, deep metric-based palmprint recognition methods are proposed to give an end-to-end solution, which show superiority and obtain promising results on specific databases [8, 9] . There are two modes for palmprint recognition, i.e., verification and identification [10] . The verification is a one-to-one matching process to determine "whether the tester is whom he claims to be". The identification is a one-to-many matching process to determine who the tester is. However, most of palmprint recognition algorithms are focused on the close-set scenarios, where all of the categories are seen during training [14] . When new users join the system, it has to spend much time to update the model. While open-set palmprint recognition is an important biometrics technology to be developed, which allows us to easily add new users at any time. As shown in Fig. 2 2) Hard sample mining and weighting are adopted to improve the efficiency and accuracy. Informative samples are selected to form positive or negative meta sets based on relative distance and then assigned specific weights to effectively encourage intra-class similarity and inter-class separability. 3) Adequate experiments are conducted on several touchless benchmark palmprint databases, including constrained and unconstrained datasets. The results demonstrate that our W2ML method can outperform other methods by a competitive margin to obtain the state-of-the-arts. This paper consists of 6 sections. In section 2, the related work is reviewed. Section 3 details our proposed methods. Section 4 presents the experiments and results. Section 5 analyzes the results carefully. Section 6 concludes the paper. The general pipeline of palmprint recognition includes image acquisition, preprocessing, feature extraction and matching [15] . Previous methods are mainly based on images collected in a touched way. They often require the testers to place their hands to a platform, such as PolyU palmprint database [16] . However, the touched acquisition manner is not very user-friendly, and the users are suffered from hygienic concerns [17] . Recently, there have been some palmprint recognition databases collected by touchless acquisition devices such as Tongji contactless palmprint database [3] and IITD palmprint database [18] . Some of them are even collected by common mobile phones, such as Xi'an Jiaotong University Unconstrained Palmprint (XJTU-UP) database [12] . These databases are more suitable in the realistic applications, especially in response to the outbreak of COVID-19. As important steps of palmprint recognition, feature extraction and matching include direction-based, statistics-based, and structure-based methods, which are mainly based on main lines, textures, and folds [15] . Kong Zhao and Zhang [29] proposed Deep Discriminative Representation (DDR) to learn discriminative features with limited palmprint training data using deep convolutional networks. Shao and Zhong [30] proposed a few-shot palmprint recognition framework based on graph neural networks using only a few of training samples. Izadpanahkakhk et al. [31] proposed a deep mobile palmprint verification framework via an effective weighted loss function, which could extract discriminative features with high accuracy. Recently, there are also some researches focusing on cross-database palmprint recognition, such as [32] and [33] . In summary, traditional palmprint recognition methods could obtain relatively accurate results in some specific database, but they usually rely on hand-crafted features. Deep palmprint recognition methods can construct end-to-end framework, but most of them rely on a large amount of training data and require to see all categories during training. On the contrary, our methods can gain greater generalization ability to adapt to unseen categories and are more suitable to open-set palmprint recognition. The goal of metric learning is to learn a distance model to establish similarity or dissimilarity between different samples, which has gained great success in face recognition [34] and image retrieval [35] . Metric learning can minimize the distance of genuine matchings and maximize the distance of imposter matchings to obtain discriminative features and embeddings. Chopra et al. [36] proposed contrastive loss and a discriminative learning framework based on Siamese network to drive the similarity metric to be small for faces from the same subject and large for faces from different subjects. Then triplet loss was proposed to improve the performance by using both the in-class and inter-class relations, where three subjects are formed as positive, negative, and anchor samples [37, 38] . Ge et al. [39] proposed a new Hierarchical Triplet Loss (HTL), which can collect informative training samples automatically to cope with the limitation of random sampling in original triplet loss. Song et al. [40] also modified triplet loss and proposed a lifted structure loss to attempt to take full advantage of the training batches in training. Wang et al. [41] proposed Multi-Similarity loss (MS loss) to provide a principled approach for collecting and weighting informative pairs. Duan et al. [42] proposed Deep Adversarial Metric Learning (DAML) to generate synthetic hard negatives from the original negative samples and trained the feature extractor and hard negative generator using adversarial learning. These methods can obtain satisfactory performance on some visual task, such as face identification and verification. However, they may not be very adaptive to palmprint recognition, especially for the open-set recognition scenarios. Traditional metric learning methods are hungry for vast amounts of labeled data. For palmprint recognition, due to privacy and cost concerns, it is difficult to collect enough training data like face recognition. So it requires the model to learn greater interpretability and generalization ability using a small amount of training samples. Another line of related work is meta learning, which is aimed to enable a base model to be adaptive to new tasks by extracting transferrable knowledge from auxiliary tasks [43] . Finn et al. [44] proposed Model-Agnostic Meta-Learning (MAML) to search for weight configuration such that the given network can be effectively fine-tuned within a few update steps. Sung et al. [45] proposed Relation Network (RN) to learn a deep distance metric by computing the scores of query images and support samples. Snell et al. [46] proposed Prototypical Networks, which learned a prototype representation for each class in metric space and performed the classification by computing the distances to prototype representations. Then, Medina and Devos [47] pre-trained the model using self-supervised learning to improve the performance of Prototypical Networks. Garcia and Bruna [48] proposed a graph neural network-based meta learning method and obtained the state-of-the-arts on several tasks. Chen et al. [49] proposed a Deep Meta Metric Learning (DMML) framework for visual recognition and proved that softmax and triplet loss were consistent in the meta space. Xu et al. [50] proposed a detection method based on meta learning to distinguish and compare a pair of traffic samples. Wu et al. [51] proposed a deep adversarial learning-based meta learning method for video-based person re-ID using the Variational Recurrent Neural Networks (VRNNs). These methods are mainly used for few-shot learning scenarios, where many few-shot tasks are formed to evaluate the model during testing. However, it may not be suitable for palmprint identification and verification, because discriminative features need to be extracted and further matched with each other. In this paper, our proposed method is focused on more general metric learning for visual recognition problems like other palmprint recognition algorithms, such as [8] and [14] . Open-set palmprint recognition is aimed to train the model using a part of images . The episode-based training is adopted, and the embeddings of query samples are matched with those of support samples to get the correct identity. During the testing, traditional meta learning methods are mainly evaluated by performing few-shot learning tasks, which construct many subtasks like training iteration process. Different from it, our W2ML method focuses on more general metric learning-based biometric recognition problems, which obtains the embeddings of images as low-dimensional features to carry out palmprint identification and verification tasks. W2ML method is trained to adapt to unseen samples successfully in an end-to-end way on episodes. In each episode, the meta metric is learned to correctly identify the query images from Q with support images in S by constraining their distances. The optimization object can be formulated as where   D  represents the distance. The traditional deep metric-based palmprint recognition methods train the model by operating the distance between sample pairs. For example, contrastive loss-based DHN tries to make positive palmprint pairs closer and push negative palmprint pairs apart from each other [14] . Different from them, benefit from the special training data sampling format of meta learning, our W2ML model is optimized by set-based distances to improve the generalization ability. Specifically, in the feature space, all the features of the same category in the support set can be formed into a meta support set, denoted as where j represents the j-th category and   f  denotes the embedding function implemented by Convolutional Neural Networks (CNN). ij w is the weight for image j i x . Therefore, the distances between query samples and meta support sets are constrained to learn discriminative representations. Similar to (2) , the distance can be denoted as where j q x  is a query image of j'-th category, j meta S is the j-th meta support set, and   d  represents the distance between features of different samples extracted by can be Euclidean distance or cosine distance, and cosine distance is adopted in this paper. During each episode-based training iteration, the query samples and meta support sets with the same class are combined into positive meta sets, and the query samples and meta support sets with the different classes are combined into negative meta sets. Our W2ML method is optimized by constraining the distances of positive meta sets and negative meta sets, i.e., (3) . However, it is difficult and inefficient to train directly. So inspired by [41] , the strategy of hard sample mining and weighting is adopted. Firstly, informative samples are selected to form positive or negative meta sets, which is based on the relative similarity between negative and positive meta sets. For a query sample j q x  , the positive pair   , jj qs where m is a margin. Similarly, for the query sample j q x  , the negative pair   , jj qs Fig. 4 (a) is farther than in Fig. 4 (b) , and thus the weight should be increased accordingly. For a selected pair   , jj qs x x j j    in positive meta sets, its weight can be written as where  and  are two hyper-parameters. (6) where  and  are two hyper-parameters. Similarly, (7) is calculated by a single negative pair and all negative sample pairs in negative meta set. Therefore, the overall optimization object of W2ML method in an episode is formulated as where represents the number of query samples. In (8) , is just the weight defined in (6) or (7) respectively.  is set to 2 and  is set to 40, like [41] . The pseudocode of W2ML method is provided in Algorithm 1. unconstrained manner, i.e., iPhone 6S, HUAWEI Mate8, LG G4, Samsung Galaxy Note5, and MI8 [12] . Two kinds of illuminations are adopted, one is indoor natural with the size of 224×224 pixels using the method in [12] . Fig. 5 shows some typical images in XJTU-UP database. Tongji contactless palmprint database consists of 12,000 palm images of right and left hands from 300 individuals using touchless device [3] . For each hand, 20 images were collected in two sessions, and 10 images in each session. The images are cropped into ROIs with the size of 224×224 pixels. Fig. 6 shows some samples. is also an unconstrained palmprint database collected by multi-brand smartphones, Huawei and Xiaomi [53] . During the experiments, each database is divided into training set and test set with [54] is adopted as the backbone of feature extractor, whose last layer is followed by a 128-dimensional fully connected layer to obtain 128-dimensional features. Empirically, for every task, the number of categories N is set to 32 and k is set to 4. In addition, benefit from transfer learning [55] , the weights pre-trained on ImageNet are adopted to initialize the neural networks and fine-tuned on the databases. PyTorch framework is adopted to implement the experiments on a NVIDIA GPU GTX1080. Adam Optimizer is applied and the base learning rate is set to 0.0002. The implementation details are summarized in Table 1 . In order to further evaluate the effectiveness of our methods, cross-database palmprint recognition are also carried out in XJTU-UP database. HF and IN are selected as training sets and the remaining datasets are selected as test sets. The first 100 categories are used to train the model and the remaining 100 categories are used for testing. Palmprint identification and verification are also performed. Open-set palmprint recognition is a difficult task, especially for unconstrained databases. There are so many unseen samples from unknown categories in the test set. These images are various in terms of illumination, angle, and noise, which may cause significant degradation of performance. The model can only learn potential information from the training set, which requires it to have a strong generalization ability, especially for cross-database recognition. However, from the results, our W2ML method can obtain promising results of open-set palmprint identification and verification on several databases. Thanks to meta-learning, W2ML learns to accurately distinguish between positive and negative pairs in a meta way. Traditional metric learning methods pay much attention to independent samples. They are difficult to find the difference between palmprint images of different categories and learn the similarity between samples of the same category. W2ML treats the single overall classification task as multiple sub-tasks and adopts set-based distance instead of sample-based one to learn discriminative metric in each sub-task. Informative samples are further selected and set to specific weights, which drives the model to focus on difficult samples to improve training efficiency. So the novel sampling and training make W2ML better suitable for open-set palmprint recognition. In the experiments, different databases are adopted. Compared with unconstrained palmprint databases, constrained palmprint images can get better performance. It may be because the unconstrained images have more complex lighting, background, and angle, just as shown in Figs. 5 and 7 , which increases the difficulty of ROI and feature extractions. In contrast, it is relatively easy to segment stable ROIs in Tongji and IITD databases, which can further help to extract discriminative features. In addition, it can be observed that cross-database palmprint recognition is more difficult. There are significant gaps between different datasets, which leads to the model not being able to learn the feature distribution of the test set well. From the results, the best performance is obtained when databases collected by similar devices or illuminations, whose gap is relatively small. There Fig. 9 , the accuracy is increased consistently with the feature dimension, while the EER is decreased consistently with the feature dimension. In order to evaluate the effectiveness of our model, we further conduct several experiments to compare it with other works. Different deep learning and non-deep learning palmprint recognition algorithms are carried out, as follows:  DDBPD [7] extracts several DDBCs by calculating the convolution difference vector and concatenates them as a global feature vector for palmprint recognition.  LDDBP [24] extracts the discriminative direction features of palmprint images based on exponential and Gaussian fusion model (EGM).  DHN [14] is a deep learning-based palmprint recognition method, which transfers palmprint images into binary codes as discriminative features. Here, VGG 16 pretrained on ImageNet is adopted as backbone.  ALDC [56] is a novel double-layer direction extraction method, which extracts apparent and latent direction features for palmprint recognition.  DEH [28] trains several local feature extractors and concatenates their features as a global discriminative feature based on online gradient boosting. Activation loss and adversarial loss are constructed to increase the diversity of learners. The VGG 16 pre-trained on ImageNet is also adopted.  PlamNet [8] applies Gabor filters in CNN to extract discriminative specific descriptors of palmprint images.  Softmax loss [57] is a popular probabilistic interpretation loss widely used for classification tasks.  Contrastive loss [36] is aimed to shorten the distances of positive samples and push away those negative samples.  Lifted structure loss [40] fine-tuning is adopted and the weights pre-trained on ImageNet are used for initialization. The same validation protocol and evaluation metrics, accuracy and EER, are also adopted. The results are shown in Tables 9, 10, and 11. databases with more training data are adopted, deep learning-based methods can obtain better performance, which truly indicates the superiority of deep learning. In the future, there will be more and more palmprint images collected by touchless devices available, so the deep learning-based palmprint recognition algorithms may gradually become the mainstream and trend. Contrastive loss is a sample-based optimization method, which is aimed to minimize the distances of genuine matchings and push away the imposter matchings from each other. However, it treats different samples equally. Similarly, DEH and DHN are mainly based on contrastive loss, so they may be suitable to close-set palmprint recognition but not adaptive to unseen categories in open-set recognition. Lifted structure loss only considers the negative relative similarity, and it cannot be adaptive to unseen samples very well. However, our meta learning-based optimization process can learn more potential information for better generalization. In this paper, a novel deep metric-based method, W2ML, is proposed for open-set touchless palmprint recognition. Only a part of categories is adopted to train the model, which satisfies the open-set recognition scenario. In order to be adaptive to the unseen palmprint samples, our W2ML method performs metric learning in a meta way to improve its generalization ability to obtain discriminative embeddings. Specifically, multiple subsets are sampled from training set to define different tasks. In each sub-task, the features of the same category in the support set are combined into a meta support set. During each episode-based learning iteration, query samples and meta support sets are further combined into positive and negative meta sets to constrain the set-based distances. In addition, hard sample mining and weighting are adopted to select informative meta sets, which are then given specific weights to improve the efficiency. Extensive experiments including palmprint identification and verification are conducted on several constrained and unconstrained palmprint databases. Compared with baselines, the identification accuracy is increased by up to 9.11% and the EER of palmprint verification is decreased by up to 2.97%. The results demonstrate the superiority of our W2ML method on open-set touchless palmprint recognition. Touchless palmprint recognition is a significant and potential biometrics technology, especially when the hygiene issues are considered in response to the outbreak of COVID-19. Touchless palmprint recognition uses a visual sensor, which does not require the users to directly touch the device. Therefore, it is possible to avoid cross-infection between users, which is particularly important during the pandemic of COVID-19. The experimental results show that our W2ML method can improve the performance of touchless palmprint recognition to a new level, which provides the possibility to promote its practical application. In the future, domain adaption strategies can be introduced to close the gaps between different databases to improve the crossdatabase open-set palmprint recognition. An Encryption Approach Using Information Fusion Techniques Involving Prime Numbers and Face Biometrics End-to-End Latent Fingerprint Search Towards contactless palmprint recognition: A novel device, a new benchmark, and a collaborative representation based identification approach Towards Cross-dataset Palmprint Recognition via Joint Pixel and Feature Alignment Toward Unconstrained Palmprint Recognition on Consumer Devices: A Literature Review Palmprint recognition with Local Micro-structure Tetra Pattern Learning Discriminant Direction Binary Palmprint Descriptor PalmNet: Gabor-PCA Convolutional Networks for Touchless Palmprint Recognition Centralized Large Margin Cosine Loss for Open-Set Deep Palmprint Recognition An introduction to biometric recognition Developing a contactless palmprint authentication system by introducing a novel ROI extraction method Efficient Deep Palmprint Recognition via Distilled Hashing Coding, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops Ordinal palmprint represention for personal identification A Hand-Based Multi-Biometrics via Deep Hashing Network and Biometric Graph Matching Decade progress of palmprint recognition: A brief survey Online palmprint identification A SIFT-based contactless palmprint verification approach using iterative RANSAC and local palmprint descriptors Personal Identification Using Multibiometrics Rank-Level Fusion Competitive coding scheme for palmprint verification Palmprint verification using binary orientation co-occurrence vector Fragile Bits in Palmprint Recognition Palmprint Recognition Using Neighboring Direction Indicator Histogram of Oriented Lines for Palmprint Recognition Local Discriminant Direction Binary Pattern for Palmprint Representation and Recognition Palmprint identification performance improvement via patch-based binarized statistical image features One-shot cross-dataset palmprint recognition via adversarial domain adaptation Palmprint Recognition in Uncontrolled and Uncooperative Environment Effective deep ensemble hashing for open-set palmprint recognition Deep discriminative representation for generic palmprint recognition Few-shot palmprint recognition via graph neural networks Joint feature fusion and optimization via deep discriminative model for mobile palmprint verification PalmGAN for cross-domain palmprint recognition Cross-Domain Palmprint Recognition Based on Transfer Convolutional Autoencoder Face re-identification challenge: Are face recognition models good enough? Scalable Deep Hashing for Large-Scale Social Image Retrieval Learning a similarity metric discriminatively, with application to face verification Distance Metric Learning for Large Margin Nearest Neighbor Classification Deep Metric Learning Using Triplet Network, 3rd International Workshop on Similarity-Based Pattern Recognition (SIMBAD) Deep Metric Learning with Hierarchical Triplet Loss, 15th European Conference on Computer Vision Deep Metric Learning via Lifted Structured Feature Embedding Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning Deep Adversarial Metric Learning Few-Shot Learning for Palmprint Recognition via Meta-Siamese Network Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Learning to Compare: Relation Network for Few-Shot Learning Prototypical Networks for Few-shot Learning, 31st Annual Conference on Neural Information Processing Systems (NIPS) Self-Supervised Prototypical Transfer Learning for Few-Shot Classification Few-Shot Learning with Graph Neural Networks Deep Meta Metric Learning A Method of Few-Shot Network Intrusion Detection Based on Meta-Learning Framework Few-Shot Deep Adversarial Learning for Video-Based Person Re-Identification Deep Distillation Hashing for Unconstrained Palmprint Recognition Towards Palmprint Verification On Smartphones Deep Residual Learning for Image Recognition A Survey on Transfer Learning Local apparent and latent direction extraction for palmprint recognition Improving neural networks by preventing co-adaptation of feature detectors Candidate in the School of Automation Science and Engineering, Xi'an Jiaotong University. He received his Bachelor's degree from Chongqing University in 2017 He was a visiting scholar with University of Illinois at Urbana-Champaign, United States The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.