key: cord-0872110-pa5nhjbc authors: Cabani, Adnane; Hammoudi, Karim; Benhabiles, Halim; Melkemi, Mahmoud title: MaskedFace-Net – A dataset of correctly/incorrectly masked face images in the context of COVID-19 date: 2020-11-28 journal: Smart Health (Amst) DOI: 10.1016/j.smhl.2020.100144 sha: df86f806a8a1e83bb0d069f00cfb5d78997a6dfd doc_id: 872110 cord_uid: pa5nhjbc Wearing face masks appears as a solution for limiting the spread of COVID-19. In this context, efficient recognition systems are expected for checking that people faces are masked in regulated areas. Hence, a large dataset of masked faces is necessary for training deep learning models towards detecting people wearing masks and those not wearing masks. Currently, there are no available large dataset of masked face images that permits to check if faces are correctly masked or not. Indeed, many people are not correctly wearing their masks due to bad practices, bad behaviors or vulnerability of individuals (e.g., children, old people). For these reasons, several mask wearing campaigns intend to sensitize people about this problem and good practices. In this sense, this work proposes an image editing approach and three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination for the global masked face detection (MaskedFace-Net). Realistic masked face datasets are proposed with a twofold objective: i) detecting people having their faces masked or not masked, ii) detecting faces having their masks correctly worn or incorrectly worn (e.g.; at airport portals or in crowds). To the best of our knowledge, no large dataset of masked faces provides such a granularity of classification towards mask wearing analysis. Moreover, this work globally presents the applied mask-to-face deformable model for permitting the generation of other masked face images, notably with specific masks. Our datasets of masked faces (137,016 images) are available at https://github.com/cabani/MaskedFace-Net. The dataset of face images Flickr-Faces-HQ3 (FFHQ), publicly made available online by NVIDIA Corporation, has been used for generating MaskedFace-Net. The wearing of the face masks appears as a solution for limiting the spread of COVID-19. In this context, efficient recognition systems are expected for checking that people faces are masked in regulated areas. To perform this task, a large dataset of masked faces is necessary for training deep learning models towards detecting people wearing masks and those not wearing masks. In this sense, some large datasets of face images with virus-related protection mask are available in the literature; e.g. the MAsked FAces dataset (MAFA) Ge et al. (2017) , the Real-World Masked Face Dataset (RMFD 2 ) and a masked face recognition dataset Wang et al. (2020) composed of Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Besides, many people are not correctly wearing their masks due to bad practices, bad behaviors or vulnerability of individuals (e.g., children, old people). In this sense, several mask wearing campaigns intend to sensitize people about this problem and good practices Africa Centres for Disease Control and Prevention -Africa CDC, African Union (2020); Bouteiller (2020); Action Santé-Social Côte d'Ivoire (2020); Colart (2020). In Hammoudi et al. (2020) , a mobile application "CheckYourMask" has been designed towards permitting people to check if their mask is correctly worn or not by taking a selfie. The creation of a dataset with correctly/incorrectly worn mask classes has been suggested. In makeml-mask (2020), a dataset composed of images with individual or multiple masked faces (853 images) has been proposed towards creating detection model taking into account the improperly masked faces. In this paper, we propose a relatively large dataset of 137,016 masked face images that is divided into two masked face categories; correctly worn and incorrectly worn (see samples in Fig. 1a and Fig. 1b, respectively) . Specifically, this work proposes three types of masked face detection dataset; namely, the Correctly Masked Face Dataset (CMFD), the Incorrectly Masked Face Dataset (IMFD) and their combination (MaskedFace-Net) for the masked face detection (see dataset structure in Fig. 3a ). Realistic masked face datasets are proposed with a twofold objective: i) to detect people having their faces masked or not masked, ii) to detect faces having their masked correctly worn or incorrectly worn (e.g.; at airport portals or in crowds). To the best of our knowledge, no large dataset of masked faces provides such a granularity of classification towards permitting mask wearing analysis. Moreover, this work globally presents the applied mask-to-face deformable model for permitting the generation of other masked face images, notably with specific masks. The dataset of face images Flickr-Faces-HQ 3 (FFHQ) has been selected as a base for creating an enhanced dataset MaskedFace-Net composed of correctly and incorrectly masked face images. Indeed, FFHQ contains 70,000 high-quality images of human faces in PNG file format of 1024 × 1024 resolution and is publicly available. The FFHQ dataset offers a lot of variety in terms of age, ethnicity, viewpoint, lighting, and image background. It was originally created as a benchmark for generative adversarial networks (GAN) Karras et al. (2018) . The global data-flow diagram shown in Fig. 2 shows the major stages of the image editing approach applied for generating the dataset of correctly/incorrectly masked face images "MaskedFace-Net". In particular, the MaskedFace-Net dataset has been created by defining a mask-to-face deformable model. A pseudo-code of the global principle for generating MaskedFace-Net is shown in Fig. 3b with respect to outputs depicted in Fig. 3a . For each face image of FFHQ (e.g. Fig. 4a ), Haar feature-based cascade classifiers are applied for detecting a region of interest (detection of face rectangle). Then, a specific key point detector "shape predictor 68 face landmarks 4 5 " (model derived from Sagonas et al. (2016) ) is applied to the detected region of interest and permits to automatically detect 68 landmarks of the facial structure (see sample in -19) . For this latter, 12 key points have manually been annotated for delineating the mask area (polygonal area). At this stage, four types of mask-to-face mapping have been defined with respect to targeted cases (see Fig. 3a ), namely mask covering the nose, mouth and chin (i.e. mask correctly worn), mask only covering the nose and mouth; mask only covering mouth and chin and mask under the mouth (i.e. three cases of mask incorrectly worn). For each type of mask-to-face mapping (CMFD, IMFD1, IMFD2 or IMFD3), a subset of 12 facial key points is retained from the 68 landmarks automatically detected; then matched to the 12 mask key points. By this way, the mask can fit specific areas of the face for each targeted case. Hence, a mask-to-face deformable model has been created to generate MaskedFace-Net. Moreover, each targeted case can have up to 2 key points of the mask (amongst 12 key points) that have their locations randomly displaced in a limited perimeter. In particular, this tolerance allows to act on the height of the mask on the face and then to bring more realism to the generated dataset. Therefore, MaskedFace-Net also contains a high variety of positioned masks. Finally, a homography transformation which relies on the defined point-to-point correspondence of landmarks between mask image and face image is applied for mapping mask pixels over the targeted facial areas. Instances of produced face landmarks and corresponding mask-to-face mapping are displayed for each type in Fig. 5a, Fig. 5b, Fig. 5c, Fig. 5d and Fig. 5e, Fig. 5f, Fig. 5g, Fig. 5h , respectively. For information, Fig. 3b illustrates a nominal scenario of face-related detection. Performance evaluation of the applied face-related detection is shown in Table 1. In particular, some faces of FFHQ have not been processed (177 images) since face occlusions (e.g., arms, hands) made the face detection failing (i.e. no detected face rectangle). After the face detection, the MaskedFace-Net dataset contained 139,646 images. Moreover, a manual filtering has been operated for deleting detected face images having their mask incorrectly mapped in reason of failing landmark detection. Indeed, erroneous landmark detection occurs when the visibility of the facial contours is limited (e.g. for profile views of detected faces). Nevertheless, the face-related image detection and edition applied over the FFHQ dataset have been highly effective since more than 95% of FFHQ images were exploited for generating the classes of masked faces. Hence, the resulting MaskedFace-Net dataset contains 137,016 masked face images. The proposed MaskedFace-Net dataset is composed of 49% of correctly masked faces (67,193 images) and 51% of incorrectly masked faces (69,823 images). For this latter set, approximately 80% represents faces with only mouth and chin masked, 10% with only nose and mouth masked and 10% with only chin masked. We emphasize that a raw mask-to-face mapping has been applied to the FFHQ dataset. In particular, no images have been filtered according to specific parameters (e.g. age). However, the file naming of MaskedFace-Net includes the one given by the FFHQ dataset. Hence, correspondence in between FFHQ and MaskedFace-Net can be established towards related filtering. It is worth mentioning that the minimum age for mask wearing depends on applicable laws in concerned countries. For instance, the mask wearing is compulsory from 6 years old in Spain, 11 years old in France, 12 years old in Belgium under certain conditions RTBF (2020). Between 2 and 11 years old, opinions differ Daclin (2020) . Since FFHQ contains face images of all ages, it is also the case for masked face image of MaskedFace-Net. Such datasets could then be exploited for detecting children in crowds that wear a mask under the recommended limit of age. Recently, our MaskedFace-Net dataset has been featured online in the section COVID-19 by a major source of computer vision datasets "VisualData.io". An image editing approach has been highlighted for generating masked face images with realistic image synthesis. A large dataset of 137,016 quality masked face images has been produced and made available online. MaskedFace-Net can be seen as a benchmark dataset for creating machine learning models related to the mask wearing analysis; notably, detecting the presence of mask or not over detected face images, the correct or incorrect wearing for detected masked faces. MaskedFace-Net can then be used for enhancing vision-based monitoring systems towards several applications such as checking the respect of laws related to the mask wearing or generating crowd statistics. Moreover, the method used for the generation of MaskedFace-Net has been described for permitting the generation of masked face images by using other types of mask. MaskedFace-Net has been generated for studying behaviors and contamination processes related to the COVID-19. In particular, MaskedFace-Net has been generated for limiting the spread of COVID-19 by supporting the health education. MaskedFace-Net may also be a base for studying behaviors and contamination phenomenon in the case of an appearing new virus having a similar transmission type. In no case the contributors of this work could be held responsible for any incident when using the MaskedFace-Net dataset or masks. The authors received no specific funding for this study. Comment bien mettre son masque How to wear a face mask correctly Coronavirus. Comment bien porter son masque ? Les conseils d'une infirmière de la métropole de Lille Le port du masque: Les gestes à faire et ne pas faire Coronavirus : Où et à quel âge les enfants doivent-ils porter le masque Detecting masked faces in the wild with lle-cnns In Validating the correct wearing of protection mask by taking a selfie: Design of a mobile application "CheckYourMask" to limit the spread of COVID-19 A style-based generator architecture for generative adversarial networks Le masque obligatoire dès 6 ans en espagne, 11 ans en France, 12 ans en belgique : Pourquoi tant de différences 300 faces in-the-wild challenge: Database and results Masked face recognition dataset and application The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.