key: cord-0802330-fxy1sego authors: Lim, Yin Cheng; Abdul Shakor, Ameerah Su'ad; Shaharudin, Rafiza title: Reliability and Accuracy of 2D Photogrammetry: A Comparison With Direct Measurement date: 2022-01-25 journal: Front Public Health DOI: 10.3389/fpubh.2021.813058 sha: ff4d1c03e0044a69f9f6f1354d3597c46e8819d7 doc_id: 802330 cord_uid: fxy1sego OBJECTIVE: Facial anthropometric data is important for the design of respirators. Two-dimensional (2D) photogrammetry has replaced direct anthropometric method, but the reliability and accuracy of 2D photogrammetry has not been quantified. This study aimed to assess inter-rater reliability of 2D photogrammetry and to examine the reliability and accuracy of 2D photogrammetry with direct measurement. DESIGN: A cross-sectional study. SETTING: Malaysia. PARTICIPANTS: A subset of 96 participants aged 18 and above. PRIMARY AND SECONDARY OUTCOMES: Ten facial dimensions were measured using direct measurement and 2D photogrammetry. An assessment of inter-rater reliability was performed using intra-class correlation (ICC) of the 2D images. In addition, ICC and Bland-Altman analyses were used to assess the reliability and agreement of 2D photogrammetry with direct measurement. RESULTS: Except for head breadth and bigonial breadth, which were also found to have low inter-rater reliability, there was no significant difference in the inter-rater mean value of the 2D photogrammetry. The mean measurements derived from direct measurement and 2D photogrammetry were mostly similar. However, statistical differences were noted for two facial dimensions, i.e., bizygomatic breadth and bigonial breadth, and clinically the magnitude of difference was also significant. There were no statistical differences in respect to the remaining eight facial dimensions, where the smallest mean difference was 0.3 mm and biggest mean difference was 1.0 mm. The ICC showed head breadth had poor reliability, whilst Bland-Altman analyses showed seven out of 10 facial dimensions using 2D photogrammetry were accurate, as compared to direct measurement. CONCLUSION: Only certain facial measurements can be reliably and accurately measured using 2D photogrammetry, thus it is important to conduct a reliability and validation study before the use of any measurement methods in anthropometric studies. The results of this study also suggest that 2D photogrammetry can be used to supplement direct measurement for certain facial dimensions. Craniometry is a specific component of anthropometry that focuses on the measurement of the anatomical size of the head and face of living subjects. It is widely applied in orthodontic and reconstructive surgery, forensics, and the design of helmets, masks, eyeglasses and respirators (1) . Numerous local anthropometric studies have been undertaken to achieve various objectives for the different needs of a range of target groups such as preschool children (2) , young adults (3) and older persons (4) , and such studies include some that have focused on the facial anthropometry of the Malaysian population (5) (6) (7) (8) (9) (10) . Many of these studies highlight the importance of incorporating ergonomic principles into design to ensure end-products fit with the body conditions and sizes of the target users. Several methods can be used to measure facial soft tissues. These include manual anthropometry (11, 12) and twodimensional (2D) (5, 10, 13-16) and 3-dimensional (3D) (8, 17, 18) imaging techniques. Manual anthropometry takes direct measurements from the subject using sliding and spreading callipers, flexible measuring tapes and protractors. The main advantages of this method are that it is non-invasive and low cost. However, despite being considered the gold standard for facial measurement, it has some disadvantages; for example, it is time consuming and it depends on the participant's compliance for reliable results. Furthermore, it is investigator dependent, meaning that there is a possibility that the investigator may apply too much pressure on the equipment during measurement, which may distort soft tissue and introduce measurement errors. Nowadays, 2D and 3D measurement techniques are commonly used to measure human anthropometric characteristics. The 2D imaging technique provides a snapshot of an object, thus it requires the participant's cooperation during image acquisition. Despite evidence to show that the 3D imaging technique is more accurate (19, 20) , the 2D option is still preferred because it is cheap, non-invasive, less time consuming and can be conducted on the ground, as in population surveys. A number of studies have compared the performance of different anthropometry methods (19, (21) (22) (23) (24) (25) . For example, one study that investigated the difference in human skull measurements by comparing conventional cephalometric radiographs against 3D measurements on 3D models found that measurements of the same skull can differ significantly (21) . Likewise, another study also noted significant differences in facial dimensions when using 2D and 3D imaging techniques, and concluded that the two facial anthropometry methods cannot be equivalently used (22) . Conversely, other studies that compared 3D with 2D (19, 23, 24) and 3D with direct measurement (19, 25, 26) found that the methods produced comparable results in terms of identifying facial soft-tissue landmarks. Although 2D images have been used widely for facial tissue analysis (5, 10) , evidence demonstrating the accuracy of 2D Abbreviations: 95%CI, 95% Confidence Interval; 2D, 2-Dimensional; 3D, 3-Dimensional; ICC, Intraclass correlation; NHMS, National Health and Morbidity Survey; NIOSH, National Institute for Occupational Safety and Health; PCA, Principal Component Analysis; SD, Standard deviation; SE, Standard error. photogrammetry in measuring facial dimensions is lacking and that which does exist shows contradicting results. While some studies have shown that different anthropometry methods can be used interchangeably (19, 24, 27) , other studies have revealed otherwise (21, 22) . Nevertheless, studies that have examined the reliability and agreement of 2D photogrammetry in measuring facial dimensions, as compared to the gold standard manual method are still limited (19, 20, 28) . Thus, this study aimed to determine the inter-rater reliability of 2D photogrammetry in measuring facial anthropometry, as well as the reliability and agreement of 2D photogrammetry, as compared to direct measurement. The significance of this validation study is that it will be used for a future population nationwide study to help us to develop own bivariate and Principal Component Analysis (PCA) panels, which is critical for the development of respirator. A cross-sectional study was conducted among a subset of participants aged 18 and above who were involved in the National Health and Morbidity Survey (NHMS) Malaysia 2020. NHMS 2020 was a national population-based survey aimed at determining seroprevalence of COVID-19, hepatitis B and C in Malaysia. Representative samples were selected randomly from 2000 living quarters in selected Enumeration Blocks (EBs) using a two stage stratified sampling from the Department of Statistics, Malaysia (29) . All household members who consented and fit the inclusion criteria were recruited into the survey. For this validation study, a subset of 96 respondents was conveniently selected from one district (Banting, Selangor) in West Malaysia and one district (Tawau, Sabah) from East Malaysia. Participants with a history of previous facial surgery, dental or facial deformity, and those with a beard or moustache were excluded from participating. The 10 facial dimensions listed in Table 1 were measured because they are critical for the development of respirators (30) . A measurer's manual was created prior to the field investigation. The measurer was trained until the measurement errors were less than what was allowed. Usually, the allowable error margin was set at 2 mm for all the dimensions measured (28, 31) . Prior to image acquisition, direct measurements were taken. The 10 selected morphological points were located by inspection and/or palpation in accordance with the 1988 Anthropometric Survey of the US Army Personnel Project (31) . Spreading callipers were used to measure head breadth, zygomatic breadth and bigonial breadth, whereas sliding callipers were used for the remaining seven facial dimensions. During the measurement process, the investigators endeavoured to ensure that the participants were relaxed and seated with a natural head position and relaxed lips. The participants' images were captured using a 20.0-megapixel digital camera (Canon IXUS 190, Tokyo, Japan) positioned on a tripod (Manfrotto MKCOMPACTLT-BK, Cassola, Italy) at a fixed distance of 1.0 metre. The tripod maintained the stability and correct height of the camera according to each participant's height. Participants were requested to wear a surgical cap to remove hair strands from their face and ears when needed. The frontal bony landmarks on their face were labelled with stickers. Before image capture, the participants were asked to look straight ahead holding their head in a neutral position without flexing or extending the neck, and to not smile or frown. Also, the head was kept in a posture so that the optical axis of the camera lens would pass through the Frankfurt plane of the head (32) . For each participant, one anterior and one lateral photo were taken. A blue screen was used as the background in order to create sufficient contrast with the colour of the skin. The camera height was adjusted based on the height of the subject's ear from the floor. For calibration purposes, a metricscale ruler was placed above the forehead of the subject for the image taken from the anterior view (Supplementary e- Figure 1 ). On the other hand, the ruler was placed perpendicular to the nose of the subject for the image taken from the lateral view (Supplementary e- Figure 2 ). Each image was checked immediately after it was obtained to ensure absence of acquisition errors such as imaging artefacts, blurring, absence of surface data, poor orientation, closed eyes, and lack of neutral facial expression. Images with the incorrect characteristics were discarded, and new images were obtained to ensure that they met the established requirements. All images were captured in JPEG format and were transferred to a computer after each day of shooting. The anthropometric dimensions were calculated using the software package Digimizer version 5.4.4. This software is very useful for analysing images as it is very flexible and simple to use. Its capabilities include providing the user with the ability to set contrast and brightness, change background images, change images to grayscale mode, measure angles, determine the centre of the segment and reduce image noise (33). For the inter-rater reliability assessment, the measurements made by the first observer were compared with those carried out by the second observer on the same photo images at a minimum of a 3-week interval, with no landmarks saved after the first measurement. The analyses were conducted using SPSS version 26 and MedCalc version 19.8. Normal distribution of the data was evaluated using the Shapiro-Wilk test. The results showed that none of the variables violated the normality distribution. To evaluate the inter-rater measurement reproducibility of 2D images, a different 2 | Summary of anthropometric statistics between observers of the 2D photogrammetry and inter-rater reliability coefficient of 2D photogrammetry. Observer A Mean ± SD Observer B Mean ± SD Mean differences ± SD 95% CI of mean differences observer took measurements using the same method as the first observer, and results were evaluated using paired sample t-test and intra-class correlation (ICC) coefficient. Next, paired sample t-test and ICC were also conducted to determine the mean differences and reliability between direct measurement and the average of 2D photogrammetry methods. ICC provides information on the ability to differentiate variations between participants and measurement. The ICC was defined as the ratio of variance among participants (participant variability) over the total variance (participant variability, observer variability and measurement variability). The ICC value ranges between 0 (no reliability) and one (perfect reliability). In line with prior research, in this study < 0.4 indicates poor reliability, 0.4-0.75 indicates moderate reliability, and ≥ 0.75 indicates excellent reliability (34) . The degree of agreement between the two methods was further evaluated using Bland-Altman analysis, where the difference between the measurements was plotted against the average of the two measurements. The plot generates three horizontal reference lines that are superimposed on a scatterplot: one line represents the average difference between the measurements, and the upper and lower lines mark the two-standard deviation (±2 SD) from the mean differences. In a Bland-Altman analysis, two criteria need to be met to establish that the two measurement methods are comparable. First, the mean differences should be small and close to 0. Second, the SD of this difference should be small (35) . However, there are no guidelines on how narrow the limit of agreement needs to be before the two methods can be considered interchangeable. For all the statistical analyses, the methods were considered to be in good agreement and interchangeable at an arbitrary value of 2 mm between two observers and two methods (28) . The statistical significance level was set as p < 0.005 for all statistical analysis. The study participants were not involved in the development of this study. The results of the study were not shared with the participants. A total of 96 participants participated in this study, of whom 51 (53.1%) were female. The mean age of the participants was 43.3 ± 16.9 years old and they were predominantly of Malay ethnicity (60, 62.5%). A reproducibility assessment was conducted to determine the mean differences and the level of reliability of 2D photogrammetry between two observers. The mean values between the two observers for the abovementioned 10 facial dimensions revealed no significant difference, except for bigonial breadth (8.1 mm) and head breadth (11.2 mm) ( Table 2) The mean differences between the direct and the 2D photogrammetry measurements were within 2.0 mm, except for bizygomatic breadth and bigonial breadth ( Table 3) . The largest mean differences were observed in bigonial breadth (9.3 mm), followed by bizygomatic breadth (3.3 mm). The smallest mean difference between the two methods was found in nose protrusion (0.4 mm) and nose breadth (0.4 mm). The reliability of using 2D photogrammetry and direct measurement for all measured dimensions varied from ICC = 0.81 (nose protrusion) to 0.99 (subnasal sellion length), except for head breadth [ICC: 0.36, 95% confidence interval (CI): 0.05-0.58] ( Table 4 ). The highest ICC score was noted for subnasal sellion length (ICC: 0.99, 95% CI: 0.98-0.99), followed by menton sellion length (ICC: 0.98, 95% CI: 0.97-0.99) and minimum frontal breadth (ICC: 0.98, 95% CI: 0.97-0.99). Supplementary e- Figure 3a to Supplementary e- Figure 3j show the level of agreements between direct measurement and 2D photogrammetry for 10 facial dimensions according to Bland-Altman plots (Supplementary e-Figure 3) . The Y axis displayed the mean difference between two methods, whereas the X axis showed the mean of two different method. Ninety-five percentage Direct measurement Mean ± SD 2D photogrammetry Mean ± SD Mean differences ± SD 95% CI of mean differences This study evaluated the reliability and accuracy of 2D photogrammetry, as compared to direct measurement which has been accepted as the gold standard. Our study showed that three facial dimensions, i.e., bigonial breadth, bizygomatic breadth, and head breadth, cannot be measured reliably and accurately using the 2D photogrammetry method. This was because, there were poor inter-rater reliability of 2D photogrammetry as well as between two different measurement methods for bigonial breadth and head breadth. There was also significant difference in the mean values between the two methods for bizygomatic breadth and bigonial breadth. Thus, only seven out of 10 facial dimensions can be measured reliably and accurately using 2D photogrammetry. The main reason for inaccurate head breadth may be the demography of respondents in this study. In Malaysia, the predominant religion is Islam and most female Muslims wear the hijab as a demonstration of their faith following the requirements of their religion. However, even without the hijab, the head breadth cannot be measured accurately because of varying hair thickness. Because of the limitations of 2D photogrammetry, it is also quite impossible to view zygomatic and gonial landmarks from the anterior view in 2D photogrammetry, even after marking the bony landmarks with stickers. The remaining seven dimensions showed no difference in terms of mean value and had a high level of agreement according to the ICC analysis. The 2D photogrammetric method has been used widely by international (13) (14) (15) (16) and local studies (5, 10) . However, studies that compare 2D photogrammetry with direct measurement are scarce and have some limitations (19, 20, 28) . There is also a lack of consensus among the existing studies. One study showed that 2D photogrammetry is not as accurate as the direct and 3D measurement methods for certain facial dimensions (19) , while the two other studies showed that 2D photogrammetry is comparable to direct measurement (20, 28) . Moreover, previous studies have mainly focused on oral maxillofacial dimensions, in contrast to our study, and none of the studies assessed the reliability and agreement of 2D photogrammetry with direct measurement simultaneously (20, 28) . Furthermore, appropriate data analysis should be employed to confirm that tested and validated tools are both reliable and accurate. Hence, the validation of the 10 facial dimensions considered in our study will be an important step in our future research, which aims to produce an anthropometric database of Malaysian head and facial measurements. The same 10 facial dimensions were used by the United States National Institute for Occupational Safety and Health in 2003 on 3997 civilian workers and by the Chinese government in 2008 on 3000 Chinese civilian workers to develop respirator fit test panels (30, 36) . Respirator fit test panels provide an objective measurement for selecting representative human test samples based on their facial dimensions for use in research, testing, certification and most importantly for respirator development. Likewise, the 10 critical facial dimensions measured in our study can be used to develop two respirator fit test panels, i.e., a bivariate panel using face length and face width and a PCA panel using all 10 facial dimensions. The bivariate panel is simpler to use than the PCA panel. However, the inclusion of the eight additional facial measurements allows the PCA panel to apply better criteria to exclude the use of extreme face sizes. These 10 dimensions have been found to be associated with respirator fit and leakage and can predict the remaining face dimensions well (1) . Moreover, the study in the United States showed that respirators designed to fit PCA panel are expected to accommodate more than 95% of current US civilian workers (30) . We acknowledge that our study has some limitations. First, direct measurements were only measured by one observer, thus reliability of this method cannot be calculated, as compared to the 2D photogrammetry measurements. However, we believe that the measurement errors in the direct measurement procedure were minimal in view of training that was conducted prior to the validation part of the study. Even though the facial dimensions were not measured using the 3D photogrammetry method, which has been found to be more accurate, direct measurement or 2D photogrammetry are more feasible for a nationwide population survey, especially in low-and middle-income countries. However, the disadvantages of the 2D technique include measurement errors due to subjective analysis, magnification errors, parallax, variation in lighting, and variation in head orientation. On the other hand, the novelty of our work lies in the robust validation analysis that we undertook to validate the results generated by 2D photogrammetry against the gold standard of direct measurement. In addition to comparing the mean values of these two methods, we also used ICC and Bland-Altman Limit of Agreement analysis. The Bland-Altman Limit of Agreement and the ICC are the most popular methods to investigate statistical agreement and to assess the reliability of medical instruments, respectively (37) . Agreement and reliability parameters are equally important in determining the quality of the applied method and these two parameters have not been assessed together in previous validation studies (19, 20) . It is important to note that a method with good reliability will not be useful if it is not in good agreement with and vice versa. The other strength of our study lies in the reporting of the CI value when using the limits of agreement approach, as this means that the data can be generalised to a larger population. Moreover, the advantage of using the Bland-Altman approach is that it can reveal both systematic errors (bias) and random errors (limit of agreement) (38) . This study reveals that only seven out of 10 facial measurements can be measured reliably and accurately using 2D photogrammetry, thus it is important that a validation and reliability study is conducted before the use of any measurement methods in anthropometric studies. The results of this study also suggest that, given its practical benefits of being inexpensive, non-invasive, operator dependent and less time consuming, 2D photogrammetry can be used to supplement direct measurement for facial dimensions. Our future study, which will take place during the COVID-19 pandemic, will use a combination of direct measurement and 2D photogrammetry to create an anthropometric database of Malaysian head and facial measurements from over 3,000 participants. The use of 2D photogrammetry can also help to reduce exposure between observers and participants. The findings also indicate the important role that 2D photogrammetry can play in assessing certain facial morphologies in countries that have limited 3D scanner resources. Lastly, future studies to compare and validate the output of 2D photogrammetry against direct measurement in respect of other facial dimensions are also warranted to ensure that more of the dimensions can be measured in this way and it will be both accurate and reliable. The original contributions presented in the study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author. The studies involving human participants were reviewed and approved by Medical Research and Ethics Committee (NMRR-20-1217-55489). The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article. The study conception was by YL and RS. YL and RS designed the study. YL collected the data. YL, AA, and RS conducted the statistical analysis and interpreted the results. YL and AA drafted the manuscript. All authors have read and approved the final version of the submitted manuscript. This funding for this study (NMRR-20-1217-55489 ) and the funding for publication were from the National Institutes of Health, Ministry of Health, Malaysia. The effect of subject characteristics and respirator features on respirator fit Foot anthropometry for shoe design among preschool children in Malaysia Anthropometric study among adults of different ethnicity in Malaysia Designing homes for the elderly based on the anthropometry of older Malaysians Photogrammetric analysis of nasolabial angle and mentolabial angle norm in Malaysian adults Craniofacial anthropometric norms of Malays Craniofacial anthropometric norms of Malaysian Indians Mohd Razi R. Anthropometric study of three-dimensional facial morphology in Malay adults Nasofacial anthropometric study among university students of three races in Malaysia Photogrammetric analysis of palpebral fissure dimensions and its position in malaysian South Indian ethnic adults by gender Head-and-face anthropometric survey of Chinese workers Facial anthropometric differences among gender, ethnicity, and age groups Analysis of the soft tissue facial profile of Croatians using of linear measurements Angular photogrammetric analysis of the soft tissue facial profile of Turkish adults Angular photogrammetric analysis of the soft tissue facial profile Angular photogrammetric analysis and evaluation of facial esthetics of young Ivorians with normal dental occlusion Three dimensional evaluation of facial asymmetry after mandibular reconstruction: validation of a new method using stereophotogrammetry Three-dimensional imaging in orthodontics Comparison of three methods of facial measurement A comparison study of different facial soft tissue analysis methods comparison between 2D and 3D cephalometry on CBCT scans of human skulls comparison between 2D and 3D methods of quantifying facial morphology Accuracy and reliability of 2-dimensional photography versus 3-dimensional soft tissue imaging Comparison of 3D scanning versus 2D photography for the identification of facial soft-tissue landmarks Comparison of a novel surface laser scanning anthropometric technique to traditional methods for facial parameter measurements Anthropometric precision and accuracy of digital three-dimensional photogrammetry: comparing the Genex and 3dMD imaging systems with one another and with direct anthropometry Precise evaluation of anthropometric 2D software processing of hand in comparison with direct method Accuracy and reliability of 3D stereophotogrammetry: a comparison to direct anthropometry and 2D photogrammetry NHMS 2020 Malaysia: Ministry of Health New respirator fit test panels representing the current U. S civilian work force Measurer's handbook: US Army anthropometric survey simple instrument to define the frankfurt horizontal plane for soft-tissue measurements of the face Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research Statistical methods for assessing agreement between two methods of clinical measurement New respirator fit test panels representing the current Chinese civilian workers Statistical methods used to test for agreement of medical instruments measuring continuous variables in method comparison studies: a systematic review Method comparison studies in medicine The authors would like to thank the Director General of Health, Malaysia, for his permission to publish this paper. The authors are most grateful to all the participants of this study. The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpubh. Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.Publisher's Note: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher. The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.