key: cord-0859724-o8xw2tcj
authors: Mumma, Joel M.; Jordan, Ellen; Ayeni, Oluwateniola; Kaufman, Noah; Wheatley, Marisa J.; Grindle, Amanda; Morgan, Jill
title: Development and validation of the discomfort of cloth Masks-12 (DCM-12) scale
date: 2021-10-20
journal: Appl Ergon
DOI: 10.1016/j.apergo.2021.103616
sha: 5c9df712aee39a028304ad64c94bf313bcb8a338
doc_id: 859724
cord_uid: o8xw2tcj

During the COVID-19 pandemic, the use of face masks by the public has helped to slow the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the community. Cloth masks have been recommended because of their effectiveness, availability, and reusability. Like other types of face masks, however, user discomfort while wearing cloth masks is thought to engender behaviors that limit the effectiveness of cloth masks as source control (e.g., adjusting or removing one's mask temporarily while in public). To design cloth masks that are more tolerable, a measurement instrument for assessing subjective user discomfort is needed. Across two studies, we identified and confirmed a two-dimensional factor structure underlying the discomfort of cloth masks – discomfort related to the breathability and discomfort related to the tightness of the mask against the face and head. Additionally, we provide replicable evidence that both factor-subscales predict the self-reported frequencies of problematic mask-wearing behaviors.

During the COVID-19 pandemic, the use of face masks by the public has been an effective, non-pharmaceutical intervention for slowing the spread of severe acute respiratory syndrome coronavirus 2 in the community (SARS-CoV-2; (Czypionka et al., 2021; Howard et al., 2021) . However, the discomfort of wearing face masks over long periods of time has limited compliance with recommendations to wear them (Bakhit et al., 2021; Czypionka et al., 2021) . Although the public has used different types of face masks (e.g., surgical masks, N95s, or KN95s), organizations such as the U.S. Centers for Disease Control and Prevention (CDC, 2020) and the World Health Organization (WHO, 2020) have recommended the use of cloth masks for the public because of their effectiveness, availability, and reusability, particularly when the demand for disposable medical-grade masks outstrips supply (MacIntyre et al., 2015) . Compared to other types of masks, however, the discomfort of cloth masks has received less empirical attention (Chughtai et al., 2016; MacIntyre et al., 2015) , despite calls in the literature to minimize user discomfort (e.g., Bhattacharjee et al., 2020; Czypionka et al., 2021) .

To improve the tolerability of cloth masks, it is necessary to measure user discomfort. Physical properties of cloth masks that affect user discomfort can be measured objectively. For example, the breathability of mask material(s) is typically operationalized as the pressure drop (e. g., in units of mm H 2 O/cm 2 ) between the two sides of the mask material (Kwong et al., 2021) , with a lower pressure drop indicating greater breathability. Nonetheless, it is also important to measure discomfort subjectively, which can reveal how physical properties of masks manifest in user discomfort (Choi et al., 2020) , particularly in real-world conditions (Meyer et al., 1997; Radonovich et al., 2019) . Moreover, measuring discomfort subjectively can also reveal potential limitations of objectively defined standards (e.g., pressure drop). For example, users of tight-fitting respirators (e.g., N95s) in clinical settings frequently report breathing difficulty (Baig et al., 2010; Radonovich et al., 2009) , despite wearing respirators that meet objective breathability standards.

Constructs, like discomfort, should be assessed with an instrument that produces psychometrically reliable and valid measurements, such as a multi-item scale(s) (DeVellis, 2017) . While such instruments have been developed to measure the discomfort of tight-fitting respirators (N95s and elastomeric half-mask respirators; LaVela et al., 2017) , and for face masks, broadly defined (Howard, 2020) , there is no instrument that assesses the discomfort of cloth masks specifically. Instruments developed for tight-fitting respirators (e.g., the Respirator Comfort, Wearing Experience, and Function Instrument; LaVela et al., 2017) are inappropriate because, unlike cloth masks, tight-fitting respirators are designed to seal tightly against the users face, which may cause discomfort to manifest in ways unique to these types of masks, such as leaving painful marks or indentations on the face or redness from the metal nose-bridge (LaVela et al., 2017) . Other instruments that include measures of the discomfort of face masks, broadly defined (e.g., the Face Mask Perceptions Scale; Howard, 2020), appear to assess only a single dimension of discomfort, such as breathability (e.g., "Face masks disrupt my breathing," and "Face masks get too hot,"); users of cloth masks likely experience other dimensions of discomfort as well, which may manifest as physical irritation or pain (Guo et al., 2008; Roberge et al., 2012) .

Otherwise, the discomfort of face masks has been assessed with a single, global item (Cheok et al., 2021; Luximon et al., 2016; Roberge et al., 2013; Scarano et al., 2020; Shenal et al., 2012; Smart et al., 2020) , which is questionable for a construct as ambiguous as "discomfort," or a single item(s) that assess one dimension of discomfort (e.g., breathing discomfort or mask-tightness; (Cheok et al., 2021; Goh et al., 2019; Luximon et al., 2016; Roberge et al., 2013; Smart et al., 2020) . Compared to multi-item scale scores, ratings of a single item are more affected by random measurement error and the specific wording of that item, the effects of which both average out in multi-item scale scores (DeVellis, 2017) . Additionally, multi-item scale scores can offer a more discriminating response scale than a single item with limited response options can, which enhances validity by allowing for stronger correlations with external variables (e.g., user behaviors; Sarstedt and Wilczynski, 2009) .

Regarding validity, measures of the subjective discomfort of face masks have distinguished between different types of respirators (Radonovich et al., 2019; Shenal et al., 2012) and are associated with user compliance (Chughtai et al., 2016; Shenal et al., 2012) . Beyond compliance, an important (but less studied; Bakhit et al., 2021) criteria are behaviors that limit the effectiveness of cloth masks during use, such as adjusting one's mask, touching one's face, and removing one's mask temporarily (Kellerer et al., 2021) . As others have observed, the frequency of these problematic behaviors may be reduced with masks that are less uncomfortable (Shiraly et al., 2020; Smart et al., 2020) . To this end, the goal of the present studies was to develop an instrument that produces a reliable and valid measure(s) of the discomfort of cloth masks. In Study 1, we used exploratory factor analysis to reveal the factor-structure of the discomfort of cloth masks. In Study 2, we used confirmatory factor analysis to verify this factor-structure in an independent sample. In both studies, we assessed the criterion-related validity of our measures with the self-reported frequency of problematic mask-wearing behaviors, with which we expected our measures of discomfort to be positively related.

We defined "discomfort" as any unpleasant sensation that relates to the body, such as pain, irritation, or other sensations arising from physiological strain (e.g., breathing difficulty or heat stress), rather than the mind (e.g., anxiety or self-consciousness). We then identified studies that assessed the discomfort of face masks, comprising studies on tightfitting respirators (e.g., N95s or elastomeric half-mask respirators) or loose-fitting masks (e.g., surgical masks or cloth masks). We reviewed the item(s) comprising each scale to identify categories of items that recurred across scales and that were relevant to cloth face masks.

Additionally, we identified other ways in which cloth masks can be uncomfortable that were not captured by these categories. Five categories of items emerged: 1) heat build-up (Baig et al., 2010; Guo et al., 2008; Howard, 2020; LaVela et al., 2017; Roberge et al., 2010; Roberge et al., 2012; Roberge et al., 2013; Smart et al., 2020) , 2) moisture build-up (Guo et al., 2008; LaVela et al., 2017; Roberge et al., 2010; Roberge et al., 2012) , 3) breathing difficulty (Choi et al., 2020; Goh et al., 2019; Guo et al., 2008; Howard, 2020; LaVela et al., 2017; Roberge et al., 2013; Smart et al., 2020) , 4) pain (LaVela et al., 2017; Roberge et al., 2012) , and 5) irritation (Baig et al., 2010; Guo et al., 2008; LaVela et al., 2017; Roberge et al., 2010; Roberge et al., 2012) . We assumed these categories could reflect related, but potentially distinct, dimensions of discomfort. Consequently, we generated multiple items for each category (from four to seven items; Fabrigar et al., 1999) , resulting in a total of 30 items. After evaluating the items for relevance, clarity, and thoroughness, we performed cognitive interviews with five individuals, who were asked to think-aloud while rating the items. We revised item-wording based on this feedback.

In April 2021, participants were recruited through Amazon Mechanical Turk (MTurk) and were required to have at least a 95% approval rate on MTurk (Keith et al., 2017) . To determine eligibility, participants were asked if English was their primary language and to rate how often they wore a cloth mask (as a single mask or as the inner layer of two masks) when going out in public over the last three months (1 = "Never," 2 = "Sometimes," 3 = "About half the time," 4 = "Most of the time," and 5 = "Always). We defined a cloth mask as any mask that is worn over the nose and mouth, made of fabric (e.g., cotton or polyester), attaches to your head (e.g., with ear loops or ties), and can be washed. We excluded participants who selected "Never" or indicated that English was not their primary language.

Eligible participants were then invited to complete a second survey; participants first answered questions about demographics and their use of cloth masks, including for how long they typically wear cloth masks as well as the frequency at which they engage in certain (problematic) mask-wearing behaviors when in public: adjusting one's mask, touching one's face, and removing one's mask temporarily. Participants rated the frequency of each behavior using 6 response options (Never = 1, Very Rarely = 2, Rarely = 3, Occasionally = 4, Frequently = 5, Very Frequently = 6). Participants then rated the items assessing discomfort in a random order (Bandalos, 2021) using 4 response options (1 = "Not at all," 2 = "A little bit," 3 = "Somewhat," 4 = "Very much so,"). We included three attention checks throughout the survey (Keith et al., 2017) , removing participants who failed any attention check. All research procedures were approved by the Emory University Institutional Review Board on April 6th, 2021 (Protocol Number: STUDY00001556).

We conducted a preliminary statistical evaluation to identify problematic items, comprising pairs of items with excessively strong correlations (r's ≥ 0.80; Gamst et al., 2015; Pett et al., 2003) , which may generate a spurious factor(s) in exploratory factor analysis (EFA), or items with uniformly low pairwise correlations (r's < 0.30; Tabachnick and Fidell, 2001) . After removing problematic items, we assessed the factorability of the correlation matrix using Bartlett's test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy.

We then performed EFA using principal axis factoring. To determine the number of factors to retain, we conducted a Parallel Analysis (using permutations of the raw data; O'Connor, 2000), a scree test (using eigenvalues from the reduced correlation matrix; Fabrigar and Wegener, 2011; O'Connor, 2000) , and considered the interpretability and parsimony of solutions comprising well-defined factors (i.e., factors reflected by at least three items; Costello and Osborne, 2005; Tabachnick and Fidell, 2001; Worthington and Whittaker, 2006) . As we expected factors to be correlated (e.g., heat build-up and moisture build-up), we used an oblique rotation (Direct Quartimin) and used the factor loadings in the pattern matrix (Fabrigar and Wegener, 2011; Tabachnick and Fidell, 2001) to interpret the extracted factors.

To better approximate simple structure (Worthington and Whittaker, 2006) , we then removed items based on their contribution to the solution; candidates for removal were items with a low communality (i.e., <0.40; Costello and Osborne, 2005) , items that failed to load substantially on any factor (i.e., <0.40; Howard, 2016) , items with a high cross-loading (i.e., loading ≥0.32 on more than one factor; Costello and Osborne, 2005; Howard, 2016) , or otherwise, items with a difference of less than 0.20 between their highest and lowest factor loadings (Howard, 2016; Worthington and Whittaker, 2006) .

To provide evidence for the criterion-related validity of the factorsubscales, we assessed the extent to which factor-subscale scores were related to the self-reported frequency of three problematic maskwearing behaviors. For each participant, we calculated the unweighted mean of the items comprising each factor-subscale, which we used to predict the self-reported frequency of each problematic-behavior using multiple linear regression. All statistical analyses were performed in SPSS version 27. A p value < 0.05 was considered statistically significant.

We received complete data from 246 respondents, 11 (4.5%) of which were excluded for failing at least one attention check or completing the survey more than once. The remaining respondents (n = 235; Median age = 38 years, IQR: 28-41 years; 52.8% female, 0% other) were mostly from North America (63.8%), followed by the Indian subcontinent (23.4%), Europe (7.2%), South America (4.7%), and elsewhere (<1%). Respondents reported wearing a cloth mask for a median duration of 2 h (IQR: 1-5 h), in total, on a typical day in which they wear a cloth mask in public.

There were no missing data. The mean Pearson correlation between items was r = 0.43 (SD = 0.12), with all items having at least one pairwise correlation ≥.30. Inspection of the correlation matrix revealed a handful of item-pairs with excessively strong correlations (r's ≥ 0.80). One pair comprised items related to ear pain, "My ears bother me," and "My ears hurt," (r = 0.82), and three inter-related pairs of items related to breathing difficulty, "It is hard to breathe," "I have trouble breathing," and "I feel out of breath," (Range: 0.80-0.82). Given the undesirable redundancy within these pairs (DeVellis, 2017), we removed one item related to ear pain ("My ears hurt,") and two redundant items related to breathing difficulty ("I have trouble breathing," and "I feel out of breath, ") before performing EFA.

We assessed the factorability of the correlation matrix of the remaining 27 items. Bartlett's test of sphericity was significant (χ 2 = 4105.67, df = 351; p < 0.001), which indicates that the correlation matrix was different from an identity matrix, and the Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy was 0.94, well above the recommended minimum value of 0.60 (Howard, 2016; Tabachnick and Fidell, 2001) . Thus, the correlation matrix was appropriate for EFA. A scree test of the eigenvalues of the reduced correlation matrix suggested a 2-factor solution whereas Parallel Analysis suggested up to a 5-factor solution (EFA eigenvalues: 11.94, 1.71, 0.91, 0.73, 0.58; Parallel Analysis 95th percentile eigenvalues: 0.92, 0.78, 0.69, 0.61, 0.55). To choose among these solutions, we evaluated the interpretability and parsimony of solutions comprising 2 to 5 factors (Fabrigar and Wegener, 2011) . The 2-factor solution provided the most well-defined, parsimonious, and interpretable set of factors; Factor 1 comprised items that reflect the discomfort related to the breathability of cloth masks (e.g., "I feel hot," "It is hard to breathe," and "I want fresh air,") and the Factor 2 comprised items that reflect discomfort related to the tightness of cloth masks against the face, head, and ears (e.g., "My head hurts," "My ears feel pinched," "My face feels ticklish," and "My face hurts,"). We retained the 2-factor solution for subsequent analyses.

To better approximate simple structure, we iteratively removed nine items in total, repeating EFA after removing each item. Having obtained a clear factor structure (Worthington and Whittaker, 2006) , we then optimized the length of each factor-subscale by removing items that loaded most weakly on their primary factor. As methodologists have recommend a ratio of at least three to five items per factor Velicer and Fava, 1998) , we reduced each factor-subscale to six items (Table 1) , repeating EFA after removing each item. The two factors, which were strongly correlated (r = 0.53), accounted for 55.92% of the variance in the 12 items. These 12 items comprise the Discomfort of Cloth Masks-12 (DCM-12) scale (see appendix).

The mean breathability-and tightness-related discomfort subscale scores (1 = "Not at all," to 4 = "Very much so,") were M = 2.5 (SD = 0.8) and M = 1.8 (SD = 0.7), respectively. The mean self-reported frequency (1 = "Never," to 6 = "Very frequently,") of adjusting one's mask in public, touching one's face (either outside or underneath their mask) in public, and removing one's mask temporarily in public was M = 4.0 (SD = 1.2), M = 3.2 (SD = 1.3), and M = 2.7 (SD = 1.4), respectively. Multiple regression analyses (Table 2) revealed that both breathabilityand tightness-related discomfort predicted the frequency at which respondents reported touching their face (R 2 = 0.14, p < 0.001) and removing their mask temporarily while in public (R 2 = 0.17, p < 0.001). However, only breathability-related discomfort predicted the frequency at which respondents reported adjusting their mask while in public (R 2 = 0.08, p < 0.001).

The goal of Study 2 was to confirm the 2-dimensional factor structure of the DCM-12 using confirmatory factor analysis in an independent sample. Moreover, as Study 1 provided evidence of the criterion-related validity of the DCM-12 subscale scores, we also sought to test the replicability of these relationships.

Item factor loadings from the pattern matrix with item communality, mean rating, and standard deviation. Note. Factor 1 = Breathability-related discomfort. Factor 2 = Tightness-related discomfort. All items were rated on a 4-point scale (1 = "Not at all," 2 = "A little bit," 3 = "Somewhat," 4 = "Very much so,"). For each item, the primary factor loading is in bold.

We used structural equation modeling to perform confirmatory factor analysis (CFA) with the lavaan package (version 0.6-9; Rosseel, 2012) in R statistical software. To estimate model parameters, we used robust unweighted least squares with a mean-and-variance (ULSMV) corrected χ 2 statistic (Forero et al., 2009; Savalei and Rhemtulla, 2012; Shi et al., 2018) , which is recommended for CFA models involving items rated on fewer than five response options . We specified an initial model with two correlated latent variables, each reflected by six items (Table 3) , and no correlated residuals between items. We then evaluated standard measures of global fit (Brown, 2006; Worthington and Whittaker, 2006) : Confirmatory Fit Index (CFI), Root Mean Squared Error of Approximation (RMSEA) with a 90% confidence interval, and standardized root mean square residual (SRMR). To determine goodness of fit, we applied conventional cutoff values: CFI ≥0.95 (Hu and Bentler, 1999) , an RMSEA value ≤ 0.05, ≤0.08, or ≤ 0.10 as indicating good, acceptable, and marginal fit, respectively (Browne and Cudeck, 1992) , and an SRMR ≤0.08 (Hu and Bentler, 1999) . We report but did not use χ 2 as a measure of global fit because of its sensitivity to sample size (Worthington and Whittaker, 2006) . Additionally, we evaluated two measures of local fit, correlated residuals and Lagrange multiplier modification indices. To determine goodness of fit at the local level, we considered correlated residuals > | 0.10| (Kline, 2016) and modification indices ≥11 (i.e., p < 0.001) as suggestive of local misfit.

To assess the reliability of each factor-subscale, we used the semTools package (version 0.5-5; Jorgensen et al., 2021) in R statistical software to calculate McDonald's ω, which is appropriate when items are unit-weighted to form a total scale score and violate the assumptions of tau-equivalence and uncorrelated errors, as Cronbach's α assumes (McNeish, 2018) . Additionally, we assessed convergent and discriminant validity using the average variance extracted (AVE) for each factor. An AVE ≥ 0.50 indicates adequate convergent validity whereas discriminant validity is supported if the AVE for each factor is greater than the variance shared between those factors (Fornell and Larcker, 1981) .

In May 2021, we recruited participants through MTurk, who were required to have at least a 95% approval rate and not have participated in Study 1. As in Study 1, we excluded respondents whose primary language was not English and reported never having worn a cloth mask when in public in the last 3 months. We targeted a sample size of at least 200 respondents, which exceeds MacCallum et al.'s (1999) recommended sample size (100-200) for factor analysis models with strongly determined factors (i.e., an item-to-factor ratio of 20:3) and item communalities around 0.5; the mean item communality in Study 1 was M = 0.6 (SD = 0.1) with an item-to-factor ratio of 18:3. We included three attention checks throughout the survey, removing participants who failed any attention check. To test for differences in demographic variables between the two studies, we used Fisher's exact test for categorical variables and Mann-Whitney U test for (skewed) continuous variables.

To provide replicable evidence for the criterion-related validity of the factor-subscales (as in Study 1), we assessed the extent to which factor-subscale scores were related to the self-reported frequency of three problematic mask-wearing behaviors using multiple regression.

We received complete data from 230 respondents, 16 of which were excluded for failing at least one attention check or completing the survey more than once. The remaining respondents (n = 214; Median age = 35 years, IQR: 29-45 years; 43.5% female, < 1% Other) were mostly from North America (63.5%), followed by the Indian subcontinent (25%), Europe (5.7%), South America (2.8%), and elsewhere (2.8%). Respondents reported wearing a cloth mask for a median duration of 3 h (IQR: 1-5 h), in total, on days on which they wear a cloth mask in public. Respondents in Studies 1 and 2 did not differ in age (W = 23,214, p = 0.21), continent of origin (p = 0.66), gender (p = 0.07), nor for how long they typically wear a cloth mask in public (W = 23,289, p = 0.27).

There were no missing data. The global fit of the initial model was acceptable according to most criteria: χ 2 = 185.98 (df = 53, p < 0.001), CFI = 0.95, RSMEA = 0.11 (90% CI: 0.09-0.12), and SRMR = 0.07. However, correlation residuals and modification indices (MIs) both suggested that the residuals of one pair of items should be allowed to correlate ("My ears feel pinched," and "My ears bother me; " r = 0.31, MI = 25.77); both items loaded strongly onto the same factor and were very similar in meaning, suggesting a method effect (Bandalos, 2021; Brown, 2006) rather than a substantive misspecification. Allowing residuals to correlate between these items sufficiently improved global model fit according to all criteria: χ 2 = 127.55 (df = 52, p < 0.001), CFI = 0.97, RSMEA = 0.08 (90% CI: 0.07-0.10), and SRMR = 0.05. Additionally, correlated residuals (largest |r| = 0.17) and MIs (largest MI = 8.8) were uniformly low. Completely standardized factor loadings (Table 3) were uniformly high, ranging from 0.66 to 0.90 (all p's < 0.001; Table 3 ). McDonald's ω for the breathability-and tightness-related discomfort factor-subscales were 0.88 and 0.86, respectively. The two latent factors were strongly correlated (r = 0.78, p < 0.001), but not excessively so (r > 0.80 or 0.85;

Multiple regression analyses predicting self-reported frequency of behaviors from factor-subscale scores. Note. Each question began with the phrase, "While wearing a cloth mask in public," and was rated using a 6-point scale (Never = 1, Very Rarely = 2, Rarely = 3, Occasionally = 4, Frequently = 5, Very Frequently = 6). Factor 1 = Breathability-related discomfort subscale score; Factor 2 = Tightness-related discomfort subscale score. *p < 0.05, **p < 0.01**, ***p < 0.001.

Completely standardized item factor loadings with standard error and item communality. Note. Residuals of items marked † were allowed to correlate post hoc. Factor 1 = Breathability-related discomfort; Factor 2 = Tightness-related discomfort. ***p < 0.001. Brown, 2006) . The AVE for breathability-(0.63) and tightness-related discomfort (0.68) were each greater than the variance shared between the factors (r 2 = 0.61) and greater than 0.50, which suggests adequate discriminant and convergent validity, respectively (Fornell and Larcker, 1981) . Thus, a 2-factor structure fit the data well.

The mean breathability-and tightness-related discomfort scores (1 = "Not at all," to 4 = "Very much so,") were M = 2.7 (SD = 0.8) and M = 2.0 (SD = 0.8), respectively. The mean self-reported frequency (1 = "Never," to 6 = "Very frequently,") of adjusting one's mask in public, touching one's face (either outside or underneath their mask) in public, and removing one's mask temporarily in public was M = 4.0 (SD = 1.1), M = 3.4 (SD = 1.4), and M = 2.8 (SD = 1.4), respectively. Multiple regression analyses (Table 4 ) revealed that both breathability-and tightness-related discomfort predicted the frequency at which respondents reported touching their face while wearing a cloth mask in public (R 2 = 0.22, p < 0.001). However, only breathability-related discomfort predicted the frequency at which respondents reported adjusting their mask (R 2 = 0.08, p < 0.001) whereas only tightnessrelated discomfort predicted how often respondents reported removing their mask temporarily (R 2 = 0.20, p < 0.001).

Wearing cloth masks in public settings is effective at slowing the spread of respiratory-borne pathogens, such as SARS-CoV-2, in the community (Czypionka et al., 2021; Howard et al., 2021) . The effectiveness of cloth masks, however, depends upon the extent to which the public uses them appropriately (Howard et al., 2021) . As appropriate use is thought to be partly related to the degree of discomfort of wearing cloth masks, it is important to measure user discomfort to design masks that will be tolerable for end-users. To this end, we identified and confirmed the factor structure underlying items assessing the discomfort of cloth masks; the resulting DCM-12 measures two dimensions of discomfort: breathability-related discomfort, which is reflected by the degree of breathing difficulty and heat build-up, and tightness-related discomfort, which is reflected by the degree of pain or irritation in the face, head, and ears.

Across two studies, both dimensions of discomfort were strongly and positively correlated, which is unsurprising given the relationship between the fit (i.e., tightness) and the breathability of a mask. Lee et al. (2020) observed that the fit of cloth masks affects their breathability because fit controls the size of the gap(s) between the user's face and mask (i.e., leakage; Kwong et al., 2021) , which allows exhaled air to escape more easily than if were to pass through the mask material. On the other hand, unique variance in each dimension is likely to be related, in part, to properties of masks that are independent of one another; independent of fit, the breathability of a cloth mask is affected by properties of the mask material, such as its filtration efficiency, and the number of layers comprising the mask (Kwong et al., 2021) . Independent of the mask material(s), an overly tight mask can cause pain and irritation via pressure and friction against the face, head, or ears, particularly over extended periods of use (Lee et al., 2020) .

The separability of these dimensions is further demonstrated by their differential relationships with problematic mask-wearing behaviors. Across both studies, we provide replicable evidence that the degree of breathability-and tightness-related discomfort not only accounted for unique variance in certain behaviors but were also uniquely related to different behaviors; as discomfort related to the breathability and the tightness of a mask increased, so too did the self-reported frequency of touching one's face. Across both studies, however, the frequency of adjusting one's mask and removing one's mask temporarily were consistently predicted by breathability-and tightness-related discomfort, respectively. This suggests that adjusting one's mask may be done primarily to relieve discomfort from a lack of breathability, rather than to relieve pain and irritation from the tightness of a mask. Removing one's mask temporarily, however, removes pressure from the face, head, or ears thereby relieving the discomfort from the tightness of the mask.

The 2-factor structure of the DCM-12 differs from the common, unidimensional conceptualization of the discomfort of face masks as being related to breathability only (e.g., Choi et al., 2020; Goh et al., 2019; M. C. Howard, 2020; Roberge et al., 2013; Smart et al., 2020) . It is unlikely that the 2-dimensional structure of discomfort is entirely unique to cloth masks because other types of masks, both loose-and tight-fitting, have the same issues of fit and breathability. However, the items reflecting these dimensions may need to be contextualized for different types of face masks; for example, tightness-related discomfort in tight-fitting respirators can manifest in ways unique to these types of mask (e.g., painful indentations on the face; Baig et al., 2010; LaVela et al., 2017; Roberge et al., 2010) . Lastly, the multidimensional nature of discomfort should discourage the use of single, global items to measure discomfort (e.g., Goh et al., 2019; Roberge et al., 2010; Shenal et al., 2012) ; not only do such items lack diagnostic value, but their ambiguity may undermine their reliability and validity.

The present studies are not without limitations. We focused on cloth masks, which we defined as any mask that is worn over the nose and mouth, made of fabric, attaches to your head, and can be washed; consequently, we cannot speak to the appropriateness of the DCM-12 for measuring the discomfort of other types of loose-fitting masks, such as surgical masks, which are a popular alternative to cloth masks. Additionally, we did not elicit details of the cloth masks that respondents wore, which most certainly varied with respect to the type of fabric (e.g., cotton or polyester), number of layers of fabric, patterns (e.g., pleated or flat), or how the mask attaches to the head (e.g., with elastic ear loops or fabric ties). Consequently, we could not assess the known-groups validity of the DCM-12, as related instruments have for other types of face masks (Radonovich et al., 2019) . Lastly, we relied on self-reports of mask-wearing behaviors as our criterion variable. Although arguably less reliable than independent observations of behavior, the DCM-12 nonetheless revealed clear and consistent relationships between discomfort and self-reported behaviors.

The DCM-12 also has applied value. The items are applicable to cloth masks generally and the relatively small number of items allows the DCM-12 to be administered quickly. Future research should assess other forms of validity, such as known-groups validity, as well as criterionrelated validity based upon observations of user behaviors. Ultimately, we hope that the DCM-12 will aid in the design of cloth masks that are not only effective as source control but are also tolerable for extended periods of use. In designing cloth masks, we recommend that user discomfort be assessed early and often in the design process because design decisions intended to maximize the effectiveness of cloth mask as source control may create discomfort, which leads to user behaviors that undermine the effectiveness of the mask. For example, all else being Table 4 Multiple regression analyses predicting frequency of behaviors from factorsubscale scores. Note. Each question began with the stem, "While wearing a cloth mask in public," and was rated using the same 6-point scale (Never = 0, Very Rarely = 1, Rarely = 2, Occasionally = 3, Frequently = 4, Very Frequently = 5). Factor 1 = Breathability-related discomfort subscale score; Factor 2 = Tightness-related discomfort subscale score. *p < 0.05, **p < 0.01**, ***p < 0.001. equal, a looser-fitting mask is more breathable, but leakage reduces the filtration efficiency of a mask (Cappa et al., 2021) , while a tighter-fitting mask will have better filtration efficiency but will be less breathable, potentially leading users to remove their mask temporarily while in public. Considering user discomfort may reduce the frequency of behaviors that undermine the effectiveness of cloth masks. As the response to the COVID-19 pandemic has demonstrated, masking as a non-pharmaceutical intervention is most effective when there are high levels of compliance in the community, underscoring the importance of behavior in the success of public health efforts to reduce the transmission of respiratory-borne pathogens.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Health care workers' views about respirator use and features that should be included in the next generation of respirators

Downsides of face masks and possible mitigation strategies: a systematic review and meta-analysis

Item meaning and order as causes of correlated residuals in confirmatory factor

Last-resort strategies during mask shortages: optimal design features of cloth masks and decontamination of disposable masks during the COVID-19 pandemic

Confirmatory Factor Analysis for Applied Research (First)

Alternative ways of assessing model fit

Expiratory aerosol particle escape from surgical masks due to imperfect sealing

Science Brief: Community Use of Cloth Masks to Control the Spread of SARS-CoV-2

Appropriate attitude promotes mask wearing in spite of a significant experience of varying discomfort

Evaluation of wearing comfort of dust masks

Compliance with the use of medical and cloth masks among healthcare workers in vietnam

Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis

Masks and face coverings for the lay public: a narrative update

Scale Development: Theory and Applications (Fourth

Exploratory Factor Analysis

Evaluating the use of exploratory factor analysis in psychological research

Factor Analysis with ordinal indicators: a Monte Carlo study comparing DWLS and ULS estimation

Structural equation models with unobservable variables and measurement error: algebra and statistics

Scale development and validation

A randomised clinical trial to evaluate the safety, fit, comfort of a novel N95 mask in children

Evaluation on masks with exhaust valves and with exhaust holes from physiological and subjective responses

An evidence review of face masks against COVID-19

A review of exploratory factor Analysis decisions and overview of current practices: what we are doing and how can we improve?

Understanding face mask use to prevent coronavirus and other illnesses: development of a multidimensional face mask perceptions scale

Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives

2021. semTools: Useful Tools for Structural Equation Modeling

Systems perspective of Amazon mechanical Turk for organizational research: review and recommendations

Behavior in the use of face masks in the context of COVID-19

Principles and Practice of Structural Equation Modeling

Review of the breathability and filtration efficiency of common household materials for face masks

Development and initial validation of the respirator comfort, wearing experience, and function instrument

Reusable face masks as alternative for disposable medical masks: factors that affect their wear-comfort

Time dependent infrared thermographic evaluation of facemasks

Sample size in factor analysis

A cluster randomised trial of cloth masks compared with medical masks in healthcare workers

Thanks coefficient alpha, we'll take it from here

Field study of subjective assessment of negative pressure half-masks. Influence of the work conditions on comfort and efficiency

SPSS and SAS programs for determining the number of components using parallel analysis and Velicer's MAP test

Making Sense of Factor Analysis: the Use of Factor Analysis for Instrument Development in Health Care Research

A Report of an Interagency Working Group of the U.S. Federal Government. U.S. Department of Veterans Affairs

A tolerability assessment of new respiratory protective devices developed for health care personnel: a randomized simulated clinical study

When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions

Surgical mask placement over N95 filtering facepiece respirators: physiological effects on healthcare workers

Physiological impact of the N95 filtering facepiece respirator on healthcare workers

Absence of consequential changes in physiological, thermal and subjective responses from wearing a surgical mask

Impact of low filter resistances on subjective and physiological responses to filtering facepiece respirators

Lavaan: an R package for structural equation modeling

More for less? A comparison of single-item and multiitem measures

The performance of robust test statistics with categorical data-Savalei-2013--Wiley Online Library

Facial skin temperature and discomfort when wearing protective face masks: thermal infrared imaging evaluation and hands moving the mask

Discomfort and exertion associated with prolonged wear of respiratory protection in a health care setting

Examining chi-square test statistics under conditions of large model size and ordinal data

Face touching in the time of COVID-19 in Shiraz

Assessment of the wearability of facemasks against air pollution in primary school-aged children in london

Using Multivariate Statistics

Affects of variable and subject sampling on factor pattern recovery

Scale development research: a content analysis and recommendations for best practices

We wish to thank Marilyn Garcia and Anagha Nair for their assistance preparing the manuscript and Colleen Kraft and Erik Brownsword for their feedback on earlier drafts.

Supplementary data to this article can be found online at https://doi. org/10.1016/j.apergo.2021.103616.

This research was supported by contract award 75D30120C09509, Centers for Disease Control and Prevention (CDC), Broad Agency Announcement 75D301-20-R-68024. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Centers for Disease Control and Prevention.