key: cord-0865997-7aeuyztq authors: Aiello, Edoardo Nicolò; Esposito, Antonella; Pucci, Veronica; Mondini, Sara; Bolognini, Nadia; Appollonio, Ildebrando title: Italian telephone-based Mini-Mental State Examination (Itel-MMSE): item-level psychometric properties date: 2022-01-08 journal: Aging Clin Exp Res DOI: 10.1007/s40520-021-02041-4 sha: 738c922c129850fbc437d6df2869b04e44b1573c doc_id: 865997 cord_uid: 7aeuyztq BACKGROUND: The Italian telephone-based Mini-Mental State Examination (Itel-MMSE), despite being psychometrically sound, has shown relevant ceiling effects, which may negatively impact the interpretation of its scores. In address to overcome such an issue, this study aimed at providing item-level insights on the Itel-MMSE through Item Response Theory (IRT) analyses. METHODS: Five-hundred and sixty-seven healthy Italian adults (227 males, 340 females; mean age: 51 ± 17 years, range 18–96; mean education: 13.31 ± 4.3 years). A two-parameter logistic IRT model was implemented to assess item discrimination and difficulty of the Itel-MMSE. Construct unidimensionality, statistical independence of items, and model and item fit were tested. Informativity levels were also assessed graphically. RESULTS: With respect to the Itel-MMSE total score, ceiling effects were found in 92.7% of participants. Unidimensionality was violated; both model and item fit were poor; a few items showed statistical dependence. Both the whole test and its items proved to be scarcely informative, especially for medium-to-high levels of ability, except for attention and spatial orientation subtests, which consistently yielded the highest discriminative capability. DISCUSSION: The Itel-MMSE appears to be most informative in low-performing healthy individuals. However, the present findings should not lead practitioners to aprioristically equate ceiling effects/low informativity to clinical uselessness. Items assessing attention and, to a lesser extent, spatial orientation appear to be the most informative. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s40520-021-02041-4. In the context of remote neurological and geriatric healthcare, telephone-based cognitive screening tests allow to the cognitive evaluation assessment of populations having poor access to in-person services due to either logistical issues (e.g., motor disabilities) or unequal geographical coverage of clinics [1] . Moreover, telephone-based cognitive screening are useful for the implementation of epidemiological studies on cognitive impairment [2] and dementia prevention campaigns [3] . Within the international literature on telephone-based cognitive screening tools, versions of the renowned Mini-Mental State Examination (MMSE [4] ) administrable via the telephone are highly represented [5, 6] . In Italy, the telephone-based MMSE (Itel-MMSE) has shown good validity, reliability, and diagnostic properties [7, 8] . However, it also showed relevant ceiling effects [8] , which may negatively impact the interpretation of its scores, as it has been highlighted for the in-person version of the MMSE [9] . A possible approach to overcome such issues is to deliver item-level information in the framework of Item Response Theory (IRT). IRT allows assessing item measurement properties in relation to the underlying trait that they are meant to target [10] . When approaching cognitive screening through IRT, two item features are of major interest: difficulty, that is the level of latent ability required for an individual to "pass" an item, and discrimination, namely the capability of an item to discriminate between individuals having different levels of ability [10] . In this view, IRT would undoubtedly help practitioners gain as much information as possible from each item when an overall score is not sufficient to draw clinical judgments on individual cognitive profiles [11] , as in the case of ceiling effects. With respect to the Italian paper-and-pencil MMSE, IRT has proved to represent a successful approach to reveal item-level features based on demographic confounders [12] . Consistently, this study was intended to provide IRT-based, item-level psychometric insight into the Itel-MMSE in an Italian population sample with the aim of helping practitioners interpret its outcome. Five-hundred and sixty-seven individuals (227 males, 340 females; mean age: 50.99±17.04 years, range = 18-96 years; mean education: 13.31±4.25 years, range = 1-26 years) with no history of neurological, psychiatric, severe general medical conditions (i.e., organ/system failures, non-compensated metabolic disorders), active psychopharmacological therapies, or hearing deficits were recruited from different regions of Italy (445 participants recruited in Northern, 122 in Center, and 286 in South Italy; see Supplementary Table 1 for sample stratification for age, education, and sex). Participants were recruited between 2020 and 2021 via wordto-mouth advertising through personal acquaintances of researchers from the University of Milano-Bicocca and the University of Padua. Medical history was investigated by means of a semi-structured interview exploring the areas of neurological, psychiatric, general medical and psychopharmacological history. This study was conducted in accordance with the Declaration of Helsinki and approved on behalf of the Ethical Committees of the University of Milano-Bicocca and the University of Padua. Participants provided informed consent to participation. The Itel-MMSE by Metitieri et al. [7] depletes items on reading, writing, oral comprehension, and constructional praxis; moreover, the "place" item for spatial orientation is dropped, and only one naming-to-description item is maintained. As for the in-person format, if at least one error is committed on the serial subtraction task, an alternative serial spelling task is administered to test sustained attention. The total score of the test ranges from 0 to 22. An in-depth sound-check was preliminarily carried out to ensure a good quality of the call. Participants were instructed on those actions that were required to execute the tasks. The presence of an informant was required to make sure that no facilitations occurred during the test administration, as well as to confirm address information (which allowed to test spatial orientation). R 4.0.1 [13] and SPSS 27 [14] were adopted for the statistical analyses. Each of the 22 items was dichotomized and cognitive efficiency was addressed as the latent ability. In respect to sustained attention tasks, only serial subtraction was taken into account (as being administered first and thus yielding no missing values). Item difficulty and discrimination were examined by means of a two-parameter logistic IRT model through the R package mirt [15] . "Canonical" difficulty was addressed as ranging from − 4 to + 4 [16] . Cut-offs for "high" and "very high" discrimination were set at ≥1.5 and ≥1.7, respectively [16] . According to the guidelines proposed by Şahin and Anıl [17] , the minimum sample size for the accurate estimation of a two-parameter logistic IRT model (i.e., including item difficulty and discrimination) was set at N≈500 on the basis of the number of items of the Itel-MMSE (n = 22). mirt [15] was also adopted to check for IRT assumptions. Unidimensionality (i.e., whether the test actually measures a unique latent ability/cognitive efficiency) was tested through an exploratory factor analysis [10] . Local independence (i.e., whether items are statistically independent of each other, net of their association with the latent trait) [10] was checked via Yen's Q3 statistic (i.e., the correlation between item residuals, which quantify the discrepancy between expected and observed values) [10, 18] . Each item yielding Q3≥.36 toward the others was judged as locally dependent [19] . Item fit (i.e., the degree of consistency between items and the test as a whole in measuring the latent trait) and model fit (i.e., whether the estimated two-parameter logistic model is consistent with observed data) were judged by assessing the root mean square error of approximation (RMSEA); higher RMSEA values are suggestive of a deviation from the ideal fit [20] . RMSEA ≥ .06 were addressed as indexing poor item/model fit. Mean aggregated Itel-MMSE scores were the following: total: 21.54±.88 (range = 14-22); orientation: 8.84±.42 (range = 6-9); attention: 4.93±.38 (range = 1-5); memory: 5.84±.47 (range = 2-6); and language: 1.93±.28 (range = 0-2). Ceiling effects for the total score were cumulatively detected in 90.2% of participants (69.7% participants scoring the maximum, 20.5% scoring 21 out of 22). Cronbach's α was acceptable (0.65). However, the exploratory factor analysis suggested a modest violation of the unidimensionality assumption (Supplementary Table 2) , with several items loading < 0.3. Accordingly, most items showed poor fit (Table 1) , as well as the model as a whole (RMSEA = 0.062). Partial local dependence was detected for the first 3 items assessing spatial orientation and the last 2 serial subtractions (|0.48|≤Q3≤|0.75|). Other items proved to be locally independent (Q3≤0.28). A summarization of item difficulty and discrimination is shown in Table 2 . As for difficulty, estimates overall proved to be unreliable (i.e., extremely high difficulty in spite of clear ceiling effects), whereas serial calculation items and, to a lesser extent, spatial orientation ones proved to be the most discriminative, as visually suggested by item characteristic functions (i.e., a graphical representation of difficulty and discrimination parameters; Fig. 1 ) and item information curves (representing the extent of informativity of each item in respect to individuals' level of ability; Fig. 2 ). The visual representation of the expected total score is suggestive of a strong ceiling effect and a general scarce discriminative capability, except for individuals with low ability levels ( Supplementary Fig. 1) . Consistently, the test information curve (visually representing the extent to which the test as a whole is informative in respect to the individual's ability) showed informativity peaks and low standard errors for low levels of ability, whereas poor informativity and unreliable estimates for ability levels were equal to/greater than the mean (Fig. 3 ). The present work provides Italian practitioners and researchers with item-level psychometric insights on the Itel-MMSE from a large Italian population sample, in order to ease the interpretation of its outcomes in both clinical and experimental settings. The Itel-MMSE proved to be scarcely informative of individuals' cognitive profiles as far as the most of its items are concerned, except for those assessing sustained attention (i.e., serial subtraction), as well as for certain ones exploring spatial orientation. Moreover, these items tended to yield the highest level of informativity for low levels of ability, this being especially true for spatial orientation ones. Consistently, as a whole, the Itel-MMSE was most informative in low-performing individuals. In this regard, previous evidence on the clinical usability of the Itel-MMSE showed that this test may offer a valid estimate of the cognitive status in patients with mild, moderate, and even severe dementia [7] , while healthy individuals score the maximum on the Itel-MMSE [8] . Further studies are needed in order to determine to which degree of cognitive impairment the Itel-MMSE is sensitive, since the previous findings suggest it might be able to detect also mild deficits [7] , at variance with the present results. The poor features of the Itel-MMSE in the healthy population are attributable to the low across-individual variability Temporal orientation-2 0. 03 3 Temporal orientation-3 -4 Temporal orientation-4 0 5 Temporal orientation-5 0 6 Spatial orientation-1 -7 Spatial orientation-2 -8 Spatial orientation-3 -9 Spatial orientation-4 0.02 10 Immediate recall- in scores, which may be in turn related to strong ceiling effects [9] . However, these findings should not lead practitioners to aprioristically equate ceiling effects/low informativity to clinical uselessness. Indeed, cognitive measures showing ceiling effects in healthy individuals might not necessarily behave the same when applied to patients [21] . For instance, a given patient failing 2 out of 3 immediate recall items will be undoubtedly judged as more impaired than another scoring the maximum, even though this subset of items shows near-zero variability and markedly goes to ceiling in unimpaired individuals. Further investigations on the Itel-MMSE are thereupon needed to determine whether such inferences actually apply to target clinical populations [22] . In this last respect, it might be of interest for future studies to explore the potential of the Itel-MMSE in detecting cognitive complaints in healthy adults and in elderly related to the COVID-19 pandemic and its social restrictions [23, 24] . Table 2 Item difficulty and discrimination for the Itel-MMSE "Canonical" difficulty was addressed as ranging from − 4 to + 4 † [16] . Cut-offs for "high" and "very high" discrimination were set at ≥ 1.5 * and ≥ 1.7 ** , respectively [16] . Item difficulty refers to the level of latent ability required for an individual to "pass" an item. Item discrimination refers to the capability of an item to discriminate between individuals having different levels of ability (1) 3.51 † 0.56 Fig. 1 Item characteristic functions for Itel-MMSE items. On x-axis, levels of ability θ (theoretically ranging from − ∞ to + ∞, whereas conventionally ranging from − 6 to + 6) are expressed in the logarithm of the odds ratio: above-zero values indicate levels of ability above the mean; below-zero values indicate levels of ability below the mean. On y-axis, the probability of a correct response P(θ) (ranging from 0 to 1). Flatter curves are suggestive of low difficulty (ceiling effect) and scarce discriminative capability. By contrast, the steeper portions of the curve in relation to a given level of ability represent the ability level(s) at which the test is most discriminative. The items that prove to be the most difficult and discriminative are those of Serial subtraction and Spatial orientation -1 and -3 Furthermore, the present item-level investigation shed further light on the factorial structure underlying the MMSE, regardless of its version (in presence vs. telephone-based). Indeed, findings herewith reported underscore the multidomain nature of the Itel-MMSE, with attention items somehow "carrying" the whole test as far as informativity is concerned, in agreement with previous contributions within the field of clinical usability of the in-person MMSE [25, 26] . A major limitation of the present study should be borne in mind, namely, the fact that two-parameter logistic model estimates might have been distorted due to the poor fit of both the model and the items. However, at an explorative and descriptive level, it is believed that information here reported may come in handy both for practitioners to orient themselves in adopting the Itel-MMSE as a telephone-based cognitive screening tool, and to promote further studies on the statistical properties and usability of this instrument. A more appropriate standardization study of the Itel-MMSE in line with current methodological standards on remote cognitive testing [27, 28] is desirable. Indeed, it remains to be derived norms specific to the remote modality of delivery [27] , to assess its construct and criterion validity, also exploring the association with other telephone-based measures of cognitive status, as well as with in-person cognitive screening tests [29, 30] , including the assessment of the equivalence between the Itel-MMSE and its in-person version by adopting ad-hoc statistical methods [31] . Such a standardization study should also include the following aspects: the exploration of both the inter-rater and the test-retest reliability of the Itel-MMSE, considering the potential subjectivity in scoring approaches with remote assessments [30] ; the assessment of the diagnostic properties of the Itel-MMSE mostly relevant for screening tests (i.e., sensitivity, specificity, positive and negative predictive values and likelihood ratios) [22, 32] , as well as of its ability to discriminate cases from controls and its sensitivity to detect changes over time [22, 32] . With respect to psyhometrics, diagnostics and norms for telephone-based cognitive screening tests, two recent works by Aiello et al. [33, 34] can be taken as righteous approaches of standardization. In conclusion, this study suggests that, when interpreting Itel-MMSE scores, clinicians should focus on those items targeting attention and, to a lesser extent, spatial orientation; Fig. 2 Item information curves for Itel-MMSE items. On x-axis, levels of ability θ (conventional range = − 6 to + 6) are expressed in the logarithm of the odds ratio: above-zero values indicate levels of ability above the mean; below-zero values indicate levels of ability below the mean. On y-axis, the level of informativity of a given item I(θ) (ranging from 0 to + ∞). Curves peaking index items that are informative to an extent. The items that prove to be the most informative with respect to the individual level of cognitive functioning are Spatial orientation-1 and Serial subtraction -3, -4, and -5. These items are more informative in low performers (i.e., low ability levels, − 4 < θ < 0) overall, the test appears to be more useful for screening cognition in low-performing individuals. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s40520-021-02041-4. Test information curve and standard error for the Itel-MMSE. On x-axis, levels of ability θ (conventional range = − 6 to + 6) are expressed in the logarithm of the odds ratio: above-zero values index levels of ability above the mean; below-zero values index levels of ability below the mean. On y-axis, the level of informativity of the test as a whole I(θ) (ranging from 0 to + ∞) is reported on the left, whereas the standard errors of the informativity estimated SE(θ) (ranging from 0 to + ∞) is reported on the right. The solid line represents the test information curve, whereas the dotted one the standard error of estimates. Peaks for the solid line represent the levels of ability at which the test is the most informative. By contrast, peaks for the dotted line shows at which levels of ability informativity estimates are unreliable. The total Itel-MMSE score is highly informative in lowperformer individuals (i.e., low levels of ability, − 4< θ < 0) Telephone screening to identify potential dementia cases in a population-based sample of older adults A critical review of the use of telephone tests to identify cognitive impairment in epidemiology and clinical research Validation of multi-stage telephone-based identification of cognitive impairment and dementia Mini-mental state": a practical method for grading the cognitive state of patients for the clinician Telephone-based screening tools for mild cognitive impairment and dementia in aging studies: a review of validated instruments Accuracy of telephone-based cognitive screening tests: systematic review and meta-analysis The Itel-MMSE: an Italian telephone version of the Mini-Mental State Examination Validity of the italian telephone version of the Mini-Mental State Examination in the elderly healthy population The Mini-mental State Examination revisited: ceiling and floor effects after score adjustment for educational level in an aging Mexican population An introduction to item response theory and Rasch models for speech-language pathologists The Montreal Cognitive Assessment (MoCA): updated norms and psychometric insights into adaptive testing from healthy individuals in Northern Italy Differential item functioning related to education and age in the Italian version of the Minimental State Examination R: A Language and environment for statistical computing IBM Corp (2021) IBM SPSS Statistics for Windows, Version 27.0. IBM mirt: a multidimensional item response theory package for the R environment Item response theory for medical educationists The effects of test length and sample size on item parameters in item response theory Scaling performance assessments: strategies for managing local item dependence Development and validation of an item bank for depression screening in the Chinese population using computer adaptive testing: a simulation study RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: the story they tell depends on the estimation methods The frontal assessment battery 20 years later: normative data for a shortened version (FAB15) Psychometrics and diagnostics of Italian cognitive screening test: a systematic review Mental health status of Italian elderly subjects during and after quarantine for the COVID-19 pandemic: a cross-sectional and longitudinal study Subjective cognitive failures and their psychological correlates in a large Italian sample during quarantine/self-isolation for COVID-19 The factorial structure of the mini mental state examination (MMSE) in Alzheimer's disease The factorial structure of the mini mental state examination (MMSE) in Japanese dementia patients Inter organizational practice committee guidance/recommendation for models of care during the novel coronavirus pandemic Tele-neuropsychological assessment tools in Italy: a systematic review on psychometric properties and usability Distance assessment for detecting cognitive impairment in older adults: a systematic review of psychometric evidence. Dement Geriat Cogn Dis Reliability of telephone and videoconference methods of cognitive assessment in older adults with and without dementia Equivalence tests: a practical primer for t tests, correlations, and meta-analyses Are we measuring the same thing? Psychometric and research considerations when adopting new testing modes in thetime of COVID-19 Telephone Interview for Cognitive Status (TICS): Italian adaptation, psychometrics and diagnostics ALS Cognitive Behavioral Screen-Phone Version (ALS-CBS™-PhV): norms, psychometrics, and diagnostics in an Italian population sample The authors thank Dr. Tiziana Metitieri and Dr.