key: cord-1056232-b6bl0pci
authors: Zanin, Elia; Aiello, Edoardo Nicolò; Diana, Lorenzo; Fusi, Giulia; Bonato, Mario; Niang, Aida; Ognibene, Francesca; Corvaglia, Alessia; De Caro, Carmen; Cintoli, Simona; Marchetti, Giulia; Vestri, Alec
title: Tele-neuropsychological assessment tools in Italy: a systematic review on psychometric properties and usability
date: 2021-11-09
journal: Neurol Sci
DOI: 10.1007/s10072-021-05719-9
sha: bfea6a649fef0679e6853c0eaf799be6aca96c4f
doc_id: 1056232
cord_uid: b6bl0pci

BACKGROUND: The current COVID-19 pandemic has abruptly catalysed a shift towards remote assessment in neuropsychological practice (tele-neuropsychology, t-NPs). Although the validity of t-NPs diagnostics is gaining recognition worldwide, little is known about its implementation in Italy. The present review by the Italian working group on tele-neuropsychology (TELA) aims at describing the availability, psychometric properties, and feasibility of t-NPs tools currently available in Italy. METHODS: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed. This work was pre-registered on the Prospective Register of Systematic Reviews (PROSPERO; CRD42021239687). Observational studies reporting telephone-, videoconference- or web-based assessment of cognition/behaviour in Italian both healthy participants (HPs) and patients were included. Bias assessment was performed through ad hoc scales. RESULTS: Fourteen studies were included from an initial N = 895 (4 databases searched). Studies were subdivided into those focused on psychometric properties and those characterized by a predominant applied nature. The majority of studies addressed either adult/elderly HPs or neurological/internal patients. Multi-domain screening tools for cognition, behaviour, mood/anxiety and quality of life were the most represented. Findings regarding validity, reliability, sensitivity, specificity and clinical usability were reported for cognitive screenings — the telephone- and videoconference-based Mini-Mental State Examination and the Telephone Interview for Cognitive Status. DISCUSSION: Positive albeit preliminary evidence regarding psychometric properties and feasibility in both clinical and non-clinical populations of Italian t-NPs brief screening tools are herewith provided. Further studies exploring clinical usability of t-NPs and psychometric properties/feasibility of tests for the in-depth assessment of specific cognitive domains are necessary. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10072-021-05719-9.

The ongoing pandemic due to the novel coronavirus disease (COVID-19) has catalysed a physiological, although slowly ongoing, shift towards tele-neuropsychology (t-NPs) in healthcare systems [1] . Indeed, while before the pandemic t-NPs has been primarily employed either in those settings not allowing traditional care [2] or in Countries covering large territories [1, 3] , an unprecedented increase of its application has been witnessed in response to the pandemic [4] .

Neuropsychological (NPs) practice had indeed to move from traditional, face-to-face examinations towards remote interactions between patients and practitioners [5] . This abrupt transition not only led clinicians to adopt unprecedented solutions for NPs assessment, but also prompted researchers to explore the equivalence of remote vs. in-person evaluations [6, 7] , as well as to adapt standard NPs tests to remote administrations [3] .

Increasing interest has been devoted to diagnostic t-NPs also with the broader aim of providing a more widespread, effective, and efficient access to such healthcare services [7] . Indeed, patients with medical conditions affecting cognition/ Elia Zanin and Edoardo Nicolò Aiello contributed equally to this work.

* Edoardo Nicolò Aiello e.aiello5@campus.unimib.it

Extended author information available on the last page of the article behaviour may have poor access to in-person NPs evaluations due logistical issues -this possibly hampering/delaying the early detection of NPs deficits and thus negatively impacting patients' prognosis [8] . Moreover, it has already been proposed how t-NPs may become essential also for the longitudinal monitoring of these clinical populations [4, 9] . This emerging picture is well represented in Italy, which witnessed an exponential rise of attention towards t-NPs in the absence of gold standards/good practices [10] . Indeed, although in 2012 the Italian National Healthcare System provided national address lines for telemedicine, 1 there had not been many chances for them to be widely implemented -as traditional care was still predominant until the first months of 2020. After all, such address lines highlight that remotely delivered healthcare services are not meant to entirely replace those provided within traditional, in-person settings (page 10) 1 . By contrast, telemedicine is intended to be evidence-based -i.e., to be practiced according to current national and international scientific contributions on the topic, also with specific regard to target diseases (page 28) 1 .

Recent international systematic reviews have highlighted that NPs instruments administered via videoconference/over the telephone are featured by high psychometric (e.g., validity, reliability) and diagnostic (e.g., sensitivity, specificity) quality [1, 3, 11, 12] , as well as that videoconference-based t-NPs assessments are substantially comparable to in-person evaluations [11] . Consistently, limited albeit promising evidence on the validity and reliability of web-based NPs tools have been provided [13, 14] . Altogether, the aforementioned findings appear to endorse the feasibility of t-NPs approaches.

However, given the relevance of socio-demographic, cultural and language differences in NPs testing [15] , evidence of the feasibility and statistical goodness of such approaches have to be examined with respect to any country/languagespecific context. Furthermore, little is known about psychometric properties, clinical usability and experimental applications of Italian t-NPs tools.

The present systematic review was thus performed by a collaborative panel of experts coming from different areas of Italy -the Italian working group on tele-neuropsychology (TELA; https://tela20.net/) -in order to shed light on the state of the art of remote NPs testing in this Country. By carrying out such an investigation, we set ourselves the broader aim to provide handy insights to both Italian clinicians and researchers in the field of t-NPs, as hopefully promoting further studies on the topic.

The present systematic review was performed according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses guidelines (PRISMA) [16] . PRISMA checklist is provided in Supplementary Table 1 . This systematic review was pre-registered on the International Prospective Register of Systematic Reviews (PROSPERO) -identification number: 2021 CRD42021239687 (https:// www. crd. york. ac. uk/ prosp ero/ displ ay_ record. php? ID= CRD42 02123 9687).

The online search strategy was conducted through the major public scientific databases -PubMed, PsychInfo, Embase, and Scopus -and ended on March 15, 2021. The following search terms were entered into the databases: (ital*) AND (cognit* OR neuropsychol* OR psychometri* OR teleneuropsychol* OR tele-neuropsychol*) AND (assess* OR screen* OR test OR evaluat*) AND (telephon* OR phone OR telephone-based OR phone-based OR remote OR videoconferenc* OR webcam OR telehealth OR tele-health OR telemedic* OR web OR self-administered OR online). Fields of search were the title, abstract and possibly key words and subjects/indexing. No date limit was set, and only contributions written in English or Italian language were included. Grey literature was not searched for. Cross-referencing papers were further examined for relevant articles during the initial search.

Both observational studies (cross-sectional and longitudinal) and case report/series performed in Italy and quantitatively assessing, in Italian, cognition and/or behaviour in both healthy individuals and patients affected by different medical conditions were considered for eligibility. For a study to be included, it had to report telephone-, videoconference-or web-based NPs assessment tools being remotely administered with or without supervision. Abstracts, reviews, metaanalyses, research protocols, qualitative studies and opinion papers were excluded. Studies addressing Italian individuals that were nonetheless comprised within a sample including participants from other Countries were also excluded in order to avoid cross-cultural biases.

Formal quality assessment was performed by two independent raters (C.S. and D.L.) by means of the Standard Quality Assessment Criteria (SQAC) [17] and the Study Quality Assessment Tools (SQAT, https:// www. nhlbi. nih. gov/ health-topics/ study-quali ty-asses sment-tools). Appropriate SQAT scales were adopted based on the design of each study (e.g., cohort vs. case-control). Disagreements were solved via discussion with a third independent rater (A.EN.). Non-applicable items were removed from both SQAC (range = 0-20) and SQAT (cross-sectional studies: range = 0-12; case-control studies: range = 0-8) scales.

Study selection process is shown in Fig. 1 .

The search provided with N = 895 potentially relevant articles. After duplicate removal, N = 491 papers were available for screening. Both screening (by assessing titles and abstracts against exclusion criteria) and eligibility (by reading full-texts which passed the screening to determine whether they actually met inclusion criteria) stages were performed independently by three of the Authors (D.L., F.G., D.C.) blinded to each other's decisions via Rayyan (https:// rayyan.qcri.org/welcome). Disagreements were resolved by a fourth independent rater (A.EN.). Among the initial results, 42 contributions were identified through first-level searches and their full-texts were accessed. A total of N = 28 were then excluded (criteria reported in Fig. 1 ). A total of N = 14 studies were included in this review. Data extraction was performed by four independent Authors (N.A., C.A., D.C., and C.S.), whereas a fifth independent author (A. EN.) checked the extracted data and resolved disagreements. The following outcomes were extracted from selected studies: authors and year; number of participants; age, education, and sex of participants (if patients were present); main features of the disease; presence and type of a control group; cohort-vs. population-based nature of the study; theoretical (i.e., standardization) vs. applied (e.g., clinical usability) nature of the study; modality of assessment; cognitive domains or behavioural aspects assessed; tests adopted; first-(i.e., screening) vs. secondlevel assessment (i.e., domain-specific); investigated psychometric properties; and possible comparisons between patients' and healthy individuals' scores.

Mean SQAC score was 17.93 ± 1.69 (15) (16) (17) (18) (19) (20) , whereas mean SQAT score was 5.63 ± 1.3 (4-8) for cohort and 4.67 ± 1.03 (3) (4) (5) (6) for case-control studies. Results were divided into two sections according to the nature of the study: (a) studies mainly focused on psychometric properties; (b) predominantly applied investigations (e.g., mostly focused on clinical usability). For a study to be included in category (a), at least one statistical feature had to be assessed -within validity, reliability, sensitivity, or specificity. Studies included healthy participants (HPs) and/or patients suffering from medical conditions possibly affecting NPs functioning. Notably, no studies performed from the onset of COVID-19 outbreak fell under category (a), whilst post-COVID studies were by far the most represented type in category (b).

Below we provide a narrative, qualitative synthesis of findings aimed at providing possibly relevant insights to clinical usability and future researches, whereas a detailed summarisation of key points for each record is provided in Tables 1 and 2.

The results of the studies that mainly focused on psychometric properties are summarized in Table 1 . In a pioneering study, De Leo et al. [22] administered an ad hoc, telephonebased, 31-item questionnaire assessing functional outcomes (encompassing cognition and mood) to 574 elderly HPs from Veneto (Northeastern Italy). They prospectively explored the impact of tele-monitoring on health parameters of elderly participants who were either used or new to these services. Cognition was assessed by 5 items derived from the Mini-Mental State Examination (MMSE) [32] , whereas depression levels by 5 items from the Self-Rating Depression Scale [33] . Cronbach's α was high for items evaluating both cognition and mood (.91 and .89, respectively). At baseline, "old users" reported better scores than "new users" on both cognitive and mood items, independently of education. Overall findings suggested that tele-monitoring was useful for reducing possibly superfluous access to healthcare facilities by elderlies.

In two different studies, performance on the in-person MMSE and the Italian telephone-based MMSE (Itel-MMSE) was compared both in 104 cognitively impaired patients (CI; different aetiologies) [18] and in 107 HPs [19] . The Itel-MMSE ranges from 0 to 22 as it lacks items relying on visual processing and includes a single naming-to-description task. Significant positive correlations were found between the two modalities in both populations (r = .85 and .26, respectively) [18, 19] . The Itel-MMSE showed high both inter-rater (r = .82-.9) and test-retest reliability (r = .9-.95) in patients with CI -also proving to be sensitive to CI severity [18] .

Vanacore et al. [19] further explored the association between the Itel-MMSE and a wider set of in presence, standard NPs tests. The Itel-MMSE showed acceptable internal consistency (Cronbach's α = .37, p < .001) and showed significant correlations with the in-person MMSE, age (r = .2), education (r = .29) as well as to independent constructional praxis (r = .24) and attentive tests (r = .54). Addressing Equivalent Scores equal to 1 or 2 as the gold standard for poor cognitive functioning [34] , the optimal cut-off yielded a sensitivity of 75%.

The usability of an Italian version of the Telephone Interview for Cognitive Status (I-TICS) was investigated by Dal Forno et al. [20] in 45 patients with probable AD and in 64 HPs. The TICS [35] is a 41-item screening test covering spatio-temporal orientation, language (lexical retrieval, repetition, comprehension), semantics, short-term memory, attention, and executive functioning (working memory and abstraction). The I-TICS showed good internal consistency (Cronbach's α = .91) and strongly correlated with MMSE scores in both groups (r = .90). At the optimal cut-off, high levels of specificity (86%) and sensitivity (84%) were detected (when addressing AD diagnostic criteria as the gold standard) [36] . Moreover, substantial agreement was found between I-TICS and MMSE cut-offs. In a sub-sample of patients, this screening proved to be sensitive to CI involution over time, whereas only moderate test-retest reliability was detected in HPs.

Psychometric properties and feasibility of a videoconference-based MMSE (VMMSE) were investigated in two studies [21, 23] . Timpano et al.'s [21] VMMSE included 28 out of the 30 original items; constructional praxis and writing tasks were excluded due to difficulties in visually assessing participants' performances. Moreover, naming stimuli were substituted by line-drawings and parallel At the optimal cut-off, the VMMSE yielded a classification accuracy of .96 (when tested against the MMSE). The authors also provided several epidemiological statistics of interest (see Table 1 ).

Carotenuto et al. [23] longitudinally administered the MMSE and the Alzheimer's Disease Assessment Scale -Cognitive Subscale (ADAS-Cog) [37] both face-to-face and via videoconference in 28 AD patients with graded severity of CI. Assessment was repeated at 6, 12, 18 and 24 months. With respect to both mildly and moderately impaired patients, no significant between-modality differences were detected neither in scores nor in administration times. However, both videoconference-administered Moreover, an ad hoc questionnaire investigating the acceptance of remote assessment yielded moderately high satisfaction levels in both patients and caregivers. Within a study on the comparison between the online and in-person versions of a self-report questionnaire assessing empathy (Questionnaire of Cognitive and Affective Empathy, QCAE) [38] , Di Girolamo et al. [24] administered to 285 HPs the web-based Italian versions of the Toronto Alexithymia Scale-20 (TAS-20) [39, 40] and of the Reading the Mind in the Eyes-Test (RME-T) [41] [42] [43] as correlational measures of social cognition. The QCAE was partially related to TAS-20 scores (r = |.27|), whereas no associations were found with the RME-T. High and low internal consistency levels were reported with respect to TAS-20 and RME-T (Cronbach's α = .85 and .32, respectively).

Lassandro et al. [25] administered a web version of the Paediatric Quality of Life Inventory Multidimensional Fatigue Scale (PedsQL MFS) [44, 45] to a sample of 191 paediatric patients (mean age 11 years) with chronic immune thrombocytopenia (ITP) and to 248 caregivers. The PedsQL MFS is a self-and parent-report tool investigating fatigue along different dimensions, including cognition. Both patients and parent-report versions revealed high internal consistency (Cronbach's α = .89). No significant differences emerged in fatigue perception between patients and their caregivers. Results were compared to those of HPs who had completed the PedsQL MFS in its standard version -with a greater perception of fatigue in ITP patients than HPs being detected.

The results of the predominantly applied investigations are summarized in Table 2 . Simeon et al. [26] administered over the telephone a modified version of the TICS (TICS-m) [46] to 1514 women (71.1 years on average) living in Southern Italy (Naples). They investigated the association between cognition and possible risk factors within retrospectivelycollected data on dietary habits. The TICS-m ranges 0-39 as further comprising a long-term memory task while dropping six items that proved to have poor discriminative capability (assessing personal orientation, naming, repetition, comprehension and abstraction) [46] . Within multiple prediction models, age, body mass index and glycaemic load negatively affected cognition, whereas education had a positive influence.

Costabile et al. [28] developed an online survey to investigate changes in functional outcomes during the first COVID-19 lockdown in patients affected by multiple sclerosis (MS) compared to HPs. Cognition and mood were assessed via relative sub-scales from the Quality of Life in Neurological Disorders (Neuro-QoL) [47] ; Raven-Like matrices were also administered to evaluate abstract reasoning. MS patients (N = 497) scored lower than HPs (N = 348) on Neuro-QoL cognitive items, whereas the two groups performed comparably on progressive matrices. Moreover, depression levels were higher in MS patients than HPs. Cognitive dysfunction proved to be associated with depressed mood, as well as to negatively affect patients' global functional outcome.

Within a pilot study on the usability of a smartphone app to monitor motor and non-motor symptoms of Parkinson's disease (PD) in 54 patients, Motolese et al. [27] administered by telephone the Non-Motor Symptoms Questionnaire (NMSQ; a patient-report tool exploring a wide range of manifestations beyond extrapyramidal ones -e.g., cognitionand mood-related) [48] , the Unified Parkinson's Disease Rating Scale (UPDRS-I, -II, and -IV; the most widespread PD-specific functional scale exploring both motor and nonmotor manifestations) [49] , the Geriatric Depression Scale short form (GDSsf; a renowned measure of depressed mood among geriatric populations) [50, 51] and the Parkinson's Disease Questionnaire-8 (PDQ-8; a PD-specific measure of QoL) [52] . However, no results were discussed by the authors about the remote usability of such instruments.

The Geriatric Depression Scale-5 (GDS-5) [51, 53] was also administered during the first COVID-19 wave via telephone by Carlos et al. [31] to 204 elderly patients with different degrees of CI, along with an ad hoc questionnaire investigating psychosocial changes in relation to cognitive status, mood and presence of subjective memory complaints. Overall, patients with lower levels of cognitive functioning reported more frequently depressive symptoms and more severe memory complaints.

Within an online survey aimed to explore the association between insomnia and psychological outcomes during first COVID-19 lockdown, Bacaro et al. [29] administered the Hospital Anxiety and Depression Scale [54, 55] to 1989 healthy adults, reporting a relation between insomnia severity and both anxiety (HADS-A) and depression (HADS-D) levels.

Lastly, Rainero et al. [30] administered a telephone-based survey to more than 4000 caregivers of patients with AD, Lewy body, frontotemporal and vascular dementia (LBD, FTD and VaD) to explore patients' clinical alterations as well as caregivers' burden during the first lockdown. Within the survey, the Clinical Dementia Rating (CDR) [56] was administered along with an ad hoc questionnaire exploring changes in perception, attention, language, memory and behaviour. Worsening in cognitive and behavioural symptoms were observed in >50% of patients (especially, those affected with LBD and AD). Memory deficits and disorientation were the most frequently reported in LBD and AD patients, whereas FTD patients predominantly showed language impairment. As for behavioural symptoms, irritability, apathy, agitation and sleep disturbances were highly represented with about 25% of patients displaying novel symptoms during quarantine. Higher level of awareness of the quarantine situation proved to be protective towards NPs dysfunctions, whereas previous physical independence was found to be a risk factor. Increasing levels of anxiety, depression, irritability and distress were reported by caregivers -80% of whom showed appreciation towards telemedicine support.

The present work provides Italian practitioners with synoptic evidence regarding the feasibility and psychometric properties of t-NPs screening instruments currently available in Italy for both clinical and experimental use.

Although the equivalence of remote vs. in-person administration cannot be aprioristically assumed [57] , our review is suggestive of the validity, reliability and usability of telephone-, videoconference-and web-based brief screening instruments for cognitive and/or behavioural impairment in the Italian population. This finding is in agreement with previous international synoptic contributions on the validity, reliability and diagnostic quality of t-NPs screening instruments [1, 3, 11, 12] . Overall, remotely administered instruments have shown moderate-to-high internal consistency and both construct and criterion/ecological validity -with respect to both standardized [e.g., [18] ] and ad hoc, semistructured [22] tools.

Positive, albeit preliminary, information on the convergence between remote and in-person administrations have been reported for the MMSE -the most widely used cognitive screening for dementia worldwide [58] . Indeed, as for the Itel-MMSE, although it has been shown to be predictive towards the in-person MMSE [18] , such evidence of criterion validity should be interpreted with caution due to the difference between the two ranges (0-22 and 0-30, respectively).

Similarly, Timpano et al.'s [21] investigation on the VMMSE lacks information regarding its association with the MMSE and describes an instrument whose items do not completely overlap with those of the MMSE (both quantitatively and qualitatively). Moreover, it should be taken into account that the full-range videoconference-based MMSE happened to overestimate the degree of CI in patients suffering from dementia [23] -this possibly suggesting that evaluations through videoconference might enter further systematic error variance when severely impaired patients are considered.

As for the statistical properties of remote MMSEs that have been tested within the included studies, both their telephone-and videoconference-based formats proved to be reliable and sensitive to changes in cognition over time. Furthermore, based on available evidence, both sensitivity and specificity proved to be higher when the MMSE is administered through videoconference (87% and 97%, respectively) [21] vs. the telephone [19] . This last finding might be due to the diagnostic relevance of those items depleted from the Itel-MMSE (ranging 0-22) -which, by contrast, are included within the VMMSE (ranging 0-28).

The TICS proved to be a promising screening -as it was proven valid, reliable, moderately sensitive and specific as well as feasible in both HPs and patients [20, 26] . However, both the paucity of studies adopting it and the existing differences between the administered protocols (I-TICS vs. TICSm) makes it necessary to collect more systematic evidence.

Remarkably, normative cut-off values for some telephoneand/or videoconference-based cognitive screenings -i.e., Itel-MMSE, VMMSE and I-TICS -have been provided. However, the statistical approach that was applied to derive these cut-offs should lead practitioners to exert caution when using them in clinical contexts. Indeed, although relatively adequate sample sizes have been adopted, such normative values were derived by neither adjusting for confounding predictors (e.g., age, education and sex via regression-based methods), nor by controlling for inferential errors when judging a given performance as impaired or not [34] .

This work also provides emerging evidence regarding the feasibility of remotely assessed clinical scales assessing mood and other significant symptoms and/or QoL in everyday contexts. So far, the self-report modality adopted by these tests in combination with the complexity of the target constructs (which may overwhelm that of cognitive ones, as being influenced by several psycho-social variables) makes challenging any comparison with the evidence herewith reported on the psychometric quality of cognitive tests. Thereupon, future research is recommended to explore the psychometric properties of mood/quality of life scales via modalities that highly diverge from face-to-face -by specifically focussing on whether they are equivalently valid when administered in settings that lack face-to-face interpersonal dynamics between patients and practitioners.

From the clinical point of view, two main topics need to be addressed. First, only specific clinical groups have been considered in the included works (i.e., only neurodegenerative conditions), whereas the generalizability of remote assessment tools to different neurological/neuropsychiatric populations has still to be tested -given the heterogeneity of NPs profiles across disorders. Second, while more attention has been paid to cognitive screeners, further investigation is needed to explore the feasibility and psychometric properties of domain-specific assessment tools. This seems even more the case for cognitive functions that either strongly rely on perceptual elements for their assessment or necessarily need the presence of a clinician for scoring the outcome of the test (e.g., visuo-spatial abilities and language, respectively).

A proof-of-concept contribution to both the aforementioned instances has been provided, for the English language, by De Witte et al. [59] , who developed a telephonebased screening for language deficits (domain-specific) for remotely monitoring patients who underwent surgical treatment of brain neoplasms (disease-specific). Moreover, a discrete number of domain-specific tests are validated for web or videoconference administration for the English language [1, 11] .

It is also worth bearing in mind that although the pandemic is an unprecedented accelerator for the implementation of t-NPs methods, its use should not be limited to such a contingency. Indeed, remote evaluations can help level out geographical differences in accessing NPs diagnostics (e.g., people living in rural areas and underserved populations) [60, 61] , as well as logistic difficulties faced by patients with motor disorders [57] -this applying to both baselines and (even more) for follow-ups. With this regard, t-NPs tools may come in hand not only for neurological populations, but also for patients affected with any internal medical conditions that might likewise affect cognition/behaviour [62] .

t-NPs tools might also help circumvent those limitations of classical tests which are due to the paper and pencil format. For instance, the brief presentation time allowed by computer-based assessment results in a more sensitive detection of subtle visuo-spatial deficits when compared to standard tests with unlimited presentation time [63] . t-NPs might also improve accessibility allowing repeated testing to monitor the progression of a disease or the recovery after acute events.

From an experimental viewpoint, t-NPs assessment can undoubtedly help implement large-scale, population-based epidemiological investigations on cognitive/behavioural disorders [64] , by also opening up to prevention campaigns (e.g., as far as pathological ageing is concerned) [65] and easier actualizations of both baseline and follow-up NPs assessment during decentralized clinical trials (e.g., aimed at testing the efficacy of pharmacotherapies for dementing illnesses) [64] .

Finally, for t-NPs practice to take hold in Italy, formal and widespread acknowledgment from the healthcare system would be needed towards remotely-delivered services -the recognition of which has to this day been limited to the current pandemic.

Our findings suggest that t-NPs assessment approaches are far from being fully developed in Italy -as for instance shown by the relatively low diffusion of web-based instruments and the lack of studies on remotely administered domain-specific tests that would allow performing a comprehensive and multidimensional assessment.

Indeed, only one of the included studies has taken into consideration domain-specific instruments -(which assess alexithymia and theory of mind) [e.g., [24] ].

In respect to such a predominant trend towards adopting telephone-based screening tools, it should be noted that current national address lines somehow legitimate this circumscribed application of t-NPs. Indeed, this document 1 hints at the fact that longer, in-depth assessments (possibly mediated by videoconference) may come with the risk of appearing to patients as devaluing their interpersonal relationship with practitioners. At the same time, higher endorsement can be traced 1 towards brief evaluations (≤15′) -which are believed to "bridge the gap" and promote continuity within the care management, this in turn positively impacting on patients' perception of their relational dynamics with clinicians.

While only a minority of studies have adopted either videoconference-or web-based channels the most frequent method for remote administration was the telephone. In this respect, it is worth highlighting that although different settings are included under the same "tele-prefix," each of them is characterized by specific psychometric features -which define their benefits as well as their limitations [66] . Future studies may thus focus on better profiling the idiosyncrasies of each t-NPs testing modality (telephone-, videoconferenceor web-based).

Aiming to stimulate a growing scientific debate, this review stands as a starting point for researchers and clinicians towards standardizing t-NPs tools for Italian practitioners' toolbox by addressing methodological issues specific to these media of assessment.

More specifically, we encourage that researchers devoted to the development/standardization of t-NPs assessment tools consider the following aspects. First, equivalence between scores yielded by remote and inperson formats of/proxy measures for a given instrument should be assessed -e.g., via the equivalence testing procedures proposed by Lakens [67] . Second, validity testing should regard both in-person and independent remote measures as either correlational measures (when testing construct validity) or outcome variables (when testing criterion validity). In addition, particular attention to inter-rater reliability should be exerted -since remote modalities of administration might suffer more than inperson ones from across-practitioners discrepancies in delivering instructions and/or in scoring procedures [68] . A comprehensive examination of diagnostic properties should also be carried out -not limited to sensitivity and specificity but also to their derived metrics (e.g., positive and negative predictive value and likelihood ratios) [69] . Moreover, thresholds for significant changes in cognition over times that also take into account practice effects should be identified through ad hoc statistical methods (e.g., Reliable Change Index) in order to improve the longitudinal applicability of t-NPs tools [70] . Finally, regression-based norms that also control for inferential errors when deriving cut-off values should be provided for t-NPs tests, in line with the current neuropsychometric methodology adopted for Italian paper-and-pencil tests [34] .

In respect to the abovementioned methodological-statistical aspects, the development and standardization of Italian t-NPs tools should address those issues that have been also highlighted within the international literature [68, 71, 72] -which reveals the need for (1) stronger validity evidence (as it appears to be often neglected, contrarily to reliability, which is more often examined); (2) "cross-modal" psychometric investigations (e.g., comparing in-person to remote assessments); (3) equivalence/ invariance testing (especially when a tool happens to be adapted from the paper-and-pencil format); (4) item-level examinations of those tasks which entail specific issues when remotely delivered (e.g., those requiring motor skills).

It follows that when it comes to adopting a given t-NPs tool, practitioners are advised to take into account (1) evidence on its validity -especially construct validity both against in-person and remotely-administered measures; (2) evidence on its reliability -with a focus on inter-rater and test-retest; (3) if norms have been derived specifically for its remote administration; (4) evidence on its basic diagnostic properties (e.g., sensitivity, specificity and possibly derived metrics such as positive and negative predictive values and likelihood ratios); (5) evidence on its clinical usability in target populations [68, [71] [72] [73] .

Finally, the rate of acceptability and the feasibility of a given t-NPs tool with specific regard to its administration modality (e.g., telephone-based vs. videoconferencebased) and target populations (e.g., "older" vs. "younger" elderlies; high vs. low digital literacy) should represent a further critical point when both developing/standardizing and choosing to adopt it [6, 74] .

In conclusion, Italian t-NPs screening tools appear to be promising as far as both feasibility and psychometric goodness are concerned. Our work is intended to represent a first step towards the need of scientific, evidence-based recognition of t-NPs practice within current national address lines. However, further studies have to be carried out by Italian researchers in order to examine feasibility and statistics of domain-specific t-NPs tools. Future explorations might also clarify, within the Italian context, the relevance of t-NPs to clinical practice beyond the current pandemic. So far, as the available instruments that have been investigated in the Italian context are still very limited in number and clear clinical applicability, this review stands as a starting point to promote further research on the potential of t-NPs in Italy.

Validity of teleneuropsychology for older adults in response to COVID-19: a systematic and critical review

Randomized controlled clinical trial of "virtual house calls" for Parkinson disease

Accuracy of telephone-based cognitive screening tests: systematic review and meta-analysis

Dementia care and COVID-19 pandemic: a necessary digital revolution

A survey of international clinical teleneuropsychology service provision prior to and in the context of COVID-19

Inter organizational practice committee recommendations/ guidance for teleneuropsychology in response to the COVID-19 pandemic

Transitioning to telehealth neuropsychology service: considerations across adult and pediatric care settings

Telemedicine and the evaluation of cognitive impairment: the additive value of neuropsychological assessment

Feasibility and psychometric integrity of mobile phone-based intensive measurement of cognition in older adults

Neuropsychology in the times of COVID-19. The role of the psychologist in taking charge of patients with alterations of cognitive functions

Neuropsychological test administration by videoconference: a systematic review and meta-analysis

Telephone-based screening tools for mild cognitive impairment and dementia in aging studies: a review of validated instruments

Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors

Initial assessment of reliability of a self-administered web-based neuropsychological test battery

Norms Selection in Neuropsychological Assessment

The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration

Standard quality assessment criteria for evaluating primary research papers from a variety of fields. Edmonton: Alberta Heritage Foundation for

The ITEL-MMSE: an Italian telephone version of the Mini-mental state examination

Validity of the Italian telephone version of the Mini-mental state examination in the elderly healthy population

Use of an Italian version of the telephone interview for cognitive status in Alzheimer's disease

Videoconference-based Mini mental state examination: a validation study

Assessment of quality of life in the elderly assisted at home through a tele-check service

Cognitive assessment of patients with Alzheimer's disease by telemedicine: pilot study

The questionnaire of cognitive and affective empathy: a comparison between paper-and-pencil versus online formats in Italian samples

Fatigue perception in a cohort of children with chronic immune thrombocytopenia and their caregivers using the PedsQL MFS: real-life multicenter experience of the Italian Association of

Dietary glycemic load and risk of cognitive impairment in women: findings from the EPIC-Naples cohort

Parkinson's disease remote patient monitoring during the COVID-19 lockdown

COVID-19 pandemic and mental distress in multiple sclerosis: implications for clinical management

Insomnia in the Italian population during COVID-19 outbreak: a snapshot on one major risk factor for depression and anxiety

The impact of COVID-19 quarantine on patients with dementia and family caregivers: a nation-wide survey

Life during COVID-19 lockdown in Italy: the influence of cognitive state on psychosocial, behavioral and lifestyle profiles of older adults

Mini-mental state: a practical method for grading the cognitive state of patients for the clinician

A self-rating depression scale

Outer and inner tolerance limits: their usefulness for the construction of norms and the standardization of neuropsychological tests

The telephone interview for cognitive status

Clinical diagnosis of Alzheimer's disease: report of the NINCDS-ADRDA work group* under the auspices of Department of Health and Human Services Task Force on Alzheimer's disease

A new rating scale for Alzheimer's disease

The QCAE: a questionnaire of cognitive and affective empathy

The twenty-item Toronto alexithymia scale-I. item selection and cross-validation of the factor structure

Cross validation of the factor structure of the 20-item Toronto alexithymia scale: an Italian multicenter study

Another advanced test of theory of mind: evidence from very high functioning adults with autism or Asperger syndrome

The "Reading the mind in the eyes" test revised version: a study with normal adults, and adults with Asperger syndrome or highfunctioning autism

The "Reading the mind in the eyes" test: systematic review of psychometric properties and a validation study in Italy

The Ped-sQL in pediatric cancer: reliability and validity of the pediatric quality of life inventory generic Core scales, multidimensional fatigue scale, and Cancer module

The PedsQL multidimensional fatigue scale in pediatric rheumatology: reliability and validity

Hereditary influences on cognitive functioning in older men: a study of 4000 twin pairs

Neuro-QOL: brief measures of health-related quality of life for clinical research in neurology

International multicenter pilot study of the first comprehensive self-completed nonmotor symptoms questionnaire for Parkinson's disease: the NMSQuest study

Movement Disorder Society-sponsored revision of the unified Parkinson's disease rating scale (MDS-UPDRS): scale presentation and clinimetric testing results

Geriatric depression scale (GDS): recent evidence and development of a shorter version

Geriatric depression scale

Cross-cultural evaluation of the short form 8-item Parkinson's disease questionnaire (PDQ-8): results from America

Validation of the five-item geriatric depression scale in elderly subjects in three different settings

Detecting psychological distress in cancer patients: validity of the Italian version of the hospital anxiety and depression scale

The hospital anxiety and depression scale

Clinical dementia rating training and reliability in multicentre studies: the Alzheimer's disease cooperative study experience

Telephone based cognitivebehavioral screening for frontotemporal changes in patients with amyotrophic lateral sclerosis (ALS)

A meta-analysis of the accuracy of the minimental state examination in the detection of dementia and mild cognitive impairment

A valid alternative for in-person language assessments in brain tumor patients: feasibility and validity measures of the new TeleLanguage test

Telephone screening to identify potential dementia cases in a population-based sample of older adults

Remote neuropsychological assessment in rural American Indians with and without cognitive impairment

Validation of a brief telephone battery for neurocognitive assessment of patients with pulmonary arterial hypertension

Neglect and extinction depend greatly on task demands: a review

A critical review of the use of telephone tests to identify cognitive impairment in epidemiology and clinical research

Validation of multi-stage telephone-based identification of cognitive impairment and dementia

Construct validity, ecological validity and acceptance of self-administered online neuropsychological assessment in adults

Equivalence tests: a practical primer for t tests, correlations, and meta-analyses

Reliability of telephone and videoconference methods of cognitive assessment in older adults with and without dementia

A review of sensitivity, specificity, and likelihood ratios: evaluating the utility of the electrocardiogram as a screening tool in hypertrophic cardiomyopathy

A test for the assessment of pragmatic abilities and cognitive substrates (APACS): normative data and psychometric properties

Distance assessment for detecting cognitive impairment in older adults: a systematic review of psychometric evidence

Are we measuring the same thing? Psychometric and research considerations when adopting new testing modes in thetime of COVID-19

Psychometrics and diagnostics of Italian cognitive screening test: a systematic review

Teleneuropsychology clinic development and patient satisfaction

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Elia Zanin 1 · Edoardo Nicolò Aiello 2,3 · Lorenzo Diana 2,3 · Giulia Fusi 4 · Mario Bonato 5 · Aida Niang 6 · Francesca Ognibene 7 · Alessia Corvaglia 8 · Carmen De Caro 7 · Simona Cintoli 9 · Giulia Marchetti 5 · Alec Vestri 10 · for the Italian working group on tele-neuropsychology (TELA)