key: cord-0025810-gxwtiwbe
authors: Pogam, Marie-Annick Le; Seematter-Bagnoud, Laurence; Niemi, Tapio; Assouline, Dan; Gross, Nathan; Trächsel, Bastien; Rousson, Valentin; Peytremann-Bridevaux, Isabelle; Burnand, Bernard; Santos-Eggimann, Brigitte
title: Development and validation of a knowledge-based score to predict Fried's frailty phenotype across multiple settings using one-year hospital discharge data: The electronic frailty score
date: 2022-01-10
journal: EClinicalMedicine
DOI: 10.1016/j.eclinm.2021.101260
sha: f8b3b79885bd67ea8994d8d0403d1b25d97d4066
doc_id: 25810
cord_uid: gxwtiwbe

BACKGROUND: Most claims-based frailty instruments have been designed for group stratification of older populations according to the risk of adverse health outcomes and not frailty itself. We aimed to develop and validate a tool based on one-year hospital discharge data for stratification on Fried's frailty phenotype (FP). METHODS: We used a three-stage development/validation approach. First, we created a clinical knowledge-driven electronic frailty score (eFS) calculated as the number of deficient organs/systems among 18 critical ones identified from the International Statistical Classification of Diseases and Related Problems, 10th Revision (ICD-10) diagnoses coded in the year before FP assessment. Second, for eFS development and internal validation, we linked individual records from the Lc65+ cohort database to inpatient discharge data from Lausanne University Hospital (CHUV) for the period 2004-2015. The development/internal validation sample included community-dwelling, non-institutionalised residents of Lausanne (Switzerland) recruited in the Lc65+ cohort in three waves (2004, 2009, and 2014), aged 65-70 years at enrolment, and hospitalised at the CHUV at least once in the year preceding the FP assessment. Using this sample, we selected the best performing model for predicting the dichotomised FP, with the eFS or ICD-10-based variables as predictors. Third, we conducted an external validation using 2016 Swiss nationwide hospital discharge data and compared the performance of the eFS model in predicting 13 adverse outcomes to three models relying on well-designed and validated claims-based scores (Claims-based Frailty Index, Hospital Frailty Risk Score, Dr Foster Global Frailty Score). FINDINGS: In the development/internal validation sample (n = 469), 14·3% of participants (n = 67) were frail. Among 34 models tested, the best-subsets logistic regression model with four predictors (age and sex at FP assessment, time since last hospital discharge, eFS) performed best in predicting the dichotomised FP (area under the curve=0·71; F1 score=0·39) and one-year adverse health outcomes. On the external validation sample (n = 54,815; 153 acute care hospitals), the eFS model demonstrated a similar performance to the three other claims-based scoring models. According to the eFS model, the external validation sample showed an estimated prevalence of 56·8% (n = 31,135) of frail older inpatients at admission. INTERPRETATION: The eFS model is an inexpensive, transportable and valid tool allowing reliable group stratification and individual prioritisation for comprehensive frailty assessment and may be applied to both hospitalised and community-dwelling older adults. FUNDING: The study received no external funding.

Frailty is a common clinical condition in the older population resulting from a progressive and cumulative loss of physiological reserve in multiple interconnected organs or systems. Combined or not with comorbidity and disability, it increases vulnerability to adverse health outcomes after minor stressors. 1−4 In high-income countries, frailty is estimated to affect 10% of community-dwellers aged 65 years and over (65+) and 25-50% of those aged 85+. 4, 5 In acute care hospitals, the prevalence of frailty in 65+ inpatients is even higher, approaching 50%. 6 Several frailty tools have been developed for clinical assessment in various health or social care settings. 6, 7 However, two complementary instruments based on different theoretical constructs 8 have been extensively validated and are widely used. The Fried frailty phenotype (FP) assesses physical frailty through five criteria: unintentional weight loss; weakness or poor handgrip strength; self-reported exhaustion; slow walking speed; and low physical activity. 1 By contrast, the Rockwood Frailty Index measures the proportion of accumulated deficits in an individual among a list of all potential ones, i.e. symptoms, signs, disabilities, diseases, and laboratory test results. 9 In recent years, claims-based frailty instruments have been designed using clinical knowledge or datadriven approaches for the routine identification of frail cohorts of community-dwellers or inpatients and risk stratification for poor outcomes. 10−13 Despite the wellknown limitations of data routinely collected for billing purposes, they provide real-world longitudinal information on patients from various health care settings and geographic locations. 14 However, most instruments have been developed to predict adverse health outcomes, not frailty itself, making them difficult to differentiate from other morbidity or mortality prediction models. 13, 15 In addition, those validated against the Fried or Rockwood reference standards generally demonstrated poor-to-moderate predictive performance. 11, 16 Finally, they often lack transportability, reliability over time or contemporaneity as they rely on billing systems from only a few countries (USA, Canada, UK) and generally dated data. 13 This study aimed to develop and validate an electronic frailty score (eFS) based on recent Swiss hospital discharge data to predict the FP and related adverse health outcomes in the following year. 2 We also compared the performance of the eFS in predicting adverse outcomes in older inpatients to three similar scores derived from Medicare data or UK hospital episode statistics: the Claims-based Frailty Index (CFI); the Hospital Frailty Risk Score (HFRS); and the Dr Foster Global Frailty score (GFS). 10 

We used a three-stage, clinical knowledge and datadriven approach to develop and validate the eFS. First, we operationalised Fried's biological concept of frailty, which results from a nonlinear, multisystem, physiological dysregulation, independent of age and concurrent acute diseases. 2, 3 Three expert physicians reviewed the approximately 14,000 codes of the 2012-2014 International Statistical Classification of Diseases and Related Problems 10th revision German Modification (ICD-10-GM) and created 18 lists of codes, considered as markers of probable and persistent deficiencies in critical systems or organs (Appendix 1). 2, 17 Lists were not mutually exclusive and included diseases or conditions affecting several systems in all the corresponding lists. We then calculated the eFS for each participant as the number of deficient organs/systems among the 18 markers using ICD-10-GM diagnoses coded in the previous year (Appendix 1). The eFS is therefore a cumulative score which treats each list of codes/marker equally within the overall score and can range from 0 to 18.

Second, we performed predictive modelling of the FP dichotomized as follows: participants were categorised as 'frail' if they met at least three of Fried's five locally-adapted criteria, 18 while robust (no criteria) and pre-frail (one or two criteria) participants were classified as 'non-frail'. Predictive models tested included a common list of candidate predictors, specific predictors, and interaction terms (Appendix 3). The common list of predictors (CL) included age at FP assessment (years, continuous), sex (male/female), and cumulative length of stay in acute care (days, quartiles), number of admissions in acute care (1/>1), cumulative time spent in an intensive care unit (hours, 0/>0), presence of at least one emergency admission at CHUV in the year before FP assessment (0/1), and time between last discharge from acute care at CHUV and FP assessment (days: 0-29, 30-89, 90-365). Specific candidate predictors comprised a list of 122 three-character truncated ICD-10-GM diagnostic codes (Diag), based on CHUV inpatient discharge data from the year before FP assessment and concerning ≥ 1% inpatient stays (i.e., ≥ five stays), the eFS, and the 20 lists of ICD-10-GM diagnostic codes selected to characterise the 18 organ/system deficiencies in the eFS (eFS_20Lists). Thus, we tested four lists of candidate predictors and the interactions between them: (1) CL (P = 7 predictors); (2) CL+Diag (P = 163 predictors); (3) CL+eFS (P = 8 predictors) and; (4) CL +eFS_20Lists (P = 27 predictors). This approach allowed us to compare the performance of a predictive model based on a data-driven selection of diagnostic codes (CL+Diag) to those of models based respectively on the cumulative eFS knowledge-driven score (CL +eFS) and an alternative eFS score with a weighting of the different lists of codes composing the eFS (i.e., a mix of knowledge and data-driven approaches). All models underwent internal validation. Third, as an external validation, we assessed the eFS performance for predicting adverse outcomes in Swiss older inpatients and compared it to the CFI, HFRS, and GFS.

Ethical approval was provided by the Cantonal Human Research Ethics Committee (CER-VD reference no. 2016-00900). Written informed consent was obtained from each participant.

Participants were recruited from the Lc65+ cohort, which consists of 4668 individuals comprising three representative samples of approximately 1500 community-dwelling residents of Lausanne (Switzerland) recruited in 2004 (C1), 2009 (C2), and 2014 (C3) and aged 65-70 years at enrolment. 18 Follow-up includes completion of annual self-administered questionnaires on health status and health service utilisation over the past year and triennial face-to-face visits with FP assessment. 18 Our study inclusion criteria included clinical assessment for FP at least once between enrolment and 2015, having spent at least one night in an acute care unit at Lausanne University Hospital (CHUV) within 12 months before FP assessment, and informed consent to link their cohort data to CHUV inpatient discharge data, including re-use of hospital data.

We linked individual records of the Lc65+ cohort database to CHUV inpatient discharge data for the period 2004-2015 using a deterministic record linkage method (i.e. agreement on last name, first name, sex and birth date). 19 To overcome the data cluster structure, we selected a single FP measure per participant. For C1 and C2 participants who underwent four and two FP assessments, respectively, during the 2004-2015 period, we retained the one with the shortest time interval between hospital discharge and FP assessment; C3 participants had only a single measure of FP. Thus, the development/internal validation dataset contained a single FP assessment per participant with their corresponding age and sex at FP assessment, inpatient discharge data for the year before and after FP assessment, time since last discharge from hospital, and date of death if applicable.

To predict the FP, we performed best-subsets (BS-LR) 20 and Lasso-penalized 21 logistic regressions (Lasso-LR) and supervised machine learning (ML) classifiers suitable for small-n-large-p datasets and binary outcomes: random forests and support vector machines. 22, 23 We randomly split the original dataset into a training and a test dataset (75% and 25% of participants, respectively), keeping the prevalence of frailty in both datasets identical to the original one, i.e. approximately 14%. We applied the best-subset strategy to the training dataset to select the model with the lowest Akaike Information Criterion. 20 We also developed each Lasso-LR and ML model on a training dataset using a nested five-fold cross-validation procedure to select the predictors and optimise the hyperparameters based on a grid-search method over a list of possible values. 22 Moderately and highly correlated predictors (i.e. Spearman correlation coefficient ≥ 0.3) were excluded from candidate predictors for all models as none of these models handles multicollinearity well (Appendix 3). Finally, we accounted for a highly imbalanced classification in ML algorithms using weighting and synthetic minority oversampling techniques. 24, 25 Internal validation For internal validation, we applied each model developed on a training dataset to the corresponding test dataset. We repeated the split-sample procedure 1000 times to produce 1000 training and test datasets, thus allowing to examine the effect of sampling on predictor selection and model performance on our small sample size datasets and low prevalence of frailty. The final performance metrics of each model were derived by averaging those calculated over the test datasets. We selected the best performing model based on the highest average area under the receiver operating characteristic curve (AUC) and F1 score. 22 Further details on the assessment of model performance are given in Appendix 3.

We also tested whether the eFS was associated with adverse health outcomes within 12 months after FP assessment (including death), a prolonged length of hospital stay (at least one stay ≥8 days), as well as the number of hospitalisations and admission to a nursing home. We verified that these associations remained after accounting for multimorbidity operationalised by the updated Charlson comorbidity index (CCI). 26 We applied logistic and Poisson regression models for binary and count outcomes, respectively.

To assess the generalisability of the eFS, we tested its performance in predicting adverse health outcomes using a large independent validation sample of 54,815 index stays (one stay per patient) between Jan 1 and Dec 31 2016 in 153 Swiss acute care hospitals, "with patients from a different but plausibly related population". 27 Participant characteristics and selection criteria are detailed in Appendix 4. With age, sex, CCI, and the eFS as predictors, we fitted two-level logistic and Poisson regression models for 11 binary and two count outcomes, respectively; the first level being the patient and the second level the hospital. The 13 predicted adverse health outcomes are detailed in Appendix 4. We excluded inpatients who died during the index stay or within 30 days/one year after discharge from the index stay (n = 4596, 5265 and 10,020, respectively) from the validation dataset when predicting non-terminal outcomes (i.e. admissions or institutionalisation after discharge from index stay) to avoid competing risks with terminal ones (death).

Finally, to compare the eFS performance in predicting the 13 adverse outcomes with the CFI, HFRS and GFS, we adapted three scores to Swiss hospital discharge data and the ICD-10-GM classification and assessed the predictive performance of models with age, sex and the adapted score as predictors. All models were assessed with and without accounting for the CCI, apart from the GFS model where the CCI is already a variable. Model performance was compared using AUCs and F1 scores for binary outcomes and mean squared prediction errors for count ones. Detailed ICD-10-GM codes of the three mapped scores are shown in Appendix 2.

We described the characteristics of the included population and compared these between frail and non-frail individuals. We compared medians for continuous variables and relative frequencies for categorical variables using the Mann−Whitney U-test and Fisher's exact test.

Data analyses were conducted using Python TM version 3.9.1 and R version 4.02.2 software. We performed the split-sample procedure, model selection, and validation using the bestglm-package in R, the scikit-learn library in Python, and the imbalance-learn-package 0.7.0 compatible with scikit-learn.

There was no funding source for this study. The corresponding author (MALP), LSB, TN, DA, BT had full access to all the data in the study. MALP, LSB, BB, BSE took the decision to submit for publication.

Among the 469 Lc65+ cohort participants included in the development/internal validation sample, the prevalence of FP was 14¢3% (n = 67), while pre-frail and robust participants represented 46¢5% (n = 218) and 39¢2% (n = 184), respectively. The mean age of study participants at FP assessment was relatively low (71¢6 years [SD 3¢55]), and females were slightly more represented (52¢0%, n = 244) ( Table 1) . As expected, frail participants were more likely to be older (p = 0¢0002) or female (p = 0¢017). The median cumulative length of stay at CHUV for all participants was 8¢0 days (IQR 11¢0) and increased significantly with the severity of frailty. Finally, the median time interval between the last discharge from CHUV and FP assessment was approximately 4.5 months for the whole sample, with frail participants experiencing shorter median time intervals than non-frail ones (4.3 vs 5.3 months).

Median eFS and CCI at FP assessment amounted to 2 and 0, respectively, and were significantly higher in frail participants (p = 0¢009 and p = 0¢003). Frail participants were at higher risk of long-term adverse health events, more than four times more likely to die and be institutionalized at one year, and approximately 2.4 times more likely to have a prolonged hospital stay (p = 0¢005, p = 0¢041, and p = 0¢002, respectively).When considering the six organs/systems included in the eFS (heart and endocrine, nervous, respiratory, digestive, and hematologic systems), frail participants were significantly more frequently affected than the non-frail (p < 0¢05). More than 40% of frail participants were affected by a cardiovascular deficiency, while 10%-40% presented at least one deficiency of the kidneys or musculoskeletal, immune, endocrine, nervous, respiratory, or digestive systems. Figure 1 displays the proportion of frail and non-frail participants for each of the 18 eFS components.

Among the 34 models tested, the best performing and parsimonious one for predicting the dichotomised FP was the BS-LR model with four predictors: age and sex at FP assessment, time since last discharge from CHUV, and the eFS. The averages of performance metrics and cut-off values for the 34 models, including the BS-LR, are shown in Appendix 3. Based on the development/internal validation sample, the adjusted ORs and 95% CIs ORs of being frail for the four predictors were: 1¢64 (0¢84-3¢18) for age 71-75 years and 4¢40 (2¢20-8¢80) for age 76-80 years compared to those aged 66-70 years; 2¢44 (1¢34-4¢42) for females compared to males; 1¢70 (0¢67-4¢29) for time since last discharge from CHUV <1 month and 1¢96 (1¢07-3¢06) for a time between 1 and 3 months compared to a time >3 months; and 1¢40 (1¢19-1¢65) for a one-point increase in the eFS. We also confirmed that the eFS score was associated with all adverse health outcomes of interest (death, prolonged length of hospital stay, number of hospitalisations, and nursing home admission within 12 months after FP assessment). These associations remained significant when taking age, sex and the CCI into account (Figure 2 ).

External validation confirmed that the eFS was a significant predictor of the 13 adverse outcomes, despite small effect sizes. These effects remained significant after taking the CCI, sex, and age into account (appendix 5). Comparisons between the eFS model and the three models with recently published claims-based frailty scores showed similar performance in predicting adverse health outcomes (Table 2 ). However, we observed slight differences among the AUCs and F1s of the different models. The eFS model performed best in predicting mortality, prolonged index stay, readmission, and the number of admissions in the following year. By contrast, the HFRS and GFS models outperformed in predicting institutionalisation. The former also better predicted one-year readmission and institutionalisation. The GFS and CFI models better predicted 30-day and one-year mortality, prolonged index stay, 30-day readmission, and institutionalisation at discharge. Finally, the HFRS model performed better at predicting one-year readmission and institutionalisation.

The estimated proportion of frail inpatients at admission was 56¢8% in the external validation dataset. The median of hospital-level proportions of frail inpatients on admission was 56¢3% (range 40¢3-73¢6%). Patients admitted to hospitals with a low prevalence of frailty on admission were younger and had a lower eFS and CCI. Details of the distribution of hospital-level proportions of frail inpatients at admission for the 153 acute care hospitals included in the external validation dataset, as well as the violin plots of the eFS, CCI, and age at admission for the 10% of hospitals (n = 16) with the lowest estimated proportions of frail inpatients on admission and the 10% (n = 16) with the highest, are provided in Appendix 6, Figure. 1 and 2 . Figure 1 . Proportion of frail and non-frail phenotypes for each of the 18 components of the electronic Frailty Score* 1-Immune system; 2-Blood cells and hematopoietic system; 3-Endocrine system; 4-Metabolic system; 5-Nervous system; 6-Visual system; 7-Hearing system; 8-Heart; 9-Vascular system; 10-Respiratory system; 11-Naso-oro-pharyngo-laryngeal system; 12-Digestive system (excluding liver); 13-Liver; 14-Cutaneous system; 15-Musculoskeletal system; 16-Lymphatic system; 17-Urinary system (excluding kidneys); 18-Kidneys. * Proportions were calculated for the frail (n = 67) and non-frail (n = 402) study participants hospitalised at least once in the 12 months before Fried's frailty phenotype assessment. One study participant may have several deficient systems and organs.

Articles www.thelancet.com Vol 44 Month February, 2022 Figure 2 . Adjusted odds ratios/incidence rate ratios of frail phenotype and adverse health outcomes according to the electronic Frailty Score, Charlson comorbidity index, age, and sex eFS=electronic Frailty Score; CCI=2011 updated Charlson comorbidity index; FP=Fried's Frailty Phenotype; aOR=adjusted odds ratio; aIRR=adjusted incidence rate ratio; 95%CI=95% confidence interval.

*reference = male. Markers in the graph represent aOR or aIRR estimates and vertical lines the 95%CI for these estimates. 95%CIs crossing the horizontal red line represent aORa or aIRRs that are not significantly different from one (i.e. no effect of the corresponding parameter). The table includes the performance metrics (AUC, F1 score) of the hierarchical logistic regression models for binary variables and Poisson for discrete variables. Performance comparisons between models containing the different frailty scores (eFS, HFRS, GFS, and CFI) should be made between models (columns) of the same colour. Models in orange correspond to those containing the following predictors: age, gender, CCI +/-frailty score or cumulative length of stay in acute care during the previous year. The models in blue are the same ones without the CCI. The model with the CFI score does not include CCI as a predictor because CCI is already part of the score (the column is therefore coloured in orange). Boxed areas with a more contrasting colour indicate the best performance (highest F1 score then, highest AUC).

By linking hospital discharge data to population cohort data, we were able to develop and validate the eFS predictive model, allowing the classification of non-institutionalised Swiss patients into phenotypically frail and non-frail. This model is parsimonious (four predictors), clinically meaningful, and easily replicable with an AUC of 0¢71 and an F1 score of 0¢39. Based on a nationwide validation sample, we found that the eFS was a significant predictor of one-year mortality or institutionalisation and shorter-term adverse outcomes, such as prolonged hospitalisation or readmission, even when controlling for age, sex, and the CCI. Furthermore, using an external validation sample, our score predicted all adverse health outcomes equally well or better when compared to the HFRS, 11 GFS, 12 and CFI. 10 To our knowledge, this is the first attempt to estimate the prevalence of frailty in older Swiss inpatients, both at the national and hospital level. In addition, these results are consistent with the literature. 6, 28 Similar to Segal et al, 10 we anchored the eFS on Fried's FP as it is based on pathophysiological hypotheses 1,2 and clinically operationalisable. 1, 2, 29 Moreover, it is also widely recognized as a valid measure of frailty in various populations and health care settings 1,2,6,7,29 and a reference standard when comparing instruments to identify frailty. 30 Although too burdensome to be routinely implemented in the healthcare system, Fried's FP is considered a valuable tool for the initial stratification of the older population at risk of disability and for proposing preventive interventions. 4 The Lc65+ cohort provides FPs for a large and representative sample of older adults in Switzerland, but it does not cover the entire population or clinical encounters. 10 Our eFS based on available diagnoses will enable a nationwide and routine prediction of the FP without costly and time-consuming clinical measurement.

The latest claims-based frailty measures were developed using a clinical knowledge-driven or a data-driven approach. 13 The former is considered straightforward but tends to generate frailty instruments with low sensitivity. The latter generally provides better predictive performance of frailty and adverse health outcomes but has potentially clinically meaningless predictors due to their selection by "black-box" ML algorithms. 13 In our study, we adopted both approaches, together with a reference standard, to select the eFS diagnostic codes and were able to show that the clinical knowledge-driven selection exhibits the best predictive performance, with a fair sensitivity of the eFS model (0¢73). Notably, the eFS can be easily replicated elsewhere using lists of clinically meaningful ICD-10 codes, and the prediction of frailty or adverse health outcomes only requires simple logistic regression modelling. Indeed, ML offered no advantage over regression in terms of predictive performance. 10, 13 As recommended, we validated the eFS against clinical frailty (i.e. FP), disability (i.e. institutionalisation) and mortality, and showed that it was a significant predictor of all these outcomes even after adjustment on the CCI. These results confirm that the eFS model can be differentiated from mortality models and predict frailty as defined by Fried et al., independently of comorbidity. 1, 13, 31 In addition, when comparing the eFS model to three models relying on well-designed and validated claims-based frailty measures, 10−12 we found that they shared a similar discriminatory ability and predictive accuracy. The eFS model could therefore represent a valid and transportable surrogate of the FP. In terms of feasibility and usability, the eFS is easy to integrate into national, regional or healthcare provider databases (i.e. health claims databases or electronic medical records) and could help discriminate frail and non-frail individuals in both community-dwelling and inpatient populations. Advantages and disadvantages of the four claims-based scores regarding feasibility and usability are listed in Appendix 7.

Our study has some limitations. First, the nervous system component of the eFS comprises ICD-10-GM codes related to dementia and other cognitive impairments, depression and other persistent psychiatric disorders, and addiction health problems. Adding these codes may have decreased the performance of the eFS model in predicting FP. Indeed, Fried's FP is often thought to reduce frailty to physical deficiencies and ignore mental and cognitive health problems. However, several authors have provided evidence to contradict these statements. 29 Furthermore, studies have shown that the five dimensions of the phenotype, depression, and cognitive impairment belonged to a common construct. 29 As a result, any decrease in the performance of the eFS model should be limited.

Second, for eFS development and internal validation, we linked the Lc65+ cohort data to CHUV discharge data only. We also applied inclusion criteria for Lc65+ cohort participants, i.e., community-dwelling, non-institutionalised residents of Lausanne recruited in 2004, 2009, and 2014, aged 65-70 years at enrolment, and hospitalised at the CHUV at least once in the year preceding the FP assessment. Finally, we selected a single FP assessment per participant to avoid data clustering. The resulting internal development/validation sample size was consequently small with few cases: 469 participants among which 67 (14¢3%) were frail. This selection process might have altered predictive model development (e.g. predictor selections and estimations, overfitting) 32 and limited the generalisability of the eFS model to the general population. We took these limitations into account by applying an appropriate statistical methodology. Furthermore, we confirmed that the eFS was also a good predictor of FP in the entire Lc65+ cohort population. The AUC, estimated by bootstrapping, of a 2-level (i.e., FP and participant) logistic regression model with four predictors (age, sex, number of hospitalisations at CHUV in the last 12 months, eFS) for the sample of Lc65+ cohort participants who had at least one frailty assessment during the study period, whether or not they had been hospitalised in the previous 12 months was 0¢71. The unselected sample included 3,497 FP measures for 1,648 separate participants with a prevalence of frailty of 5¢5% (n=191). The optimal predicted probability cut-off to classify participants as frail was 0¢21, resulting in a sensitivity and specificity of 54% and 79%, respectively. The eFS model, with or without adjustment for CCI, was also predictive of mortality, the occurrence of at least one hospital stay > eight days, institutionalisation, and the number of hospital admissions in the year following the FP assessment (Appendix 8).

Selecting participants who were previously admitted to CHUV has also likely led to a selection bias of more severe hospitalisations and the omission of stays in specialised or smaller hospitals. We may also have induced measurement bias for two predictors: eFS (underestimation) and time since the last hospitalisation (overestimation). Both biases constitute a limitation that we could not control for or minimise. In particular, we could not rely on diagnoses or the number and location of hospitalisations provided by participants during interviews or questionnaires as they were often affected by recall bias.

Third, comparisons across three other claims-based frailty measures may have been affected by measurement bias. Although we successfully adapted the other scores to Swiss hospital discharge data and the ICD-10-GM classification, we were unable to account for "race" in the CFI and to translate some ICD codes, particularly "W falls" codes, which do not exist in the ICD-10-GM (Appendix 2). In addition, we did not use the original variable weights for the CFI and GFS, but instead estimated new ones related to our national data to optimise the relationship between predictors and outcomes. 33, 34 However, models with the competing scores (HFRS, GFS, CFI) were fitted to external validation data only and not to test and training data, which may have given the eFS score an advantage. Finally, we measured the scores on data from the previous year, excluding the index stay. This modified time frame may have biased the CFI and HFRS calculations as the former is based on the last six months data before FP assessment, and the latter on the last 12 months data, including the index stay.

Fourth, we may have underestimated the prevalence of frailty in the external validation sample by selecting a relatively young population (i.e. inpatients aged 66-80 years). However, we limited this selection bias by including participants hospitalised at least once in the previous year and therefore at higher risk for frailty.

Fifth, given its low PPV (0¢28), the eFS model cannot be used for the clinical screening of frail older persons. However, its high NPV (0¢94) allows to identify non-frail individuals with good reliability and could help prioritise individuals who should receive a comprehensive geriatric assessment. 13 This automated prescreening should thus assist in medical decision-making, especially for primary care professionals, by better targeting patients eligible for tailored interventions to prevent, delay or reverse frailty, but also by protecting the most at-risk patients from overly aggressive diagnostic or treatment procedures that could worsen their health and well-being without increasing their life expectancy. 3, 35, 36 It should also result in a decreased caregiver workload and a reduced burden and costs for the health care system.

Sixth, as for other claims-based indices, the eFS accuracy and reliability may be affected by the quality of diagnostic coding in Swiss hospitals, including the level of health information technology available and the completeness of documentation by healthcare providers, especially at CHUV. 13, 37 The quality of Swiss hospital discharge data was not constant over the study period, but has significantly improved since the first mandatory collection in 1998. It has been considered excellent by the Federal Statistical Office since 2014. Likewise, local data integrity and medical information density have only been considered very good since 2001. 38 Thus, the temporal variation in the number of available diagnoses may have led to an underestimation of the scores of participants included at the beginning of the study period compared to those included at the end. Finally, coding quality also varies among hospitals and countries, which represents an inevitable limitation of claims-based studies. Regarding health information technology infrastructure and routine documentation of frailty, there is no personal health recording system in Switzerland that integrates all patient medical information or nationwide data collection on the frailty status of the population. The only available source of information on frailty is the Lc65+ cohort study, which is representative of Lausanne's older population. 18 However, in the near future, the Swiss Frailty Network and Repository intends to promote the development and validation of an electronic medical record-based frailty score for older inpatients admitted in Swiss university hospitals. 39 In conclusion, the eFS is a new claims-based frailty score with the ability to automatically identify frail groups of older persons based on diagnoses and a few easily accessible data. The eFS model is also easily transportable and may theoretically be applied to both hospitalised and community-dwelling older adults, provided they have been hospitalised at least once in the year preceding frailty assessment. Routine measurement of the prevalence of frailty at the national, regional or provider levels using this eFS model should help health authorities plan and allocate resources. Its integration into national or regional healthcare databases (i.e. health administrative databases or electronic medical records) should also contribute to a better consideration of frailty in billing systems and improve the case-mix adjustment when comparing performance or quality of care between hospitals or healthcare services. 13, 37 Regarding clinical decision making, the eFS model should help prioritise individuals at risk of adverse health outcomes and in need of comprehensive assessment and personalised care. 13 Future research could explore how accurately the eFS predicts dependency in institutionalised older adults and those living at home, 13 or assesses changes in frailty status over time, particularly after frailty-attuned interventions or during the COVID-19 pandemic. 40 Additional studies could also investigate how the eFS performs in other countries with ICD-10based coding systems or compared to claims-based deficit accumulation frailty indexes.

MALP conceptualised the study, wrote the protocol and submitted it for approval to the ethics committee, did the literature review, selected ICD-10 codes, acquired, analysed and interpreted the data, coordinated the project, and wrote and revised the manuscript. LSB and BSE discussed the protocol, selected ICD-10 codes, discussed the analyses and interpreted the data, and revised the manuscript; TN, DA, BT analysed and interpreted the data and wrote and revised the manuscript. IPB, BB and VR discussed the protocol, discussed the analyses and interpreted the data, and revised the manuscript. NG selected ICD-10 codes, interpreted the data and revised the manuscript. The authors vouch for the accuracy and completeness of the data. All authors commented upon and approved the final manuscript.

All authors declare no competing interests.

Requests for data sharing should be submitted to the corresponding author for consideration.

Frailty in older adults: evidence for a phenotype

Nonlinear multisystem physiological dysregulation associated with frailty in older women: implications for etiology and treatment

The physical frailty syndrome as a transition from homeostatic symphony to cacophony

Frailty in elderly people

Oude Voshaar RC. Prevalence of frailty in community-dwelling older persons: a systematic review

What do we know about frailty in the acute care setting? A scoping review

Prevalence of frailty in nursing homes: a systematic review and meta-analysis

The frailty phenotype and the frailty index: different instruments for different purposes

Accumulation of deficits as a proxy measure of aging

Development of a claims-based frailty indicator anchored to a wellestablished frailty phenotype

Development and validation of a hospital frailty risk score focusing on older people in acute care settings using electronic hospital records: an observational study

Dr Foster global frailty score: an international retrospective observational study developing and validating a risk prediction model for hospitalised older persons from administrative data sets

Measuring frailty in health care databases for clinical care and research

Using administrative data to study persons with disabilities

Claims-based frailty indices: a systematic review

Measuring frailty in administrative claims data: comparative performance of four claims-based frailty measures in the U.S. medicare data

Frailty and the endocrine system

The Lausanne cohort Lc65+: a population-based prospective study of the manifestations, determinants and outcomes of frailty

AHRQ Methods for Effective Health Care. Linking Data for Health Services Research: A Framework and Instructional Guide

Best subsets logistic-regression

Regression shrinkage and selection via the lasso

The elements of statistical learning: data mining, inference, and prediction

Clinical prediction models: a practical approach to development, validation, and updating

SMOTE: Synthetic minority over-sampling technique

Using random forest to learn imbalanced data

Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries

Internal and external validation of predictive models: a simulation study of bias and precision in small samples

Quantifying the prevalence of frailty in English hospitals

Screening for frailty: older populations and older individuals

Predicting risk and outcomes for frail older adults: an umbrella review of frailty screening tools

Untangling the concepts of disability, frailty, and comorbidity: implications for improved targeting and care

Minimum sample size for developing a multivariable prediction model: PART II -binary and time-toevent outcomes

Regression coefficient-based scoring system should be used to assign weights to the risk index

Improved comorbidity adjustment for predicting mortality in Medicare populations

Management of frailty: opportunities, challenges, and future directions

The value of unstructured electronic health record data in geriatric syndrome case identification

Claims-based Frailty indices may function as longterm risk estimates for elderly patients after hospitalization. Lancet Reg Health Eur

Statistiques de l'assurance-maladie: Indicateurs de qualit e des hôpitaux suisses de soins aigus 2019

Swiss Frailty Network and Repository: protocol of a swiss personalized health network's driver project observational study

Association between Clinical Frailty Scale score and hospital mortality in adult patients with COVID-19 (COMET): an international, multicentre, retrospective, observational cohort study

Technical support for data linkage was provided by the Information Systems Department, Lausanne University Hospital, and Mr Juan Manuel Blanco, a research fellow at the Department of Epidemiology and Health Systems of Unisant e (University Centre of General Medicine and Public Health). We thank all participants in the Lc65+ cohort study and the research assistants who collected and managed the cohort data. We also acknowledge Rosemary Sudan for editorial assistance.This work was partially presented as an oral presentation at the 2018 International Population Data Linkage Conference, 12-14 September 2018, Banff, Alberta, Canada.

The study received no external funding.

Supplementary material associated with this article can be found in the online version at doi:10.1016/j. eclinm.2021.101260.