key: cord-1003644-zexo8e69 authors: Ponzano, Marta; Schiavetti, Irene; Bovis, Francesca; Landi, Doriana; Carmisciano, Luca; De Rossi, Nicola; Cordioli, Cinzia; Moiola, Lucia; Radaelli, Marta; Immovilli, Paolo; Capobianco, Marco; Bragadin, Margherita Monti; Cocco, Eleonora; Scandellari, Cinzia; Cavalla, Paola; Pesci, Ilaria; Confalonieri, Paolo; Perini, Paola; Bergamaschi, Roberto; Inglese, Matilde; Petracca, Maria; Trojano, Maria; Tedeschi, Gioacchino; Comi, Giancarlo; Battaglia, Mario Alberto; Patti, Francesco; Fragoso, Yara Dadalti; Sen, Sedat; Siva, Aksel; Karabudak, Rana; Efendi, Husnu; Furlan, Roberto; Salvetti, Marco; Sormani, Maria Pia title: A multiparametric score for assessing the individual risk of severe Covid-19 among patients with Multiple Sclerosis. date: 2022-05-25 journal: Mult Scler Relat Disord DOI: 10.1016/j.msard.2022.103909 sha: 23e6cec5776efe5c873923773725e1059c723b30 doc_id: 1003644 cord_uid: zexo8e69 BACKGROUND: Many risk factors for the development of severe forms of Covid-19 have been identified, some applying to the general population and others specific to multiple sclerosis (MS) patients. However, a score for quantifying the individual risk of severe Covid-19 in patients with MS is not available. The aim of this study was to construct such score and to evaluate its performance. METHODS: Data on patients with MS infected with Covid-19 in Italy, Turkey and South America were extracted from the Musc-19 platform. After imputation of missing values, data were separated into training data set (70%) and validation data set (30%). Univariable logistic regression models were performed in the training dataset to identify the main risk factors to be included in the multivariable logistic regression analyses. To select the most relevant variables we applied three different approaches: 1) multivariable stepwise, 2) Lasso regression, 3) Bayesian model averaging. Three scores were defined as the linear combination of the coefficients estimated in the models multiplied by the corresponding value of the variables and higher scores were associated to higher risk of severe Covid-19 course. The performances of the three scores were compared in the validation dataset based on the area under the ROC curve (AUC) and an optimal cut-off was calculated in the training dataset for the score with the best performance. The probability of showing a severe Covid-19 course was calculated based on the score with the best performance. RESULTS: 3852 patients were included in the study (2696 in the training dataset and 1156 in the validation data set). 17% of the patients required hospitalization and risk factors for severe Covid-19 course were older age, male sex, living in Turkey or South America instead of living in Italy, presence of comorbidities, progressive MS, longer disease duration, higher Expanded Disability Status Scale, Methylprednisolone use and anti-CD20 treatment. The score with the best performance was the one derived using the Lasso selection approach (AUC= 0.72) and it was built with the following variables: age, sex, country, BMI, presence of comorbidities, EDSS, methylprednisolone use, treatment. An excel spreadsheet to calculate the score and the probability of severe Covid-19 is available at the following link: https://osf.io/ac47u/?view_only=691814d57b564a34b3596e4fcdcf8580. CONCLUSIONS: The originality of this study consists in building a useful tool to quantify the individual risk for Covid-19 severity based on patient's characteristics. Due to the modest predictive ability and to the need of external validation, this tool is not ready for being fully used in clinical practice to make important decisions or interventions. However, it can be used as an additional instrument to identify high-risk patients and persuade them to take important measures to prevent Covid-19 infection (i.e. getting vaccinated against Covid-19, adhering to social distancing, and using of personal protection equipment). performance. Methods: Data on patients with MS infected with Covid-19 in Italy, Turkey and South America were extracted from the Musc-19 platform. After imputation of missing values, data were separated into training data set (70%) and validation data set (30%). Univariable logistic regression models were performed in the training dataset to identify the main risk factors to be included in the multivariable logistic regression analyses. To select the most relevant variables we applied three different approaches: 1) multivariable stepwise, 2) Lasso regression, 3) Bayesian model averaging. Three scores were defined as the linear combination of the coefficients estimated in the models multiplied by the corresponding value of the variables and higher scores were associated to higher risk of severe Covid-19 course. The performances of the three scores were compared in the validation dataset based on the area under the ROC curve (AUC) and an optimal cut-off was calculated in the training dataset for the score with the best performance. The probability of showing a severe Covid-19 course was calculated based on the score with the best performance. Results: 3852 patients were included in the study (2696 in the training dataset and 1156 in the validation data set). 17% of the patients required hospitalization and risk factors for severe Covid-19 course were older age, male sex, living in Turkey or South America instead of living in Italy, presence of comorbidities, progressive MS, longer disease duration, higher Expanded Disability Status Scale, Methylprednisolone use and anti-CD20 treatment. The score with the best performance was the one derived using the Lasso selection approach (AUC= 0.72) and it was built with the following variables: age, sex, country, BMI, presence of comorbidities, EDSS, methylprednisolone use, treatment. An excel spreadsheet to calculate the score and the probability of severe Covid-19 is available at the following link: https://osf.io/ac47u/?view_only=691814d57b564a34b3596e4fcdcf8580. Conclusions: The originality of this study consists in building a useful tool to quantify the individual risk for Covid-19 severity based on patient's characteristics. Due to the modest predictive ability and to the need of external validation, this tool is not ready for being fully used in clinical practice to make important Since the start of the Covid-19 pandemic many risk factors for the development of severe forms of the disease have been identified including older age, male gender and presence of comorbidities [1] [2] [3] [4] . Patients with Multiple Sclerosis (MS) are in general more vulnerable and at higher risk of infections compared to the general population and Covid-19 has raised additional concern for these patients, especially for those under disease-modifying therapies [5; 6; 7] . Among Italian patients with MS the risk of severe Covid-19 course was found to be two times higher compared to the general population [8] and MS-specific risk factors for severe Covid-19 course have been identified in many studies, including higher EDSS, progressive phenotype, disease duration, corticosteroid use within 1 month since Covid-19 onset and anti-CD20 therapy [9; 10] . Several COVID-19 severity indexes have been already developed in order to identify patients at higher-risk of hospitalization, admission to intensive care unit (ICU) and death [11, 12, 13] . However, at the present time, it does not exist such a specific score for patients with MS. The aim of this study was thus to develop a prognostic score for helping clinicians to assess the individual risk of their patients. The score was developed taking into consideration both the general and MSspecific subjects' characteristics and internal validation was conducted. The TRIPOD statement for transparent reporting of a multivariable prediction model for individual prognosis was followed [14] Methods Data on MS patients who got infected with Covid-19 in Italy, Turkey and South America were extracted from the web-based platform (MuSC-19 project) containing clinician-reported data from several MS centers around the world. Details on data sharing agreements, ethical committee approval and type of variables collected have been already reported elsewhere [7] . We reported details on the location of the participating centers in Supplementary Table 1 . The presence of comorbidities was evaluated as the recording of at least one the following underlying pathologies: cerebrovascular disease, hematological disease, coronary heart disease, hypertension, diabetes, chronic liver disease, chronic kidney disease, malignant tumor, HBV, HIV, major depressive disorder, other (if specified). We excluded patients with suspected Covid-19 but without a positive Covid-19 test result and the patients enrolled in the first three months of pandemic due to the low reliability of the data collected at the beginning of the pandemic. Only patients enrolled between May 2020 and the end of the study (17 September 2021) were thus included. Demographic and MS characteristics of the patients were presented as frequencies (%), mean (standard deviation) or median (interquartile range). Due to the presence of missing values, we performed a multiple imputation (MI) by chained equations approach with 10 imputations. After multiple imputation was performed, 10 separate datasets were created and the analyses were conducted based on theoretical rules of MI [15] . In the imputation models, in addition to the variables with missing values (age, smoking habits, type of MS, disease duration and EDSS), we included as predictors sex, country, BMI and type of treatment based on the relevance of these variables in the characterization of the patients. Subsequently, we separated the data into a training data set (70%) and a validation data set (30%) based on random computer generation. To verify the comparability of the two data sets, characteristics of the patients in the two data sets were compared using Chi-squared test or Fisher's exact test for categorical variables and Mann-Whitney U test for continuous variables. Univariable logistic regression models were performed in the training data set in order to identify discriminating factors between mild and severe course of Covid-19 (Mild vs Hospitalization or death) and the multivariable model was performed excluding the variables showing a p-value≥0.10 in the univariable analysis and also MS type and disease duration due to collinearity issues. Subsequently, we reincluded the non-significant univariate predictors and we applied three different approaches for selecting the most relevant variables based on the following strategies:  Model 1 -multivariable stepwise selection approach followed by multivariable logistic regression model with 500 bootstrap replications on the selected variables.  Model 2 -Lasso regression selection approach followed by multivariable logistic regression model with 500 bootstrap replications on the selected variables [16] . The optimal value of the penalty parameter was determined using 10-folds Cross-validation  Model 3 -Bayesian model averaging (BMA) approach [17] for logistic regression models with Covid-19 severity as dependent variable. BMA computation was performed using the R Bayesian adaptive sampling (BAS) package BAS, assigning equal probabilities to all models in order to not make any a priori assumptions. Factors with posterior inclusion probability (PIP) ≥0.7 were selected. The coefficients estimated in the models were used to derive three scores, defined as the linear combination of the coefficients multiplied by the corresponding value of the p variables (Score= β 1 × var 1 + β 2 × var 2 + . . . + β p × var p ) and higher scores represented a greater risk of severe Covid-19 course. The discriminating performance of the three scores was evaluated in the validation set as the area under the ROC curve (AUC). For the score with the best performance we identified in the training data set an optimal cut-off based on the Liu criterion, which consists in maximizing the product of sensitivity and specificity [18] . The Liu criterion is appropriate in this context since it allows to find an optimal cut-point to dichotomize a continuous variable based on sensitivity as well as specificity. Additionally, as a sensitivity analysis, we also estimated the optimal cut-off as the cut point on the ROC curve closest to (0,1) and using the Youden method but the estimated cut points were very similar (Liu: 3.02, nearest to (0,1): 3.10; Youden: 3.32). Subsequently, in the validation sample we derived sensitivity, specificity and their corresponding 95% confidence intervals (CI) to assess the performance of the binary score. For the model with the best performing, we also estimated the probability of showing a severe Covid-19 outcome based on the estimated coefficients as follows: All statistical analyses and multiple imputation were performed using Stata version 16.0 (Stata Corporation, College Station, TX, USA) except for the BMA (R v3.5). Out of the 4820 patients from Italy, Turkey and South America enrolled into the Musc-19 platform at the cutoff date of 17 September 2021, 3852 patients remained after applying the exclusion criteria ( Figure 1 ). After Imputation N=3852 Coefficients (log of the odds ratio) and standard errors derived from the three models are reported in Table 3 . The variables included in the three models were largely overlapping, excluding BMI and treatment with interferon (included only in Model 2) and for Methylprednisolone use (not in included in Model 3). Performances of the three models were reported in Table 4 68%(60%-74%) 59%(56%-62%) The optimal cut-point for the score was found to be 3.02 and patients were classified as having higher risk of severe Covid-19 if their score was higher than 3.02. The application of this cut-off in the validation sample yielded a sensitivity of 68% and a specificity of 59% (Table 5) . Estimated probabilities of severe Covid-19 ranged from 0.02 to 0.89, with an observed mean of 0.17 (standard deviation=0.13). To facilitate the application of the score in daily practice, an excel spreadsheet that enables the data entry of the patient characteristics and the automatic calculation of the score and of the estimated probability of severe disease can be downloaded at the following link: https://osf.io/ac47u/?view_only=691814d57b564a34b3596e4fcdcf8580 In this work, we identified several risk factors for severe Covid-19, some related to general characteristics and others specific to MS. These results were consistent with previous findings and are thus a confirmation of what has been already shown elsewhere [4; 9] . The originality of this study consists in building a score to quantify the individual risk of severe Covid-19 among patients with MS. To identity the features contributing to this score, we performed three models based on different statistical approaches and results remained quite consistent: this consistency guarantees a good reliability of variables selection. Additionally, when constructing the scores, we also evaluated the contribution of the selected MS characteristics (EDSS, Methylprednisolone use, Treatment) on the performance of the scores and we observed only a slight improvement compared to those based exclusively on general characteristics. It follows that even if it is known that some characteristics of MS play a role in the severity of Covid-19, the general characteristics of the patients seem to be more relevant. The score was found to have a modest predictive ability (AUC=0.72 and when the dichotomized score was evaluated: sensitivity=0.68, specificity=0.59). As such, the score cannot be used in clinical practice to take important decisions such as treatment changes, asking for sick leave or planning resources allocation. Additionally, even if the very large sample size of our study enabled to split the data into training and validation datasets while still maintaining a large sample size, external validation of the score on an independent set of data is needed to further support our results before it can be fully used in practice [19, 20] . However, before further research is done to completely validate the score and to improve its predictive ability, we suggest an initial use of the score in practice which may seem less ambitious than one would expect but that is still important. In particular, the score may be used as an useful supplementary tool for quantifying the personal risk assessment in order to give the higher-risk patients an additional reason to get vaccinated against Covid-19 if they haven't had it yet and to appropriately adhere to social distancing and use of protective equipment to decrease the risk of getting infected [21; 22] . As such, in this initial context of application, the modest predictive ability it's not of much concern and the fact that sensitivity is higher than specificity is even preferrable, since it is better to identify more false positives compared to many false negatives. Additionally, in this context of application, the fact that we prospectively followed patients infected with Covid-19 prior to the start of the vaccinations programs no longer seems a limitation since the enrolled patients better reflect the patients who have hesitated to take the vaccination. To identify the patients at higher risk of severe Covid-19 course, clinicians can compare the observed score with the derived cut-off: more the observed value is higher than the threshold, more the patient is at risk while more the observed value is lower than the threshold, less the patient is at risk. Additionally, the clinician can also directly derive the estimated probability of showing a severe Covid-19 course. All these calculations (continuous score, magnitude and sign of the difference between observed value and cut-off and the estimated probability) may be used together to get a more complete understanding of the patient's risk and can be easily derived using the provided user-friendly excel spreadsheet. .. . Future research should also evaluate the performance of the score in other countries. Differences in hospitalizations rates among countries can indeed depend on quality and accessibility of Health Service but also on the national guidelines regarding the hospital admission of the patients [23, 24] . As first preliminary analysis or in the absence of studies presenting specific-region scores, for patients outside Italy, Turkey and South America, the score may be calculated on the basis of which of the three regions is the most similar to the country under study in terms of National World Health Organization A novel coronavirus from patients with pneumonia in China World Health Organization: WHO Director-General's opening remarks at the media briefing on COVID-19 -11 Risk factors for severe and critically ill COVID-19 patients: A review Hospital admission due to infections in multiple sclerosis patients Disease-modifying therapies and infectious risks in multiple sclerosis Severity in Multiple Sclerosis COVID-19 Severity in Multiple Sclerosis: Putting Data Into Context. Neurol Neuroimmunol Neuroinflamm Risk factors of severe COVID-19 in people with multiple sclerosis : A systematic review and meta-analysis Severe outcomes of COVID-19 among patients with multiple sclerosis under anti-CD-20 therapies: A systematic review and meta-analysis. Mult Scler Relat Disord Severity Index: A predictive score for hospitalized patients Prediction model and risk scores of ICU admission and mortality in COVID-19 Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement TX: StataCorp LLC 16. Tibshirani R. Regression shrinkage and selection via the lasso Bayesian model averaging: a tutorial Classification accuracy and cut point selection Prognosis and prognostic research: Validating a prognostic model Prognosis and prognostic research: Validating a prognostic model Adherence to social distancing and use of personal protective equipment and the risk of SARS-CoV-2 infection in a cohort of patients with multiple sclerosis COVID-19 vaccine hesitancy in Iranian patients with multiple sclerosis Coronavirus disease 2019 in Latin American patients with multiple sclerosis The MuSC-19 study: The Egyptian cohort Figure 1. Flowchart of patient inclusion and exclusion