key: cord-0913643-ar9rkpbr authors: Jamshidi, E.; Asgary, A.; Tavakoli, N.; Zali, A.; Esmaily, H.; Jamaldini, S. H.; Daaee, A.; Babajani, A.; Sendani Kashi, M. A.; Jamshidi, M.; Rahi, S.; Mansouri, N. title: Using Machine Learning to Predict Mortality for COVID-19 Patients on Day Zero in the ICU date: 2021-02-08 journal: nan DOI: 10.1101/2021.02.04.21251131 sha: 95de098e5d055006d0c51d22c9078e7e4b31ffef doc_id: 913643 cord_uid: ar9rkpbr Rationale Given the expanding number of COVID-19 cases and the potential for upcoming waves of infection, there is an urgent need for early prediction of the severity of the disease in intensive care unit (ICU) patients to optimize treatment strategies. Objectives Early prediction of mortality using machine learning based on typical laboratory results and clinical data registered on the day of ICU admission. Methods We studied retrospectively 263 COVID-19 ICU patients. To find parameters with the highest predictive values, Kolmogorov-Smirnov and Pearson chi-squared tests were used. Logistic regression and random forest (RF) algorithms were utilized to build classification models. The impact of each marker on the RF model predictions was studied by implementing the local interpretable model-agnostic explanation technique (LIME-SP). Results Among 66 documented parameters, 15 factors with the highest predictive values were identified as follows: gender, age, blood urea nitrogen (BUN), creatinine, international normalized ratio (INR), albumin, mean corpuscular volume, white blood cell count, segmented neutrophil count, lymphocyte count, red cell distribution width (RDW), and mean cell hemoglobin along with a history of neurological, cardiovascular, and respiratory disorders. Our RF model can predict patients outcomes with a sensitivity of 70% and a specificity of 75%. Conclusions The most decisive variables in our model were increased levels of BUN, lowered albumin levels, increased creatinine, INR, and RDW along with gender and age. Complete blood count parameters were also crucial for some patients. Considering the importance of early triage decisions, this model can be a useful tool in COVID-19 ICU decision-making. Nahal Mansouri (nahal.mansouri@chuv.ch) Sahand Jamal Rahi (sahand.rahi@epfl.ch) The authors received no financial support for the research, authorship, and/or publication of this article. COVID-19 has currently affected more than 82 million people worldwide and caused more than 1.8 million deaths 1 . Complications are more common among elderly patients and people with preexisting conditions, and the rate of intensive care unit (ICU) admission is substantially higher in these groups 2, 3 . ICU admissions rely on the critical care capacity of the health care system. Iran, which is the primary testbed for this study, was one of the first countries hit by COVID-19. The ICU admission rate is around 32% of all hospitalizations, and the ICU mortality rate is about 39% 4 . With the potential upcoming waves of COVID 19 infections, these numbers are expected to rise, leading to shortages of ICU beds and critical management equipment. There is also the risk of a global shortage of effective medical supplies, making the judicious use of these medications a top priority for healthcare systems. An individual-based prediction model is essential for tailoring treatment strategies and would aid in expanding our insights into the pathogenesis of COVID-19. A number of risk assessment scores are available to predict the severity of different diseases in ICU patients 5 . Predictors of the need for intensive respiratory or vasopressor support in patients with COVID-19 and of mortality in COVID-19 patients with pneumonia have been put forth 6, 7 Beyond the general benefits of data-driven decision-making, the pandemic has also exposed the need for computational assistance to health care providers, who under the rush of severely ill patients may make mistakes in judgement 8, 14, 15 . Stressful conditions and burnout in health care providers can reduce their clinical performance, and a lack of accurate judgment can lead to increased mortality rates 16, 17 . Artificial intelligence can help healthcare professionals determine who needs a critical level of care more precisely. Indeed, the effective use of AI could mitigate the severity of this outbreak. Here, we propose a personalized machine-learning (ML) method for predicting mortality in COVID-19 patients based on routinely available laboratory and clinical data at the date of ICU admission. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. On the day of the ICU admission, 66 parameters were assessed for each patient including 11 demographic characteristics (e.g., age and gender), past medical history and comorbidities (including nine different preexisting conditions), and 55 laboratory biomarkers. These parameters are listed in Table 1 . 69% of measurements were reported on the day of admission, 27% were reported one day after, and 4% were reported within two days of ICU admission because of sampling limitations and laboratory practice. We excluded patients whose laboratory data were obtained more than two days after the date of admission to the ICU. The aim was to predict patient survival. For the selection of parameters with the highest predictive value, under the null hypothesis of distributions being the same between the two groups, the two-sample Kolmogorov-Smirnov test (KS), shown in Supplementary Figure 1 , was used for numerical parameters (age and laboratory biomarkers), and the Pearson chi-squared test (ढ 2 ), shown in Supplementary Figure 2 , was used for categorical parameters (e.g., gender and All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Data processing was carried out in four steps: First, because of incomplete laboratory data and in order to reduce difficulties associated with missing values, 235 patients having data for at least seven out of ten biomarkers were selected. Second, samples were randomly separated into 10 independent sets with stratification over outcomes for 10-fold cross-validation to ensure the generalizability of the models 19 . Of the 10 subsets, a single subset was retained as a validation set for model testing and the remaining nine subsets were used as training data. The crossvalidation process was then iterated ten times with each of the 10 subsets being used as the validation data exactly once. Third, numerical parameters were standardized by scaling the features to mean zero and unit variance. Lastly, missing biomarker values were imputed using the k-nearest neighbor (k-NN) algorithm, and a binary indicator of missingness for each biomarker was added to the dataset 18, 20 . Standardization and imputation were performed separately on each cross-validation iteration by using training set samples. Logistic regression (LR) and random forest (RF) methods were used to build classification models using the Python scikit-learn package 21 . The performance of each method on training and validation sets in each cross-validation iteration was compared using a receiver operating characteristic curve (ROC), which is shown in Figure 1 . To prevent overfitting in the training All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint process, the LR model was trained with an L2 regularization factor equal to one, and the RF was forced to hold more than 10% of samples in each of its terminal leaves 22, 23 . To find the most influential parameters in the LR model prediction, we used regression coefficients, which is shown in the Supplementary Material. Using the local interpretable model-agnostic explanation submodular-pick (LIME-SP) method, we identified different patterns among the whole feature space in the RF model 24 . The LIME-SP method can interpret the model's predictions in different parts of the feature space by modeling a subset of model predictions in the feature space around the sample with the help of linear models that are more interpretable. In our study, LIME-SP was performed on 100 random samples to find six submodules with the most disparity in their selected markers, as shown in Figure 2 . To identify meaningful clinical differences between patients, seven parameters with the highest predictive values were derived from each submodule. [ Figure 1 ] The median age of patients was 69 years with an interquartile range (IQR) of 54-78. The minimum and maximum ages were 20 and 98 years, respectively. One hundred fifty-three patients (65.1%) were men, and 82 (34.9%) were women. One hundred five (39.9%) were discharged from the ICU after recovery and 158 (60.1%) patients lost their lives. The most frequent comorbidities among the patients were hypertension, diabetes, and cardiovascular disorders in 94, 92, and 86 patients, respectively. Among the 158 deceased patients, neurological All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint disorders were the most prevalent comorbidity (42 patients, 84%). The statistical analysis and the availability of each parameter in our dataset are summarized in Table 1 . [ Table 1 ] In the RF model, the optimum point between overfitting and efficiency was found by selecting predict a patient's outcome with a sensitivity of 70% and a specificity of 75%, while the sensitivity for the LR model was 65% and the specificity was 70%. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. By using the LIME technique, variables that provide the most information on the probability of each patient's death were identified. Among the six submodules identified with the highest disparity among 100 patients, albumin, BUN, and RDW were present in five of them. Age, MCH, and creatinine were present in four of the abovementioned submodules. This points out the importance of these measurements in the recorded parameters. Additionally, BUN (in three of these submodules), RDW (in two submodules), and age (in one submodule), were the most decisive ones. This model could predict a patient's outcome reliably (AUC between 80 to 85) over a 15-day period, as shown in Figure 3 . The mortality rate was highest between zero and four days. Given that the model was designed for first-day ICU admissions, moving away from this day reduced the accuracy of the predictions and the efficacy of the LIME method for clinical interventions, as expected. [ Figure 3 ] To evaluate the clinical capability of the model, the decision curve (DC), and the clinical impact curve (CIC) were investigated 25 . The DC framework measures the clinical "net benefit" for the prediction model relative to the current treatment strategy for all or no patients. Net benefit is measured over a spectrum of threshold probabilities, defined as the minimum disease risk at which further intervention is required. Based on the DC, CIC, and on the assumption of the same interventions for high-risk patients, our model indicated a superior or equal net benefit within a wide range of risk thresholds and patient outcomes, as shown in Figure 4 . [ Figure 4 ] All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The aim of this study was developing an interpretable ML model to predict the mortality rate of COVID-19 patients at the time of admission to the ICU. To the best of our knowledge, this is the first study to develop a predictive model of mortality in patients with severe COVID-19 infection at such an early stage using routine laboratory results and demographic characteristics. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Only patients who had at least seven of the 10 selected biomarkers have been included in the training phase of the modeling and missing parameters were imputed using k-NN based on the data. As can be seen in the models' ROC curve, the RF algorithm outperformed LR in predicting the outcome. This difference is mainly due to the non-linear correlation between variables, manifesting the complexity of the problem. The application of the LIME-SM method allowed us to determine a patient-specific marker set that each patient's prognosis is based on. This technique explains the predictions by perturbing the input of data samples and evaluating the effects. The output of LIME is a list of features, reflecting each feature's contribution to a given prediction. Understanding the 'reasoning' of the ML model is crucial for increasing physicians' confidence in selecting treatments based on the prognosis scores. Using the LIME method, the significance of variables with high predictive value was determined for each prediction made for an individual. The evaluation of the variables in the individual's personalized prediction can lead to supportive measures and help determine treatment strategies according to the interpretation of the individual prognosis. As severe COVID-19 may result from various underlying etiologies, our model can help categorize patients into groups with distinct clinical prognosis, thus allowing personalized treatments. In addition to targeted therapies, the differentiation between patients may reveal disease mechanisms that coincide or that occur under specific preexisting conditions. Future cohort studies could explore these assumptions with increased sample sizes. In this study, hypoalbuminemia and renal function were identified as the main factors with high predictive values for the model. These findings are in agreement with recent results showing that hypoalbuminemia is an indicator of poor prognosis for COVID-19 patients 29 . It is well All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint documented that endogenous albumin is the primary extracellular molecule responsible for regulating the plasma redox state among plasma antioxidants 30 . Moreover, it has been shown that albumin downregulates the expression of the angiotensin-converting enzyme 2 (ACE2) which may explain the association of hypoalbuminemia with severe COVID-19 31 . Intravenous albumin therapy has been shown to improve multiple organ functions 32 . Therefore, early treatment with human albumin in severe cases of COVID-19 patients before the drop in albumin levels might have positive outcomes and needs to be further investigated. Furthermore, increased levels of BUN and Cr are observed in our study, which is an indication of kidney damage. An abrupt loss of kidney function in COVID-19 is strongly associated with increased mortality and morbidity 33 . There are multiple mechanisms supporting this association 34, 35 . One of the findings of this study is the identification of RDW (a measure of the variability of the sizes of RBCs) as an influential parameter. This result is in line with recently published reports 36 . Elevated RDW, known as anisocytosis, reflects a higher heterogeneity in erythrocyte sizes caused by erythrocyte maturation and degradation abnormalities. Several studies have found that elevated RDW is associated with inflammatory markers in the blood such as IL-6, tumor necrosis factor-α, and CRP, which is common in severely ill Covid-19 patients 37 (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint hospitalized patients on different days after the initial ICU admission were used for their model 38 . Since our goal was the prediction of mortality risk as early as possible for ICU patients, this limited us to using only the laboratory results on day zero, in contrast. For patients with severe COVID-19 infection, early decision making is critical for successful clinical management. Additionally, laboratory results from other days may not always become available. We also identified lymphocyte count as a predictor of mortality, as in the previous study; however, CRP levels and LDH did not reach statistical significance. Although IL-6 has been found to be a good predictor of disease severity by other studies, it did not reach statistical significance in our model 39 . IL-6 had a considerable KS statistical value, but because of the high number of missing values, its p-value was not significant compared to other markers. The fact that IL-6 is not always measured upon ICU admission is precisely why it is not suitable for our purposes. The missingness indicator of some markers in both LR and RF models has an impact on the predictions based on the regression coefficient and LIME, which can be the result of the model compensating for the imputation error. However, the missingness indicator may also indicate the existence of bias in biomarker reporting 40 . Such biases (e.g., sampling bias) are an inevitable part of retrospective studies. They can be addressed using domain-adaptation techniques such as correlation alignment (CORAL) in future studies using additional data 41, 42 . Another limitation of this study may be the lack of an objective criterion for ICU admission. Moreover, different treatment strategies can change the survival outcome for patients who may have had similar profiles when admitted to the ICU. In future studies, the accuracy of this model may be further improved by adding chest imaging data and by using a larger dataset. Possible targets for our ML All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint framework include the prediction of other crucial information such as the patients' need for mechanical ventilation, the occurrence of cytokine release syndrome, the severity of acute respiratory disease syndrome, the cause of death, and the right treatment strategy. In conclusion, we evaluated 66 parameters in COVID-19 patients at the time of ICU admission. Of those parameters, 15 metrics with the highest prediction values were identified: gender, age, BUN, Cr, INR, albumin, MCV, RDW, MCH, WBC, segmented neutrophil count, lymphocyte count, and past medical history of neurological, respiratory, and cardiovascular disorders. In addition, by using the LIME-SP method, we identified different submodules clarifying distinct clinical manifestations of severe COVID-19. The ML model trained in this study could help clinicians determine rapidly which patients are likely to have worse outcomes, and given the limited resources and reliance on supportive care allow physicians to make more informed decisions. The authors reported no potential conflict of interest. The data that support the findings of this study are available from the corresponding authors upon request. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint 1 (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. .. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint List of figures and legends. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 8, 2021. ; https://doi.org/10.1101/2021.02.04.21251131 doi: medRxiv preprint COVID-19) Dashboard Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area Clinical features of COVID-19 in elderly patients: A comparison with young and middle-aged patients Rate of Intensive Care Unit admission and outcomes among patients with coronavirus: A systematic review and Meta-analysis Scoring systems in the intensive care unit: A compendium Comparison of severity scores for COVID-19 patients with pneumonia: a retrospective study Application of Artificial Neural Network for Prediction of Risk of Multiple Sclerosis Based on Single Nucleotide Polymorphism Genotypes Artificial Intelligence (AI) applications for COVID-19 pandemic Factors Associated With Mental Health Outcomes Among Health Care Workers Exposed to Coronavirus Disease Exposure to COVID-19 patients increases physician trainee stress and burnout Do doctors experiencing burnout make more errors? Estimating the Attributable Cost of Physician Burnout in the United States Feature Selection for Dimensionality Reduction Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation K-Nearest Neighbor (K-NN) based Missing Data Imputation 21. scikit-learn No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity Learning to Rank from Medical Imaging Data Feature selection, L1 vs. L2 regularization, and rotational invariance Twenty-first international conference on Machine learning -ICML '04 Why Should I Trust You?": Explaining the Predictions of Any Classifier Decision curve analysis: a novel method for evaluating prediction models COVID-19: consider cytokine storm syndromes and immunosuppression Induction of pro-inflammatory cytokines (IL-1 and IL-6) and lung inflammation by Coronavirus-19 (COVI-19 or SARS-CoV-2): antiinflammatory strategies Proposed Mechanisms of Targeting COVID-19 by Delivering Mesenchymal Stem Cells and Their Exosomes to Damaged Organs Low albumin levels are associated with poorer outcomes in a case series of COVID-19 patients in Spain: a retrospective cohort study Specific antioxidant properties of human serum albumin. Ann Intensive Care Albumin caused the increasing production of angiotensin II due to the dysregulation of ACE/ACE2 expression in HK2 cells The clinical use of albumin: the point of view of a specialist in intensive care Kidney disease is associated with in-hospital death of patients with COVID-19 Hypoxia: The Force that Drives Chronic Kidney Disease Acute Kidney Injury in COVID-19: Emerging Evidence of a Distinct Pathophysiology Elevated RDW is Associated with Increased Mortality Risk in COVID-19 Red cell distribution width, inflammatory markers and cardiorespiratory fitness: results from the National Health and Nutrition Examination Survey An interpretable mortality prediction model for COVID-19 patients Elevated levels of IL-6 and CRP predict the need for mechanical ventilation in COVID-19 A Potential for Bias When Rounding in Multiple Imputation Correlation Alignment for Unsupervised Domain Adaptation