key: cord-0977590-wjpxk0iw authors: Cavallaro, M.; Moiz, H.; Keeling, M. J.; McCarthy, N. D. title: Contrasting factors associated with COVID-19-related ICU and death outcomes: interpretable multivariable analyses of the UK CHESS dataset. date: 2020-12-07 journal: nan DOI: 10.1101/2020.12.03.20242941 sha: 14e8e68c0729060ea9b9187a7e1b4b1cc59a67e8 doc_id: 977590 cord_uid: wjpxk0iw Identifying factors associated with severe COVID-19 is a priority to guide clinical care and resource use in this pandemic. This cohort comprised 13954 in-patients with confirmed COVID-19. Study outcomes were death and intensive care unit admission (ICUA). Multivariable logistic regression estimated odd ratios adjusted for 37 covariates (comorbidities, demographic, and others). Gradient boosted decision tree (GBDT) classification generated Shapley values evaluating the impact of covariates for each patient. Deaths due to COVID-19 were associated with immunosuppression due to disease (Odds Ratio 1.39, 95%CI [1.10-1.76]), type-2 diabetes (1.31, [1.17-1.46]), chronic respiratory disease (1.19, [1.05-1.35]), obesity (1.16, [1.01-1.33], age (1.56/10-year increment, [1.52-1.61]), and male sex (1.54, [1.42-1.68]). Associations with ICUA differed in direction (e.g., age, chronic respiratory disease) and in scale, e.g., obesity (3.37, [2.90-3.92]) for some factors. Ethnicity was strongly but variably associated with both outcomes, for example Irish ethnicity is negatively with death but not ICUA. GBDTs had similar performance (ROC-AUC, ICUA 0.83, death 0.68 for GBDT; 0.80 and 0.68 for logistic regression). Shapley explanations overall were consistent with odds ratios. Chronic heart disease, hypertension, other comorbidities, and some ethnicities had Shapley impacts on death ranging from positive to negative among different patients, although consistently associated with ICUA for all. Immunosuppressive disease, type-2 diabetes, and chronic liver and respiratory diseases had positive impacts on death with either positive or negative on ICUA. Very different association of some factors, e.g., obesity, with death and ICUA may guide review of practice. Shapley explanation identified varying effects among patients emphasising the importance of individual patient assessment. COVID-19, due to SARS-CoV-2 betacoronavirus, emerged in Wuhan, China in late 2019 and has spread globally. It can cause severe complications of pneumonia, acute respiratory distress syndrome, sepsis, and septic shock 1 . It has, as of October 24, 2020, infected over 42 million people and killed over 1.1 million people 2 . Certain patient subsets, such as the elderly and those with comorbidities, are at an increased risk of severe outcomes from COVID-19 such as admission to intensive care units, respiratory distress requiring mechanical ventilation, and death 3, 4 . Clinicians can use predictive factors to prioritize patients at higher risk of clinical deterioration and public health authorities can use them to target public health interventions. Identifying factors associated with severe disease has been described as an urgent research priority. Several studies have sought to identify factors predicting poor outcome following COVID-19 infection 5, 6 and assist clinician decision making [7] [8] [9] . A traditional method such as logistic regression can infer the odd ratios (ORs) of the outcome in the presence of a risk factor. Modern machine-learning technologies, widely implemented during the COVID-19 pandemic, can handle more complex patient data types, offer greater generality, and produce more accurate predictions than the previous methods, but at the cost of losing transparency and interpretability 10 . Surveillance systems support these analyses. The COVID-19 Hospitalization in England Surveillance System (CHESS), a UK system distributed by Public Health England (PHE) and adapted from the UK severe influenza surveillance system, collects extensive patient admission data including known comorbidities and important demographic information (such as age, sex, and ethnicity). This large national dataset reduces limitations inherent in small cohorts, enabling more reliable identification of associations. Analysis was performed using logistic regression and a more general machine-learning model (the gradientboosted decision tree, GBDT), which generated interpretable predictions by means of the Shapley additive explanation, a technique that, to the best of our knowledge, has not yet been deployed in COVID-19 studies and mitigates the interpretability issue in machine-learning outputs. Through these methods, we demonstrated the extent to which preexisting conditions differentially predicted death and intensive care unit (ICU) admission. Some factors affected both similarly but others proved to be protective for one while increasing the risk for the other, or showed very different effect sizes. We also identified variation of effects among patients. These results may be useful to clinicians assessing hospitalized patients with COVID-19. They may also provide a greater context or benchmark for individuals evaluating or interpreting complex automated clinical decision systems designed to identify those most at-risk. Description of cohort and outcomes included in the study. With 8628 patients, white British was the largest group in the cohort and therefore chosen as a reference category. 1895 patients did not identify themselves with any ethnicity and were labelled as "NA". With the exception of age and admission date, all features were stratified to binary variables. Entries labelled "diabetes" whose type was unknown and not recorded in the database as "type 1", have been considered as "type 2". Death and ICU admission were chosen as outcomes. The median age of this sample was 70 years (IQR 56-81, range 1-105), 59.25% were men and 0.18% had an unrecorded sex. The prevalence of comorbidities is reported in Table S1 and ethnicity in Table S2 . Crosscorrelations between recorded ethnicities and pre-existing conditions are illustrated in Figure S1 . Logistic regression models were used to estimate odd ratios (ORs) of all 37 pre-existing conditions and demographic factors for both outcomes. Standard errors (SEs) and confidence intervals (CIs) of the ORs were computed using the Taylor seriesbased delta method and the profile likelihood method, respectively, and statistical significance assessed using a Wald test. In addition, we applied a "gradient boosted decision tree" (GBDT) machine-learning model with logistic objective function. A GBDT aggregates a large number of weak prediction models, in this case decision trees, into a robust prediction algorithm, where the presence of many trees mitigates the errors due a single-tree prediction. Each individual tree consists of a series of nodes that represent binary decision splits against one of the input variables, with its final output being determined by the nodes at the end of the tree (known as leaves). The model was implemented in the XGBoost library (version 0.81) 23 and depended on a number of hyper-parameters. To avoid over-fitting, these hyper-parameters were selected by means of Bayesian optimization of c-statistics using 5-fold cross-validation over the training set 24 with constant L1-regularisation parameter = 0.5. We used Shapley additive explanation (SHAP) analysis to understand the result of a GBDT model fit 11,12 . The importance of each feature in the model output is represented by the so-called Shapley values, introduced in game theory literature and providing a theoretically justified method for allocation of credit among a group of players. In the context of machine learning, the same mathematics is used to allocate the credit for the GBDT prediction among the features included in the study, for each patient. The chief output of this approach is a matrix of Shapley values where i indicates a patient and j is a pre-existing condition or other patient characteristic, = 1,2, … , . We also refer to a Shapley value as the impact of j on the outcome for the patient i. Similar to the logistic regression model, for each patient i, the trained GBDT model returns a decision value to be interpreted as the logarithm of the odds that the outcome is poor. The Shapley values are unique allocations of credit in explaining the decision among all the features, where for our case, negative values ( < 0) tip the decision value towards good outcome, while positive values ( > 0) towards bad (i.e., ICU or death). The model output satisfies = ∑ =0 , where 0 is a bias term. In tree-based models, the same idea has been extended to allocate the credit to pairs of features, thus yielding = ∑ ∑ Φ , where the Φ s are referred to as SHAP interaction values 12 . The diagonal term Φ encodes the net effect on the model prediction of a feature j, stripped of its interactions with the other features ≠ and is referred to as the SHAP main effect of j. We used an implementation specific to tree-based models accessible via the XGBoost and SHAP libraries; we refer the reader to references 11,12 for a more comprehensive discussion and for the implementation details. To rank each feature by their overall importance, we average the Shapley values over the patients whose variable value is higher than its mean to obtain the importance score = ∑ > / ∑ > ( ), where > is 1 if > and 0 otherwise. All models were fitted to a randomly chosen 90% of data entries, while the remaining entries were used for validation. Goodness-of-prediction was assessed by means of the c-statistics of the receiver operating characteristic curve (ROC-AUC) on the validation set, with bootstrapped 2.5%-97.5% confidence intervals. Data management was performed using Python (version 3.7.1) and Pandas (version 0.23.4), with analyses carried out using Python, Scikit-learn (version 0.20.1), and R (version 3.4.3). All codes for data management and analysis are archived online at https://github.com/mcavallaro/CovidC Risk factors showed strong associations with both death and admission to ICU, but the strength and even direction of these associations differed substantially across these outcomes ( Figure 1 ). From logistic regression analysis (Table 1 and Figure S2 ), deaths due to COVID-19 were strongly associated with immunosuppression due to disease (OR 1. The regression was adjusted for other comorbidities including type-1 diabetes, chronic liver, serious mental illness, chronic renal disease, chronic neurological condition, chronic heart disease, hypertension, and asthma, none of which were significantly associated to death (P>0.05). Having any comorbidity other than these was recorded in the dataset as "other comorbidity" and appeared to be a protective factor (OR death, 0.87, 95%CI 0.79-0.95). Some self-reported ethnicities, compared to white British, were associated with substantially increased (e.g., Indian (OR 1.84, 95%CI 1.42-2.73)) or decreased (e.g., white Irish (OR 0.49, 95% 0.35-0.92) risk of death ( Figure 1 , Table 1 ). Asymptomatic testing was associated with substantially lower risk of death (OR 0.29, 95% CI 0.18-0.45). Among co-morbidities, ICU admission (Table 1 and Figure S3 ) was strongly positively associated with obesity ( Figure S3 and detailed in Table 1 . The association of each predictor with death and with ICU admission are shown in Figure 1 , highlighting some contrasts in direction and magnitude while other risk factors appear more consistently associated with the two outcomes. The overall associations obtained from the GBDT model were consistent with the logistic model results. The receiver operating characteristic (ROC) curves for the logistic regression models are plotted in Figure S4 . The ROC-Area Under the Curve (AUC) scores for the logistic regression classifiers were 0.68 (95%CI 0.65-0.71) and 0.8 (95%CI 0.77-0.82) for death and ICU outcome predictions, respectively. Generalized collinearity diagnostics by means of variance inflation factor (VIF) excluded severe collinearity (Table S6 ). The scores of GBDT for classification task were 0.68 (95% CI: 0.66-0.71) and 0.83 (95% CI: 0.81-0.85) for the death and ICU outcome predictions, respectively. In addition to outcome prediction, the GBDT analysis with Shapley value explanations yielded the impact of each feature on both death and ICU outcome for each single patient (summarised in Figures S5 and S6 ). We contrasted the Shapley values for impacts on death and ICU admission in Figure 2 . All patients with obesity, serious mental illness, immunosuppressing treatment, male sex, asymptomatic admission, and those whose self-reported ethnicity was other black, Indian, black Caribbean, other Asian, other white, and NA had concordant impacts to death and ICU admission. In almost all asthma patients, it is possible to appreciate negative impact on death and positive impact on ICU admission. Patients with type-1 diabetes, chronic renal disease, or chronic neurological disease show positive association with death and negative association with ICU outcome, although with very dispersed Shapley value distributions. Upon visual inspection, the scatter points for chronic liver disease, type-1 diabetes, chronic neurological, and chronic heart comorbidities show two (or more) clusters with respect to the impact on death. Hypertension scatter plot displays a neat partition with respect to the impact on ICU outcome, showing that this variable was associated with intensive care. Its impact on death is less clear, with patients having discordant or concordant Shapley values for death. The cases of type-2 diabetes and chronic respiratory disease appear diametrically opposite to these, as all patients with such conditions had positive Shapley values for death with qualitatively different impacts on ICU outcome. The features were ranked according to their median ORs and their importance scores I (defined in methods), showing that these two are ordinally associated in both death (Spearman's ρ=0.47, P=0.005) and ICU outcomes (Spearman's ρ=0.97, P=13 10 -22 ), as shown in Figure S7 . The explanation model for the GBDT was therefore largely consistent with the interpretable logistic linear model. The analysis of SHAP main effect also revealed the non-linear relations between outcomes and the age and admission day ( Figures S8 and S9 ). The probability of death rose above 30 years of age. Likelihood of ICU admission decreased markedly above 60. This cohort study investigated the association between patient characteristics (demographics and comorbidities) and severe outcomes with COVID-19 using a large national dataset in England (the CHESS database). Our findings on many factors were largely consistent with the patterns observed worldwide in studies on patients infected with SARS-CoV-2 13,14,23-25,15-22 . Both logistic and GBDT models predicted admission to ICU more accurately than death. Obese patients were approximately 3.4-fold more likely to be admitted to ICU (the strongest association for any co-morbid condition), while the association with mortality was much weaker (OR 1.6). In a US study involving 3615 patients, patients with a body mass index (BMI) between 30 and 35 were 2-fold more likely to reach the ICU and those with a BMI of over 35 were 3-fold more likely, when compared to BMIs of less than 30 13 . These very high levels of ICU admission in our and other works, as well as the contrastingly weaker association with COVID-19 mortality, could be explained by clinicians tending to, relatively, over-admit obese patients to ICU. It could reflect ICU admission being very effective in reducing mortality in this group and is an important area for further research 26 . Hypertension, and asthma were associated with ICU admission but not death. Other have reported increased risk of severe COVID-19 among asthmatics, with the increase driven only by . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; patients with non-allergic asthma 15 . Hypertension has been associated with severe COVID-19 disease in previous univariable studies but there is no clear evidence that hypertension is an independent risk factor 17 . Black or Asian minority ethnic groups showed higher odds of death and substantially higher odds of ICU admission in our data compared to white British patients. Similar findings to ours have been demonstrated UK-wide. Multivariable analyses from large multi-ethnic cohorts have suggested that Asian and black patients group experienced an excessive level of mortality, hospital admission, and intensive care admission even when differences in age, sex, deprivation, geographical region, and some key comorbidities were taken into account 5, 16, 18, 19 . White Irish ethnicity was associated with substantially lower risk of death. This finding, adjusted for all covariates, echoes findings in an earlier study comparing death rates standardised for age and region using census data 18 . Chinese ethnicity predicted ICU admission (OR 10.2) most strongly, followed by black Caribbean (OR 5.2). For these and other minority groups the association with ICU admission far exceeded that of death. An unrecorded or unknown ethnicity was strongly negatively associated with ICU admission, but not strongly associated with death. This may indicate increased recording of ethnicity on ICU admission, a potential cause of bias in estimating true differences in risk of ICU admission across ethnicities. Age, type-1 diabetes, and neurological, heart, and respiratory diseases were negatively associated with ICU admission but not death. Age and chronic respiratory disease were strongly positively associated with death. Data gathered across USA showed that deaths are 90 times higher in the 65-74 age group than the 18-29 age group and 630 times higher in the 85 and older group 27 . This may reflect judgements of limited capacity to benefit from ICU admission due to age and some comorbidities. Type-2 diabetes is broadly reported to be associated with poor outcome in COVID-19 patients, while studies reporting outcome for type-1 diabetes are rare 20, 21 . A national general practice based analysis in England demonstrated that both type-1 and type-2 diabetes are associated with increased risk of in-hospital death with COVID-19 22 . Our multiply adjusted analysis of the CHESS dataset confirmed that type-2 diabetes had a strong association with mortality (and nonsignificant association with ICU admission), while type-1 diabetes' association was positive but not statistically significant. On the other hand, type-1 diabetes was negatively associated with ICU outcome. There is uncertainty regarding the effect of diabetes and glycaemic control on COVID-19 outcome. Whilst some suggest a 3-fold increase in intensive care admission and death 23 , others found no association between glycaemic control and severe outcome 25 . Potential mechanisms for effects could include hyperinsulinemia or the interaction of SARS-CoV-2 with ACE2 receptors expressed in pancreatic β cells 20, 28 . Male sex was positively and similarly associated with both ICU admission and death. The increased risk of male deaths is consistent with worldwide data, in which, on average, 1.4-fold more men than women have died from SARS-CoV2, with some countries reporting greater than 2-fold male deaths 29 . Increased expression of the ACE2 receptor may occur in men and has been suggested as a possible explanation for this finding 24 . Asymptomatic testing and pregnancy demonstrated a strong negative association with both death and ICU admission. These results were expected in view of NHS trusts undertaking surveillance swabs for asymptomatic people, including among elective hospital admissions. Different machine-learning models have been leveraged to predict COVID-19 patients at risk of sudden deterioration. A study over 162 infected patients in Israel demonstrated that artificial intelligence may allow accurate risk prediction for COVID-19 patients using three models (neural networks, random trees, and random forests) 30 ; a random forest model was used over 1987 patients for early prediction of ICU transfer 31 ; the GBDT model was deployed on blood-sample data from 485 patients in Wuhan, China 32 ; GBDT models outperformed conventional early-warning scoring systems for ventilation requirement prediction over 197 patients 33 ; deep learning and ensemble models were reported to perform well for early warning and triaging in China 9,34 . These models are very complex, but evidence indicates that mortality predictions can be obtained from more parsimonious models, upon selecting the most important features, thus facilitating more efficient implementation of machine-learning in clinical environments 35 . Despite these successes, prediction models have been found overall to be poorly reported and at high risk of bias in a systematic review 36 . A comprehensive list of relevant works is out of the scope of this paper, but it is worth underlining that machine-learning methods typically excel in outcome prediction but lack ease of interpretation of the result. In this study, we bridged the gap between performance and interpretability in machine learning for poor outcome predictions in COVID-19 patients. We trained GBDT models (see methods section) and extracted not-only their predictions, but also the extent to which each potential risk factor contributed to the prediction overall (thus permitting comparisons with the more easily-interpretable logistic regression model) and for each patient. So-called "Shapley values" quantify such information, as summarised in Figure S5 and S6 for death and ICU respectively. Overall, the association of patient features with the final outcome (measured by the SHAP importance score, see methods and Figure S7 ) is consistent with the logistic regression results, although the two models are intrinsically different. Moreover, for each feature, we derived an individual Shapley value for each patient, allowing us to consider the variation in effects among patients. As a first example, we discuss interpretation of type-2 diabetes. In the summary plots of Figures 2, S5 , and S6, the red markers correspond to type-2 diabetes patients and blue to patients without type-2 diabetes. In the summary plot for death outcome (Figures 2 and S5) , the red and blue markers are grouped into two distinct clusters. All the type-2 diabetes patients had positive Shapley values, thus showing that such a comorbidity was always associated to death, while all the other patients had nearly zero Shapley values. Conversely, in the summary plot for ICU outcome (Figure S6 is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint overall importance of a potential risk factor, but also its range of effects over the patients. In this case our interpretation is that although consistently increasing the risk of death, the presence of type 2 diabetes had more variable impact on decision making around ICU admission, in some cases apparently adding to the case for admission and in some cases diminishing it. Being male was positively associated with both death and ICU admission. Its impacts were concordant in sign and confined within a narrow range of values. Conversely, for example, chronic renal disease and immunosuppressive treatment had low impact on predicting death for some patients, but very high impact for others, perhaps reflecting that these categories comprise a number of diverse conditions and therapies. Considering ethnicity, most minority groups were consistently and positively associated with ICU admission but the impact attributed to Pakistani ethnicity were much more variable. Shapley value analysis of the GBDT model also excels in explaining the nonlinear relations between covariates and their importance to outcome prediction. In Figure S7 A, the predicted probability of death is shown to increase with age, in part due to increasing presence of comorbidities which are correlated with increasing age ( Figure S1 ). The isolated effects of age (the SHAP main effects for age) illustrated in Figure S8 C show a sharp rise from age 30. For ICU admission, the SHAP main effects for age abruptly drop from the 60th year of age (Figure 9 ), which may suggest an age threshold being applied in clinical decision making on ICU admission. During the first peak of COVID-19 epidemic healthcare services were under variable strain, and clinical expertise growing over time. Declining in-hospital mortality was observed in Italy 37 and England 38 during the first pandemic peak. This may reflect a mix of changing pressure, developing clinical expertise and variable follow up time following admission. We included the patient's admission day in our models to allow for these effects in adjustment (logistic regression) and attribution of impact (machine learning). Hospital admission later than March decreased both death and ICU admission. While all our models had excellent performances, it is worth noting that prediction of ICU outcome was significantly better than death alone prediction for both. Including laboratory test results in the predictor variable may improve death prediction 39 . This study confirms that, in hospitalised patients, the risk of severe COVID-19, defined as either death or transfer to intensive care unit, is strongly associated with known demographic factors and comorbidities. We found that the association of these variables with death was often qualitatively and quantitatively different from their association with ICU admission. This was consistently derived by means of two different predictive models, i.e., the standard logistic and the GDBT machine-learning models. The Shapley value explanation of the latter model also highlights the impact of each factor for each patient. These results allow an insight into the variable impact of individual risk factors on clinical decision support systems. We suggest that these should not only grant the optimal average prediction, but also provide interpretable outputs for validation by domain experts. These aspects are particularly valuable to tackle COVID-19, a complex disease that can cause a variety of symptoms and clinical outcomes, depending on the patients' conditions, and rapidly overwhelm healthcare systems, thus requiring large-scale automated decision systems. Zhu is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Tables and figures Table 1 . Estimated odd ratios (ORs) from adjusted logistic regressions and importance (Imp) scores of death and intensive-care unit admission (ICUA) outcomes. . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint to discordant and concordant associations. The figure highlights mismatches in the ORs of a number of variables, e.g., asthma and "other comorbidity" were risk factors for ICUA but protective for death outcome. Chronic respiratory disease was a risk factor for death but negatively associated with ICU admission. For most ethnicities the ORs of death and ICUA were concordant in sign but of different magnitude. Abbreviations: Mental ill.: serious mental illness Resp. dis.: respiratory disease Neuro. dis.: neurological disease Immunos. dis.: immunosuppression due to disease T2D: type-1 diabetes T1D: type-2 diabetes Eth. NA: ethnicity unrecorded Immunos. treat.: immunosuppression due to treatment Asymp. : asymptomatic admission . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; Figure 2 . Contrasting Shapley values for impact on death and intensive-care unit admission ICUA for all variables included in this study. Each marker in the scatter plots corresponds to an in-patient. Colors from red to blue indicate the value of the underlying variable (in binary variables, red color means feature is present, blue otherwise; in age feature, red to blue . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint shades correspond to old to young ages; in admission day, red to blue shades correspond to early to late dates). The explanation models assigned a concordant (discordant) impact on death and ICUA to the patients in the white (grey) regions. The scatter plots expose not only the importance of a potential risk factor but also its range of effects over the cohort. All patients with immunosuppression disease, type-2 diabetes, liver and respiratory disease, and Pakistani self-defined ethnicity had positive Shapley values from death, with impact on ICU ranging from negative to positive values, thus suggesting that these conditions were always leaning towards death but sometimes not consistently towards ICUA. Conversely hypertension always have positive impact on ICUA whilst can either have positive or negative impact on death for different patients. The Shapley values for death for many features appear clustered (T1 diabetes, chronic liver, neurological, and hearth disease comorbidities), thus suggesting the presence of different groups under the same labels with different effect on patient health. Inspection of the age pattern suggests the presence of a group of young patients (blue markers) with negative impact on both age and ICU outcome, old-age patients with positive impact on death and negative impact on ICU outcome, and intermediate-age patients with impacts negative on death and positive on ICU outcome. For abbreviations, see the caption of Figure 1 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; Figure S1 . Correlation heatmap between self-defined ethnicities and pre-existing conditions. Color shades from blue to red correspond to increasing values of Person correlation coefficient (white: no correlations are present). NA labels inpatients who did not identify themselves with any ethnicity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; Figure S2 . Odd ratios for death outcomes from Table 1 . Features are grouped in comorbidities, ethnicities, and others (top to bottom). Significance star codes are `*` P ≤ 0.05, `**` P ≤ 0.01, `***` P ≤ 0.001, `****` P ≤ 0.0001 --no star corresponds to nonsignificant statistical association, more stars correspond to higher significant statistical association. . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S3 . Odd ratios for ICU outcome from Table 1 . Features are grouped in comorbidities, ethnicities, and others (top to bottom). Stars are codes for significance (* P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001,**** P ≤ 0.0001) --no star corresponds to non-significant statistical association, more stars correspond to higher significant statistical association. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S4 . ROC curves (C-statistics) of the logistic regression classifiers over the validation set. Confidence intervals are obtained by means of bootstrapping. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; Figure S5 . Summary plot of Shapley values for impact on death outcome. For each potential risk factor, the Shapley values of each patient are represented as a swarm plot distribution. Colors from red to blue indicate the value of the underlying variable (in binary variables, red color means feature is present, blue otherwise; in age feature, red to blue shades correspond to old to young ages; in admission day, red to blue shades correspond to early to late dates). Shapley values for age are scaled by a factor 0.5 to fit the plot range. Features are grouped in comorbidities, ethnicities, and others. Within each group, the factors are ranked according to the importance score Imp (see Table 1 ); in other words, upper in the list . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; are the conditions most likely to be associated with the worst outcome. For binary variables, Imp is obtained averaging the Shapley values over the red points. Immunosuppression by disease, type-2 diabetes mellitus, being male, and chronic liver, renal, respiratory and neurological conditions consistently appear to have positive impact to death outcome for all patients. Asthma was found to have negative impact on death for all patients. . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S6 . Summary plot of Shapley values for impact on ICU outcome. Colors and keys as in Figure S5 . The presence of obesity, hypertension, immunosuppression treatment, and "other comorbidity" were clear indicators of ICU outcome for all patients. Shapley values for age are scaled according to Figure S5 . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S7 . SHAP importance scores from the explaination model for GBDT vs logarithm of odd-rations (ORs) from logistic regression for death (A) and intensive-care unit (ICU) admission (B). Each point represents a feature (see Table 1 ). Red markers correspond to the features whose association with the outcome was not significant according to the logistic regression (P>0.05). The x-axis errorbars comprise 68% confidence intervals. The y-axis errorbar are standard errors of the importance scores. The SHAP importance I allows us to assess to what extent a feature contributes to the GBDT prediction. This plot shows that these are consistent with the well-known logistic regression coefficients, despite the underlying models used to generate these two quantities are fundamentally different. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S8 . A-B) Probability of death predicted by GBDT for each patient in training set. C-D) SHAP main effect for age and admission date. These effects can be ascribed to the age/admission date alone, regardless of their covariates. The strong pattern in the main effect for admission date highlights the importance of incorporating timing in predictive models. . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Figure S9 . A-B) Probability of intensive-care unit admission predicted by GBDT for each patient in training set. C-D) SHAP main effect for age and admission date. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint . CC-BY-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted December 7, 2020. ; https://doi.org/10.1101/2020.12.03.20242941 doi: medRxiv preprint Transmission, Diagnosis, and Treatment of Coronavirus Disease 2019 (COVID-19): A Review WHO Coronavirus Disease (COVID-19) Dashboard | WHO Coronavirus Disease (COVID-19) Dashboard Comorbidities and the risk of severe or fatal outcomes associated with coronavirus disease 2019: A systematic review and meta-analysis Clinical Characteristics of Patients Who Died of Coronavirus Disease 2019 in China Factors associated with COVID-19-related death using OpenSAFELY Rapid Epidemiological Analysis of Comorbidities and Treatments as risk factors for COVID-19 in Scotland (REACT-SCOT): A population-based case-control study Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score Machine learning based early warning system enables accurate mortality risk prediction for COVID-19 Interpretable Machine Learning --A Brief History, State-of-the-Art and Challenges From local explanations to global understanding with explainable AI for trees Obesity in Patients Younger Than 60 Years Is a Risk Factor for COVID-19 Hospital Admission High Prevalence of Obesity in Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) Requiring Invasive Mechanical Ventilation Ethnicity and Risk of Death in Patients Hospitalised for COVID-19 Infection: An Observational Cohort Study in an Urban Catchment Area Hypertension and related diseases in the era of COVID-19: a report from the Japanese Society of Hypertension Task Force on COVID-19 Asian and Minority Ethnic groups in England are at increased risk of death from COVID-19: indirect standardisation of NHS mortality data Ethnicity and Outcomes of COVID-19 Patients in England COVID-19 in people with diabetes: understanding the reasons for worse outcomes Coronavirus Infections and Type 2 Diabetes-Shared Pathways with Therapeutic Implications Associations of type 1 and type 2 diabetes with COVID-19-related mortality in England: a wholepopulation study COVID-19 in People with Diabetes: Urgently Needed Lessons from Early Reports Biological sex impacts COVID-19 outcomes Phenotypic characteristics and prognosis of inpatients with COVID-19 and diabetes: the CORONADO study High Prevalence of Obesity in Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) Requiring Invasive Mechanical Ventilation The Lancet Diabetes Endocrinology, T. L. D. &. COVID-19 and diabetes: a co-conspiracy? lancet. Diabetes Endocrinol COVID-19 sex-disaggregated data tracker -Global Health 50/50 Utilization of machine-learning models to accurately predict the risk for critical COVID-19 Using Machine Learning to Predict ICU Transfer in Hospitalized COVID-19 Patients An interpretable mortality prediction model for COVID-19 patients Prediction of respiratory decompensation in Covid-19 patients using machine learning: The READY trial Early triage of critically ill COVID-19 patients using deep learning Clinical features of COVID-19 mortality: development and validation of a clinical prediction model Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal Decreased in-hospital mortality in patients with COVID-19 pneumonia