key: cord-0831092-yrdd6u2c
authors: Altini, Nicola; Brunetti, Antonio; Mazzoleni, Stefano; Moncelli, Fabrizio; Zagaria, Ilenia; Prencipe, Berardino; Lorusso, Erika; Buonamico, Enrico; Carpagnano, Giovanna Elisiana; Bavaro, Davide Fiore; Poliseno, Mariacristina; Saracino, Annalisa; Schirinzi, Annalisa; Laterza, Riccardo; Di Serio, Francesca; D’Introno, Alessia; Pesce, Francesco; Bevilacqua, Vitoantonio
title: Predictive Machine Learning Models and Survival Analysis for COVID-19 Prognosis Based on Hematochemical Parameters
date: 2021-12-20
journal: Sensors (Basel)
DOI: 10.3390/s21248503
sha: 779775e9262bb11adc676b666332f96ec1b89bdf
doc_id: 831092
cord_uid: yrdd6u2c

The coronavirus disease 2019 (COVID-19) pandemic has affected hundreds of millions of individuals and caused millions of deaths worldwide. Predicting the clinical course of the disease is of pivotal importance to manage patients. Several studies have found hematochemical alterations in COVID-19 patients, such as inflammatory markers. We retrospectively analyzed the anamnestic data and laboratory parameters of 303 patients diagnosed with COVID-19 who were admitted to the Polyclinic Hospital of Bari during the first phase of the COVID-19 global pandemic. After the pre-processing phase, we performed a survival analysis with Kaplan–Meier curves and Cox Regression, with the aim to discover the most unfavorable predictors. The target outcomes were mortality or admission to the intensive care unit (ICU). Different machine learning models were also compared to realize a robust classifier relying on a low number of strongly significant factors to estimate the risk of death or admission to ICU. From the survival analysis, it emerged that the most significant laboratory parameters for both outcomes was C-reactive protein min; [Formula: see text] (95% CI 6.548–49.277, p < 0.001) for death, [Formula: see text] (95% CI 1.000–3.200, p = 0.050) for admission to ICU. The second most important parameter was Erythrocytes max; [Formula: see text] (95% CI 1.141–2.729, p < 0.05) for death, [Formula: see text] (95% CI 0.895–2.452, p = 0.127) for admission to ICU. The best model for predicting the risk of death was the decision tree, which resulted in ROC-AUC of 89.66%, whereas the best model for predicting the admission to ICU was support vector machine, which had ROC-AUC of 95.07%. The hematochemical predictors identified in this study can be utilized as a strong prognostic signature to characterize the severity of the disease in COVID-19 patients.

In December 2019, in Wuhan, province of Hubei (China), several local health facilities reported cases of pneumonia of unknown origin, which have been identified as the first human cases of COVID-19 [1, 2] . The SARS-CoV-2 virus pandemic has caused more than 5,000,000 deaths and a total of over 250,000,000 confirmed cases, globally, as of November 2021 [3, 4] . Most patients have mild, self-limiting respiratory infections, with symptoms such as fever, headache, dry cough, fatigue, and muscle pain, but some may rapidly develop fatal complications, including acute respiratory distress syndrome (ARDS) or respiratory failure, multiple organ dysfunction, and septic shock that imposes hospitalization and could lead to the death of the patient [1, 5] .

This pandemic has put a strain on all global health systems and represents a formidable opportunity to highlight the value of laboratory medicine and to focus on new methods to support and speed up the identification of patients with higher risks of progression to severe stages of the disease.

Accurate prediction of COVID-19 mortality and the identification of factors related to the severity of the disease would allow for targeted strategies in those patients with higher risk of death or developing severe disease; thus, reducing the burden of unnecessary hospitalizations and the health system overload [6] .

A better (and clearer) understanding of predictive factors for COVID-19 is crucial for the development of clinical decision support systems that can accurately and rapidly detect the patients with increased risk of worsening conditions [7] .

Towards this aim, we retrospectively analyzed data from a cohort of 303 patients with reverse transcription-polymerase chain reaction (RT-PCR) confirmed COVID-19, hospitalized at Polyclinic Hospital of Bari, during the first phase of the COVID-19 global pandemic from 14 March to 10 September 2020. Statistical methods and survival analysis, together with the development of machine learning classifiers, were carried out on these data, with the purpose of identifying hematochemical parameters that better reflect and contribute to the risk assessment.

The paper is structured as follows. Section 2 summarizes the relevant literature on the predictive models for COVID-19. Section 3 describes the details of the data collection process, the patient cohort, and the analysis framework. Section 4 details the methods exploited for carrying out the analysis, and explains the feature selection process and the development of machine learning (ML) classifiers for the risk assessment, considering both the death and the admission to the intensive care unit (ICU) as target outcomes. As for the admission to the ICU, we included patients who were admitted at the start to the ICU or were transferred to the ICU from the other COVID Units. In Section 5, we present and discuss the obtained results. Lastly, in Section 6, we summarize the findings of this research.

Different authors considered the task of performing statistical analysis or developing ML models to predict the severity of COVID-19 disease [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] . Tjendra et al. [12] performed a meta-analysis, which summarize 72 papers on the predictive role of different biomarkers in COVID-19 patients. According to them, white blood cells, lymphocyte and platelet counts, C-reactive protein (CRP), ferritin, and interleukin-6 were found to be potential prognostic markers of evolution of the disease to a severe form.

Yoshida et al. [8] discovered sex disparities in clinical and biological parameters of severe outcomes in 776 adults with COVID-19, hospitalized in a U.S. healthcare system. The data from the cohort were acquired in New Orleans, LA, between 27 February and 15 July 2020.

Nachtigall et al. [9] retrospectively analyzed 1904 patients admitted to a national network of hospitals in Germany. The authors considered demographic data, comorbidities, and clinical outcomes, and revealed that the most important risk factors for death were older age, precedent lung disease, and male sex.

Banoei et al. [10] performed a multivariate predictive analysis on a subset of 108 out of 250 features, encompassing comorbidities, blood markers, and clinical features. The features considered were those captured at the admission time from a cohort of 250 hospitalized patients with COVID-19. The strongest mortality predictors were diabetes, coronary artery disease, altered mental status, dementia and age greater than 65 years. Among the biochemical markers, the most relevant were CRP, lactate, and prothrombin.

Zuccaro et al. [11] considered a cohort of 426 consecutive hospitalized patients from a hospital in Lombardy, Italy, in the period 12 February-30 March 2020. They concluded that male sex, older age, hospital admission after 4 March, and number of comorbidities were independent risk factors related to in-hospital mortality.

Zhou et al. [13] retrospectively analyzed 116 patients admitted to Chongqing Public Health Medical Center, China, in the period 24 January-7 February, 2020, with a diagnosis of mild or moderate COVID-19. According to the authors, three factors were found to be independent predictors of progression to severe disease, during two weeks after admission: high value of creatine kinase, low value of CD4+ T-cell count, and age higher than 65 years.

Niu et al. [14] included a cohort of 150 patients diagnosed with COVID-19 from Huanggang Central Hospital in the period 23 January-5 March, 2020. By exploiting univariate and multivariate logistic regression, the authors explored which were the most relevant risk factors associated with in-hospital death. This analysis allowed concluding that diabetes, high value of lactate dehydrogenase on admission, and higher sequential organ failure assessment score increased the odds of in-hospital death. A summary of the related works is available in Table 1 .

Deep learning (DL) approaches are becoming more relevant in the biomedical and health domains, and literature already exists for what concerns the COVID-19 pandemic [19] . Even though most of the literature focuses on tasks, such as medical image analysis, biomedical signal processing, and natural language processing, which are domains different from ours, there is a recent trend in exploiting DL models for irregularly sampled time series (ISTS) data. Sun et al. performed a review of the DL methods for addressing the issues arising from ISTS data [20] . They also consider a COVID-19 dataset, coming from the work of Yan et al. [21] , for which they discover that, for mortality prediction, T-LSTM [22] and GRU-D [23] are the top performing models. With respect to DL approaches, the statistical and machine learning framework developed in this paper more easily allows one to interpret the results, also from a clinical significance point of view.

Most of the available works in the literature are considered demographic data, comorbidities, and blood markers. In this work, our purpose was to realize a predictive model based on hematochemical parameters. Unlike what was done in previous works, as Banoei et al. [10] , which considered blood markers at admission time, we included time series data for hematochemical factors, allowing the construction of a more reliable predictive model. Niu et al. [14] considered the evolution of parameters over time, but based their conclusions on a cohort smaller than ours, being composed of only 150 patients. As predictive models, they mainly considered univariate and multivariate logistic regression, whereas we compared a wide variety of methods: Decision tree (DT), random forest (RF), Gaussian naive Bayes (GNB), support vector machines (SVM), K-nearest neighbors (KNN), and adaptive boosting. Finally, other authors, as Nachtigall et al. [9] , did not consider blood parameters in their analyses. Therefore, our paper can be considered a contribution over the existing literature, especially because we performed, in a cohort of 303 patients, statistical and survival analyses and systematic comparison of predictive models over time series of hematochemical parameters. 

The demographic and anamnestic data were collected by clinicians and specialists from four different COVID-Units of the Polyclinic Hospital of Bari (Apulia, Southern Italy): Intensive Care Unit (41 patients), Infectious Disease Unit (224 patients), Pneumology Unit (122 patients), and Internal Medicine Unit (324 patients). In total, data of 434 patients were collected. Laboratory tests were performed by specialists from the Clinic Pathology Unit of the aforementioned Hospital, providing data of 367 patients. The intersection among demographic, clinical, and laboratory data resulted in a dataset of 303 patients.

Specifically, demographic data included variables, such as age and sex, the clinical characteristics examined were date of hospitalization, record the date of transfer to ICU, date of discharge from all COVID units including the ICU, date of death, days of hospitalization; as for laboratory tests, a total of 69 hematochemical parameters were analyzed. The full list of hematochemical parameters considered for the study is available in supplementary materials.

The target outcomes were in-hospital death and admission to ICU. Events were considered to have occurred only if they happened within the follow-up period.

A workflow of the process followed for carrying out this study, from the data collection to results, is depicted in Figure 1 . 

Overall, 303 patients with COVID-19 were enrolled in the study, of which 184 (60.7%) were male and 119 (39.3%) were female.

The following data are reported as mean ± standard deviation. The age of the study cohort was 64.2 ± 17.7 years (range 19-99 years). The hospitalization time was 22.3 ± 17.1 days (range 0-126 days) and the ICU staying time was 3.7 ± 10.5 days (range 0-94 days).

During the time of hospitalization, 218/303 (71.9%) patients were discharged alive, 85/303 (28.1%) died before discharged, and 74/303 (24.4%) were admitted to the ICU. Among the ICU patients, 49/74 (66.2%) died and 25/74 (33.8%) survived.

On the total of 184 male patients, 54 (29.3%) died, 130 (70.7%) were discharged alive, and 53 (28.8%) were admitted to the ICU, whereas of the 119 female patients, 31 (26.1%) died, 88 (73.9%) were discharged alive, and 21 (17.6%) needed admission to the ICU ( Table 2) .

The mean age of the dead patients was 74.08 ± 13.15 years, whereas the mean age of the survived patients was 60.36 ± 17.81 years.

In the following, four age classes were considered: under 55 years old, between 55 and 65 years old, between 65 and 80 years old and over 80 years old.

As shown in Table 2 , the highest mortality rate was observed in the two oldest age groups (65-80 years and over 80 years), whereas the highest rate of admission or transfer to the ICU was found among patients between 65 and 80 years of age. Patients younger than 55 years and older than 80 years were less likely to be admitted to the ICU. 

The analysis performed in this study was carried out in the Python 3 programming language. The frameworks exploited included Pandas (for data handling), Scikit-Learn (for training and validating machine learning algorithms), SciPy (to perform the statistical analysis), Seaborn and Matplotlib (to visualize the data).

The data collected from the different units were merged into a unique dataset, which we exploited for the following of the study. The obtained dataset contained both (a) demographic and clinical data and (b) hematochemical parameters of the patient cohort. Since, for many laboratory tests examined, there were available time series data, which can allow to understand the time progression of the clinical state, five features were extracted: minimum, maximum, mean, first, and last values [24] .

Outlier removal was performed, considering only the 99.75th percentile values, excluding the remaining 0.25th percentile values, both from upper and lower sides.

For the machine learning predictive models, in order to handle missing values, imputation with the KNNImputer algorithm was performed. It exploits the Euclidean distance to find the nearest neighbors and imputes the missing values with the uniformly averaged values from the specified number of neighbors [25] .

Lastly, the data were rescaled into the range [0, 1]. This process is useful for features that are not normally distributed and preserves zero entries in sparse data.

According to the literature, the application of these algorithms should lead to an increase of the machine learning classifiers performance [26] .

The variables of interest were divided into quantitative variables, i.e., continuous variables that contain numerical values, such as age, and the minimum, average, maximum, first and last values of each hematochemical parameter examined, and qualitative variables, i.e., variables describing the patient's status as sex, death, or admission to the ICU.

Descriptive statistics. Regarding categorical variables, absolute and relative frequencies have been considered. While, regarding continuous variables, mean, median, first quartile, second quartile, third quartile, and interquartile range have been extracted.

Inferential statistics. Inferential statistics was carried out using the Chi-squared test for the categorical variables and the Mann-Whitney U test for the continuous variables. For both kind of tests, the significance threshold was set to 0.05. Even though some debate exists about thresholds for p-value [27] , 0.05 is the historical and the most widely adopted threshold for testing statistical significance. In order to make our work comparable with the majority of existing literature, we decided to adopt the same threshold.

Survival analysis corresponds to a set of statistical methodologies used to model and analyze temporal data, in order to investigate the time required for the occurrence of the event under study.

In this study, the Kaplan-Meier method has been exploited for categorical variables (i.e., age classes and sex) to estimate the survival time and generate survival curves, which were obtained by plotting the survival probabilities in relation to the hospitalization days for both outcomes, i.e., in-hospital mortality and admission to ICU [28] .

Instead, Cox regression was applied for the blood parameters, considering the laboratory normality ranges. It is a powerful technique to study the impact of several risk factors on patients' survival at the same time.

In Cox regression, the dependent variable is the incidence rate of a given event considered as the number of events per person in the time between the entry into the study and the date of the last observation [29] . The events under consideration were death and admission to the ICU.

The feature selection process consists of choosing a subset of relevant features in order to use machine learning methods effectively, speeding up the algorithms, increasing the prediction accuracy and the comprehensibility of the data [30] .

For the features selection step, coefficients resulting from a multivariate logistic regression applied to the two different outcomes were exploited [31] .

Considering the logistic regression in Equation (1):

where k is the number of predictors. The features are preserved only if their respective coefficients meet the criteria in Equation (2):

where |β i | is the absolute value of the i-th coefficient β i , mean([β 1 , . . . , β k ]) is the mean of the coefficients and std([β 1 , . . . , β k ]) is the standard deviation of the coefficients. In this way, only the features mostly related to the patient's outcome have been retained.

Splitting of the data. After the pre-processing stage, the dataset resulted in 303 patients and 347 predictors, composed of the five features for each of the 69 hematochemical parameters plus age and sex information. In order to reduce the number of features, a selection has been carried out as described in Section 5.2, resulting in a subset of only six predictors. This dataset has been divided in two subsets, using an 80/20 split, resulting in a training set composed of 242 patients, and a test set composed of 61 patients.

Predictive models. In order to analyze the predictive capacity of the selected variables, it was decided to compare different machine learning models. The following six classifiers have been considered:

1.

Decision tree [32, 33] Adaptive boosting or AdaBoost [39, 40] .

Models evaluation and settings. In order to evaluate the models during the hyperparameters exploration, the exhaustive grid search with k-fold cross-validation has been implemented [41] . Final models have been assessed on the hold-out test set. As shown by the literature, this method is used also to improve the classification accuracy [42] . Details about the tuning of hyperparameters with grid search are provided in Appendix A.

The k-fold cross validation has been implemented directly in the grid search and has the advantage of providing a precise estimation of the accuracy of the model and using more data to validate the model [43] .

In order to assess the performances of the different models, receiver operating characteristic (ROC) curves and confusion matrices have been exploited.

Statistically significant differences in the risk of death, as well as in the risk of admission to the ICU, were found among the age groups, according to the p-value < 0.001. Mortality risk was similar for male and female subjects (p-value 0.622), whereas statistically significant differences were observed in the risk of admission to the ICU (p-value 0.032), with the men more likely to be admitted to the ICU than women. These results are reported in Table 1 .

The Kaplan-Meier survival curves showed a similar survival pattern for males and females (Figure 2A,B) . Instead, as shown in Figure 2C ,D, divergences in mortality were observed between the younger and the older age groups.

The results of the feature selection process are shown in Tables 3 and 4 , together with the logistic regression coefficients, indicated in the column "Logit coeff". Only features that satisfied Equation ( 2) have been reported, i.e., features with coefficients higher than the thresholds 2.772 and 3.911, respectively, for mortality and admission to the ICU. From this analysis, 32 features resulted significant for the mortality and 28 features for the admission to the ICU.

In order to extract a unique feature subset, only the features that were found to be significant for both outcomes were retained. They were Ionized calcium max, CRP mean, CRP min, Total bilirubin min, Erythrocyte max, Aspartate aminotransferase (AST) min.

The subset obtained was analyzed using the Mann-Whitney U test to check the statistical significance of each feature; among the six features, three resulted in having a high statistical significance for both outcomes with a p-value < 0.05: CRP mean, CRP min, Total bilirubin min.

We also investigated if the considered feature sets, both the starting one with all the features and the other one with the selected prognostic signatures, were discriminative in an embedding scatter plot at reduced dimensionality, exploiting principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) [44] techniques. Two plots have been made, one for survived and deceased patients, in Figure 3 , and the other one for patients who were or not transferred to the ICU, Figure 4 . Violin plots that depict the distribution differences between the conditions, both for death and admission to the ICU, are reported in Figures 5 and 6 . Table 5 shows the results of Cox regression analysis used to estimate the relationship between the risk predictive factors, i.e., all the six significant hematochemical features examined, and the mortality rate or the rate of the admission to ICU.

Regarding mortality risk, HR higher than 1 was found for all the six features meaning that patients who had values of the features outside the normality range are at increased risk of mortality. Nonetheless, only the features CRP min and erythrocytes max were statistically significant, with p < 0.001 and p < 0.05, respectively. It has to be noted that, when we performed the Cox regression analysis, the HR for CRP mean was = 3.11 × 10 6 with a 95% CI for log(HR), which spanned from −5020 to 5050, because this feature was overrange in almost every hospitalized patients and 100% of dead patients. In fact the associated p-value was 0.995, meaning that its HR was not statistically significant. Therefore, we repeated the Cox regression analysis without this parameter, before reporting the results in Table 5 .

Regarding the admission to ICU, HR greater than 1 was observed for all the features, except for Ionized calcium max. However, in this case, no feature was statistically significant (p < 0.05). The most important predictor was CRP min, with HR = 1.789 (95% CI 1.000-3.200, p = 0.050).

Thus, CRP min can be considered the most important risk factor for both outcomes. A medical discussion about these features is provided in Section 5.4.

The hazard ratio with the 95% confidence interval for all features is plotted in the logarithmic scale in Figure 7 . 

Regarding the predictive models, only the hematochemical parameters have been considered. According to the feature selection stage, only hematochemical tests that resulted significant for both outcomes were retained. They were Ionized calcium max, C-Reactive protein mean, and C-reactive protein min, erythrocytes max, Total bilirubin min and aspartate aminotransferase min.

Machine learning algorithms considered for realizing the predictive models were decision tree, random forest, Gaussian naive Bayes, support vector machines, K-nearest neighbors and AdaBoost, using the exhaustive grid search cross validation to obtain the highest possible accuracy. The performances of the different models are displayed in Figures 8 and 9 .

Decision tree is found to e the model with the highest ROC-AUC for the mortality prediction task, whereas SVM is the best model for predicting admission to ICU. Figures 10  and 11 depict the ROC curves, showing the performances on both the train set and the test set for the best models. 

Admitted to ICU 4.5 ± 0.6 4.5 ± 0. 

These results permit identifying a subset of features that can be used to predict the worsening state of COVID-19.

In the cohort under study, we observed that the patients who were dead or who were admitted to ICU presented alterations of the values of some hematochemical tests that we identify as most predictive factors.

Particularly, we found that the CRP min was overrange in 96.4% (41.5%) of the dead (alive) patients and 76.7% (50.4%) of the patients admitted (not admitted) to the ICU, resulting in the main predictor factor for mortality risk and, even not statistically significant, for the risk of admission to the ICU. These data are in accordance with the literature, which suggests that the CRP is strongly associated with mortality in patients with COVID-19 [35, 45, 46] . On the other hand, it is well known that CRP is a marker for systemic inflammation already associated with severe disease in bacteria or virus infections.

It has been reported that, compared to moderate cases, severe COVID-19 cases had lower red blood cell counts and hemoglobin levels [47] . It has also been stated that COVID-19 is associated to red blood cell (RBC) damage and that the virus negatively affects the process of RBC formation; thus, being responsible for multiple organ damage [48] . Indeed, the statistical analysis showed that, in the cohort of study, the percentage of patients with under range values of erythrocytes max was 45.2% (23.9%) in deceased (alive) patients and 41.1% (26.2%) in patients admitted (not admitted) to the ICU [49] . However, the feature was shown to be only statistically significant for mortality risk.

In our cohort, we also observed that dead patients and patients admitted to the ICU had higher Total bilirubin min value compared, respectively, to the survived and patients not admitted to the ICU. Thus, the hyper-bilirubin level can also be exploited as a predictor of worsening conditions in COVID-19 patients. Accordingly, a pooled analysis reported that patients with severe COVID-19 display higher bilirubin levels compared to those with milder forms [50] . An elevated bilirubin level is regarded as a vital marker of altered liver function, indicating a likely liver injury due to the infection [51] . However, hyper-bilirubin levels may be also due to erythrocyte damage and an increased hemolysis rate.

As to the AST min value, it was found to be statistically significantly higher in deceased subjects compared to those who were discharged alive. In fact, the extracted min feature was over range, respectively, in 35.7% of dead and 8.8% of survived patients. Likewise the hyper-bilirubin levels, increased AST values, may indicate liver injury due to the SARS-CoV-2 infection and a poorer outcome [52, 53] .

Finally, the last feature extracted was Ionized calcium max, which we found to be under range in a high percentage of patients with COVID-19, irrespective of the severity of the disease. No significant differences were in fact observed between dead and surviving patients. A retrospective case-control study by Pal et al. analyzing 72 patients with nonsevere COVID-19 and an equal number of healthy controls reported that hypocalcemia was highly prevalent, even in COVID-19 patients with non-severe disease. They suggest that hypocalcemia may be intrinsic to the disease per se [54] . Cappellini et al. also found a decrease in whole blood ionized calcium levels in COVID-19 versus non-COVID 19 subjects, with the difference being statistically significant [55] . Thus, the lower serum calcium levels observed may be due to a viral direct action on the regulation of the normal ion homeostasis, as shown by the other viruses.

The limitations of the present study are mainly: (a) the acquired cohort comes from a single hospital; therefore, the generalization capability of the developed models-as well as on other cohorts-need to be assessed; (b) only features extracted by time series data of the blood parameters were considered, not the raw data.

Artificial Intelligence can play a pivotal role in processing and analyzing patient data for efficient diagnosis and prognosis. In this paper, we retrospectively analyzed a cohort of hospitalized patients with confirmed diagnoses of COVID, with the purpose of recognizing and evaluating a set of hematochemical parameters, which can be strong predictors of the disease severity, considering, as outcomes, the mortality rate and the rate of admission to ICU.

Starting from the data collection of 303 patients and 347 extracted features, considering five features per each of the 69 hematochemical parameters, in addition to age and sex information, through statistical feature selection techniques, the subset of predictors was reduced to only six features for both target outcomes. They were the Ionized calcium max, CRP mean, CRP min, Total bilirubin min, Erythrocyte max, AST min. We showed that modifications in the value of the six selected predictors are often present in the most severe cases of the disease that are at high risk of deterioration [35, 45, 46, 52, 53, [55] [56] [57] [58] [59] [60] , with CRP min being the main predictor factor. The best predictive model was the decision tree for the mortality prediction task, with ROC-AUC of 89.66%, and the SVM for the ICU admission prediction, with ROC-AUC of 95.07% confirming the possibility of utilizing these models for both outcome predictions.

In conclusion, the developed models can aid in the realization of a clinical decision support system, which can assist clinicians in the assessment of COVID-19 severity, increasing the precision, accuracy, and velocity of the prediction.

Due to the reliability and accuracy of the developed models, it will be possible to carry out a better stratification risk for COVID-19 hospitalized patients, allowing to reduce severe cases of the disease and deaths.

Future works include the validation of these models on further groups of patients that can allow to better understand the value of the identified predictors. Furthermore, DL models, such as recurrent neural networks (RNNs) [61] or long short-term memory (LSTM) [62] , which are architectures designed for modeling temporal sequences, can be exploited to obtain higher accuracy, although at the cost of results that are more difficult to interpret [63] . Informed Consent Statement: Patient consent was waived due to the fact that this was a retrospective observational study with anonymized data, already acquired for medical diagnostic purposes.

The data presented in this study are available upon request from the corresponding author.

The authors declare no conflict of interest.

The following abbreviations are used in this manuscript: 

In regard to the architecture of the ML classifiers, the exhaustive grid search technique has been implemented on a defined subset of the hyperparameters, each in a specific range of possible values, in order to limit the searching time of the k-fold cross-validation procedure (k = 10). The aim of this phase was to optimize the accuracy, resulting in a subset of optimal hyperparameters for each classifier. Then, the classifiers were validated on the hold-out test set, as described in Section 5.3. The setting of hyperparameters has been performed twice, for the death outcome and the admission to ICU outcome.

The hyperparameters tuned for the DT were: the maximum depth of the tree (max_depth), which has been optimized in a range from 1 to 8; the criteria used to measure the quality of a split (criterion), which could be either gini or entropy; the strategy used to choose the split at each node (splitter), which could be random or best.

The only hyperparameter tuned for the GNB classifier was the portion of the largest variance of all features that are added to variance for calculation of the stability (var_smoothing), which was optimized in a range between 10 −10 and 10, with 10 steps.

The hyperparameters tuned for the SVM were: the regularization parameter (C), chosen from the set {1, 10, 100, 1000}; the kernel type used in the algorithm (kernel), chosen from the set {linear, poly, rbf, sigmoid}; the kernel coefficient (gamma), which could be either 10 −3 or 10 −4 (this parameter is set only when the kernel is not linear).

The hyperparameters tuned for the KNN were: the number of neighbors to use (n_neighbors), in range from 1 to 10; the distance metric used by the tree (metric), tuned from the set {euclidean, manhattan, chebyshev}.

The hyperparameters tuned for the RF were: the criterion, as for the DT; the max_depth, tuned in the range from 1 to 10; the bootstrap dichotomous value, to decide if exploiting all the sample test data, or only the bootstrap sample.

The hyperparameters tuned for the AdaBoost were: the weights applied to each classifier at each iteration (learning_rate), in a range from 10 −4 to 1; the maximum number of estimators used (n_estimators), in a range from 10 to 100.

From our experiments, the optimal configuration of hyperparameters for each classifier is as listed below. 

COVID-19 outbreak: An overview

Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan

World Health Organization. COVID-19 Weekly Epidemiological Update; World Health Organization

Development of a prognostic model for mortality in COVID-19 infection using machine learning

From SARS to COVID-19: What lessons have we learned?

Predictors of mortality for patients with COVID-19 pneumonia caused by SARS-CoV-2: A prospective cohort study

Clinical features of COVID-19 mortality: Development and validation of a clinical prediction model

Clinical characteristics and outcomes in women and men hospitalized for coronavirus disease

Clinical course and factors associated with outcomes among 1904 patients hospitalized with COVID-19 in Germany: An observational study

Machine-learning-based COVID-19 mortality prediction model and identification of patients at low and high risk of dying

Competing-risk analysis of coronavirus disease 2019 in-hospital mortality in a Northern Italian centre from SMAtteo COvid19 REgistry (SMACORE)

Predicting disease severity and outcome in COVID-19 patients: A review of multiple biomarkers

Predictive factors of progression to severe COVID-19

Development of a predictive model for mortality in hospitalized patients with COVID-19

Lung Segmentation and Characterization in COVID-19 Patients for Assessing Pulmonary Thromboembolism: An Approach Based on Deep Learning and Radiomics

Automated Triage System for Intensive Care Admissions during the COVID-19 Pandemic Using Hybrid XGBoost-AHP Approach

COVID-19 Case Recognition from Chest CT Images by Deep Learning, Entropy-Controlled Firefly Optimization, and Parallel Feature Fusion

Vital Signs Prediction for COVID-19

Deep Learning applications for COVID-19

A review of deep learning methods for irregularly sampled medical time series data

An interpretable mortality prediction model for COVID-19 patients

Patient subtyping via time-aware LSTM networks

Recurrent neural networks for multivariate time series with missing values

Patient specific predictions in the intensive care unit using a Bayesian ensemble

Missing value estimation methods for DNA microarrays

Scikit-learn: Machine learning in Python

Statistical significance: P value, 0.05 threshold, and applications to radiomics-Reasons for a conservative approach

The analysis of survival data: The Kaplan-Meier method

The analysis of survival data in nephrology: Basic concepts and methods of Cox regression

Feature selection: A literature review

Logistic regression for dependent binary observations

Deep learning-based decision-tree classifier for COVID-19 diagnosis from chest X-ray imaging

Covid Symptom Severity Using Decision Tree

COVID-19 patient health prediction using boosted random forest algorithm. Front. Public Health

C-reactive protein levels in the early stage of COVID-19

Naive Bayes classifier for predicting the factors that influence death due to covid-19 in China

A novel approach to predict COVID-19 using support vector machine. In Data Science for COVID-19

Prediction of COVID-19 Possibilities using K-Nearest Neighbour Classification Algorithm

Prediction and Feature Importance Analysis for Severity of COVID-19 in South Korea Using Artificial Intelligence: Model Development and Validation

Machine-learning approaches in COVID-19 survival analysis and discharge-time likelihood prediction using clinical data

Grid search, random search, genetic algorithm: A big comparison for NAS

Application of improved grid search algorithm on SVM for classification of tumor gene

Complete Cross-Validation for Nearest Neighbor Classifiers

Visualizing data using t-SNE

C-reactive protein predicts outcome in COVID-19: Is it also a therapeutic target?

The role of C-reactive protein as a prognostic marker in COVID-19

Anemia and iron metabolism in COVID-19: A systematic review and meta-analysis

Erythrocytes as a target of sars cov-2 in pathogenesis of COVID-19

Silent hypoxia: Higher NO in red blood cells of COVID-19 patients

Bilirubin levels in patients with mild and severe Covid-19: A pooled analysis

Bilirubin levels as potential indicators of disease severity in coronavirus disease patients: A retrospective cohort study

Letter to the Editor: COVID-19-Related Liver Injury: The Interpretation for Aspartate Aminotransferase Needs to Be Cautious

Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in

High prevalence of hypocalcemia in non-severe COVID-19 patients: A retrospective case-control study

Low levels of total and ionized calcium in blood of COVID-19 patients

Serum calcium as a biomarker of clinical severity and prognosis in patients with coronavirus disease

Serum Calcium and Vitamin D levels: Correlation with severity of COVID-19 in hospitalized patients in Royal Hospital

On the use of lymphocyte to neutrophil ratios in laboratory medicine

Predictive value of the neutrophil to lymphocyte ratio for disease deterioration and serious adverse outcomes in patients with COVID-19: A prospective cohort study

A learning algorithm for continually running fully recurrent neural networks

Long short-term memory

Predicting COVID-19 disease progression and patient outcomes based on temporal deep learning