key: cord-0833255-g5zam27j authors: Venturini, Sergio; Orso, Daniele; Cugini, Francesco; Crapis, Massimo; Fossati, Sara; Callegari, Astrid; Pellis, Tommaso; Tonizzo, Maurizio; Grembiale, Alessandro; Rosso, Alessia; Tamburrini, Mario; D'Andrea, Natascia; Vetrugno, Luigi; Bove, Tiziana title: Classification and analysis of outcome predictors in non‐critically ill COVID‐19 patients date: 2021-04-09 journal: Intern Med J DOI: 10.1111/imj.15140 sha: 48ec6d592ffb251b884e1251fac01ee3296f1fbd doc_id: 833255 cord_uid: g5zam27j BACKGROUND: Early detection of severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2)‐infected patients who could develop a severe form of COVID‐19 must be considered of great importance to carry out adequate care and optimise the use of limited resources. AIMS: To use several machine learning classification models to analyse a series of non‐critically ill COVID‐19 patients admitted to a general medicine ward to verify if any clinical variables recorded could predict the clinical outcome. METHODS: We retrospectively analysed non‐critically ill patients with COVID‐19 admitted to the general ward of the hospital in Pordenone from 1 March 2020 to 30 April 2020. Patients' characteristics were compared based on clinical outcomes. Through several machine learning classification models, some predictors for clinical outcome were detected. RESULTS: In the considered period, we analysed 176 consecutive patients admitted: 119 (67.6%) were discharged, 35 (19.9%) dead and 22 (12.5%) were transferred to intensive care unit. The most accurate models were a random forest model (M2) and a conditional inference tree model (M5) (accuracy = 0.79; 95% confidence interval 0.64–0.90, for both). For M2, glomerular filtration rate and creatinine were the most accurate predictors for the outcome, followed by age and fraction‐inspired oxygen. For M5, serum sodium, body temperature and arterial pressure of oxygen and inspiratory fraction of oxygen ratio were the most reliable predictors. CONCLUSIONS: In non‐critically ill COVID‐19 patients admitted to a medical ward, glomerular filtration rate, creatinine and serum sodium were promising predictors for the clinical outcome. Some factors not determined by COVID‐19, such as age or dementia, influence clinical outcomes. The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing COVID-19 emerged as a public health problem in late 2019 in China stemming from a zoonotic source and was declared a pandemic in March 2020. 1 COVID-19's incubation period is believed to be up to 14 days. The main presentation features of COVID-19 are fever, cough and dyspnoea. 2, 3 However, a complete picture of the clinical course and clinical presentation of COVID- 19 has not yet been described. 4 Indeed, the clinical spectrum of COVID-19 is varied and ranges from very mild to critical cases. Many studies have focussed on general characteristics presented at the beginning of the disease and tried to identify major risk factors related to mortality, such as advanced age, cardiovascular disease, diabetes, chronic respiratory disease, hypertension and cancer. [5] [6] [7] In other studies, obesity and smoking were associated with increased risk. 8, 9 Early diagnosis of patients who could develop a particularly severe form of COVID-19 is of great importance to provide adequate care and optimise the resources. 10 The identification of some predictors could allow us to detect prognostically unfavourable developments. These results must be executable at any time, rapidly achievable and sustainable even in contexts with limited resources. These findings could improve awareness of COVID-19 and soon improve the correct identification and prognostic risk of COVID-19. They could improve understanding of the clinical evolution of COVID-19 disease during hospitalisation and improve the correct identification and classification of prognosis. We analysed a series of non-critically ill COVID-19 patients admitted to a medical department, using different machine learning classification models, to verify if any clinical variables recorded could predict clinical outcome. We retrospectively analysed the data of non-critically ill COVID-19 patients admitted to the general medicine department of the hospital in Pordenone from 1 March 2020 to 30 April 2020. We included patients positive for SARS-CoV-2 detected by real-time polymerase chain reaction in a nasopharyngeal swab, admitted from the emergency department, suffering from COVID-19. Exclusion criteria were pregnancy, traumatised patients and age <18 years old. We also excluded patients transferred from other wards or hospitals. The study followed the international and national regulations following the Declaration of Helsinki. The following clinical data were collected: age, gender, weight, height, body mass index, length of stay, the delay from onset of symptoms and hospitalisation ('Onset'), medical ward admitting, clinical presentation (fever, cough, shortness of breath, myalgia, diarrhoea, gastrointestinal complaints), clinical history (smoking, arterial hypertension, using angiotensin-converting enzyme (ACE) inhibitor drugs, coronary artery disease, diabetes mellitus, obesity, atrial fibrillation, neoplasm, rheumatic diseases, dementia, respiratory disease, liver failure, metabolic syndrome and the number of comorbidities), vitals (systolic arterial pressure, diastolic arterial pressure, heart rate, body temperature) and blood gas analysis (pH, the arterial partial pressure of oxygen and carbon dioxide), the inspiratory fraction of oxygen administered, arterial pressure of oxygen and inspiratory fraction of oxygen ratio (PaO 2 /FiO 2 ), oxygen arterial saturation, blood chemistry tests at hospitalisation: white blood cells count, neutrophils, lymphocytes, red blood cells count, haemoglobin, platelets, C-reactive protein, procalcitonin, creatinine and glomerular filtration rate (GFR) calculated via the Chronic Kidney Disease Epidemiology Collaboration (or CKD-EPI) equation, plasma sodium and potassium, liver functionality (transaminases, gamma-glutamyl transferase, lactate dehydrogenase, bilirubin), coagulation system (prothrombin and activated partial thromboplastin times, fibrinogen, Ddimer), interleukin-6, some venous thromboembolic disease prediction score (Padua VTE (venous thromboembolism) score and International Society of Thrombosis and Hemostasis for Disseminated Intravascular Coagulation score, Sequential Organ Failure Assessment (SOFA) score, administered therapy (hydroxychloroquine, azithromycin, lopinavir/ritonavir (Kaletra), darunavir/ cobicistat (Rezolsta) and low-molecular-weight heparin), and clinical outcomes (discharge, decease and transfer to intensive care unit (ICU)). All registered characteristics were compared, dividing patients according to their clinical outcome. The Student t test to evaluate continuous variables with a parametric distribution or the Kruskal-Wallis test for nonparametric distribution was performed. The categorical variables were assessed using the Chi-squared test (or Fisher exact test, if appropriate). All the variables of the data set have been implemented in five types of regression tree model: Missing values were imputed based on Gibbs sampling. A resampling procedure to evaluate machine learning models was used through a k-fold crossvalidation method consisting of fivefolds resampling. A two-tailed P-value of ≤0.05 was considered statistically significant. A correction for multiplicity by Benjamini and Hochberg's method was applied when appropriate. The models were compared with each other based on their accuracy, sensitivity and specificity performances. All statistical analyses were generated using the opensource R-CRAN software (version 4.0.0; R Foundation for Statistical Computing, Vienna, Austria). The main packages implemented were 'compareGroups', 'randomForest', 'mice', 'rpart', 'party' and 'caret'. During the study period considered, 176 non-critically ill COVID-19 patients were hospitalised (Fig. 1) . The clinical characteristics of the population are shown in Table 1 . The median age was 75.0 years (95% confidence interval 72.0-77.0). Male gender was slightly predominant (55.7%). Seventy-three (41.5%) patients were hospitalised in an exclusive ward for COVID-19 patients, while the rest were hospitalised in a general medicine ward (103 patients corresponding to 58.5% of the population). The most common presenting symptoms were fever (78.4%), cough (48.3%) and shortness of breath (52.8%). The most common comorbidity was hypertension (53.4%) and 48.3% were on an ACE inhibitor drug at admission. Chronic kidney disease was present in 66 (37.5%) patients. Sixty-three percent of patients were given hydroxychloroquine during hospitalisation. Most were discharged because they recovered (119 patients corresponding to 67.6%), 35 (19.9%) patients died and 22 (12.5%) patients evolved into critical illness and were admitted to the ICU. By subdividing the population based on outcome (Table 1) , the variables that significantly asymmetrically distributed were: age (86 vs 72 years for deceased patients compared to the recovered patients, P < 0.001); the time elapsed between the onset of symptoms and hospitalisation (3 days for deceased patients compared to 7 days for recovered patients, P < 0.001); the hospitalisation ward (prevalently the COVID-19-ward, for deceased patients, P = 0.024); a clinical history of coronary artery disease (with a low prevalence among recovered patients, P = 0.05); the prevalence of some form of dementia (only one patient with dementia among those admitted to the ICU, P < 0.001); the number of concomitant pathologies (at least three concomitant pathologies for deceased patients compared to two for patients recovered or hospitalised in the ICU, P < 0.001); body temperature was higher for ICU patients and lower for deceased patients (37.9 C vs 36.8 C, respectively, P < 0.001); a ratio between arterial pressure of oxygen (PaO 2 ) and inspiratory fraction of oxygen (FiO 2 ) was lower for deceased patients (236 mmHg for deceased patients, 292 mmHg for patients admitted to the ICU and 314 mmHg for recovered patients, P < 0.001). Regarding blood chemistry tests: lymphocyte count was lower in deceased patients and higher in recovered patients (0.78 vs 1.19 × 10 3 , respectively, P = 0.001); haemoglobin and platelet count were lower in deceased patients (11.6 g/dL and 156 000/ mL, respectively, P = 0.010 and 0.002); procalcitonin and C-reactive protein were higher in the deceased patients (P < 0.001 for both). The creatinine level was higher in the deceased patients, and, conversely, GFR was lower in these patients (108 μg/L and 45.9 mL/min/1.73 m 2 , P < 0.001 for both). Hydroxychloroquine, azithromycin and darunavir/cobicistat were mainly administered to patients who were subsequently transferred to ICU (P < 0.001 for hydroxychloroquine and darunavir/ cobicistat; P = 0.019 for azithromycin). Comparing the five proposed regression tree models, the two models with greater accuracy were M2 and M5 (accuracy = 79%, for both) ( Table 2 ). The model worst performed was M4 (accuracy = 65%) (Fig. 2) . All models showed good sensitivity performance in predicting discharge, good specificity in predicting death or the need for transfer to ICU but poor performance regarding sensitivity for both latter outcomes. For M, the first three most useful predictors were serum sodium level, body temperature and the PaO 2 /FiO 2 ratio (Fig. 3) . For M2, the most accurate predictors were renal function, age and, according to the accuracy or the Gini coefficient, FiO 2 , or serum sodium (Fig. 4) . By analysing our population of COVID-19 patients admitted to a medical department, we have derived and internally validated a series of predictive models with varying degrees of accuracy. The best accurate models (the random forest model and the conditional inference tree model) substantially agree in defining the predictive factors for clinical worsening. These predictive factors are related to kidney function (GFR and creatinine) or serum sodium. Classic classification decision tree ('rpart' package) (M4). A creatinine greater or less than 1.2 mg/dL (106.1 μmol/L), a white blood cell count greater or less than 11 000/mL and the age greater than 79 years are the main determinants of the patient's clinical outcome. This model's accuracy is 65% (95% confidence interval 49-59%) (see the main text for details). ICU, intensive care unit; WBC, white blood cells. Recursive partitioning regression tree model using permutation tests to hierarchise predictor variables ('party' package) (M5). The plasma sodium level (i.e. if greater or less than 144 mEq/L) represents the first predictive node. The following are body temperature (i.e. if higher or lower than 37.7 C), hospitalisation in the ward dedicated to COVID-19 patients only and the arterial pressure of oxygen and inspiratory fraction of oxygen ratio (PaO 2 /FiO 2 ) higher or lower than 112. The probability percentages for (from left to right) death, discharge or transfer to intensive care (ICU), are reported at the bottom of the decision tree. The accuracy of this model is 79% (95% confidence interval 64-90%). For further details, see the main text. Internal Medicine Journal 51 (2021) 506-514 Compared to the literature on COVID-19, our population has a comparable mean age and a clear predominance of males. Fever and cough were the most represented symptoms, as already described in the literature. 11, 12 Furthermore, as previously reported, a coronary heart disease condition was associated with a worse clinical outcome. 13 The prevalence of dementia has not been reported in the literature, and our population has shown a clear prevalence among the most severely ill patients. However, in none of the derived predictive models, dementia is a strong predictor. Therefore, it is an association that should be considered, especially in the context of non-critically ill patients. These patients are likely to require a greater workload than patients without dementia and whose COVID-19 infection may not be so obvious to diagnose. 14, 15 Furthermore, when analysing the population by the outcome, patients with more severe disease had a more pronounced inflammatory profile, for example, in our population, higher levels of fever, C-reactive protein and procalcitonin (despite a lower lymphocyte count). Our population appears to be comparable to the population studied in the Wuhan district: patients with more severe forms of COVID-19 had altered coagulation status and lower lymphocyte counts. 16 Similarly, a study on Detroit's population also showed that lymphocyte counts (along with coagulation changes) are a factor related to a worse clinical course. 17 Several authors have shown that alterations in the platelet aggregation/coagulation system are present in the most severe forms of COVID-19. 18, 19 This finding was confirmed in our population. Patients who required a higher level of care (or who died) had lower platelet counts and higher D-dimer levels than patients with a good clinical course. 20 Due to our study's observational nature, we are unable to establish whether this finding is a simple correlation, an epiphenomenon of a general pro-inflammatory arousal state, or a causal pathological mechanism responsible for the death (i.e. evidence of disseminated intravascular coagulation). The present study also has further importance in highlighting how some factors not directly relatedat least presumptivelyto the pathological process caused by COVID-19 influence the outcome of an unselected population of COVID-19 patients. Indeed, age and general clinical conditions appear to play a role in determining patients' clinical course. Older patients with various comorbidities (e.g. dementia) are frail and therefore more prone to a negative outcome, regardless of the severity of the disease process, which has led to the need for hospitalisation. This finding is even more important if − as it has been done − we intend to compare heterogeneous populations of COVID-19 patients to determine the ultimate mechanisms by which SARS-CoV-2 can cause death. A recent retrospective study conducted on a population of Wuhan's Chinese district showed that age (along with comorbidities and renal function) is an independent prognostic factor of mortality. Our population does not seem to differ much from the Chinese population, at least in this respect. 21 As for the proposed classification models, in predicting dichotomous outcomes (i.e. survival or death), the derived models are generally accurate. However, instead, they behave inaccurately for more complex problems such as the need to increase the intensity of assistance. This inaccuracy is since factors not strictly related to clinical presentation contribute to this decision, such as patient and family member expectations, ethical reasoning, quality of life and local logistical considerations. 22 The main determinants of patient outcomes were renal function and serum sodium. It is not possible to establish to what extent COVID-19 causes possible acute renal failure or renal injury. However, any renal replacement therapy can play a decisive role in the outcome of these patients. This aspect of COVID-19 disease is probably poorly highlighted in the literature, focussing more on respiratory failure. 23, 24 Respiratory failure is a leading cause of death in COVID-19 patients and implies a burden of care in terms of applied technologies (ventilators) and clinical skills (experts in ICU medicine). 25 However, it is also true that most patients with COVID-19 likely do not require complex respiratory care but instead include comprehensive management care for multiorgan function. In the international Health Outcome Predictive Evaluation (HOPE) registry, patients with chronic renal failure were less than 10%, but 30% of patients admitted for COVID-19 had a reduction in GFR on admission. These patients with previously unrecognised or de novo renal failure had higher mortality than patients with normal renal function. 26 The authors proposed the hypothesis of nephropathic damage directed by SARS-CoV-2 and immune-mediated damage induced by the cytokine cascade. 26 As for the inpatient ward, patients initially admitted to the COVID-19 ward more often presented respiratory symptoms. Patients admitted to the COVID-19 ward with respiratory failure more frequently had a poor prognosis. However, patients with good respiratory performance had similar discharge rates to patients admitted to the non-COVID-19 ward. It is also significant that patients admitted to the non-COVID-19 ward were moved more frequently to the ICU than patients admitted to the COVID-19 ward (16 vs 6). The causes of the transfer to the ICU were probably only partly the worsening of respiratory failure. Equally relevant were other organ failures, such as kidney failure. Finally, concerning the therapies administered, more drugs were administered to patients recovered or transferred to the ICU than to deceased patients. Given our study's observational nature, it is impossible to establish a causal relationship with this association. Randomised clinical trials should be implemented to address this issue. Ours is a retrospective observational study that evaluated the correlation between clinical variables and the outcomes of a series of non-critically ill COVID-19 patients. For this reason, although we have been able to establish the correlation between some factors and the clinical outcomethrough some applications of machine learning classification modelswe have not determined the causal link between some of these predictors and the outcome. Furthermore, like all predictive models, the result depends on the variables initially inserted in the data set. We cannot rule out that some variables that we have not considered may have a stronger correlation with some of the clinical outcomes. In non-critically ill COVID-19 patients admitted to a medical ward, GFR, creatinine and serum sodium are promising clinical outcomes predictors. Some factors not determined by COVID-19, such as age or dementia, are associated with the clinical outcome. Some machine learning classification methods, such as a random forest model and a conditional inference tree model, seem robust enough to derive a predictive model. Similar studies on similar populations are needed to validate these derived models externally. Data are available following a reasoned request. WHO announces COVID-19 outbreak a pandemic A novel coronavirus from patients with pneumonia in China Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China WHO international health regulations (IHR) vs COVID-19 uncertainty Clinical Features of 85 Fatal Cases of COVID-19 from Wuhan. A Retrospective Observational Study Clinical characteristics and outcomes of hospitalised patients with COVID-19 treated in Hubei (epicentre) and outside Hubei (non-epicentre): a nationwide analysis of China Clinical characteristics of 113 deceased patients with coronavirus disease 2019: retrospective study Relationship between obesity, diabetes and ICU admission in COVID-19 patients Tobacco smoking and COVID-19 pandemic: old and new issues. A summary of the evidence from the scientific literature Economic analysis of healthcareassociated infection prevention and control interventions in medical and surgical units: systematic review using a discounting approach The newly emerged COVID-19 disease: a systemic review The COVID-19 pandemic: a comprehensive review of taxonomy, genetics, epidemiology, diagnosis, treatment, and control A review of cardiac manifestations and predictors of outcome in patients with COVID-19 Neurological diseases as mortality predictive factors for patients with COVID-19: a retrospective cohort study Dementia care in the time of COVID-19 pandemic Clinical course and outcome of 107 patients infected with the novel coronavirus, SARS-CoV-2, discharged from two hospitals in Wuhan Clinical characteristics and morbidity associated with coronavirus disease 2019 in a series of patients in metropolitan Detroit Association of Padua prediction score with in-hospital prognosis in COVID-19 patients Systematic review of the prognostic utility of D-dimer, disseminated intravascular coagulation, and anticoagulant therapy in COVID-19 critically ill patients D-dimer triage for COVID-19 Clinical characteristics and outcomes of older patients with coronavirus disease 2019 (COVID-19) in Wuhan, China: a singlecentered, retrospective study Resource allocation in ICU: ethical considerations Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Clinical characteristics of non-critically ill patients with novel coronavirus infection (COVID-19) in a Fangcang Hospital Intensive care during the coronavirus epidemic Impact of renal function on admission in COVID-19 patients: an analysis of the international HOPE COVID-19 (Health Outcome Predictive Evaluation for COVID 19) registry