key: cord-1003731-x3zxq1oi
authors: Kim, Hyung-Jun; Heo, JoonNyung; Han, Deokjae; Oh, Hong Sang
title: Validation of Machine Learning Models to Predict Adverse Outcomes in Patients with COVID-19: A Prospective Pilot Study
date: 2022-04-20
journal: Yonsei Med J
DOI: 10.3349/ymj.2022.63.5.422
sha: 620f24c6b493a37640d34ed54275f1082e4b7809
doc_id: 1003731
cord_uid: x3zxq1oi

PURPOSE: We previously developed learning models for predicting the need for intensive care and oxygen among patients with coronavirus disease (COVID-19). Here, we aimed to prospectively validate the accuracy of these models. MATERIALS AND METHODS: Probabilities of the need for intensive care [intensive care unit (ICU) score] and oxygen (oxygen score) were calculated from information provided by hospitalized COVID-19 patients (n=44) via a web-based application. The performance of baseline scores to predict 30-day outcomes was assessed. RESULTS: Among 44 patients, 5 and 15 patients needed intensive care and oxygen, respectively. The area under the curve of ICU score and oxygen score to predict 30-day outcomes were 0.774 [95% confidence interval (CI): 0.614–0.934] and 0.728 (95% CI: 0.559–0.898), respectively. The ICU scores of patients needing intensive care increased daily by 0.71 points (95% CI: 0.20–1.22) after hospitalization and by 0.85 points (95% CI: 0.36–1.35) after symptom onset, which were significantly different from those in individuals not needing intensive care (p=0.002 and <0.001, respectively). Trends in daily oxygen scores overall were not markedly different; however, when the scores were evaluated within <7 days after symptom onset, the patients needing oxygen showed a higher daily increase in oxygen scores [1.81 (95% CI: 0.48–3.14) vs. -0.28 (95% CI: 1.00–0.43), p=0.007]. CONCLUSION: Our machine learning models showed good performance for predicting the outcomes of COVID-19 patients and could thus be useful for patient triage and monitoring.

After initial reports in late December 2019, coronavirus disease (COVID-19) has become a worldwide pandemic. 1 Among several modalities used to treat the disease, 2-7 only a few have proven to be effective. 2, 3 Supportive care and respiratory support are now considered as mainstays of treatment for COV-ID- 19 . 8 As such, medical professionals must focus on patient triage to the appropriate level of care to reduce the risk of medical supply shortages. 9 Numerous efforts have been made to establish risk factors of deterioration among patients with COVID-19. [10] [11] [12] [13] However, most of these require laboratory or radiographic findings, which

can be time consuming and costly to obtain. Therefore, we have developed an easy-to-use machine learning model with which to predict the risk of needing intensive care among COVID-19 patients using easily obtainable patient information (e.g., demographics, comorbidities, subjective symptoms, and body temperature). 14 Another model was developed using the need for oxygen supplementation as another outcome. The models were integrated into a web-based application developed during the early phase of the pandemic. 15, 16 However, the original models were developed from and validated in a retrospective cohort. 14 In addition, the information was uploaded by attending physicians and not by the patients themselves. Because our easy-to-use machine learning models were designed to be used directly by patients, the validity of the models in which the information is directly uploaded by patients must be investigated.

Thus, in this study, we aimed to prospectively evaluate the validity of the models for predicting the need for intensive care or oxygen supplementation among patients with COVID-19.

In this prospective observational study, we screened all adult (age≥18 years) patients with COVID-19 confirmed by polymerase chain reaction who were admitted to the Armed Forces Capital Hospital Trauma Center, Seongnam, South Korea from September 19, 2020 through November 19, 2020. We enrolled only those who volunteered to participate. The Armed Forces Capital Hospital Trauma Center is a 60-bed hospital constructed on March 5, 2020 to treat trauma patients. Because of a sudden upsurge in COVID-19 patients, a part of the center was converted into a 40-bed COVID-19 care unit. The unit is capable of providing general supportive care, including oxygen supplementation. Intensive care, such as mechanical ventilation, vasopressor use, and extracorporeal life support, is not possible in this unit, and patients needing intensive care are transferred to other hospitals.

This study was approved by the Institutional Review Board of the Armed Forces Capital Hospital (approval number: AFCH-20-IRB-037) and was conducted in accordance with the amended Declaration of Helsinki. The need for informed consent was waived because patients volunteered to directly provide their information into the application without any invasive measurements. In addition, acquiring written informed consent was considered dangerous due to the highly transmissive nature of COVID-19. 17

Patients provided their data, including demographics, smoking history, underlying comorbidities, activities of daily living, symptoms, and body temperatures, directly to an online web-based application ( Supplementary Fig. 1A , only online). 14 In addition to their baseline data, the patients were encouraged to provide their daily symptoms and body temperature if possible. Additional information, such as symptom onset, patient outcome, date of admission, and date of discharge, was collected by an attending physician. 17

This study included two prediction models: one predicting the need for intensive care [intensive care unit (ICU) score] and the other predicting the need for oxygen supplementation (oxygen score). The need for intensive care was defined as admission to the ICU, use of extracorporeal life support, mechanical ventilation, vasopressors, or death within the first 30 days of admission. 14 This accounted for patients who could not be admitted to the ICU owing to limited hospital facilities. The need for oxygen supplementation during the first 30 days was included as another outcome because it is a useful criterion for hospitalization. 18 Both models were originally derived from and validated in a separate nationwide cohort that included hospitalized patients with COVID-19 from 100 hospitals in South Korea. Patient information was uploaded to an online case report form by the attending physicians in each center, and the database was managed by the Korean Disease Control and Prevention Agency (https://icreat.nih.go.kr/). 14 We used patient characteristics that could be easily provided by patients, such as demographics, smoking history, symptoms, and body temperature, to derive a machine learning model with an AutoML method. 19 The details of the included variables are presented in Supplementary Table 1 (only online). Patients hospitalized from January 25, 2020 through March 20, 2020 were assigned to the model derivation group, and those hospitalized from March 21, 2020 through June 3, 2020 were assigned to the model validation group. Detailed descriptions of model derivation and validation for ICU score are provided in an earlier report. 14 Results of the calculated probability of the need for intensive care based on the predefined XGBoost model are presented as numbers ranging from 0 (lowest probability) to 100 (highest probability). 14 Oxygen score was derived in a similar manner using the same variables. ICU score and oxygen score to predict patient outcomes showed excellent discrimination performance in both the derivation and validation groups from the previous cohort ( Supplementary Fig. 2 , only online). 14 Variables with high feature importance included activities of daily living, age, dyspnea, body temperature, sex, and symptoms of dyspnea ( Fig. 1 ). Details on these variables are presented in Supplementary Tables 2 and 3 (only online).

Probabilities of the need for intensive care and oxygen supplementation according to ICU and oxygen scores were calculated automatically after data were uploaded by patients. The probabilities are presented as numbers from 0 to 100, with 0 referring to the lowest probability of requiring intensive care or oxygen supplementation in each model and 100 referring to the highest. The attending physician could inspect these scores along with details via a web-based application for physicians ( Supplementary Fig. 1B , only online). We calculated the area under the receiver operating characteristics curve (AUC) with 95% CIs to assess the discrimination performance of initial scores.

Although both scores were derived to predict 30-day outcomes using baseline scores, serial data were gathered for further analyses. We utilized linear mixed-effects models to evaluate the repeated measures of each score. The models are useful for analyzing repeated measures as they can use all data available and account for repeats within subjects. 20,21 Separate linear mixed effects models were used to estimate associations between changes in ICU and oxygen scores and the presence of each outcome (need for intensive care or oxygen) with patient-specific intercept. 20 The model included terms for time (hospital days), outcome, and interactions between them. Two different baseline timepoints were used in both scores: the day of hospitalization and the day of symptom onset. Scores provided on the day or after discharge were excluded in the analyses. In addition, the scores provided from the day of oxygen supplementation were excluded in the analysis of oxygen scores. All statistical analyses were performed using Stata version 16 (StataCorp. 2019. Stata Statistical Software: release 16. Stata-Corp LLC, College Station, TX, USA).

Among 82 patients hospitalized in the COVID-19 care unit during the study period, 44 patients volunteered to participate in our study. Among those patients, 5 and 15 patients needed intensive care and supplementary oxygen, respectively. All 5 patients who needed intensive care also needed supplementary oxygen. The remaining 29 patients were discharged without the need for oxygen supplementation or intensive care (Fig. 2) .

The median patient age was 61 years (IQR: 53-64 years). 29 (65.9%) were female, and 35 (79.6%) were never smokers. The most common underlying comorbidities were hypertension 9 (20.5%) and diabetes mellitus 8 (18.2%). Almost all patients were able to independently perform their daily activities 43 (97.7%). The median duration between symptom onset and hospitalization was 2 days (IQR: 1-4.5 days), and the most frequent symptoms were cough 26 (59.1%), sputum 20 (45.5%), and sore throat 18 (40.9%). The baseline demographics, underlying comorbidities, and symptoms did not differ according to the requirement of oxygen or intensive care. However, patients who required intensive care had higher baseline body temperatures than those who did not need oxygen or intensive care (p=0.006) ( Table 1) .

The median ICU score at baseline was 2.59 (IQR: 2.10-5. 24 intensive care within a median of 8 days (IQR: 7-9 days) from hospitalization and a median of 9 days (IQR: 8-11 days) from symptom onset. The AUC of ICU scores to predict the need for intensive care within 30 days was 0.774 (95% CI: 0.614-0.934) (Fig. 3A) .

Baseline oxygen score was only assessed after oxygen sup- plementation in three patients, and they were excluded from the analyses of oxygen scores. The median oxygen score at baseline was 6.55 (IQR: 4.30-8.61) in 41 patients, 8.07 (IQR: 6.39-16.14) in 12 patients who needed oxygen supplementation, and 6.00 (IQR: 6.28-16.14) in 29 patients who did not need it (p= 0.022). The patients needed oxygen supplementation within a median of 4 days (IQR: 3-6 days) from hospitalization and within a median of 7.5 days (IQR: 6-9.5 days) from symptom onset. The AUC of oxygen scores to predict the need for oxygen supplementation within 30 days was 0.728 (95% CI: 0.559-0.898) (Fig. 3B ).

In total, 464 scores were calculated for both ICU and oxygen scores. Among those scores, 24 scores were provided on the day of and after discharge and were therefore excluded. ICU score was measured at least twice in all 44 patients, with a median of 10 measurements (IQR: 7-12.5) per patient. When baseline score was defined as that obtained on the admission day, the 5 patients who needed intensive care showed an av-erage daily increase of 0.71 points (95% CI: 0.20-1.22), while the 39 patients who did not need intensive care showed an average daily decrease of -0.11 points (95% CI: -0.20 --0.02), with a significant difference between the two groups (p=0.002) (Fig. 4A) . A similar significant difference was found when the baseline score was set as that obtained on the day of symptom onset [+0.85 points (95% CI: 0.36-1.35) vs. -0.10 points (95% CI: -0.19--0.01), p<0.001] (Fig. 4B) .

With respect to oxygen score, only 353 scores from 41 patients were included, because 87 scores were obtained on the day of oxygen supplementation. The oxygen score was measured at least twice in all 41 patients, with a median of eight measurements (IQR: 4-11) per patient. There was no significant difference in daily changes in oxygen scores starting from the day of hospitalization between the 12 patients who required oxygen and the 29 patients who did not require oxygen (p=0.113) (Fig. 4C) . The difference remained insignificant between the two groups of patients when the baseline score was set as that obtained on the day of symptom onset (p=0.349) (Fig. 4D) . However, when only the scores calculated within less than 7 days of symptom onset were analyzed, the patients who required oxygen supplementation showed a significantly higher increase in their daily oxygen score than those who did not need oxygen supplementation [1.81 (95% CI: 0.48-3.14) vs. -0.28 (95% CI: -1.00-0.43), p=0.007].

In this study, we aimed to validate our predeveloped machine learning models to predict the need for intensive care and oxygen supplementation among patients with confirmed COV-ID-19. Initial scores predicted 30-day outcomes with good discrimination performance. In addition, we found distinct patterns of changes in daily scores according to patient outcomes. To our best knowledge, this study is the first to evaluate prognostic models with real-world data provided directly by patients. Further, this study is also the first to evaluate patterns of changes in these patients.

Our models have several advantages. First, they can enable early triage with minimal resources. Early triage is crucial for achieving good outcomes in patients with COVID-19 because the time window between symptom onset to critical event is very short. 22, 23 Unlike other models, 24-28 our models do not include radiographic or laboratory findings as prediction variables. In contrast to fully equipped higher-level facilities, quarantine facilities may not have advanced medical equipment, 29 and underdeveloped areas may not have any medical facilities at all. Second, our web-based application can facilitate prognostic evaluation in a large group of patients. As the number of patients with COVID-19 increases, it is difficult to predict each patient's prognosis individually given limited resources. A lack of efficient triage can lead to higher mortality rates, particularly in areas with a sudden upsurge of COVID-19 cases. 30, 31 Given that our models are web based and calculate risk automatically with data provided directly from patients, large-scale risk calculation is possible. Third, our models can be used for telemedicine in the era of COVID-19. 32 A previous study has reported that nearly half of all patients with COVID-19 are treated on an outpatient basis. 33 Even when patients are hospitalized, time lag exists between symptom onset and hospitalization, 34, 35 leaving the possibility of acute patient deterioration before active inhospital management. Our models can be applied for active monitoring of patients who are on home quarantine and identify those at higher risk of deterioration and requiring early hospitalization. Among the patients who required supplementary oxygen, oxygen scores increased within 7 days of symptom onset, but decreased thereafter. This can be partly explained by the general supportive care given after hospitalization. Among the 12 patients included in the serial analysis of oxygen score, all 12 patients (100.0%) received antipyretics, and 11 patients (91.7%) received antitussives. Even among the other 29 patients who did not need supplementary oxygen, 23 patients (79.3%) received antipyretics, and 18 patients (62.1%) received antitussives. Considering that our models do not include laboratory or radiographic findings, the variables that can change daily are limited to body temperature and subjective symptoms. Such variables are prone to change according to the extent of general supportive care. However, the median interval from symptom onset to oxygen supplementation in this study was 7.5 days, similar to other previous studies. 36, 37 Therefore, a serial measurement of oxygen score can be helpful for preemptive identification of patients who may need oxygen supplementation.

The discrimination performance of our models seemed slightly lower than those calculated from a previous retrospective cohort ( Supplementary Fig. 2 , only online), 14 which can be attributed to the following reasons: first, there is possible selection bias in the patients included in this study. The COV-ID-19 unit of our center was a temporary unit capable of simple supportive care, and the number of healthcare providers was limited. Accordingly, it could not accommodate patients with severe underlying comorbidities, such as cancer, chronic lung disease, or dementia, or those dependent on others for their daily activities. Also, patients not capable of using smartphones could not participate in our study. Second, information uploaded directly by patients may differ from those defined by healthcare providers. Patients uploaded their information directly via a web-based application in this prospective cohort (Fig. 1A) , 15, 16 Days after hospitalization unlike the data collected from the derivation cohort, which were uploaded by attending physicians. 14 We can make several recommendations from this study. First, patients with higher scores should be prioritized for transfer to higher-level facilities. Although an exact score to predict shortterm outcomes has yet to be established, our models have proven to be efficacious as good decision-support tools when active monitoring is impossible due to shortage of medical resources or manpower. Second, patients should record their daily status for up to 7 to 10 days after symptom onset. Our study revealed a median interval of 9 days (IQR 8-11 days) between hospitalization and ICU admission, and a median of 7.5 days (IQR: 6-9.5 days) between symptom onset and oxygen supplementation. These durations are similar to those in previous reports 22, 36, 37 and also match with trends in daily changes in ICU scores and oxygen scores in our study. If scores remain low during the first 10 days, we can carefully expect a mild disease course without the need for oxygen supplementation or intensive care. In addition, day of symptom onset seems to be a better baseline timepoint than day of hospitalization for repeated measures.

Despite these advantages, our study also has some limitations. First, the attending physician was not blinded to the calculated scores during the study period. However, all patients received standard care regardless of the calculated scores. Second, because this was a pilot study, the number of patients was limited. Further studies with larger sample sizes are needed to establish the validity of the model and to determine its usefulness for reducing overall mortality rates among patients with COVID-19.

As a pilot study, our models show fair discrimination performance for identifying patients at risk of adverse outcomes and may need intensive care and oxygen supplementation. Thus, they can be useful for predicting patient outcomes and for patient monitoring during the early disease course. Further validation studies are needed to draw a complete conclusion.

A novel coronavirus from patients with pneumonia in China

Remdesivir for the treatment of COVID-19 -final report

Dexamethasone in hospitalized patients with COVID-19

Treatment of 5 critically ill patients with COVID-19 with convalescent plasma

Efficacy of tocilizumab in patients hospitalized with COVID-19

A trial of lopinavir-ritonavir in adults hospitalized with severe COVID-19

Effect of hydroxychloroquine in hospitalized patients with COVID-19

Pathophysiology, transmission, diagnosis, and treatment of coronavirus disease 2019 (COVID-19): a review

Critical supply shortages-the need for ventilators and personal protective equipment during the CO-VID-19 pandemic

Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal

A predictive model for disease progression in non-severely ill patients with coronavirus disease 2019

An interpretable mortality prediction model for COVID-19 patients

Using machine learning to predict ICU transfer in hospitalized COVID-19 patients

An easy-to-use machine learning model to predict the prognosis of patients with COVID-19: retrospective cohort study

COVID-19 outcome prediction and monitoring solution for military hospitals in South Korea: development and evaluation of an application

A patient selfcheckup app for COVID-19: development and usage pattern anal

The reproductive number of COVID-19 is higher compared to SARS coronavirus

Oxygen saturations less than 92% are associated with major adverse events in outpatients with pneumonia: a population-based cohort study

Automated machine learning: review of the state-of-the-art and opportunities for healthcare

Newton-Raphson and EM algorithms for linear mixed-effects models for repeated-measures data

Using the general linear mixed model to analyse unbalanced repeated measures and longitudinal data

Characterization and clinical course of 1000 patients with coronavirus disease 2019 in New York: retrospective case series

Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study

A tool for early prediction of severe coronavirus disease 2019 (COVID-19): a multicenter study using the risk nomogram in Wuhan and Guangdong, China

Prediction for progression risk in patients with COVID-19 pneumonia: the CALL score

Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO clinical characterisation protocol: development and validation of the 4C Mortality Score

Prediction models for the clinical severity of patients with COVID-19 in Korea: retrospective multicenter cohort study

Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19

Managing COVID-19 in a novel, rapidly deployable community isolation quarantine facility

Wuhan and Hubei COV-ID-19 mortality analysis reveals the critical role of timely supply of medical resources

Estimating the risk of CO-VID-19 death during the course of the outbreak in Korea

Virtually perfect? Telemedicine for COV-ID-19

Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York city: prospective cohort study

Belgian Collaborative Group on COVID-19 Hospital Surveillance. Time between symptom onset, hospitalisation and recovery or death: statistical analysis of Belgian COVID-19 patients

Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data

Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study

Epidemiologic features and clinical course of patients infected with SARS-CoV-2 in Singapore

We would like to express our appreciation of all health care workers involved in the diagnosis and treatment of patients with COVID-19 in South Korea. We would also like to thank the Korean Disease Control and Prevention Agency and the healthcare providers from the 100 hospitals for their efforts in collecting medical records used to develop the ICU model and the oxygen model in this study.