key: cord-0948166-q34w8ewl
authors: Liu, Fang-Yan; Sun, Xue-Lian; Zhang, Yong; Ge, Lin; Wang, Jing; Liang, Xiao; Li, Jun-Fen; Wang, Chang-Liang; Xing, Zheng-Tao; Chhetri, Jagadish K.; Sun, Peng; Chan, Piu
title: Evaluation of the Risk Prediction Tools for Patients With Coronavirus Disease 2019 in Wuhan, China: A Single-Centered, Retrospective, Observational Study
date: 2020-08-25
journal: Crit Care Med
DOI: 10.1097/ccm.0000000000004549
sha: 237c920dcf8e399c5ec703de07ba88be373f30b8
doc_id: 948166
cord_uid: q34w8ewl

OBJECTIVES: To evaluate and compare the efficacy of National Early Warning Score, National Early Warning Score 2, Rapid Emergency Medicine Score, Confusion, Respiratory rate, Blood pressure, Age 65 score, and quick Sepsis-related Organ Failure Assessment on predicting in-hospital death in patients with coronavirus disease 2019. DESIGN: A retrospective, observational study. SETTING: Single center, West Campus of Wuhan Union hospital-a temporary center to manage critically ill patients with coronavirus disease 2019. PATIENTS: A total of 673 consecutive adult patients with coronavirus disease 2019 between January 30, 2020, and March 14, 2020. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Data on demography, comorbidities, vital signs, mental status, oxygen saturation, and use of supplemental oxygen at admission to the ward were collected from medical records and used to score National Early Warning Score, National Early Warning Score 2, Rapid Emergency Medicine Score, Confusion, Respiratory rate, Blood pressure, Age 65 score, and quick Sepsis-related Organ Failure Assessment. Total number of patients was 673 (51% male) and median (interquartile range) age was 61 years (50–69 yr). One-hundred twenty-one patients died (18%). For predicting in-hospital death, the area under the receiver operating characteristics (95% CI) for National Early Warning Score, National Early Warning Score 2, Rapid Emergency Medicine Score, Confusion, Respiratory rate, Blood pressure, Age 65 score, and quick Sepsis-related Organ Failure Assessment were 0.882 (0.847–0.916), 0.880 (0.845–0.914), 0.839 (0.800–0.879), 0.766 (0.718–0.814), and 0.694 (0.641–0.746), respectively. Among the parameters of National Early Warning Score, the oxygen saturation score was found to be the most significant predictor of in-hospital death. The area under the receiver operating characteristic (95% CI) for oxygen saturation score was 0.875 (0.834–0.916). CONCLUSIONS: In this single-center study, the discrimination of National Early Warning Score/National Early Warning Score 2 for predicting mortality in patients with coronavirus disease 2019 admitted to the ward was found to be superior to Rapid Emergency Medicine Score, Confusion, Respiratory rate, Blood pressure, Age 65 score, and quick Sepsis-related Organ Failure Assessment. Peripheral oxygen saturation could independently predict in-hospital death in these patients. Further validation of our finding in multiple settings is needed to determine its applicability for coronavirus disease 2019.

T he novel coronavirus disease 2019 (COVID- 19) initially reported in the city of Wuhan in China and has now spread globally (1) (2) (3) (4) . As of June 17, 2020, there were 8,061,550 registered cases in 216 countries, and 440,290 people have lost their lives (4).

Based on past reporting, the majority of the patients with COVID-19 had mild symptoms with a good prognosis (5); however, some patients progressed rapidly to a critical state such as acute respiratory failure state, multiple organ failure, and septic shock (6) . For patients infected with COVID-19, early identification of the severity of illness could facilitate appropriate supportive care and prompt access to the ICU if necessary. In an epidemic situation, for patients with mild COVID-19 symptomatic treatment in general isolation is recommended (7) . Early intensive care is warranted for patients with rapid worsening of the condition to reduce mortality. However, the appropriate use of intensive care during the epidemic may alleviate the shortage of medical resources (8) . Therefore, an easy-to-use risk predictive tool for assessing the possibility of deterioration of patients with COVID-19 is needed. Such a tool could help clinicians to stratify patients into relevant risk categories and facilitate in making appropriate clinical decisions in a state of emergency, where every second counts.

Several scoring systems for detecting potentially critically ill patients in acute settings have been proposed and used in emergency department. Some of the most commonly used scoring systems are the National Early Warning Score (NEWS)/National Early Warning Score 2 (NEWS2) (9) (10) (11) (12) , Rapid Emergency Medicine Score (REMS) (13, 14) , Confusion, Respiratory rate, Blood pressure, Age 65 (CRB65) score (15) , and quick Sepsis-related Organ Failure Assessment (qSOFA) (16, 17) which might serve as a risk prediction tool for the patients infected with COVID-19. These tools, including parameters that are easily available in a very basic clinical setting (Supplementary Table 1 We aimed to retrospectively use the clinical data available in a single center and compare these five easy-to-use risk prediction tools for patients with COVID-19. Our study is an attempt to identify the best easy-to-use risk prediction tool to be used as an aid to clinical assessment in a simple clinical setting for the management of COVID-19 patients in a triage situation.

This was a single-centered retrospective observational study conducted at the West Campus of Wuhan Union Hospital (Wuhan, China), which is a large university hospital, and one of the major designated referral and treatment hospitals for critically ill adult patients (≥ 18 yr old) with COVID-19 (Supplementary Material 1, Supplemental Digital Content 6, http://links.lww.com/ CCM/F709). We analyzed all data from consecutive patients admitted to the center between January 30, 2020, and March 14, 2020, who had been diagnosed with COVID-19 as per the World Health Organization interim guidance (18) . Laboratory confirmation of COVID-19 infection was performed by the local health authority as described elsewhere (1) . All these patients had been either discharged or had died by April 12, 2020.

According to the previously published retrospective chart review guidelines (19) , the data abstractors were trained clinicians, directly supervised by the principal investigator (PI). Abstractors used data abstraction forms and had a clear definition of abstract variables and were blinded to the study outcome. Elements of electronic medical records (including nursing records and radiological examinations) of all consecutive patients with laboratory-confirmed COVID-19 infection were abstracted. Patient outcomes were abstracted by two nurses. Patients' records missing more than one vital sign were excluded, and for cases where only one vital sign was missing, this value was imputed using the median values. Clinical data, including demographics, comorbidities, vital signs, mental status, oxygen saturation, and use of supplemental oxygen at admission to the ward, were abstracted by research assistants, trained, and supervised by the PI. The collected data were used to score NEWS, NEWS2, REMS, CRB65, and qSOFA (Supplementary Table 1 Table 5 , Supplemental Digital Content 5, http://links.lww.com/CCM/F708). The mental status for different tools was based on the hospital recorded level of consciousness (such as conscious, confusion, delirium, drowsiness, sopor, mild coma, medium coma, deep coma). For NEWS/ NEWS2, we allocated patients recorded as "confusion, delirium, drowsiness, sopor, mild coma, medium coma, deep coma" a score of 3; for REMS, we allocated patients recorded as "conscious" a score of 0, "confusion, delirium, drowsiness, sopor" a score of 1, "mild coma" a score of 2, "medium coma" a score of 3, "deep coma" a score of 4; and for qSOFA, we allocated patients recorded as "confusion, delirium, drowsiness, sopor, mild coma, medium coma, deep coma" a score of 1, that is, having "altered mental status."

The endpoint was in-hospital death.

Categorical variables were described as frequency rates and percentages, and continuous variables were described as median and interquartile range (IQR) values. The discrimination of NEWS, NEWS2, REMS, CRB65, and qSOFA for the prediction of in-hospital death was assessed and compared using the area under the receiver operating characteristic (AUROC) curve. An AUROC of at least 0.70 was defined as "acceptable discrimination" and an AUROC of at least 0.80 was defined as "excellent discrimination" (20) . The optimal cutoff point was defined as the threshold value with the maximal Youden index (21) . The AUROC was compared according to the method of DeLong, DeLong, and Clarke-Pearson (22) . Calibration was assessed using the Hosmer-Lemeshow test (23) and by visual assessment in loess calibration curve of predicted probabilities with observed risk (24) . In order to evaluate the predictive power of each of the constituent elements of the best score, we first undertook univariate analysis, using logistic regression to estimate the association between the variables and death. We then undertook multivariate analysis to determine which individual variables were independent predictors of death. All significant predictor variables (univariate analysis p < 0.1) were entered into a logistic regression model, with death as outcome. Then receiver operating characteristics (ROCs) were created to determine the effectiveness of these predictors. In these univariate and multivariate analyses, we used the variables as they are categorized in NEWS (e.g., when a patient's respiration rate was 22 breath/min, his/her respiration rate score was 2. See Supplementary Table 1 , Supplemental Digital Content 1, http://links. lww.com/CCM/F704), rather than the raw data. A p value of less than 0.05 was considered significant. All analyses were performed using SAS Version 9.4 (SAS Institute, Cary, NC).

This study was approved by the Ethics Committee of Wuhan Union Hospital (2020-LSZ-0129). As the study was observational, and the data were anonymized for analysis, individual patient consent was waived by the ethics committees.

For this study, medical data of the 673 patients with COVID-19 who fulfilled the inclusion criteria were enrolled (Supplementary Fig. 1 , Supplemental Digital Content 7, http://links.lww. com/CCM/F710). The median (IQR) age was 61 years (50-69 yr), of which 364 (54%) of them were older than 60 years, 341 patients (51%) were men and 227 patients (34%) had chronic diseases, including hypertension, diabetes, chronic cardiac disease, chronic pulmonary disease, cerebrovascular disease, chronic liver disease, chronic kidney disease and malignancy, and 121 died (18%). The baseline characteristics of patients are presented in Table 1 .

Performance of NEWS, NEWS2, REMS, CRB65, and qSOFA Figure 1 shows ROC curves comparing the ability of NEWS, NEWS2, REMS, CRB65, and qSOFA at predicting in-hospital death. AUROC analysis demonstrated the discrimination of the five scoring tools as follows: NEWS, 0.882 (95% CI, 0.847-0.916); NEWS2, 0.880 (95% CI, 0.845-0.914); REMS, 0.839 (95% CI, 0.8-0.879); CRB65, 0.766 (95% CI, 0.718-0.814); and qSOFA, 0.694 (95% CI, 0.641-0.746). NEWS was found to be the best in predicting in-hospital death, and NEWS greater than or equal to 5 was the optimal threshold, with a sensitivity of 84.3% and specificity of 76.8% ( Table 2) . DeLong test found AUROC of NEWS to be significantly larger than that of REMS (p = 0.048), CRB65 (p < 0.001), and qSOFA (p < 0.001) (Supplementary Table 6 , Supplemental Digital Content 8, http:// links.lww.com/CCM/F711). Calibration plots are shown in Supplementary Figure 2 (Supplemental Digital Content 9, http://links.lww.com/CCM/F712), which showed NEWS, NEWS2, REMS, CRB65, and qSOFA to be well calibrated.

Univariate analysis ( Table 3) showed that of the seven parameters in the NEWS score, respiration rate score, oxygen saturation score, supplemental oxygen, temperature score, heart rate score, and AVPU score were associated with death in patients with COVID-19 (at the level of p < 0.1).

Multivariate analysis showed that respiration rate score, oxygen saturations score, temperature score, and AVPU score were independent predictors of death in patients with COVID-19. After adjusting for other variables, supplemental oxygen and heart rate score did not predict death. Supplementary Figure 3 (Supplemental Digital Content 10, http://links.lww.com/CCM/F713) shows ROC curves comparing the ability of each parameter of NEWS at predicting in-hospital death. AUROC analysis demonstrated the discrimination of the each parameter as follows: respiration rate score, 0.687 (95% CI, 0.636-0.738); oxygen saturation score, 0.875 (95% CI, 0.838-0.913); systolic blood pressure score, 0.511 (95% CI, 0.476-0.546); temperature score, 0.544 (95% CI, 0.503-0.585); heart rate score, 0.534 (95% CI, 0.479-0.589); and AVPU score, 0.566 (95% CI, 0.534-0.597). Oxygen saturation score was found to be the best in predicting in-hospital death, and oxygen saturation score greater than or equal to 2 (i.e., oxygen saturation ≤ 93%) was the optimal threshold, with a sensitivity of 77.7% and specificity of 89.7%. DeLong test found AUROC of oxygen saturation score to be significantly larger than that of respiration rate score (p < 0.001); systolic blood pressure score (p < 0.001); temperature score (p < 0.001); heart rate score (p < 0.001); and AVPU score (p < 0.001) (Supplementary Table 7 , Supplemental Digital Content 11, http://links.lww.com/CCM/F714). Therefore, we chose to compare the oxygen saturation score with the other prediction scoring systems (Fig. 1) . The oxygen saturation score was found to be better than CRB65 (p < 0.001) and qSOFA (p < 0.001) and not worse than NEWS (p = 0.63), NEWS2 (p = 0.74), and REMS (p = 0.1) in predicting in-hospital death (Supplementary Table 8 , Supplemental Digital Content 12, http://links.lww.com/CCM/F715). Calibration plots showed that oxygen saturation score was well calibrated (Supplementary Fig. 2 , Supplemental Digital Content 9, http://links.lww.com/CCM/F712). The oxygen saturation score appears to give a better specificity but a lower sensitivity than NEWS ( 

It is vital to determine as quickly as possible which patients with COVID-19 infection are at high risk of death, especially in poor healthcare resource settings so as to make proper use of all available resources. Thus, the risk prediction tools employed for aiding triage decisions should be based on rapidly obtainable and direct prognosis-related parameters. We have compared five well established such tools: NEWS, NEWS2, REMS, qSOFA, and CRB65 for predicting death risk in a critical situation.

NEWS is a validated clinical assessment tool developed by the Royal College of Physicians in the United Kingdom (9-11). NEWS comprises respiratory rate, oxygen saturation, temperature, systolic blood pressure, heart rate, and level of consciousness (Supplementary Table 1 unvalidated. In our current study, we found that the discrimination of NEWS and NEWS2 for the prediction of in-hospital death in patients with COVID-19 infection in Wuhan was both equally excellent. There was no significant difference between the performances of NEWS and NEWS2, which could be due to the small size of patients with hypercapnic respiratory failure. Therefore, for ease of use, NEWS might be more practical. Redfern et al (25) found that NEWS had better discrimination over qSOFA for identifying high-risk non-ICU patients with/without infection. In this study, we found that the efficiency of NEWS/NEWS2 were the best compared with the other three risk prediction score systems as a predictor of in-hospital death in patients with COVID-19. A previous study has shown that the NEWS was good at predicting acute mortality (10) . In this study, we found that the NEWS was also good at predicting inhospital mortality in patients with COVID-19. As a predictor of in-hospital death, the optimal cut point for NEWS was 5. With a sensitivity of 84.3% and specificity of 76.8%, it was more accurate and had higher negative predictive value (NPV) (95.7%) and positive predictive value (PPV) (44.4%) than REMS, CRB65, and qSOFA (Table 2 ). In addition, our results show that it is more appropriate to admit those with scores of 5 and above to ICU because the clinical situation of patients with COVID-19 could rapidly deteriorate. Using NEWS as a triage assistant tool would not only simplify the decision-making process but also improve the quality of medical care and save the medical resources for needed patients. Based on the risk prediction score, the critical care resource could be reserved for the identified high-risk patients. During the COVID-19 pandemic, critical care is a scarce resource facing soaring demand; thus, preplanned resource allocation and prioritization based on NEWS could largely improve care efficiency and ultimately lead to better outcomes on one hand, but also raise the alarm of severity to promote early intervention and reduce fatality.

A recent study suggested to apply the Multilobular infiltration, hypo-Lymphocytosis, Bacterial coinfection, Smoking history, hyper-Tension and Age (MuLBSTA) score in predicting the risk of mortality in COVID-19 infection (5) . The MuLBSTA score requires six indexes, which are age, smoking history, hypertension, bacterial coinfection, multilobular infiltration, and lymphopenia (26) . Since this score takes longer time to assess and, more importantly, includes laboratory tests, we did not include the MuLBSTA score in this study. In addition, it should be noted that many countries with poor medical resources or other limited resources may be unable to perform such complicated tests in an emergency situation or on site. Thus, simple tools such as NEWS could be more useful for rapid assessment and management of the COVID-19 pandemic. NEWS requires only vital signs, oxygen saturation, use of supplemental oxygen, and mental status that nurses or even volunteers can use the tool without much training.

Baker et al (27) found that even a single deranged physiologic parameter at admission was associated with mortality in a critically ill population. Our study showed admission median oxygen saturation of survivors and nonsurvivors with COVID-19 was 86% (IQR, 73.5-93%) and 96% (IQR, 96-98%), respectively. The value of oxygen saturation was confirmed by multivariate analysis. Furthermore, our study demonstrated that oxygen saturation score had excellent discrimination and good calibration. The discrimination of oxygen saturation score was better than CRB65 and qSOFA and not worse than NEWS, NEWS2, and REMS in predicting in-hospital death. In other words, we may also use the oxygen saturation level alone to predict death in patients with COVID-19 infection. Peripheral oxygen saturation can be quickly and easily obtained by any health professional almost anywhere. It may be used as an inexpensive reliable tool for assessing the severity of COVID-19 and detecting patients at high risk in poor resource settings. However, our results suggest that while oxygen saturation score gives a higher specificity and PPV, it has a lower sensitivity as compared with NEWS. The lower sensitivities demonstrate that oxygen saturation score is relatively poor at correctly identifying those patients who will die subsequently in hospital, based at admission data. Smith et al (28) found that early warning scores, such as NEWS, provide better detection of adverse outcomes at a lower trigger rate than a single vital sign variable. Therefore, if resources are unlimited, NEWS would be a better choice.

It should also be noted that during the initial phase of the COVID-19 outbreak, due to the rapid spread and limited medical resources, patients admitted to hospital were more likely to have a higher risk of death. Therefore, more than 30% of patients included in this study were severe and critically ill patients. Thus, these patients had a higher mortality www.ccmjournal.org XXX 2020 • Volume XX • Number XXX rate (17.9%) than the previously reported case fatality rate in China (5.6%) (29) .

The results of this study must be interpreted considering some limitations. First, as a retrospective study, all data were collected as a part of usual care rather than for research, and thus, some of the clinically meaningful data were not available (e.g., the mode of oxygen delivery for all patients). Second, this is a single-center study, and our findings may not be generalizable to all hospitals. Third, a further external validation of oxygen saturation score as a prediction model may be needed. Fourth, our study was not intended to create a new triage system but rather validation and comparison of existing scoring systems widely used in emergency situation for applying to COVID-19 based on the data from our large samples in an initial phase of the pandemic so that different triage systems can be applied in different situation for further care of COVID-19 patients. Based on the different situations of available medical resources, balance between PPV and NPV of each scoring system should be weighted. Last, data of the suspected but undiagnosed cases were not included in our study. It is warranted to conduct a multicenter study including as many patients as possible in future to have a more comprehensive understanding of using several scoring systems' performance in detecting the potentially critically ill patients with COVID-19 infection.

In this single center study, the discrimination of NEWS/NEWS2 for predicting in-hospital death in patients with COVID-19 admitted to the ward was found to be superior to REMS, CRB65, and qSOFA. In addition, peripheral oxygen saturation could independently predict in-hospital death in these patients. Further validation in different multiple settings may be needed to determine the widespread applicability of these tools.

Sun contributed equally and shared first authorship

Sun and Chan had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis

Sun involved in administrative, technical, or material support

Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal

Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in

A novel coronavirus from patients with pneumonia in China

Clinical findings in a group of patients infected with the 2019 novel coronavirus (SARS-Cov-2) outside of Wuhan, China: Retrospective case series

Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: Summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention

Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: A descriptive study

World Health Organization: Novel Coronavirus (COVID-19) Situation Report-46

Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: A singlecentered, retrospective, observational study

Standardising the Assessment of Acute-Illness Severity in the NHS

The ability of the National Early Warning Score (NEWS) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death

A prospective validation of National Early Warning Score in emergency intensive care unit patients at Beijing

NEWS) 2: Standardising the Assessment of Acute-Illness Severity in the NHS

Comparison of the Rapid Emergency Medicine Score and APACHE II in nonsurgical emergency department patients

Rapid Emergency Medicine score: A new prognostic tool for in-hospital mortality in nonsurgical emergency department patients

Defining community acquired pneumonia severity on presentation to hospital: An international derivation and validation study

The third international consensus definitions for sepsis and septic shock (Sepsis-3)

Assessment of clinical criteria for sepsis: For the third international consensus definitions for sepsis and septic shock (Sepsis-3)

World Health Organization: Clinical Management of Severe Acute Respiratory Infection When Novel Coronavirus (nCoV) Infection Is Suspected: Interim Guidance

Reassessing the methods of medical record review studies in emergency medicine research

Applied Logistic Regression

Index for rating diagnostic tests

Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach

Hosmer D: A goodness-of-fit test of the multiple logistic regression model

A calibration hierarchy for risk models was defined: From utopia to empirical data

A comparison of the Quick Sequential (Sepsis-Related) Organ Failure Assessment Score www.ccmjournal.org XXX 2020 • Volume XX • Number XXX and the National Early Warning Score in non-ICU patients with/ without infection

Clinical features predicting mortality risk in patients with viral pneumonia: The MuLBSTA score

Single deranged physiologic parameters are associated with mortality in a low-income country

A comparison of the ability of the physiologic components of Medical Emergency Team Criteria and the U.K. National Early Warning Score to discriminate patients at risk of a range of adverse clinical outcomes

National Health Commission of the People's Republic of China: National Health Commission of the People's Republic of China Home Page

We thank Hui Qiu, BN and Shuang Xu, MD from the Union Hospital, Tongji Medical College, and Huazhong University of Science and Technology for their help in collecting the data.

The authors have disclosed that they do not have any potential conflicts of interest.For information regarding this article, E-mail: simple1111@hust.edu.cn (Dr. P. Sun); pbchan@hotmail.com (Dr. Chan)