key: cord-1003520-ekzrzzdb authors: Zhang, Kai; Zhang, Xing; Ding, Wenyun; Xuan, Nanxia; Tian, Baoping; Huang, Tiancha; Zhang, Zhaocai; Cui, Wei; Huang, Huaqiong; Zhang, Gensheng title: The Prognostic Accuracy of National Early Warning Score 2 on Predicting Clinical Deterioration for Patients With COVID-19: A Systematic Review and Meta-Analysis date: 2021-07-09 journal: Front Med (Lausanne) DOI: 10.3389/fmed.2021.699880 sha: 7f98b652d98aba0f5d8ec2dfde6f7090f198c2b8 doc_id: 1003520 cord_uid: ekzrzzdb Background: During the coronavirus disease 2019 (COVID-19) pandemic, the National Early Warning Score 2 (NEWS2) is recommended for the risk stratification of COVID-19 patients, but little is known about its ability to detect severe cases. Therefore, our purpose is to assess the prognostic accuracy of NEWS2 on predicting clinical deterioration for patients with COVID-19. Methods: We searched PubMed, Embase, Scopus, and the Cochrane Library from December 2019 to March 2021. Clinical deterioration was defined as the need for intensive respiratory support, admission to the intensive care unit, or in-hospital death. Sensitivity, specificity, and likelihood ratios were pooled by using the bivariate random-effects model. Overall prognostic performance was summarized by using the area under the curve (AUC). We performed subgroup analyses to assess the prognostic accuracy of NEWS2 in different conditions. Results: Eighteen studies with 6,922 participants were included. The NEWS2 of five or more was commonly used for predicting clinical deterioration. The pooled sensitivity, specificity, and AUC were 0.82, 0.67, and 0.82, respectively. Benefitting from adding a new SpO(2) scoring scale for patients with hypercapnic respiratory failure, the NEWS2 showed better sensitivity (0.82 vs. 0.75) and discrimination (0.82 vs. 0.76) than the original NEWS. In addition, the NEWS2 was a sensitive method (sensitivity: 0.88) for predicting short-term deterioration within 72 h. Conclusions: The NEWS2 had moderate sensitivity and specificity in predicting the deterioration of patients with COVID-19. Our results support the use of NEWS2 monitoring as a sensitive method to initially assess COVID-19 patients at hospital admission, although it has a relatively high false-trigger rate. Our findings indicated that the development of enhanced or modified NEWS may be necessary. The recent outbreak of coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has challenged healthcare systems worldwide (1) . As of , SARS-CoV-2 has results in more than 12.5 million confirmed cases, with more than 2.7 million deaths (2) . Although the majority of patients infected with COVID-19 are symptomless or oligosymptomatic, about one-fifth of patients may develop severe COVID-19 with a high risk of mortality (3, 4) . Thus, for patients with COVID-19, early identification of the deteriorating patients is of importance because it could direct finite resources toward those patients in greatest clinical need. However, risk stratification and early identification of patients with high risk of clinical deterioration at admission remain as major challenges. Frontline health workers constantly meet the challenges of determining the severity and prognosis of COVID-19 cases in order to provide high-quality care and effectively allocate resources (5) . Therefore, there is a need for an easy-touse and effective risk-predictive tool to assess the possibility of deterioration of patients with COVID-19. The National Early Warning Score (NEWS), first introduced in 2012 and updated in 2017 (NEWS2), has received a formal Abbreviations: COVID-19, coronavirus disease 2019; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2; UK, United Kingdom; ICU, intensive care unit; ED, emergency department; qSOFA, quick Sequential Organ Failure Assessment; AUC, area under the curve; NEWS, National Early Warning Score; CI, confidence interval; PLR, positive likelihood ratio; NLR, negative likelihood ratio; DOR, diagnostic odds ratio; HSROC, hierarchical summary receiver operating characteristic. endorsement from the National Health Service to become the early warning system for deterioration of acutely ill patients in the United Kingdom (UK) (6, 7) . The NEWS/NEWS2 is a scoring system based on routine physiological parameters, which can be obtained easily and rapidly at the bedside. Each indicator is given a score, where 0 is considered normal, and simple addition allows a total score from 0 to 23. A score of 5 or more represents the key threshold for urgent response, and patients with a score of 7 or more would be deemed to have a high clinical risk and trigger a high-level clinical alert (Table 1 ) (6, 7) . Since some components (e.g., temperature, oxygen saturation, and supplemental oxygen dependency) were proved to be associated with the progression of COVID-19 (8, 9) , guidelines from the Royal College of Physicians (10) and the Swiss Society of Intensive Care Medicine (11) advocate the use of the NEWS2 for initial assessment in patients with COVID-19. However, these recommendations were only based on expert opinions, and there have been no published meta-analyses to evaluate the predictive performance of the NEWS2. Therefore, the aim of the present study was to evaluate the prognostic accuracy of the NEWS2 on predicting clinical deterioration for patients with COVID-19. In addition, we performed a comparison of the NEWS2 with the original NEWS. We followed the PRISMA statement (12) to structure the meta-analysis (Supplementary Material 1) . A predefined protocol has been registered in PROSPERO (CRD42021243845, The basic inclusive criteria are as follows: (1) recruited adult patients with confirmed cases of SARS-CoV-2 infection, (2) applied the NEWS2 or the NEWS to predict clinical deterioration (including the need for intensive respiratory support, admission to the ICU, or in-hospital death), and (3) provided sufficient data to estimate the prognostic accuracy. There was no language restriction. The detailed searching strategies and inclusion and exclusion criteria are recorded in Supplementary Material 3. Two authors independently retrieved and extracted studies according to the inclusion criteria. We recorded the true positive, false positive, false negative, and true negative from the articles directly or through a recalculation according to the sensitivity and specificity. Any disagreement in the process was resolved by a discussion. Two authors employed the PROBAST to assess the risk of bias and applicability concerns of the included studies (13) . The detailed quality assessment standard is recorded in Supplementary Material 3. We used a bivariate random-effects regression model (14) to pool the sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and area under the curve (AUC) as point estimates with 95% confidence interval (CI). We also constructed the hierarchical summary receiver operating characteristic (HSROC) curve to present the summary point estimates of sensitivity and specificity. I 2 statistics were calculated to assess the statistical heterogeneity between the included studies, where I 2 > 50% indicated a substantial level of heterogeneity (15) . We performed subgroup analyses to evaluate the performance of the NEWS2 in different conditions. Studies were stratified A total of 8,746 published studies were initially identified. After removing the duplicate articles and screening the abstracts, we identified 40 studies, and 22 studies were excluded with reasons in the full-text assessments (the list of excluded studies with reasons is shown in Supplementary Material 4) . Finally, we included 18 studies (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) (29) (30) (31) (32) (33) (34) in our meta-analyses (Figure 1) . Table 2 shows the basic information and characteristics of the included studies. A total of 6,922 participants were included in the analysis, with the mortality rate in each study ranging from 6 to 47%. Three studies (22, 24, 28) were relatively small in sample size (<100), and six studies (17, 19, 21, 26, 30, 34) enrolled more than 400 patients. Fifteen studies (17-19, 21, 22, 24-27, 29-34) investigated general ward patients, and three (20, 23, 28) investigated only the emergency department (ED) population. Fourteen studies (18-26, 28, 30, 32-34) used the NEWS2, while another four (17, 27, 29, 31) studies only used the original NEWS. Moreover, in six studies (18, 19, 23, 24, 28, 31) , the investigators Covino ? ? Liu PROBAST, Prediction model Risk Of Bias ASsessment Tool; ROB, risk of bias. "+" indicates low ROB/low concern regarding applicability; "-" indicates high ROB/high concern regarding applicability; and "?" indicates unclear ROB/unclear concern regarding applicability. employed a positive quick Sequential Organ Failure Assessment qSOFA (≥2) to predict clinical deterioration. Table 3 shows the summary results of the quality assessments by using PROBAST. Overall, 16 studies had a high or unclear risk of bias, mainly because of the inappropriate handling method of missing data (11 studies excluded participants with missing values from the analyses, and five studies did not explicitly state the handling method of the missing data). Four studies had a high or unclear concern regarding applicability since the threshold value of the NEWS or the time interval between the evaluation of predictor and the determination of the outcome were not consistent with other studies. The details of the quality assessment are reported in Supplementary Material 5. In addition, the Deek's funnel plot indicated that there was a potential publication bias among the included studies (Deek's test: P < 0.10, Supplementary Material 6) . Eleven studies used the NEWS2 to predict clinical deterioration for patients with COVID-19. Figure 2 shows the forest plot of sensitivity and specificity for the NEWS2; the pooled sensitivity and specificity of the NEWS2 were 0.82 (95% CI: 0.75, 0.87) and 0.67 (95% CI: 0.58, 0.75). The pooled PLR and NLR of the NEWS2 were 2.50 (95% CI: 1.96, 3.20) and 0.27 (95% CI: 0.20, 0.37). In seven studies reporting the prognostic accuracy of the NEWS (Figure 3) , the pooled sensitivity and specificity were 0.75 (95% CI: 0.63, 0.84) and 0.65 (95% CI: 0.52, 0.76). Figure 4 shows the HSROC curves for the NEWS2 (Figure 4A ) and the NEWS ( Figure 4B) ; the AUC was 0.82 (95% CI: 0.79, 0.85) and 0.76 (95% CI: 0.72, 0.79), respectively. Considerable heterogeneity existed across the studies. In six studies, the researchers employed the qSOFA to predict clinical deterioration. The pooled sensitivity, specificity, and AUC of qSOFA were 0.26 (95% CI: 0.20, 0.33), 0.94 (95% CI: 0.86, 0.97), and 0.64 (95% CI: 0.60, 0.80), respectively ( Table 4 ). There was evidence that the prognostic performance of the NEWS2 varied across different subgroups ( Table 4 ). The performance of the NEWS2 for predicting clinical deterioration within 72 h was better than that during hospitalization (AUC: 0.86 vs. 0.80). In addition, the NEWS2 had more moderate sensitivity and specificity and better discrimination in patients with a less severe disease (mortality rate, <10%). In sensitivity analyses, we restricted the analyses to studies that evaluated the NEWS2 at hospital admission or studies that used the threshold of ≥5; the pooled sensitivity, specificity, PLR, NLR, and AUC were largely consistent with the primary results ( Table 4) . It is vital to determine as quickly as possible which patients with COVID-19 infection are at a high risk of deterioration, especially in poor healthcare resource settings, so as to make proper use of all available resources. To the best of our knowledge, this is the first meta-analysis to evaluate the prognostic accuracy of the NEWS2 on predicting clinical deterioration for patients with COVID-19. In general, the NEWS2 has good discrimination in predicting the combined outcome of the need for intensive respiratory support, admission to the ICU, or in-hospital death. The high sensitivity ensured that the NEWS2 could be used as a sensitive method to initially assess COVID-19 patients at hospital admission. In addition, our results showed that using a threshold of 5 results in high sensitivity (0.83), moderate specificity (0.65), and good discrimination (0.82). It means that early interventions should be implemented for COVID-19 patients with more than five NEWS2 points as soon as possible because the clinical situation of those patients is expected to rapidly deteriorate. The estimates of the pooled results showed a considerable heterogeneity between studies. Investigating the source of heterogeneity and the prognostic performance of the NEWS2 in different conditions are important objectives in our study. First of all, the NEWS2, an updated version of the NEWS, differs from that of the original NEWS by the inclusion of a new SpO 2 scoring scale for use in patients with hypercapnic respiratory failure. Oxygen supplementation has been proven to be an independent risk factor for novel coronavirus pneumonia progressing to a critical condition (35) . Liu et al. (26) demonstrated that the oxygen saturation level had a good prognostic performance for predicting death in patients with COVID-19 infection. Thus, benefitting from adding a specific scale for patients with hypercapnic respiratory failure, the NEWS2 showed better sensitivity and discrimination than the original NEWS. Second, the time window between score calculation and outcome measurement could also account for heterogeneity. Since predictive accuracy can be improved because the score is calculated close to the occurrence of the outcome, the NEWS2 has a high sensitivity in predicting clinical deterioration within 72 h for patients with COVID-19. The result supports the use of NEWS2 monitoring as a sensitive method to conduct an initial assessment of COVID-19 patients at hospital admission. Third, the severity of a disease might affect the prognostic accuracy as well. For patients with higher mortality rates (≥10%), the NEWS has a high sensitivity but a relatively low specificity, indicating a relatively high false-trigger rate. However, the sensitivity and specificity of the NEWS2 are more moderate in patients with lower mortality (<10%, mostly in the ED). The result supports using the NEWS2 as an adjunct to the process of triage and disposition of newly admitted patients with COVID-19, especially in overcrowded emergency rooms (20) . Moreover, study location might be a source of heterogeneity because differences in the healthcare systems of each country could affect clinical outcomes. Specifically, early warning score systems have been introduced and linked to effective clinical responses in many UK hospitals (36) . It might introduce the treatment paradox, where some deteriorating patients were likely to receive rapid medical interventions after triggering the alert. Hence, the actual deteriorating rate tends to be lower than predicted and biases our estimate of accuracy. In addition, the primary outcome consists of the need for intensive respiratory support, admission to the ICU, and in-hospital mortality. The indications for the use of intensive respiratory support and the standards of ICU admission were varied among the included studies, which might affect the occurrence of positive results and become a source of heterogeneity. The NEWS2 is a summary score derived from six physiological parameters; some parameters relate to the degree of respiratory failure, such as oxygen saturation and oxygen supplementation. Since COVID-19 is often characterized by solitary respiratory failure (37, 38) , an advantage of NEWS2 compared to other scoring systems is that both hypoxemia and supportive oxygen treatment are included as scoring parameters. It could explain its relatively better performance compared to other scoring systems. In our study, compared with qSOFA, the NEWS had better discrimination and moderate sensitivity and specificity. Although our research suggests that the NEWS2 has good prognostic performance, it is worth highlighting some potential pitfalls in clinical practice. For instance, in patients with COVID-19, the oxygen requirement might increase rapidly if their respiratory function continued to worsen, but the increased oxygen requirement does not directly cause an increase in the NEWS2 since oxygen supplementation is only a binary variable (yes or no) in the NEWS2 scoring system. Therefore, clinically, we suggest that any increase in oxygen requirement for patients with COVID-19 should arouse the attention of clinicians. Furthermore, given that older patients with COVID-19 have a higher proportion of severe cases and fatality ratio (39, 40) , the pandemic has prompted the need to pay particular attention to the health of older persons. Evidence also showed that increased age was independently associated with poor prognosis in COVID-19 patients (8) . A Chinese group put forward a modified version of the NEWS2 with the addition of age >65 years as an independent component, termed NEWS-C (41) . An external validation study found that the NEWS-C has the best predictive accuracy among common scoring systems for predicting the deterioration of respiratory function in patients with COVID-19 (31) . Therefore, it is possible that the prognostic accuracy of the NEWS2 could be improved by modifying the score. Notably, the NEWS2 is not an alternative to the clinical judgment by experienced clinicians; it should be utilized to help in clinical decision-making by providing objective data. According to the guidelines of the Royal College of Physicians (10) , patients with the NEWS2 <5 should also receive strict monitoring because a considerable proportion may still rapidly progress to severe respiratory failure. Finally, in addition to the initial assessment of illness severity, the NEWS was originally designed as a track-and-trigger tool to identify acute clinical deterioration and guide the clinical response for patients. By recording the score on a regular basis, the trends in the clinical response of a patient can be tracked, providing an early warning of clinical deterioration and the need for more intensive treatment (6). Baker et al. found an increasing trend of the NEWS2 beginning many hours prior to the occurrence of a serious clinical deterioration event (18) . Therefore, the score should be calculated not only at the admission of patients but also throughout their hospital stay to evaluate a possible deterioration in their clinical situation. The strengths of this meta-analysis include, first, a standard protocol and comprehensive search strategies across multiple databases. Thus, we believe that we did not miss any relevant studies. Second, a statistically robust hierarchical model was employed to estimate pooled results and to construct HSROC plots. This approach allows for both between-study variability in sensitivity and specificity and flexibility in the estimation of summary statistics (42) . Our findings can contribute to a better understanding of the NEWS2 in patients with COVID-19, which could be useful for implementing the NEWS2 in clinical practice. Meanwhile, there are some important limitations in the metaanalysis. First, previous research suggest that heterogeneities are widely observed in the systematic reviews of diagnostic test accuracy (43, 44) . We also identified significant heterogeneity among the included studies, which might affect the credibility of the pooled estimates. Second, most of the included studies were single-center studies with a relatively small sample size, which may limit the generalizability and certainty of our analysis. Furthermore, the NEWS2 was not designed as a single-time-point predictive tool. Since existing research only show the prognostic accuracy of the NEWS2 in predicting clinical deterioration at a single time point (mostly at the time of admission), we could not evaluate the NEWS2 in any other context. On the other hand, the timings of the NEWS2 measurement were not entirely consistent in the included studies. We assume that the accuracy might be improved if multiple time points were considered, and the changed trend of NEWS2 with time has a potential application value of predicting mortality, just like the delta SOFA (45) . We perform the first meta-analysis to examine the prognostic accuracy of the NEWS2 on predicting clinical deterioration for patients with COVID-19. The NEWS2 has moderate sensitivity and specificity in predicting the deterioration of patients with COVID-19, and the threshold of 5 is an optimal trigger threshold for activating a rapid response. Our results support the recommendations for use of NEWS2 monitoring as a sensitive method to initially assess COVID-19 patients at hospital admission, although it has a relatively high false-trigger rate. However, the discriminative power of the NEWS2 is far from excellent. Further improvements of the NEWS2 by modifying the score or combining more important predictors is still necessary. In addition, the value of a single assessment is limited. Further research should focus on the utility of longitudinal NEWS2 monitoring to identify deteriorating patients and guide clinical response, not solely for initial assessment at hospital admission. The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s. KZ conceived the idea, performed the analysis, and drafted the manuscript. XZ and WD contributed to the study design, data acquisition, and interpretation. NX, BT, and TH helped in the statistical analysis. ZZ and WC critically revised the manuscript for important intellectual content. GZ and HH helped to frame the idea of the study and provided technical support. All authors have read and approved the submitted version. This work was supported in part by grants from the National Natural Science Foundation of China (No. 81971871, GZ) and the Medical and Health Research Program of Zhejiang Province (No. 2021KY174, GZ). The sponsors of this study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. A novel coronavirus outbreak of global health concern WHO Coronavirus (COVID-19) Dashboard. Available online at Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72,314 cases from the Chinese Center for Disease Control and Prevention AGS position statement: resource allocation strategies and age-related considerations in the COVID-19 era and beyond Standardising the Assessment of Acute-Illness Severity in the NHS National Early Warning Score (NEWS) 2. Standardising the Assessment of Acute-Illness Severity in the NHS. Updated report of a working party Risk factors of critical & mortal COVID-19 cases: a systematic literature review and meta-analysis Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study NEWS2 and Deterioration in COVID-19. Available online at Recommendations for the admission of patients with COVID-19 to intensive care and intermediate care units (ICUs and IMCUs) The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration PROBAST: a tool to assess the risk of bias and applicability of prediction model studies Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews Quantifying heterogeneity in a meta-analysis The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed A fuller picture of COVID-19 prognosis: the added value of vulnerability measures to predict mortality in hospitalised older adults National early warning score 2 (NEWS2) to identify inpatient COVID-19 deterioration: a retrospective analysis Utility of established prognostic scores in COVID-19 hospital admissions: multicentre prospective evaluation of CURB-65, NEWS2 and qSOFA Predicting intensive care unit admission and death for COVID-19 patients in the emergency department using early warning scores Comparison of severity scores for COVID-19 patients with pneumonia: a retrospective study Predictive value of national early warning score 2 (NEWS2) for intensive care unit admission in patients with SARS-CoV-2 infection Predicting severe COVID-19 in the emergency department COVID-19: symptoms, course of illness and use of clinical scoring systems for the first 42 patients admitted to a Norwegian local hospital Prognostic accuracy of the SIRS, qSOFA, and NEWS for early detection of clinical deterioration in SARS-CoV-2 infected patients Evaluation of the risk prediction tools for patients with coronavirus disease 2019 in Wuhan, China: a single-centered, retrospective, observational study Prognostic factors in patients admitted to an urban teaching hospital with COVID-19 infection National early warning score 2 (NEWS2) on admission predicts severe disease and in-hospital mortality from Covid-19 -a prospective cohort study National early warning score to predict intensive care unit transfer and mortality in COVID-19 in a French cohort Use of the first National Early Warning Score recorded within 24 hours of admission to estimate the risk of in-hospital mortality in unplanned COVID-19 patients: a retrospective cohort study Prognostic accuracy of early warning scores for clinical deterioration in patients with COVID-19 National early warning score 2 (NEWS2) better predicts critical coronavirus disease 2019 (COVID-19) illness than COVID-GRAM, a multi-centre study Early warning scores in patients with suspected COVID-19 infection in emergency departments The ROX index has greater predictive validity than NEWS2 for deterioration in Covid-19 Lower mortality of COVID-19 by early recognition and intervention: experience from Jiangsu Province. Ann Intensive Care Distributions of the national early warning score (NEWS) across a healthcare system following a large-scale roll-out Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City Area Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China Clinical characteristics of coronavirus disease 2019 in China Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region Novel coronavirus infection during the 2019-2020 epidemic: preparing intensive care units-the experience in Sichuan Province, China A hierarchical regression approach to metaanalysis of diagnostic test accuracy evaluations A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy Performance of the quick sequential (sepsis-related) organ failure assessment score as a prognostic tool in infected patients outside the intensive care unit: a systematic review and meta-analysis SOFA and mortality endpoints in randomized controlled trials: a systematic review and meta-regression analysis The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed. 2021.699880/full#supplementary-material Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.Copyright © 2021 Zhang, Zhang, Ding, Xuan, Tian, Huang, Zhang, Cui, Huang and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.