key: cord-0965199-0bkeqja5 authors: Boero, Enrico; Rovida, Serena; Schreiber, Annia; Berchialla, Paola; Charrier, Lorena; Cravino, Marta Maria; Converso, Marcella; Gollini, Paola; Puppo, Mattia; Gravina, Angela; Fornelli, Giorgia; Labarile, Giulia; Sciacca, Santi; Bove, Tiziana; Karakitsos, Dimitrios; Aprà, Franco; Blaivas, Michael; Vetrugno, Luigi title: The COVID‐19 Worsening Score (COWS)—a predictive bedside tool for critical illness date: 2021-01-24 journal: Echocardiography DOI: 10.1111/echo.14962 sha: 3adb9e3757a7f60fa993888bf9b2af04a54cd62c doc_id: 965199 cord_uid: 0bkeqja5 OBJECTIVES: To evaluate the accuracy of a new COVID‐19 prognostic score based on lung ultrasound (LUS) and previously validated variables in predicting critical illness. METHODS: We conducted a single‐center retrospective cohort development and internal validation study of the COVID‐19 Worsening Score (COWS), based on a combination of the previously validated COVID‐GRAM score (GRAM) variables and LUS. Adult COVID‐19 patients admitted to the emergency department (ED) were enrolled. Ten variables previously identified by GRAM, days from symptom onset, LUS findings, and peripheral oxygen saturation/fraction of inspired oxygen (P/F) ratio were analyzed. LUS score as a single predictor was assessed. We evaluated GRAM model's performance, the impact of adding LUS, and then developed a new model based on the most predictive variables. RESULTS: Among 274 COVID‐19 patients enrolled, 174 developed critical illness. The GRAM score identified 51 patients at high risk of developing critical illness and 132 at low risk. LUS score over 15 (range 0 to 36) was associated with a higher risk ratio of critical illness (RR, 2.05; 95% confidence interval [CI], 1.52‐2.77; area under the curve [AUC], 0.63; 95% CI 0.676‐0.634). The newly developed COVID‐19 Worsening Score relies on five variables to classify high‐ and low‐risk patients with an overall accuracy of 80% and negative predictive value of 93% (95% CI, 87%‐98%). Patients scoring more than 0.183 on COWS showed a RR of developing critical illness of 8.07 (95% CI, 4.97‐11.1). CONCLUSIONS: COWS accurately identify patients who are unlikely to need intensive care unit (ICU) admission, preserving resources for the remaining high‐risk patients. By the beginning of 2020, a novel disease called COVID-19 was recognized and eventually defined as a pandemic by the WHO. 1 The disease-causing virus, known as SARS-CoV-2, with its high tropism for the lower respiratory tract, can produce an infection with a broad spectrum of symptoms ranging from asymptomatic to severe acute respiratory failure, often requiring intensive care unit (ICU) admission. 2 Since the beginning of the pandemic, many healthcare facilities reorganized entire departments where multidisciplinary teams collaborated to provide care for COVID-19 patients. Massive effort from the worldwide medical community has been put forth to better understand the pathophysiology of this disease, in order to provide appropriate care, optimize hospital resources, and increase efficiency of workflow. In this context, the availability of an easyto-use standardized scoring system would have been of great help in supporting clinicians with different backgrounds to better identify patients at higher risk of developing a critical illness. Aiming to provide means for a better resource allocation, several prediction models have been developed over the last few months. Vital parameters, comorbidities, and blood test results have been combined to predict disease severity and outcomes for hospitalized COVID-19 patients. [3] [4] [5] [6] [7] [8] [9] [10] [11] Among them, Liang et al developed the COVID-GRAM score, which showed success in the early prediction of critical illness development, defined as admission to the ICU, need for invasive mechanical ventilation (IMV), or death. 4 However, the GRAM score requires ten independent variables, including laboratory results and chest X-ray and requires online calculations to risk stratify patients. Despite its accuracy, its use could be time-consuming as not all required parameters are readily available in all settings. In fact, during the first pandemic peak, healthcare facilities experienced an unexpected patient influx to the emergency department (ED) and medical wards with an average of 60 to 80 COVID-19 patients per hour. Based on this very early Italian experience, such patient influxes made serial radiological imaging unfeasible. For this reason, a less burdensome and rapid prognostic score may be of considerable benefit. Several of the above-mentioned prognostic scores integrated radiological data (ie, chest X-ray or CT scan), but no study has yet investigated the performance of lung ultrasound (LUS) as a prognostic tool in COVID-19 patients. LUS is available at the patient's bedside, and its reliability and speed as a tool to evaluate acute respiratory disorders in real-time have been well established. 12, 13 Moreover, COVID-19 has a distinctive distribution pattern involving mainly the peripheral and lower regions of the lungs, 14 and presumably this is why LUS demonstrated superior sensitivity to CT scan for pleural and subpleural abnormalities. 15 According to the available literature and contingent need, LUS may play a central role in this pandemic where the risk of healthcare workers' exposure and patients' overflow has been a primary concern. We hypothesized that a new prognostic score, integrating previously validated variables and LUS findings instead of chest radiography, could work as well as the GRAM score for the early identification of COVID-19 patients developing critical illness. Hence, we firstly tested the GRAM score on our cohort and then developed and internally validated the new COVID-19 Worsening Score (COWS). We conducted a single-center retrospective cohort validation study of the GRAM score and subsequently developed and internally validated a new prognostic score. The study was conducted in an Italian tertiary Hospital in Turin (San (protocol #82995). The hospital review board waived patients' consent due to the retrospective nature of the study and anonymous data handling and analysis. Patient demographic characteristics, comorbidities, presenting symptoms and date of their onset, clinical signs, laboratory test results, and sonographic and radiological findings (chest X-ray and/or CT) were collected within 48 hours of ED admission. The arterial oxygen partial pressure to fractional-inspired oxygen (P/F) ratio was also recorded. The adverse outcome referred to as critical illness in the results section was defined by the occurrence of at least one of the fol- Among the patients' collected data, we selected the ten variables previously identified in the GRAM score. We chose these ten variables due to their ability to predict the severity of respiratory failure and progression to critical illness. 20 Moreover, P/F ratio on admission and number of days from symptoms onset were included in the analysis. Missing data were further searched in available materials such as handover and notes. In patients that underwent a CT scan, we considered the following findings: the number of pulmonary lobes involved the presence of emphysema and the percentage of well-aerated lung. These radiological features were predictors of ICU admission or death in COVID-19 in a previous study. 11 CT scans (obtained by 64 Slice Discovery HD 750 CT Scanner, General Electric) and chest X-rays were analyzed by a radiologist with more than ten years of chest imaging experience blinded to patients' outcomes. The LUS protocol adopted for the study was comprehensive of 6 scanning areas per hemithorax as previously described. 21 Each hemithorax was assessed in one upper and one lower area in the three regions divided by the parasternal, anterior, and posterior axillary lines, respectively. The image focus was placed at the level of the pleural line maintaining the image depth at 8-12 cm. 13 An already validated aeration score was assigned to each area, 22 Continuous variables were reported as mean and standard deviation (SD) or median with interquartile range (IQR) as appropriate and categorical variables as numbers and percentages. Evaluation of the LUS score as a predictor of the adverse evolution of COVID-19 infection was assessed by univariate level. Restricted cubic splines were modeled to assess the nonlinear effect, and significance was tested by the Wald chi-square. Significance level was set at 0.05. Finally, the LUS score was dichotomized by the ROC curve analysis. Application of the COVID-GRAM model on our sample was carried out to evaluate its performance in classifying high-and low-risk patients according to the threshold identified by the ROC curve analysis. The evaluation of the COVID-GRAM added with the LUS score was then performed. Aiming to develop a novel and easy to use prognostic score, a selection strategy based on Bayesian model averaging was adopted. The number of comorbidities, LUS score, P/F ratio, dyspnea, and duration of symptoms (days) showed a posterior probability of inclusion greater than 30% and was retained in the final logistic regression model labeled as COWS. Thirty percent was chosen as the cutoff through sensitivity analysis to maximize the bootstrapped predictive accuracy of the selected model. The performance of the model was assessed in terms of Somers concordance index Dxy (the closer to 1, the better), Brier score (scores closer to zero indicate a better prediction), and calibration slope. An internal validation to correct measures of predictive performance for optimism (over-fitting) was performed by bootstrapping 500 samples of the data. To improve the prediction, a shrinkage bootstrap-based method was applied to re-estimate regression coefficients. The overall optimism across all models was estimated deriving a shrinkage coefficient equal to the average calibration slope from each of the bootstrap samples. The shrinkage coefficient was applied to the original coefficient to account for over-fitting. Finally, the intercept was re-estimated based on the shrunken coefficients to ensure the overall calibration was maintained, producing the final model. All analyses were carried out using R 4.0.0. 23 Between February 26 and May 17 2020, 274 COVID-19 patients were admitted to the wards from the ED (Figure 1 ). Baseline clinical characteristics are summarized in Table 1 . One hundred and seventy-four patients had a final adverse outcome (critical illness), while 100 patients had a favorable outcome (noncritical illness). Complete data for the study analysis, including LUS findings, were available in 143 cases. The mean time between ED admission and outcome was 5.1 days (SD, 5.4; median 3.8; IQR, 1-7). to discriminate between high-and low-risk patients, we identified 51 patients at high risk and 132 at low risk of developing critical illness ( Figure 2 ). When applied to the 143 patients who were integrated in the final analysis, no difference in GRAM score performance was found. As an intermediate analysis, we investigated whether the combination of the dichotomous LUS score with the COVID-GRAM score could increase the performance of GRAM score alone in predicting adverse outcome. We named this combined score F I G U R E 2 GRAM score derived risk groups (on the left) and outcomes (on the right); gray shadows link classification to outcomes and their width is proportional to the number of patients By using the Bayesian averaging model, we selected five predictive variables with their relative coefficients as follow: LUS score greater than 15, the number of comorbidities, days from the symptom onset, dyspnea at presentation, and P/F ratio (Table 3) . COWS ranged from 0 to 1, and the optimal accuracy was iden- Figure 6 ). In this study, we developed and validated a new prognostic bedside score for early identification of COVID-19-related critical illness and named it COVID-19 Worsening Score (COWS). This new score integrated LUS findings and three selected variables of the previously validated COVID-GRAM score. Since COWS does not require laboratory or radiological results, it enables rapid stratification of patients upon ED arrival. This aspect is critical when considering the large COVID-19 patient influxes seen worldwide, which occasionally necessitated opening of outdoor tent areas and screening of patients in parking lots. The overall accuracy of COWS is 80%, which is equal to the GRAM score. However, with a negative predictive value of 93%, COWS better discriminates low-risk patients than the GRAM score and may thus help in reducing inappropriate ICU admissions and optimizing hospital resources. Moreover, the ability to anticipate clinical worsening could provide benefits to patients, such as shortening the time spent on spontaneous breathing, or on NIV, to prevent patient self-inflicted lung injury (P-SILI). 24 This product had better predictive performance than single biomarkers as proved by an internal validation study. 3 Several radiological scoring systems were also implemented to assess the severity of the disease and predict patient's outcomes. A chest X-ray (CXR) scoring system on 18-point scale, known as Brixia score, was proposed to quantify and monitor the severity of lung abnormalities. 5 The Brixia score when combined with the patient's age and presence of immunosuppression was shown to predict in-hospital mortality. 6 In a retro- results showed that in established COVID-19 cases, the higher the LUS score, the greater was the risk of developing critical illness. We identified that a LUS score value higher than 15 helps discriminate between favorable and adverse outcomes in our cohort of patients. (1) This result is consistent with previous findings reported by Soummer et al 28 Thus, a score based on sonographic (ie, anatomical), functional, and clinical clues may be the most reliable means to provide a quick evaluation of the patient from complementary points of view. COWS is based on LUS, P/F ratio, dyspnea, number of disease, and days from symptoms. For this reason, it acts as both a quick bedside tool and a screening test with a high negative predictive value. These two features suggest its usefulness in the context of the rapid evaluation of multiple patients presenting to the ED to avoid inappropriate resource use on low-risk patients saving costly resources for a minor number of high-risk patients. To this extent, the use of COWS may help increase appropriateness in the deployment of radiological resources, ventilatory equipment, and ICU admissions. Finally, one of the advantages of COWS compared to the GRAM score may also be its quick repeatability over time. In the likely event of a second-wave massive inflow of patients overwhelming hospital resources, patients may be listed according to the calculated predicted risk, in order to help the decision on resource allocation. In particular, stratifying patients by means of COWS may help set the appropriate monitoring level and aid in the difficult process of applying reverse triage criteria for ICU access in extreme conditions. 29 In the context of a long-lasting epidemic, where a model of huband-spoke COVID-19 hospitals might be used, COWS may speed up the selection of the low-risk patients who may be safely transferred to spokes, keeping high-risk patients in the hub center. This study has several limitations. Firstly, it is a retrospective single-center study and the sample size was relatively limited, as complete data were available for 143 patients. Moreover, even if the assessment of internal validity suggests potential usefulness of our newly developed score in clinical practice, however, external validation is needed to enhance the generalizability of our findings. A bigger multicenter, prospective research effort would also be advisable for a greater sample size collection. Secondly, despite COWS' ability to identify low-risk patients, recognition of high-risk patients remains suboptimal, and further adjustment should be applied. Thirdly, we used GRAM' variable selection to build our model, instead of starting from all the possible variables collected in our patients. However, this approach is reasonable as the selected variables were variables with plausible clinical relation with the outcome. Fourthly, the P/F may have been calculated on very different FiO 2 and with different levels of PEEP (from ZEEP to even 10 cmH 2 O). Finally, we tried to assess whether a thoracic CT scan might be combined with COWS as a second level examination in selected patients to improve the overall accuracy, but we did not find promising results for this purpose, possibly due to the limited number of observations. Of note, the cross-sectional area of fat tissue at T7-T8 vertebral height, assessed in Colombi et al, 11 was not measured in our study due to CT software limitations. COVID-19 pandemic has severely challenged hospitals' capacity in providing intensive levels of care. After validating the COVID-GRAM score in our population, we identified a simplified version of the score, by integrating LUS findings, functional, and selected clinical data. The COWS is bedside, quick, and easy to calculate. Its result is able to accurately identify patients who are unlikely to deteriorate or need ICU admission, sparing resources for the minority of COVID-19 patients with a high-risk of developing critical illness. We would thank Dr Sergio Livigni for its wise guidance as a supervisor of critical care practice in our institution and for easing the process of data collection and knowledge sharing. We would thank Dr Savino Sciascia for its interest in clinical research, helping in text revision, and IRB approval. The data that support the findings of this study are available from San Giovanni Bosco Hospital, Turin. Restrictions apply to the availability of these data, which were used under license for this study. Data are available from the authors with the permission of San COVID-19) outbreak -WHO announces COVID-19 outbreak a pandemic Clinical features of patients infected with 2019 novel coronavirus in Wuhan Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients With COVID-19 COVID-19 outbreak in Italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression Chest X-ray severity index as a predictor of in-hospital mortality in coronavirus disease 2019: a study of 302 patients from Italy Chest X-ray has poor diagnostic accuracy and prognostic significance in COVID-19: a propensity matched database study The limited sensitivity of chest computed tomography relative to reverse transcription polymerase chain reaction for severe acute respiratory syndrome coronavirus-2 infection Association of radiologic findings with mortality of patients infected with 2019 novel coronavirus in Wuhan The clinical and chest CT features associated with severe and critical COVID-19 Pneumonia Well-aerated Lung on Admitting Chest CT to Predict Adverse Outcome in COVID-19 Pneumonia The comet-tail artefact: an ultrasound sign of alveolar-interstitial syndrome International evidence-based recommendations for point-of-care lung ultrasound Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Lung ultrasonography versus chest CT in COVID-19 pneumonia: a two-centered retrospective comparison study from China Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration Diagnosis and treatment of adults with community-acquired pneumonia Clinical Findings in 111 Cases of Influenza A (H7N9) Virus Infection Severe SARS-CoV-2 infections: practical considerations and management strategy for intensivists Ultrasound for "lung monitoring" of ventilated patients Lung ultrasound for diagnosis and monitoring of ventilator-associated pneumonia Development Core Team T. A Language and Environment for Statistical Computing Mechanical ventilation to minimize progression of lung injury in acute respiratory failure Management of COVID-19 respiratory distress Chest CT for early detection and management of coronavirus disease (COVID-19): a report of 314 patients admitted to emergency department with suspected pneumonia Correlation of Chest CT and RT-PCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 Cases Ultrasound assessment of lung aeration loss during a successful weaning trial predicts postextubation distress* Clinical Ethics Recommendations for the Allocation of Intensive Care Treatments in exceptional, resource-limited circumstances -Version n. 1 Posted on March The COVID-19 Worsening Score (COWS)-a predictive bedside tool for critical illness