key: cord-0740237-xxtnwrj2 authors: Levy, T. J.; Richardson, S.; Coppa, K.; Barnaby, D. P.; McGinn, T.; Becker, L. B.; Davidson, K. W.; Hirsch, J. S.; Zanos, T.; Consortium, Northwell COVID-19 Research title: Estimating Survival of Hospitalized COVID-19 Patients from Admission Information date: 2020-04-27 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.04.22.20075416 sha: d166c3ec40170c1ac88df8e2471934e679b296d2 doc_id: 740237 cord_uid: xxtnwrj2 Background While clinical characteristics and a range of mortality risk factors of COVID-19 patients have been reported, a practical early clinical survival calculator specialized for the unique cohort of patients has not yet been introduced. Such a tool would provide timely and valuable guidance in clinical care decision-making during this global pandemic. Methods Demographic, laboratory, clinical, and treatment data (from 13 acute care facilities at Northwell Health) were extracted from electronic medical records and used to build and test the predictive accuracy of a survival probability calculator-the Northwell COVID-19 Survival (NOCOS) calculator-for hospitalized COVID-19 patients. The NOCOS calculator was constructed using multivariate regression with L1 regularization (LASSO). Model predictive performance was measured using Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) of the calculators tested. Results A total of 5,233 inpatients were included in the study. Patient age, serum blood urea nitrogen (BUN), Emergency Severity Index (ESI), red cell distribution width (RCDW), absolute neutrophil count, serum bicarbonate, and glucose were identified as the optimal early predictors of survival by multivariate LASSO regression. The predictive performance of the Northwell COVID-19 Survival (NOCOS) calculator was assessed for 14 consecutive days. Conclusions We present a rapidly developed and deployed estimate of survival probability that outperforms other general risk models. The 7 early predictors of in-hospital survival can help clinicians identify patients with increased probabilities of survival and provide critical decision support. The World Health Organization designated coronavirus disease 2019 (COVID-19) a global pandemic on March 11 th , 2020, with over 1 million confirmed worldwide cases. 1 Estimates of severe disease range from 20-30% and case fatality rates from 2-7%. 2, 3 As healthcare facilities across the world struggle to provide care for increasing numbers of critically ill patients, many countries are reporting or anticipating significant ventilator and equipment shortages. [4] [5] [6] The development of evidence-based resource allocation tools and processes will be necessary to ensure that we meet our ethical duty to provide the most benefit for the largest number of people. In cities across the globe, physicians faced with resource limitations are independently deciding which patients to aggressively resuscitate and ventilate and for whom to withhold artificial respiratory support. 6, 7 Aiding healthcare workers with robust predictive survival models ensures more informed decision-making and efficient, just resource allocation while reducing physician stress and burnout. An early, simple, and clinically relevant model to predict survival in hospitalized COVID-19 patients brings objectivity to emotionally fraught decisions and conversations with patients and families. There have been no published multivariate models predicting survival in larger cohorts (>100) of patients with COVID-19 for at the time of this study, although reports from China have identified age, Sequential Organ Failure Assessment (SOFA) score, and d-dimer level as potential predictors. 8 Our objectives were to use parameters available early to clinicians to characterize and predict survival for hospitalized COVID-19 patients within the largest health system in New York By including an L 1 -norm regularization term that promotes sparsity, LASSO regression is well suited for determining the optimal subset of measurements. The magnitudes of the coefficients relate to the predictive values of the normalized measurements while coefficients of non-predictive measurements converge exactly to 0. The data is normalized by taking the z-score so that all measurements are sampled from a distribution with 0 mean and a standard deviation of 1. The mean and standard deviation of the measurements with non-0 coefficients are stored as model hyperparameters during training and applied to test data. Missing measurements were imputed to the mean. The regularization factor λ is another hyperparameter that is determined by sweeping λ over a range, evaluating the performance, and choosing the value that corresponds to the optimal tradeoff between maximizing performance and minimizing the number of predictors. After optimizing for λ , the number of predictors was fixed at 7 inputs. The performance is measured as the area under the Receiver Operating Characteristic (ROC) curve. The training set is evaluated with the model using leave-one-out cross-validation to prevent overfitting in order to estimate the class conditional distributions (survived and expired) of LASSO predictions as Gaussian likelihood functions. The posterior probability that the patient will survive is ‫‬ ൫ ߤ 1 0 inputs. Figure 3 shows the performance of fixed NOCOS when tested using the up-to-date values of the seven measurements, with the AUC increasing steadily to values close to 0.91. In this study, we successfully developed a simple and practical survival calculator for Developed to be parsimonious and easy to use, the predicted survival probability can be used to assist clinical decision-making and ease physician burden in this unprecedented situation. The output of this calculator (which is freely available at https://feinstein.northwell.edu/nocos) provides an easily comprehensible probability, which can be communicated to physicians and nurses, families, and other administrative teams. The choice of variables included in our model, which were ascertained from the LASSO regularization, all have clinical face validity. It is well established with many diseases, and particularly with COVID-19, that older age confers an increased mortality risk. 8 molecule, leading to impaired red blood cells as well as free radical formation and toxic effect to the lungs. 20 These findings suggest potential therapeutic approaches to reduce sudden decompensation, organ failure, and death of these patients. A major strength of this work is the development of a powerful predictive model typically usable for clinicians within 60 minutes of a patient's initial presentation. Although the calculator performs well with these very early measurements, it improves its predictive performance when these measurements are updated throughout the hospitalization of the patient (Figure 3 ), showing that, as expected, the most accurate prediction is given with the most up-to-date values of the seven measures. We also restricted inputs to commonly collected, discrete, and objective data. Its sheer simplicity and reliance on quantitative measurements makes it generalizable and easy to deploy to all interested stakeholders, including front-line providers and hospital administrators organizing distribution of scarce and limited resources. While we present the calculator output as a probability score, a specific operating point can also be chosen to provide a binary outcome prediction with significant accuracy. Choosing an operating point is left up to stakeholders; local clinical teams have flexibility to adjust thresholds toward a more stringent or risk-averse solution (Table 2) , based on the rapidly changing needs during this pandemic. Calculating estimates of survival or mortality using clinical measurements can extend from simple algorithmic rules and thresholds to linear regression models and more complex machine learning (ML) algorithms. Attempting to augment medical decision-making, studies ranging from modulating single parameters to advanced predictive modeling have been applied to forecast decompensation, mortality, and survival among other clinical outcomes. [21] [22] [23] Early work with small patient cohorts of COVID-19 has led to models that identify some All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 24, 25 However, these studies are limited to small numbers of patients as well as the inclusion of qualitative and subjective variables, are prone to mislabeling, and are not always readily available. Our approach benefits from a simple, straightforward formula of typical measurements acquired from ED patients; a patient base at least 20-fold larger than previous studies; and an approach of data-true feature selection based on their predictive value through the LASSO regularization. Due to the challenging situation during the ongoing COVID-19 global health crisis, there is a need for robust tools to aid in complex clinical decision-making. Using well-known clinical calculators such as SOFA or CURB-65 shows ostensible promise; however, these calculators have limitations in both their accuracy and the ease of collecting necessary measurements to construct these scores. Input variables such as confusion (for the CURB-65 score) and Glasgow Coma Scale (for the SOFA score) are ambiguous, hard to measure, and frequently unavailable. Similar difficulties are encountered when trying a novel combination of SOFA score with age and D-dimer values. 8 In our study, 78.3% of patients were missing the Ddimer measurement in the emergency department. In contrast, the NOCOS calculator is based on commonly collected laboratory results and a guideline based ESI triage acuity score. Moreover, the calculator is trained and tested on the patient cohort of interest and can account for the evolving nature of this pandemic by daily or more frequent updates and model retraining. 26 The proposed calculator has some limitations. It was designed to be linear with only essential predictors included, and non-linear or convolutional/recurrent models may provide improved performance. Moreover, the model is not integrating additional, more complex information All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.22.20075416 doi: medRxiv preprint such as radiology X-ray or CT-scan reads. Due to the retrospective study design, not all laboratory tests-including lactate dehydrogenase, interleukin-6, and serum ferritin-were done on all patients, and the performance of these variables could not be adequately assessed. These data were automatically extracted from the EHR database, and some patient-level details could not be extracted. However, our NOCOS calculator aimed to leverage easily obtainable data, obviating the need for sifting through charts to obtain a predictive result. Given the complexity of data acquisition and model development in the midst of a pandemic, we prioritized the creation and rapid dissemination of a more straightforward, clinically relevant implementation. While the model validation contained patients admitted to hospitals within the New York metropolitan area, we believe it will generalize well given the diverse demographic composition of the region and the Northwell Health patient population. In an unprecedented way, the severity of the SARS-CoV-2 pandemic has strained hospitals' resources, including space, materials, and front-line healthcare workers. Providers are often forced to take important clinical decisions under immense time pressure and limited information. Tools that could aid them in these circumstances are timely and important. The Northwell COVID-19 Survival calculator answers a clinical need and provides early information to physicians making a range of difficult-but-critical decisions every day. Financial Disclosures: The authors report no real or apparent conflicts of interest. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.22.20075416 doi: medRxiv preprint 1 4 Funding Sources: This work was supported by grants R24AG064191 from the National Institute on Aging and R01LM012836 from the National Library of Medicine of the National Role of the Funding Sources: The views expressed in this paper are those of the authors and do not represent the views of the National Institutes of Health, the United States Department of Health and Human Services, or any other government entity. Other declarations: The investigators were independent from the funders; Todd J. Levy and Theodoros P. Zanos had full access to the data and can take responsibility for the integrity of the data and the accuracy of the data analysis; Theodoros P. Zanos affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained. The data that support the findings of this study are available on request from COVID19@northwell.edu. The data are not publicly available due to restrictions as it could compromise the privacy of research participants. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. T a b l e 2 . C o n f u s i o n m a t r i c e s f o r m u l t i p l e o p e r a t i n g p o i n t s f o r t h e f i v e c a l c u l a t o r s t e s t e d o n d a t a f r o m t h e f i n a l d a y o f t h e t e s t i n g w e e k , e n d i n g A p r i l 1 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 27, 2020. . https://doi.org/10.1101/2020.04.22.20075416 doi: medRxiv preprint Coronavirus COVID-19 Global Cases Center for Systems Science and Engineering Coronavirus Disease 2019 (COVID-19) in Italy Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention There Aren't Enough Ventilators to Cope With the Coronavirus. The New York Times The resilience of the Spanish health system against the COVID-19 pandemic Facing Covid-19 in Italy -Ethics, Logistics, and Therapeutics on the Epidemic's Front Line NYU Langone Tells ER Doctors to 'Think More Critically' About Who Gets Ventilators Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study