key: cord-1013141-a1ugw861 authors: Woo, S. H.; Rios-Diaz, A. J.; Kubey, A. A.; Cheney-Peters, D. R.; Ackermann, L. L.; Chalikonda, D. M.; Venkataraman, C. M.; Riley, J. M.; Baram, M. title: Development and Validation of a Web-Based Severe COVID-19 Risk Prediction Model date: 2020-07-18 journal: nan DOI: 10.1101/2020.07.16.20155739 sha: 3b909e7681826cb51d2d9e6e85855fde1533cace doc_id: 1013141 cord_uid: a1ugw861 Background: Coronavirus disease 2019 (COVID-19) carries high morbidity and mortality globally. Identification of patients at risk for clinical deterioration upon presentation would aid in triaging, prognostication, and allocation of resources and experimental treatments. Research Question: Can we develop and validate a web-based risk prediction model for identification of patients who may develop severe COVID-19, defined as intensive care unit (ICU) admission, mechanical ventilation, and/or death? Methods: This retrospective cohort study reviewed 415 patients admitted to a large urban academic medical center and community hospitals. Covariates included demographic, clinical, and laboratory data. The independent association of predictors with severe COVID-19 was determined using multivariable logistic regression. A derivation cohort (n=311, 75%) was used to develop the prediction models. The models were tested by a validation cohort (n=104, 25%). Results: The median age was 66 years (Interquartile range [IQR] 54-77) and the majority were male (55%) and non-White (65.8%). The 14-day severe COVID-19 rate was 39.3%; 31.7% required ICU, 24.6% mechanical ventilation, and 21.2% died. Machine learning algorithms and clinical judgment were used to improve model performance and clinical utility, resulting in the selection of eight predictors: age, sex, dyspnea, diabetes mellitus, troponin, C-reactive protein, D-dimer, and aspartate aminotransferase. The discriminative ability was excellent for both the severe COVID-19 (training area under the curve [AUC]=0.82, validation AUC=0.82) and mortality (training AUC= 0.85, validation AUC=0.81) models. These models were incorporated into a mobile-friendly website. Interpretation: This web-based risk prediction model can be used at the bedside for prediction of severe COVID-19 using data mostly available at the time of presentation. COVID-19 is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 1 The disease has spread rapidly throughout the world with more than 6 million confirmed cases and 372,035 deaths as of June 1st, 2020. 2 Mortality in the United States (U.S.) is over 100,000. 2,3 COVID-19 is associated with a high fatality rate, roughly 6% worldwide with variation by country. 2, [4] [5] [6] The severity of COVID-19 illness varies from asymptomatic to severe disease that requires ICU admission. [7] [8] [9] [10] Patients with severe disease who are admitted to the ICU and require mechanical ventilation experience the highest mortality, reported as high as 53.4%. 9, 11, 12 Part of COVID-19's complexity is its variable time course and severity.(e- Figure 1 ) Therefore, early identification of patients at risk for progression to severe COVID-19 is paramount for accurate triage, determining appropriate diagnostic and treatment approaches, and resource allocation. The progressing health crisis precipitated by this pandemic has been exacerbated by the lack of data on key clinical factors associated with severe presentation of disease. Emergent studies have reported several independent risk factors for the development of severe adverse outcomes among patients with COVID-19. 13, 14 In the U.S., available COVID-19 studies are descriptive in nature. 9 ,15, 16 A large study retrospectively analyzed 5,700 hospitalized patients from 12 hospitals in the New York City area. Comorbidities such as hypertension (57%), obesity All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07. 16.20155739 doi: medRxiv preprint This retrospective cohort study included adult patients (>18 years) admitted to a large academic medical center and community hospitals with a diagnosis of COVID-19, defined as positive SARS-CoV-2 polymerase chain reaction (PCR). Patients were admitted between March 1st and April 30th, 2020. The study exclusion criteria were transfer from outside hospital with no access to the data, admission to labor and delivery, and admission for a primary surgical or trauma reason. Patients were grouped according to the severity of COVID-19. Non-severe COVID-19 was defined as requiring hospitalization but not meeting the definition of severe COVID-19 (defined as intensive care unit (ICU) admission, mechanical ventilation, and/or death). Demographics, clinical information, laboratory findings at the time of initial presentation, as well as outcome data were obtained through review of the electronic health record (EHR, Epic (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07.16.20155739 doi: medRxiv preprint difficulty breathing, shortness of breath, or caretaker reporting they appear as such], white blood cell count (WBC), absolute lymphocyte count, hematocrit, platelet count, serum sodium, serum creatinine, C-reactive protein (CRP; 0-5mg/dL, 5-15mg/dL, >15mg/dL), elevated troponin (HsTroponin T > 19ng/L; Troponin I ≥ 0.04ng/mL), creatine kinase (CK), ferritin (0-300ng/mL, 300.1-1000ng/mL, >1000ng/mL), lactate, lactate dehydrogenase (LDH), aspartate aminotransferase (AST; 0-80 IU/L, >80 IU/L), D-dimer (ng/mL), and acute kidney injury (defined as increase in creatinine ≥ 0.3mg/dL from baseline) upon admission to the hospital. The data points extracted were verified by two independent reviewers, with consensus reached by a third independent reviewer if disagreement was noted. Information on race was limited to data available within the EHR system, which classifies Hispanic as a distinct category rather than under a separate category for ethnicity as defined by the U.S. Census Bureau. 21 The primary outcome was a composite measure defined by ICU admission(s), use of mechanical ventilation, and/or death within 14 days of hospitalization (referred to as "severe COVID-19"). These outcomes were also assessed independently. The secondary outcome was death within 14 days of hospital admission. When patients were discharged before the follow-up period (14 days), outcome assessment was carried through readmissions if records were available. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07. 16.20155739 doi: medRxiv preprint Counts and frequencies were used to report categorical data. Means with standard deviation (SD) and medians with interquartile ranges (IQR) were used to report normally and non-normally distributed data, respectively. Chi-squared tests were used for comparison of categorical data. T-tests and Wilcoxon rank sum tests were used to compare continuous data as appropriate. The eligible sample (n=415) was randomly split into a derivation group (75%; n=311) and a validation group (25%; n=104). The prediction risk model was created using the derivation sample. The derivation and validation groups were randomly selected. Covariates considered for inclusion in the model were identified a priori without knowledge of the outcome data based on clinical judgement and potential confounders identified in the literature. 22 The discriminative ability and performance of the model was assessed by calculating the AUC. The Hosmer-Lemeshow test was used to assess goodness-of-fit. Statistical significance was set a priori at p< 0.05. Python (version 3.6.6), Statsmodels (version 0.9.0, for regression), and RStudio (version 1.1.463) were used for statistical analysis. The final model's predicted probability equation was incorporated into a mobile-friendly web-based application All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07.16.20155739 doi: medRxiv preprint (www.covidmodel.org) developed using Scikit-Learn module. Despite institutional COVID-19 treatment guidelines, not all patients had uniform laboratory testing done at the time of presentation, resulting in some missing data. We classified these missing laboratory values as separate categories and tested their association with the primary outcome. Patients with missing CRP and AST values had the lowest rates of severe COVID-19 and behaved similar to patients with laboratory values within normal reference range. Therefore, these patients were combined into the same category as patients within the normal reference range and kept in their respective derivation or validation cohorts. The Thomas Jefferson University Institutional Review Board approved this study and waived informed consent from study participants. A total of 415 patients with COVID-19 were included, of whom 164 (39%) developed severe COVID-19. The median age was 66 years (IQR 54-77). The majority were male (55%) and most were Black (44%), followed by White (34%), Asian (13%), and Hispanic (8%) (which our EHR categorized under race). Table 1 shows univariate analyses of patient demographic and clinical characteristics. Patients who developed severe COVID-19 were more likely to be older (age 70.6 vs 61.4, p<0.001), male (66.3% vs 48.0%; p<0.001), and present with dyspnea (75.5% vs 61.5%; p=0.004). They were also more likely to have a past medical history of diabetes All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07.16.20155739 doi: medRxiv preprint (44.8% vs 28.6%; p=0.001), coronary artery disease (23.3% vs 14.7%; p=0.04), and/or prior stroke (20.2% vs 11.9%; p=0.03). Analyses of laboratory data upon presentation showed that severe COVID-19 was associated with higher levels of CRP, AST, and D-dimer (Figure 1-A) . Notably, a CRP of 0-5 mg/dL was associated with a 27.5% rate of severe COVID-19 compared with 40.9% in those with a CRP of 5.1-15 mg/dL and 72.5% in those with a CRP >15 mg/dL. Ddimer 0-300 ng/mL was associated with a 26.7% rate of severe COVID-19 compared with 38.9% in those with D-dimer 300.1-1000 ng/mL and 68.8% in those with D-dimer >1000 ng/mL. Table 3) . The percentage of patients developing the primary outcome of severe COVID-19 was 39.3% (n=163). Analysis of individual outcomes showed that 31.6% (n=131) of patients were admitted to the ICU, 24.6% (n=102) required mechanical ventilation, and 21.2% (n=88) died within 14 days of admission. Figure 2 shows the distribution of hospital days in which patients first required mechanical ventilation. Mean time to mechanical ventilation was 3.73 hospital days (SD 2.7). After the first two days, 55.9% (n=57) required mechanical ventilation (Figure 2 ). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07.16.20155739 doi: medRxiv preprint 1 0 Multivariable analyses on the training cohort demonstrated that increased age, male sex, diabetes, dyspnea, CRP > 15 mg/dL, AST > 80 IU/L, and D-dimer > 1000 ng/mL were independently associated with severe COVID-19 ( (Figure 3) . The probability of the occurrence of severe COVID-19 and COVID-19-related death can be calculated using the predicted probability equation, which is based on the model's intercept All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. COVID-19 associated morbidity, mortality, and stress on the healthcare system is expected to continue. To address the on-going pandemic, the addition of reliable and easy-to-use models of deterioration to severe COVID-19 will better allow clinicians and systems to make improved, evidenced-based patient care decisions. This study developed and internally validated a web-based model to predict 14-day risk of progression to severe COVID-19 using a cohort of 415 diverse U.S. patients hospitalized at a large academic medical center and four community All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07.16.20155739 doi: medRxiv preprint hospitals. Compared to other large case studies in the U.S., patients in our study have similar demographics, clinical characteristics, and mortality rates. 9, 27 There have been several studies from China, Italy, and the U.S. that have identified characteristics of patients who have poor outcomes from COVID-19 based on retrospective analyses. 7, 9, [28] [29] [30] Similar to Zhou et al., we identified an increased risk of severe COVID-19 with older age. 13 An analysis of 1,150 adults in New York City hospitals indicated that chronic pulmonary disease, followed by cardiovascular disease, older age, higher concentrations of interleukin-6, and D-dimer at admission were the strongest predictors of mortality with Black and Hispanic patients presenting later in the disease course compared to White patients. 31 These findings are similar to the clinical predictors used in our study, in which an elevated CRP and Ddimer are suggestive of a profound inflammatory state. In agreement with data from a multicenter observational study, we found increased odds of death with male sex and history of diabetes. 32 Similarly, positive troponin was found to be associated with increased odds of death (OR=2.21, p=0.02) which is also consistent with prior studies. 17, 22, 33 For this reason and practical purposes, troponin was included in both the severe COVID-19 and mortality models. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. This study has several strengths. First, the study participants represent a diverse selfreported race of patients residing in three states. Second, this population represents a cohort of inpatients that reflect common U.S. comorbid conditions which will allow wide applicability for clinical use in the U.S. and other countries with similar demographics and comorbidity types and rates. Third, this study is one of the first prediction models in the U.S.. Fourth, the data used for the model development is derived from multiple hospital types including a large academic center and several community hospitals thereby reducing the likelihood that observed outcomes are confounded by a single institution's unique treatment approach. Last, the model is available for real-time clinical use via a web-based calculator. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. . https://doi.org/10.1101/2020.07. 16.20155739 doi: medRxiv preprint This study is not without limitations. First, the data was extracted retrospectively, relied on EHR provider documentation, and was limited to variables contained in the EHR. To help ensure data validity, the variables and outcomes of interest were extracted by physicianinvestigators and validated by an independent researcher. Second, the sample size is relatively small compared to larger studies from China. These models had excellent performance during the internal validation process, therefore, we chose to prioritize the dissemination given the urgent need of prediction models tailored specifically to the U.S. to care for patients suffering from COVID-19. Third, given the rapidly changing "standard of care" for COVID-19 and institutional efforts to educate clinicians in near real-time, there was likely significant practice variation both within each hospital and between hospitals between March 1, 2020 and April 30, 2020 that might affect outcomes. Nonetheless, we have provided a mobile-friendly model for prediction of severe COVID-19 upon presentation. In conclusion, this study presents an internally-validated prediction model for progression to severe COVID-19 and mortality in hospitalized patients that can be used in real-time at the bedside. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. Adjusted odds ratio of 14-day mortality predictors in multivariable logistic regression. All patients (n=415) All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. Initial white blood cell count (×10 9 /L) 6.6 9.1 <0.001 Initial hemoglobin (g/dL) 12.7 12.9 0.34 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. C-reactive protein (mg/dL) (reference:0-5) All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 18, 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin COVID-19 United States Cases by County -Johns Hopkins Coronavirus Resource Center Johns Hopkins Coronavirus Resource Center COVID-19 Map -Johns Hopkins Coronavirus Resource Center Johns Hopkins Coronavirus Resource Center The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. The Epidemiological Characteristics of an Outbreak of 2019 Novel Coronavirus Diseases (COVID-19) -China Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention Risk Factors Associated With Mortality Among Patients With COVID-19 in Intensive Care Units in Factors associated with death in critically ill patients with coronavirus disease 2019 in the US Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study COVID-19 in an Integrated Health Care System in California Characteristics and Outcomes of 21 Critically Ill Patients With COVID-19 in Washington State Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19 Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal Triage of Scarce Critical Care Resources in COVID-19 An Implementation Guide for Regional Allocation: An Expert Panel Report of the Task Force for Mass Critical Care and the Limited ability of SOFA and MOD scores to discriminate outcome: a prospective evaluation in 1,436 patients Association of Cardiac Injury With Mortality in Hospitalized Patients With COVID-19 in Obesity and infection The effect of age on the development and outcome of adult sepsis Sepsis in diabetes: A bad duo Scikit-Learn: Machine learning in python Characteristics and Clinical Outcomes of Adult Patients Hospitalized with COVID-19 -Georgia Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region Compliance with triage to intensive care recommendations