key: cord-0946778-l2zvvgjf authors: Myers, Laura C.; Kipnis, Patricia; O’Suilleabhain, Liam; Escobar, Gabriel; Liu, Vincent X. title: Performance of Predictive Models for 30-Day Hospitalization and Mortality after COVID-19 Infection date: 2022-01-01 journal: Annals of the American Thoracic Society DOI: 10.1513/annalsats.202103-267rl sha: af925f128f6fcc3ab6ff856e2658411a2a38baca doc_id: 946778 cord_uid: l2zvvgjf nan Identifying which patients have the highest risk of hospitalization for coronavirus disease can inform personalized recommendations about exposure risk, proactive screening and treatments at the health system level, and vaccine distribution across a population. Previously developed COVID-19 risk tools estimate mortality of those admitted to the hospital (1). We developed and implemented a real-time COVID-19 risk score (CRS) predicting 30-day nonelective hospitalization to inform elective surgery workflows (2) , which we subsequently updated to incorporate newer data. Similar health system scores, including the Veterans Health Administration COVID-19 (VACO) index, have been developed to predict 30-day mortality across a population (3) . Here, we report performance of CRS for predicting both 30-day nonelective hospitalization and mortality and compare the performance of CRS with the VACO index for these outcomes. These analyses were uniquely feasible given our position in an integrated health system (Kaiser Permanente Northern California [KPNC] ) with population-level data. We identified patients' first positive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) polymerase chain reaction result between February 1, 2020 and September 30, 2020 in adult members (aged >18 yr) at KPNC, which serves 4.5 million members across 21 medical centers. We developed and internally validated CRS using a split 75% train/25% test dataset. We performed multivariable logistic regression using age, sex, and comorbidity point score version 2 (COPS2) (4). COPS2 is a weighted score of 75 hierarchical condition categories (5) over the 1 year preceding the first positive test. COPS2 is easily generated as a sum and has been externally validated in Canada with better discrimination than Elixhauser or Charlson (6). Because of low turnover in KPNC, more than 90% of patients had data for 12 months prior to COVID-19 diagnosis, indicating that there were few missing data. We used a truncated power function to spline at 3 knots for age and COPS2 (7). The final model included age, age (2), age (3), cubic spline of age, sex, COPS2, COPS2 (2), COPS2 (3), and cubic spline of COPS2. Our outcome was 30-day nonelective hospitalization, defined as hospitalizations originating in the emergency department. By regional policy, patients seen in clinic and sent to the hospital must pass through the emergency room for stabilization and triage, so there are no direct admits from clinic. We chose this outcome because it is a more proximal outcome than 30-day mortality and potentially more intervenable. Each patient could contribute one hospitalization in the dataset, which was the hospitalization closest to the first positive test. We also examined CRS performance for 30-day mortality outcome. We calculated the VACO index on the same cohort using the published variable specifications and coefficients for age, sex, history of myocardial infarction/peripheral vascular disease, and interaction terms for age and Charlson comorbidity (3). We report concordance statistics (c-statistics) of CRS and the VACO index for predicting 30-day nonelective hospitalization and 30-day mortality with bootstrapped 95% confidence intervals (CIs) and sensitivity and specificity at a positive predictive value threshold near 20%. We also calculated the net reclassification index for CRS relative to the VACO index for both outcomes to assess comparative performance (8) . We show calibration plots for the outcome for which the model was originally derived (hospitalization for CRS and mortality for VACO) along with calibration statistics (calibration-in-the-large and calibration slope) and 95% CIs. We performed calibration on the validation cohort for CRS and the full cohort for VACO. The KPNC Institutional Review Board approved the project with a waiver of informed consent. We used SAS 9.4. From more than 3.2 million patients, we examined 36,137 patients with positive tests, of whom 3,397 (9.4%) had nonelective hospitalization and 361 (1.4%) died within 30 days of the test. Most tests (34,435, 95.3%) were performed in outpatient (nonemergency department) settings. Mean age was 43.4 6 16.3 years ( Table 1) . The c-statistics for CRS were 0.82 (95% CI, 0.81-0.83) for nonelective hospitalization and 0.93 (95% CI, 0.92-0.94) for 30-day mortality in the test set. The c-statistics for the VACO index were 0.78 (95% CI, 0.77-0.79) for nonelective hospitalization and 0.93 (95% CI, 0.92-0.94) for 30-day mortality. At a positive predictive value threshold of 20%, CRS had higher sensitivity but lower specificity for hospitalization compared with the VACO index, resulting in net reclassification of 8.0% more hospitalizations in the higher risk category and 3.5% fewer nonhospitalizations in the lower risk category compared with the results from the VACO index (Table 2 ). For 30-day mortality, CRS had lower sensitivity but slightly higher specificity compared with the VACO index, resulting in net reclassification of 11.0% fewer deaths in the higher risk category and 0.7% more nondeaths in the lower risk category. The calibration plots reveal that CRS is well calibrated for patients at low risk of hospitalization (0-15% risk), which is the vast majority of the cohort (88%) ( Figure 1A ). In contrast, VACO overestimates risk across the full range of risk ( Figure 1B ). The calibration-in-the-large was 0.17 (95% CI, 0.03-0.31) for nonelective hospitalization for CRS on the validation cohort and 20.39 (95% CI, 20.56 to 20.22) for mortality for VACO on the full cohort. The calibration slope was 1.04 (95% CI, 0.97-1.10) and 1.24 (1.16-1.32) for CRS and VACO, respectively. Supported by the Permanente Medical Group, Inc., and Kaiser Foundation Hospitals, Inc. The funders were not involved in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication. Author Contributions: L.C.M. interpreted the results and drafted the manuscript. P.K. and L.O'S. performed the analysis. G.E. coordinated the data extraction and interpreted the results. V.X.L. conceived of the idea and oversaw the analysis. All authors critically reviewed the manuscript. P.K. and V.X.L. had full access to the data and take responsibility for the accuracy of the analysis. We evaluated the performance of models for two patient-centered outcomes following COVID-19 infection using data from a large, integrated health system. We found that CRS model discrimination was very good for nonelective hospitalization, the outcome for which it was developed (c-statistic, 0.82). For 30-day mortality, CRS was comparable to the VACO index (both 0.93). CRS showed good calibration for patients with low risk of hospitalization, which was the majority of the cohort (88%). Given this result, the model has been used in KPNC to target patients for outpatient procedures who were at low risk of hospitalization due to COVID-19 during the initial reopening of operating rooms. External validation of CRS is necessary. VACO overestimated risk across the range of risk in this external validation, which is consistent with its previous validation at the Veterans Administration (3) . Because the veteran population is older, predominantly male, with higher chronic disease burden (9) , external validation of VACO is important for generalizability. KPNC members are diverse and tend to reflect the general population (10) . In our external validation, VACO indexdiscrimination was better (0.93)than what was reported in thetwo internal validations (0.81-0.84) (3). This finding could be due to a more complete dataset at KPNC or a more rare outcome rate (1.4% mortality in the KPNC cohort vs. 6.8% across the two veterans' validation cohorts). Regardless, the VACO index showed strong potential for generalizability in an independent cohort despite differences in the composition of the veteran and KPNC cohorts. Both models focus on a selected set of variables known to account for key drivers of adverse outcomes in COVID-19 that can be easily identified using electronic health record data. Both models have already been implemented in real-time systems or online calculators to facilitate rapid dissemination and everyday use. For example, CRS has been deployed at KPNC to assist with diverse health system workflows for patients with COVID-19; the VACO index is a publicly available tool on MDCalc. We did not include race in CRS because the model could be used for resource allocation. Our group has previously shown that severity of illness scores used during crisis standard of care could prioritize white patients over Black patients because they overestimate risk for Black patients (11) . Therefore, we did not want to create a model that prioritized one race to receive resources, which might propagate health disparities. This is a complex, important, and emerging area. Further work is necessary to demonstrate equity of this model across subgroups of races. This study fills important gaps. A previous model by Jehi and colleagues predicting 30-day hospitalization in COVID-19 infection required 10 input variables, including smoking status, symptoms, and chronic medications (12) . Our model is much simpler and had similar performance (0.82 in this study compared with 0.81) (12) . While numerous models have been developed to predict COVID-19 mortality, models calibrated for hospitalization offer higher potential for health system decision-making, which can include proactive screening and outreach, recommendations for specific treatments, and patient education and counseling. With both scores, the use of data through late 2020 also increases confidence that these models show good discrimination over multiple waves of infection. There are several limitations. Patients may have had positive tests outside our health system that were not counted. Similarly, deaths that occurred outside the hospital may not have been reported to KPNC at the time of analysis. The strengths of this study include its large, population-based sample with minimal missing data spanning all waves of the pandemic. Few data sources contain the longitudinal data necessary to evaluate these models on the population level. COVID-19@Spain and COVID@HULP Study Groups. Development and validation of a prediction model for 30-day mortality in hospitalised patients with COVID-19: the COVID-19 SEIMC score COVID-19 and long-term planning for procedure-based specialties during extended mitigation and suppression strategies Development and validation of a 30-day mortality index based on preexisting medical administrative data from 13,323 COVID-19 patients: The Veterans Health Administration COVID-19 (VACO) Index Risk-adjusting hospital mortality using a comprehensive electronic record in an integrated health care delivery system Healthcare Cost and Utilization Project The Kaiser Permanente inpatient risk adjustment methodology was valid in an external patient population The elements of statistical learning Net reclassification index at event rate: properties and relationships Health and health behavior differences: U.S. military, veteran, and civilian men Similarity of adult Kaiser Permanente members to the adult population in Kaiser Permanente's Northern California service area: comparisons based on the 2017/2018 cycle of the California health interview survey Equitably allocating resources during crises: racial differences in mortality prediction models Development and validation of a model for individualized prediction of hospitalization risk in 4,536 patients with COVID-19 Author disclosures are available with the text of this article at www.atsjournals.org. *Corresponding author (e-mail: laura.c.myers@kp.org).