key: cord-0990886-85dg6prb authors: Reimer, Jody R; Ahmed, Sharia M; Brintz, Benjamin; Shah, Rashmee U; Keegan, Lindsay T; Ferrari, Matthew J; Leung, Daniel T title: Using a clinical prediction rule to prioritize diagnostic testing leads to reduced transmission and hospital burden: A modeling example of early SARS-CoV-2 date: 2021-02-23 journal: Clin Infect Dis DOI: 10.1093/cid/ciab177 sha: 35f24e4ebbed8cee587742a49f3b4a461041b7a5 doc_id: 990886 cord_uid: 85dg6prb BACKGROUND: Prompt identification of infections is critical for slowing the spread of infectious diseases. However, diagnostic testing shortages are common in emerging diseases, low resource settings, and during outbreaks. This forces difficult decisions regarding who receives a test, often without knowing the implications of those decisions on population-level transmission dynamics. Clinical prediction rules (CPRs) are commonly used tools to guide clinical decisions. METHODS: Using early SARS-CoV-2 as an example, we used data from electronic health records to develop a parsimonious 5-variable CPR to identify those who are most likely to test positive. To consider the implications of gains in daily case detection at the population level, we incorporated testing using the CPR into a compartmentalized model of SARS-CoV-2. RESULTS: We found that applying this CPR (AUC: 0.69 (95% CI: 0.68 - 0.70)) to prioritize testing increased the proportion of those testing positive in settings of limited testing capacity. We found that prioritized testing led to a delayed and lowered infection peak (i.e., “flattens the curve”), with the greatest impact at lower values of the effective reproductive number (such as with concurrent community mitigation efforts), and when higher proportions of infectious persons seek testing. Additionally, prioritized testing resulted in reductions in overall infections as well as hospital and intensive care unit (ICU) burden. CONCLUSION: We highlight the population-level benefits of evidence-based allocation of limited diagnostic capacity. The ongoing COVID-19 pandemic has demonstrated the importance of rapid identification of infections in managing an epidemic, as it allows for rapid isolation of cases, contact tracing and quarantining of contacts, thereby limiting onward transmission. However, as seen at the onset of the current pandemic, diagnostic testing capacity is often limited in the emergence of novel infections, in low resource settings, or during outbreaks [1] [2] [3] .When diagnostic testing is unavailable, clinical case definitions are used instead in clinical management and public health response [4] . The rationing of diagnostic testing may result in those with more severe disease or at higher risks of complications receiving tests, as definitive diagnosis is critical to guide care [5] . However, because of their symptoms, severely ill patients may also be less mobile, thereby limiting the indirect benefit of their diagnostic testing on reducing onward transmission. Therefore, tools are needed to guide clinicians in the face of limited testing capacity. Clinical prediction rules (CPRs) are commonly used tools to help to guide clinical management decisions, such as who should undergo testing or receive limited clinical resources. They provide standardization and consistency in care between physicians, as well as improved diagnostic accuracy [6] . Some widely used CPRs include the Centor criteria [7] for diagnosis and treatment of strep pharyngitis, the Ottawa ankle rule [8] for appropriate use of X-ray in setting of ankle trauma, and the CURB65 score [9] for triage of patients with pneumonia. As CPRs are usually developed to improve patient care, their evaluation has been focused on their impact on patient-level outcomes; the impact of CPRs on population health, including on transmission dynamics of infectious pathogens, has not been widely studied. Compartmental models such as the susceptible-exposed-infected-removed (SEIR) model, are often used to describe disease dynamics through a population. They combine epidemiological information (e.g. transmissibility, duration of infectiousness, reproductive number) to provide a picture of the population-level disease dynamics over time [10, 11] , to our knowledge, compartmental models have not yet been used to evaluate the impact of CPRs on population-level public health outcomes. Many diagnostic models for SARS-CoV-2 now exist [12] , each specific to a given population and time, typically focused on achieving optimal patient care. Using a single health system in Utah as a proof-of-concept, we developed a CPR and incorporated it into an SEIR model of the ongoing SARS-CoV-2 pandemic to evaluate the population-level impact that could have been achieved by using a CPR to prioritize testing early in the pandemic, when testing capacity was limited. Many countries, including the United States, have experienced shortages in diagnostic testing capacity, and these shortages will likely continue in many settings worldwide [13] [14] [15] , and well as in future outbreaks of emerging pathogens. Our primary objective was to measure the impact that prioritized testing (using the CPR) could have had on the course of the SARS-CoV-2 pandemic, including the magnitude and timing of the outbreak peak as well as the associated impact on hospitalization and intensive care unit (ICU) burden. Additionally, we determined the conditions (e.g., test availability, test seeking volume, effective reproductive number) in which prioritized testing would have resulted in the greatest reduction of SARS-CoV-2 infections and hospitalizations. Potential benefits of CPR-guided testing continue to be relevant for surges in the SARS-CoV-2 pandemic, for future emerging infections, and for outbreaks of common infections (e.g., cholera, measles) in settings with limited diagnostic capacity. All patients tested for SARS-CoV-2 in the University of Utah Health (UHealth) system were eligible for our study. Data were gathered from a period where testing eligibility was based on presenting with at least one of cough, fever, shortness of breath, or a high risk of exposure given recent travel or contact with a laboratory-confirmed case (March 1, 2020 -April 6, 2020). We use the phrase test eligible to describe any person seeking a test who satisfies these conditions. We considered age, gender, state ranked area deprivation index, smoking status, reported symptoms, healthcare worker status, travel history, and exposure to a confirmed SARS-CoV-2 case as predictive variables. Random forest regression and logistic regression models were considered for our CPR. Our final CPR was a logistic regression model using the top 5 predictors to output the probability of an individual testing positive for SARS-CoV-2. Full details on data processing, the predictive variables, and the construction of the CPR are available in the Supplementary Materials S1. This study was reviewed by the University of Utah Institutional Review Board (IRB) and determined to be exempt. We first explored the effects of prioritized versus indiscriminate testing per day (Fig. 1A ). On a given day, we assumed a certain number, N eligible , of people seek testing and are test eligible (have cough, fever, shortness of breath, or known exposure and seek testing). Of those who seek testing, a certain proportion q would test positive for SARS-CoV-2 if given a test and the rest, (1-q), would test negative. We assumed a limited number, N tests , of SARS-CoV-2 tests were available daily. Using simulations (details in Supplementary Material S2), we measured the proportion of test eligible, SARS-CoV-2 positive patients who received testing under the two testing regimes: prioritized and indiscriminate testing. We also considered the effect of prioritized testing on disease spread in the population over longer time scales (months-to-years). We incorporated the same processes described above into a stochastic SEIR model parametrized for COVID-19. On each modeled day, we simulated the steps shown in Fig. 1B , with parameters as in Table 1 . Further simulation details are in Supplementary Materials S3. We ran simulations assuming a total population of 3.2 million, the approximate population of the state of Utah [16] . We assumed an initial condition of 15 people in the infectious class and all others in the susceptible class. We ran our simulations for a period of 2 years. For each set of parameters considered, we ran 1000 stochastic simulations and then calculated the mean value of each of the total susceptible (S+T S ), exposed (E+T E ), infectious (I+ T I ), and removed (R+ T R ) groups, as well as 95% prediction intervals. We then calculated several metrics including the timing of the peak of the mean infection curve; the peak value of the mean infection curve; and the mean total number of infections by the end of the simulation. These metrics allowed us to compare expected outcomes between the models with indiscriminate testing and prioritized testing. To highlight the associated implications for healthcare demand, we also modeled the daily occupancy of hospital beds and ICU beds (details in Supplmentary Material S3) We then calculated the mean number of people-days (i.e., the number of people on a given day) where demand for hospitalization exceeds Utah's capacity of 4,869 hospital beds and the number of people-days where demand for ICU beds exceeds Utah's capacity of 687 ICU beds [18, 19] . Note that these numbers are for total hospital and ICU beds, not those set aside for COVID-19 patients, and thus provide an upper bound for hospital capacity. All analyses and simulations were conducted using R statistical software (version 3.6.0, [20] ). All code is archived and available online at doi:10.5281/zenodo.3924186. During the period March 1 -April 6, 2020, 1,983 patients were tested for SARS-CoV-2 at UHealth. After removing observations with missing covariate data, we obtained an analytic sample size of 1,928. Our final parsimonious 5-variable CPR had a cross-validated AUC of 0.69 (95% CI: 0.68 -0.70). In all the results that follow, we used this 5-variable CPR. We explored using additional variables but found this only marginally improved predictive ability (AUC up to 0.71; Fig. S1 and Table S2 ), at the expense of requiring much greater data entry effort by clinicians. We also considered alternative versions of the CPR in light of varying predictor availability in different clinical contexts. We explored models excluding symptoms, including vital signs, and including a race/ethnicity variable (Table S1) . Again, these did not meaningfully improve predictive ability (AUC up to 0.72; Table S2 ). Finally, we explored using random forest regression to fit the models, but logistic regression estimates had consistently higher AUCs. When comparing indiscriminate testing to prioritized testing, the absolute difference in the number of people infected with COVID-19 who were tested was greatest for intermediate levels of testing availability, achieving the greatest benefit to disease detection when between 40-60% of test eligible people received testing (vertical difference between solid lines in Fig. 2) . However, the proportional increase in the number of people infected with COVID-19 who were tested was greatest for low testing capacity, with the largest fold changes seen when <20% of test eligible people received testing (dotted line in Fig. 2 ). For example, if the rate of SARS-CoV-2 positivity among test eligible people was 5% and there was test capacity for only 10% of those test eligible people, we would expect to see a nearly 3-fold increase in the number of patients testing positive on a given day if using prioritized testing instead of indiscriminate testing (Fig. 2A) . These results were sensitive to the proportion of SARS-CoV-2 positive patients who are test eligible, with greater differences between prioritized and random testing strategies seen for low rates of SARS-CoV-2 positivity (compare Fig. 2A-2E ). Results were robust to the total number of test eligible persons. Using our stochastic SEIR compartmental model, we show that prioritized testing delays the timing and reduces the prevalence at the infection peak and reduces final size of the pandemic (Fig. 3 , Table 2 ). For our base parameter set, prioritized testing as compared with indiscriminate testing resulted in a 30 day delay in the timing of the infection peak and a 22% decrease in the peak number of infections. The differences in the timing and numbers of infections between a model with prioritized versus indiscriminate testing were greatest for lower values of the effective reproductive number, R e (Fig. 3 , Table 2 ). When alternate CPRs with similar AUC values were considered, results varied only marginally (Table S2) . Alternate CPRs with higher AUC values did not necessarily perform better on all metrics (Table S3 ). Increasing the proportion of infectious test eligible people (w I ) had a positive impact on the magnitude of the differences between the indiscriminate and prioritized testing models (Fig. 3, Table 2 ). Increasing the number of tests available (N tests ) increased the differences for low values of N tests but then had reduced benefits for higher values (Table 2) , consistent with Fig. 2 . Varying the delay in test results, from 0 to 4 days, we observed only small differences in overall disease dynamics ( The availability of diagnostic testing may be limited during either the initial phase of an outbreak with an emerging pathogen, or even in later phases in under-resourced settings resulting in rationing of diagnostic tests, which can have unintended population-level implications. Using SARS-CoV-2 in Utah as a proof-of-concept, we found that a CPR to prioritize testing positively impacts both the number of laboratory-confirmed cases per day, as well as long-term disease dynamics when testing is scarce. We incorporated our model of prioritized testing into an SEIR model and showed the value of our CPR, with appreciable delays in the timing and height of the infection peak, decreases in the total number of infections, and reductions in the number of people-days above hospital and ICU capacity. This novel combination of analytic methods allowed us to highlight both the individual-and population-level benefits of the CPR. In spite of our CPR having only moderate discriminatory performance (AUC=0.69), our results show that prioritizing diagnostic testing, even based on less-than-perfect CPRs, still has a meaningful impact on individual and population disease burden. Furthermore, future predictive models built following more extensive and improved data collection (e.g. standardized collection by clinicians over a longer time) may improve CPR performance, thereby further improving the impact of prioritized testing on community disease burden. we found that prioritized testing yielded the greatest absolute gains for intermediate testing capacity (capacity to test between 40-60% of test eligible people), and highest proportional gains for low testing capacity. Improved diagnostic triage through prioritized testing leads to diagnosis of individuals earlier in their course of disease, with potential for benefit through earlier initiation of therapies or medical monitoring, and isolation or contact-tracing precautions [21] . At the population level, we found notable impact of prioritized testing on COVID-19 dynamics, leading to reductions in infections, hospitalizations, and ICU utilization, as well as delaying the infection peak, providing more time for health systems to prepare for the surge. The magnitude of this impact was sensitive to several key parameters. For example, when R e was lowered, as may happen with the introduction of other public health interventions such as social distancing, the effects of prioritized testing increased. This suggests a synergistic effect between prioritized testing and other non-pharmaceutical interventions, since implementing prioritized testing concurrently with other non-pharmaceutical interventions that reduce R e , can help to maximize potential gains. Increasing the proportion of infectious people who seek testing (w I ) increases the effects of prioritized testing because of the indirect benefit (reduction of R e ) of isolating those individuals quickly. This may occur in populations with a higher proportion of symptomatic individuals, such as older populations [22] or those with other known risk factors [23] . Alternatively, the proportion of infectious individuals seeking testing could be increased intentionally through interventions such as contact tracing or campaigns to encourage test-seeking behavior. (Table S3) . Secondly, there are several logistical challenges. Implementation of such a prioritization system would require its incorporation into a telephone or web-based triage, or through a health worker-based assessment. Additionally, our model assumes that all individuals seeking testing would present at the same time. In most clinical settings, the implementation of such a CPR would involve the use of a probability threshold, set based on data from the previous day(s) and the expected number of test eligible people. The optimal setting of this threshold, given stochastic testing demands and infection dynamics, would be an area for future exploration during clinical trials. Third, we did not consider the implications of the sensitivity and specificity of SARS-CoV-2 tests; low sensitivity and specificity in the diagnostic tests would reduce the utility of testing in general, and thus also of prioritized testing. Finally, our SEIR model was chosen as a tool to demonstrate the relative impact of the CPR using a generalizable framework familiar to our intended audience, and thus omitted explicit consideration of some SARS-CoV-2 transmission mechanisms (e.g., superspreader events). As knowledge about any emerging pathogen continues to evolve, additional details which could help with detailed forecasting can and should be included for specific populations, appropriate for a specific time and place. The limited availability of SARS-CoV-2 testing has hampered disease mitigation efforts in many locations. By incorporating a diagnostic CPR into a transmission dynamics model, we have demonstrated the potential efficacy of prioritized testing for delaying and reducing peak infections and the consequent healthcare demand. By highlighting parameter regimes in which these effects are greatest, we have suggested situations in which it may be most efficacious to consider using a CPR to prioritize testing of testing shortages caused by the emergence of a novel infectious disease such as SARS-CoV-2. Laboratory Response to Ebola -West Africa and United States World Health Organization. Coronavirus disease Isolation of a Novel Coronavirus from a Man with Pneumonia in Saudi Arabia COVID-19): 2020 Intermin Case Definition Cumulative incidence and diagnosis of SARS-CoV-2 infection in New York Clinical prediction rules: Challenges, barriers, and promise Large-scale validation of the centor and mcisaac scores to predict group A streptococcal pharyngitis Accuracy of Ottawa ankle rules to exclude fractures of the ankle and mid-foot: Systematic review Defining community acquired pneumonia severity on presentation to hospital: An international derivation and validation study Mathematical models in the evaluation of health programmes Opportunities and challenges in modeling emerging infectious diseases Prediction models for diagnosis and prognosis of covid-19 infection: Systematic review and critical appraisal Why US coronavirus testing barely improved in April -Vox. Available at The COVID-19 testing challenge Accessed 28 Coronavirus Test Obstacles: A Shortage of Face Masks and Swabs -The New York Times. 2020. Available at A scenario modeling pipeline for COVID-19 emergency planning American Hospital Directory. Individual Hospital Statistics for Utah Harvard Global Health Institute. US Hospital Capacity R: A Language and Environment for Statistical Computing Predicting partner hiv testing and counseling following a partner notification intervention Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China Prevalence of comorbidities and its effects in coronavirus disease 2019 patients: A systematic review and meta-analysis Overview of COVID-19 Surveillance Accessed 21 An Introduction to Statistical Learning Random Forest vs Logistic Regression: Binary Classification for Heterogeneous Datasets The real time effective reproductive number for COVID-19 in the United States High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2 83%) 1.75 30