key: cord-0744092-fgqcpwkz authors: Jehi, Lara; Ji, Xinge; Milinovich, Alex; Erzurum, Serpil; Rubin, Brian; Gordon, Steve; Young, James; Kattan, Michael W. title: Individualizing risk prediction for positive COVID-19 testing: results from 11,672 patients. date: 2020-06-10 journal: Chest DOI: 10.1016/j.chest.2020.05.580 sha: f35772b0f3eb352416e53bd4ab735c7f650b5c51 doc_id: 744092 cord_uid: fgqcpwkz Abstract: Background Coronavirus disease-2019 (COVID-19) is sweeping the globe. Despite multiple case-series, actionable knowledge to proactively tailor decision-making is missing. Research Question Can a statistical model accurately predict infection with COVID? Study Design and Methods: We developed a prospective registry of all patients tested for COVID-19 in Cleveland Clinic to create individualized risk prediction models. We focus here on likelihood of a positive nasal or oropharyngeal COVID-19 test [COVID-19 (+)]. A least absolute shrinkage and selection operator (LASSO) logistic regression algorithm was constructed, which removed variables that were not contributing to the model’s cross-validated concordance index. Following external validation in a temporally and geographically-distinct cohort, the statistical prediction model was illustrated as a nomogram and deployed in an online risk calculator. Results 11,672 patients fulfilled study criteria in the development cohort, including 818 (7.0%) COVID-19 (+), and 2,295 patients fulfilled criteria in the validation cohort including 290 COVID-19 (+). Males, African Americans, older patients, and those with known COVID-19 exposure were at higher risk of being COVID-19 (+). Risk was reduced in those who had pneumococcal polysaccharide or influenza vaccine, or were on melatonin, paroxetine, or carvedilol. Our model had favorable discrimination (c-statistic=0.863 in development; 0.840 in validation cohort) and calibration. We present sensitivity, specificity, negative predictive value, and positive predictive value at different prediction cut-offs.The calculator is freely available at https://riskcalc.org/COVID19. Interpretation Prediction of a COVID-19 (+) test is possible and could help direct healthcare resources. We demonstrate relevance of age, race, gender, and socioeconomic characteristics in COVID-19-susceptibility and suggest a potential modifying role of certain common vaccinations and drugs identified in drug-repurposing studies. Interpretation: Prediction of a COVID-19 (+) test is possible and could help direct healthcare resources. We demonstrate relevance of age, race, gender, and socioeconomic characteristics in COVID-19-susceptibility and suggest a potential modifying role of certain common vaccinations and drugs identified in drug-repurposing studies. Funding: NIH/NCATS UL1TR002548 The first infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the novel virus responsible for coronavirus disease 2019 was reported in the United States on January 21, 2020 1 . Three months later, the US healthcare system and our society are struggling in an ever-changing environment of social distancing policies and projected utilization requirements,with constantly shifting treatment guidelines. A scientific approach to planning and delivering healthcare is sorely needed to match our limited resources with the persistently unmet demand. This supply vs demand gap is most obvious with diagnostic testing. Plagued with technical and regulatory challenges 2 , the production of COVID-19 test reagents and tests is lagging behind what is needed to fight a pandemic of this scale. Consequently, most hospitals are limiting testing to symptomatic patients and their own exposed healthcare workers. This is occurring at a time when experts are calling for expanding testing capabilities beyond symptomatic individuals to better measure the infection's transmissibility, limit the spread by quarantine of those infected, and characterize COVID-19's epidemiology 3 . Recent loosening of the FDA testing regulations and the development of point of care testing will make more tests available, but given the anticipated demand, it is unlikely that testing supply will be enough. Even if enough testing supplies become available, indications driven by scientific data are still needed. Another challenge is the suboptimal diagnostic performance of the test 4 , raising concerns about false negative results complicating efforts to contain the pandemic. Unless we develop intelligent targeting of our testing capabilities, we will be significantly handicapped in our ability to make progress in assessing the extent of the disease, directing clinical care, and ultimately controlling COVID-19. We developed a prospective registry aligning data collection for research with clinical care of all patients tested for COVID-19 in our integrated health system. We present here the first analysis of our Cleveland Clinic COVID-19 Registry, aiming to develop and validate a statistical prediction model to guide utilization of this scarce resource by predicting an individualized risk of a "positive test". A nomogram is a visual statistical tool that can take into account numerous variables to predict an outcome of interest for a patient 5 . We included all patients, regardless of age, who were tested for COVID-19 at all Cleveland Clinic locations in Ohio and Florida. Albeit imperfect, this provides better representation of the population than testing restricted to the Cleveland Clinic main campus. The Cleveland Clinic Institutional Review Board approval was obtained concurrently with the initiation of testing capabilities (IRB#20-283). The requirement for written Informed Consent was waived. Demographics, co-morbidities, travel and COVID-19 exposure history, medications, presenting symptoms, treatment, and disease outcomes are collected (supplemental data 2). Registry variables were chosen to reflect available literature on COVID-19 disease characterization, progression, and proposed treatments, including medications proposed to have potential benefits through drug-repurposing studies 6 . Capture of detailed research data is facilitated by the creation of standardized clinical templates implemented across the healthcare system as patients were seeking care for COVID-19-related concerns. Data were extracted via previously validated automated feeds 7 from our electronic health record (EPIC, EPIC Systems Corporation) and manually by a study team trained on uniform sources for the study variables. Study data were collected and managed using Research Electronic Data Capture (REDCap) electronic data capture tools hosted at Cleveland Clinic. 8, 9 COVID-19 testing protocols: The clinical framework for our testing practice is shown in Figure 1 . As testing demand increased, we adapted our organizational policies and protocols to reconcile demand with patient and caregiver safety. This occurred in three phases: • Phase I (March 12-13, 2020): We expanded primary care through telemedicine. If patients called for concerns that they had COVID-19, they were screened through a virtual visit (VV) using Cleveland Clinic's Express Care ® Online or called their primary care provider. If they needed to travel to our locations, we asked them to call ahead before arrival. Our goal was to limit exposure to caregivers, and ensure physicians could order testing when appropriate, while following the Center for Disease Control testing recommendations. A doctor's order was required for testing. • Phase II (March 14-17, 2020): Drive-through testing was initiated on Saturday March 14. Patients still needed to have a doctor's order for a COVID-19 test, similar to Phase I. Testing guidelines were similar to Phase I. Upon arrival at the drive-through location, patients stayed in their car, provided their doctor's order, and remained in their car as samples were collected. Patients were tested regardless of their ability to pay and were not charged copays. • Phase III (March 18-onwards): Given high testing demand, low initial testing yield, and backlog of tests awaiting to be processed, there was a shift to testing high risk patients ( Figure 1 ). Test samples were obtained through naso and oropharyngeal swabs -both collected and pooled for testing. Tests were run using the CDC assay using Roche magnapure extraction and ABI 7500 DX PCR machines, as per the standard lab testing in our organization. knots were applied to continuous variables to relax the linearity assumption. A least absolute shrinkage and selection operator (LASSO) logistic regression algorithm was performed to retain the most predictive features. A 10-fold cross validation method was applied to find the regularization parameter lambda which gave the minimum mean cross-validated concordance index. Predictors with nonzero coefficients in the LASSO regression model were chosen for calculating predicted risk. Model validation: The final model was first internally validated by assessing the discrimination and calibration with 1000 bootstrap resamples. The LASSO procedure, including 10-fold cross validation for optimizing lambda, was repeated within each resample. We then validated it in a temporally and geographically distinct cohort of 2,295 patients tested at the Cleveland Clinic hospitals in Florida from 4/2/2020 to 4/16/2020. This was done to assess the model's stability over time, and its generalizability to another geographical region. Model performance: Discrimination was measured with the concordance index. 10 Calibration was assessed visually by plotting the nomogram predicted probabilities against the observed event proportions. The closer the calibration curve lies along the 45° line, the better the calibration. A scaled Brier score (IPA) 11 was also calculated, as this has some advantages over the more popular concordance index. The IPA ranges from -1 to 1, where a value of 0 indicates a useless model, and negative values imply a harmful model. Finally, decision curve analysis (DCA) 12 was conducted to inform clinicians about the range of threshold probabilities for which the prediction model might be of clinical value. We then calculated sensitivity, specificity, positive predictive value, negative predictive value, for different recommended test cut-offs ( Figure 4 ). We adhered to the TRIPOD checklist for prediction model development. Patient characteristics: 11,672 patients presented with symptoms of a respiratory tract infection or with other risk factors for COVID-19 before April 2, 2020, and underwent testing according to the framework illustrated in Figure 1 . The testing yield changed as the selection criteria became stricter (Supplemental figure-1). Between April 2 and 16, 2020, 2,295 were tested in Florida (Florida Validation Cohort). The clinical characteristics of the development cohort and validation cohort are found in Table 1 . Nomogram results:. Imputation methods were evaluated with 1000 repeated bootstrapped samples. We found that models based on median imputation appeared to outperform those based on data from MICE imputation, so median imputation was selected for the basis of the final model. Variables that we looked at that were not found to add value beyond those included in our final model for predicting COVID-19 test result included being a healthcare worker in Cleveland Clinic, fatigue, sputum production, shortness of breath, diarrhea, and transplant history. The bootstrap-corrected concordance index in the development cohort was 0.863 (95% CI 0.852, 0.874), and the IPA was 20.9% (95% CI: 18.1%, 23.7%). The concordance index in the Florida validation cohort was 0.839 (95% CI: 0.817, 0.861), and the IPA was 18.7% (95% CI: 13.6%, 23.9%). Figure 3 shows the calibration curves in the development and validation cohort. In the development cohort, the predicted risk matches observed proportions for low predictions before the model begins to overpredict at high risk levels. Calibration in the Florida validation cohort is acceptable, although predictions above 40% become too high as the predicted probability increases. Cut-off Definition: Given that the tool provides a probability that an individual subject will test positive, the challenge is to use the tool in practice. This would usually require choosing a cutoff, below which, the risk is sufficiently low that the subject would not be tested. Figure 4 illustrates the tradeoff by plotting the proportion of negative tests avoided versus the proportion of positive tests retained as the cutoff is increased. A decision curve analysis showed that if the threshold of action is 1.3% or less, the model is not better than simply assuming everyone is "high risk". However, once the threshold becomes greater than 1.3%, using the model to determine who is high risk is preferable. The nomogram and its online version available at https://riskcalc.org/COVID19/ are shown in Figure 2 . The COVID-19 pandemic has significantly impacted the world, changing medical practice and our society. Some countries are now recovering from it, but many regions are just beginning to be affected. In the United States, some states are still preparing for a "surge" that may overwhelm the healthcare delivery system, while others are preparing to "re-open" and lift social distancing measures. In a "pre-surge" situation, resources needed to address every step of a patient's trajectory through COVID-19 are limited, starting from testing, through hospitalization, and intensive care if needed. In a "pre-reopening" situation, tools to better identify individuals at risk of developing COVID-19 are sorely needed to inform policy. We developed the Cleveland Clinic COVID-19 Registry to include ALL patients tested for COVID-19 (rather than just those with the disease) to better understand disease epidemiology, and develop nomograms, tools that go beyond cohort descriptions to individualize risk prediction for any given patient. This could empower front-line healthcare providers and inform decisionmaking, immediately impacting clinical care. We present here our first such nomogram, one that predicts the risk of a positive COVID-19 test. We want to emphasize that our work should not be interpreted as "accepting" or rationalizing inadequate testing capacity. Our tool should not take the pressure off being able to do what is right clinically for individual patients by expanding testing capabilities. COVID-19 testing challenge: Available COVID-19 clinical literature is mostly based on small case series, or descriptive cohort studies of patients already documented to have COVID-19 13-22 : this provides some information on the population that may be at greatest risk of adverse outcomes if they get infected with the virus, but does little to inform us on who is at most risk to get infected. The proportion of COVID (-) tests fell significantly in our patient population with stricter testing guidelines (Supplemental Figure 1 ), but the yield remained very low, suggesting that our ability to clinically differentiate COVID-19 from other respiratory illnesses at the early stages of the disease is limited, further supporting the need for better tools to individualize testing indications. COVID-19 risk factors: Some of our predictors for developing COVID confirm prior literature. For example, we corroborate a recent World Health Organization report suggesting that men may be at higher risk of developing COVID-19 23 , thought to reflect underlying hormonal or genetic risk. Our finding of a higher COVID-19 risk with advancing age can be explained by known agerelated changes in the angiotensin-renin system in mice 24 Carvedilol was recently found to inhibit ACE-2-induced proliferation and contraction in hepatic stellate cells through the rhoa/rho-kinase pathway 27 . It is unclear whether it has similar effects on ACE-2 in lung endothelium. With ACE-2 being key in the pathophysiology of infection with SARS-CoV-2, our findings are intriguing. These findings would have to be reproduced and validated in clinical trials before their full significance can be assessed. When interpreting our multivariable model, it is important to recognize that a single predictor cannot be interpreted in isolation. For example, it is artificial to claim that a drug is reducing risk since, in reality, other variables tend to be different for a patient who is on, or not on, a drug. Moving a patient on a nomogram axis, holding all other axes constant, is hypothetical, since he or she is likely moving on other axes when moved on one. This is the case for all multivariable statistical prediction models. Nomogram performance: Model performance, as measured by the concordance index, is very good in the development and in the validation cohort (c-statistic =0.863 and 0.839 respectively). This level of discrimination is clearly superior to a coin toss or assuming all patients are at equivalent risk (both c-statistics = 0.5). The internal calibration of the model is excellent at low predicted probabilities (see Figure 3 ), but some regression to the mean is apparent at predictions beyond 40% or so in the validation cohort. This would seem to be of little concern, that the model is overpredicting risk at that level, since this is considerably high risk clinically and likely beyond a threshold of action. Moreover, the metric that considers calibration, the IPA value, confirms that the model predicts better than chance or no model at all. The good performance of our model in a geographically distinct region (Florida), and over time (validation cohort in patients tested at a later timeframe) suggests that patterns and predictors identified in our model are likely consistent across health systems and regions, rather than specific to the unique spread of the virus within Cleveland's social structures. Clinical utility: As with any predictive tool, the utility of a nomogram depends on the clinical context. The decision curve analysis suggests that if the goal is to distinguish patients with a risk of 1.3% (or a higher cutoff) vs those of higher risk, the prediction model is useful. In other words, using the model to determine whom to test detects more true positives per test performed than does testing everyone as long as one is willing to test 1000 subjects to detect 13 cases. Any cutoff choice involves tradeoffs of avoiding negative tests vs. missing positive cases, illustrated in Figure 4 . Using a low prediction cut-off (<1.3% from the tool) as a trigger to order testing will allow us to continue to identify a vast majority of COVID (+) cases (assuming we maintain our other selection criteria for testing constant) while avoiding testing a large proportion of patients who are indeed COVID (-). This may be appropriate when testing supplies are abundant and one wants to comprehensively survey the extent of COVID-19 in the population. Conversely, in a resource-limited setting (e.g: hospital facing a surge), a cut-off greater than 1.3% or more may be more appropriate to avoid unnecessary testing. Study Limitations: Available real-time reverse transcriptase polymerase chain reaction (rRT-PCR) tests of nasopharyngeal swabs have been typically used for diagnosis, but data suggest suboptimal test performance as it only detected the SARS-CoV-2 virus in 63% of nasal swabs and 32% of pharyngeal swabs in patients with known disease 4 . In our study, we did both swabs, hoping to at least partly address this limitation. Although we performed validation of our model in a temporally and geographically distinct cohort, we acknowledge the fact that our results depend on the particular time and place that the data were collected. As the pandemic evolves, our results may not reflect updated distribution of the virus in any given region and our model will need to be re-fit. To accommodate an ever-increasing COVID-19 prevalence, the model will need to be recalibrated and refit over time. Our online risk calculator is publicly available, but direct integration with the electronic health record can further improve its utility. The online calculator will reflect this updating. Our study is not designed to evaluate the very real issue of healthcare disparities which would require a population-based approach for the study of healthcare delivery, beyond the scope of the work presented here. Our conclusions are highly dependent on access to testing sites and doctors orders rather than population-based predictors of positive results. We provide an online risk calculator that can effectively identify individualized risk of a positive COVID-19 test. Such a tool provides immediate benefit to our patients and healthcare providers as we face anticipated increased demand and limited resources, but does not obviate the critical need for adequate testing: the scarcity of resources must not be accepted as an unalterable fact, and we should resist the inevitability of lack of resources and inequities in healthcare. We also provide some mechanistic and therapeutic insights. Figure 1 : Timeline illustrating evolution of clinical framework to COVID test ordering during the first 10 days of testing. *patients were only sent to the Emergency Department (ED) if they needed evaluation of additional symptoms, and not purely to obtain COVID testing. ***guidelines to order COVID testing followed the CDC recommendations. Main change in Phase III was better definition of high-risk categories, rather than reliance on "physician discretion". VV= Virtual Visit. Of note, only 6.7% were tested in Phase I+Phase II due to Physician Discretion alone, so that number was too small to perform any modeling work in that group. Figure 2:This figure illustrates the graphical version of the model (nomogram in 2A) and the corresponding online risk calculator found at https://riskcalc.org/COVID19/ (2B). The example for both is a 60 yo white male, former smoker, who presented with cough, fever, and a history of a known family member with COVID-19. He has coronary artery disease, did not receive vaccinations against influenza or pneumococcal pneumonia this year, and is only on Melatonin to help with sleep. No labs were done at time of COVID-19 testing. His predicted risk of testing positive is 13.79%. If race is changed to black, with all other variables remaining constant, his relative risk almost doubles to an absolute value of 23.95%. Step 1: Find the patient's characteristic on each line, and from it, draw a vertical arrow towards the points line. The intersection identifies the points attributed to this characteristic. Our example patient earns 32 points for fever, 24 points for cough, 85 points for being 60 y.o, etc. The points line identifies the points that are associated with each of the predictor variables. Step 2: Repeat step 1 for each of the patient's characteristics Figure 3A shows the calibration curve in the Development cohort of 11672 tested in Cleveland Clinic Health System before April 2. Figure 3B shows the calibration curve in the Florida Validation Cohort (2295 patients tested in Cleveland Clinic Florida from 4/2/2020-4/16/2020). As demonstrated, there is excellent correspondence between the predicted probability of a positive test and the observed frequency of COVID-19 (+) in both cohorts. Step 1: Enter patient data Step2: Run calculator Step 3: Obtain individualized prediction Calibration curves for the model predicting likelihood of a positive test. The x-axis displays the predicted probabilities generated by the statistical model and the y-axis shows the fraction of the patients who were COVID-19 (+) at the given predicted probability. The 45°line, therefore, indicates perfect calibration where, for example, at a predicted probability of 0.2 is associated with an actual observed proportion of 0.2. The solid black line indicates the model's relationship with the outcome. The closer the line is to the 45°degree line, the closer the model's predicted probability is to the actual proportion. Figure 3A shows the calibration curve in the Development cohort of 11672 tested in Cleveland Clinic Health System before April 2. Figure 3B shows the calibration curve in the Florida Validation Cohort (patients tested in Cleveland Clinic Florida from 4/2/2020-4/16/2020). As demonstrated, there is good correspondence between the predicted probability of a positive test and the observed frequency of COVID-19 (+) in all cohorts. References: 1-The coronavirus pandemic in five powerful charts Defining the Epidemiology of Covid-19 -Studies Needed Detection of SARS-CoV-2 in Different Types of Clinical Specimens Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2 Extracting and utilizing electronic health data from Epic for research Research electronic data capture (REDCap) -A metadata-driven methodology and workflow process for providing translational research informatics support The REDCap consortium: Building an international community of software partners Evaluating the yield of medical tests The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models Assessing the performance of prediction models: a framework for traditional and novel measures Epidemiological, clinical and virological characteristics of 74 cases of coronavirus-infected disease 2019 (COVID-19) with gastrointestinal symptoms Epidemiological and Clinical Predictors of COVID-19 Clinical course and mortality risk of severe COVID-19. Lancet Host susceptibility to severe COVID-19 and establishment of a host risk score: findings of 487 cases outside Wuhan. Crit Care Clinical Features of 69 Cases with Coronavirus Disease Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Risk Factors Associated With Acute Respiratory Distress Syndrome and Death in Patients With Coronavirus Disease Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Clinical characteristics of 140 patients infected with SARS-CoV-2 in Wuhan,China. Allergy Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Age-Associated Changes in the Vascular Renin-Angiotensin System in Mice Ageing reduces angiotensin II type 1 receptor antagonism mediated pre-conditioning effects in ischemic kidneys by inducing oxidative and inflammatory stress Synthetic Toll-like receptor 4 (TLR4) and TLR7 ligands as influenza virus vaccine adjuvants induce rapid, sustained, and broadly protective responses Carvedilol Inhibits Angiotensin II-Induced Proliferation and Contraction in Hepatic Stellate Cells through the RhoA/Rho-Kinase Pathway Step 1: Find the patient's characteristic on each line, and from it, draw a vertical arrow towards the points line. The intersection identifies the points attributed to this characteristic. Our example patient earns 32 points for fever, 24 points for cough, 85 points for being 60 y.o, etc.The points line identifies the points that are associated with each of the predictor variables.Step 2: Repeat step 1 for each of the patient's characetristicsStep 3: add all the points collected in the points line, and mark the total in this Total points line. Draw a downward arrow from total points Intersection with "Risk of COVID-19 positive line" provides individualized patient risk of 13.79%