key: cord-0704004-dmczi8yx authors: Lopez, Cesar A.; Cunningham, Clark H.; Pugh, Sierra; Brandt, Katerina; Vanna, Usaphea P.; Delacruz, Matthew J.; Guerra, Quique; Goldstein, Samuel Jacob; Hou, Yixuan J.; Gearhart, Margaret; Wiethorn, Christine; Pope, Candace; Amditis, Carolyn; Pruitt, Kathryn; Newberry-Dillon, Cinthia; Schmitz, John; Premkumar, Lakshmanane; Adimora, Adaora A.; Emch, Michael; Boyce, Ross; Aiello, Allison E.; Fosdick, Bailey K.; Larremore, Daniel B.; de Silva, Aravinda M.; Juliano, Jonathan J; Markmann, Alena J. title: Disparities in SARS-CoV-2 seroprevalence among individuals presenting for care in central North Carolina over a six-month period date: 2021-03-30 journal: medRxiv DOI: 10.1101/2021.03.25.21254320 sha: 6458353b0f5cb7419391ff3c97a962bdb469ba0f doc_id: 704004 cord_uid: dmczi8yx BACKGROUND: Robust community-level SARS-CoV-2 prevalence estimates have been difficult to obtain in the American South and outside of major metropolitan areas. Furthermore, though some previous studies have investigated the association of demographic factors such as race with SARS-CoV-2 exposure risk, fewer have correlated exposure risk to surrogates for socioeconomic status such as health insurance coverage. METHODS: We used a highly specific serological assay utilizing the receptor binding domain of the SARS-CoV-2 spike-protein to identify SARS-CoV-2 antibodies in remnant blood samples collected by the University of North Carolina Health system. We estimated the prevalence of SARS-CoV-2 in this cohort with Bayesian regression, as well as the association of critical demographic factors with higher prevalence odds. FINDINGS: Between April 21(st) and October 3(rd) of 2020, a total of 9,624 unique samples were collected from clinical sites in central NC and we observed a seroprevalence increase from 2·9 (1·7, 4·3) to 9·1 (7·2, 11·1) over the study period. Individuals who identified as Latinx were associated with the highest odds ratio of SARS-CoV-2 exposure at 7·77 overall (5·20, 12·10). Increased odds were also observed among Black individuals and individuals without public or private health insurance. INTERPRETATION: Our data suggests that for this care-accessing cohort, SARS-CoV-2 seroprevalence was significantly higher than cumulative total cases reported for the study geographical area six months into the COVID-19 pandemic in North Carolina. The increased odds of seropositivity by ethnoracial grouping as well as health insurance highlights the urgent and ongoing need to address underlying health and social disparities in these populations. The CDC recommends selecting a threshold such that the test has 99·5% specificity. 2 We followed this 99 recommendation here specifying the cutoff to be the standard estimate of the 0·995 quantile (based on the quantile 100 function in R) of the negative lab samples. Using the 274 negative controls, the cutoff was 2·57 with empirical 101 sensitivity of 89·7% and empirical specificity of 99·3%. Therefore, a sample is considered positive if its average OD 102 value is 2·57 or more times larger than the average OD of the corresponding plate negative controls. We fit a Bayesian autoregressive logistic model to estimate weekly prevalence while accounting for uncertainty in 105 test sensitivity and specificity. Let n t give the number of samples in week t, and y t give the number of samples that Assuming seroprevalence varies smoothly, we define an AR(1) process for the π t as follows. First, let β t = 113 logit(π t ). Then we model β t as 114 β t ∼normal(α+φβ t−1 ,σ β 2 ) t=2,...,T β 1 ∼ normal(α, 0·5). As we expect autocorrelation and we are on the logit scale, we expect σ β 2 to be relatively small, so a relatively 117 vague prior is assumed 118 σ β 2 ∼normal + (0,0·5), where normal + indicates the folded normal distribution. We found changing the prior variance of σ β 2 had minimal 120 effect on the estimates and associated uncertainty of {π t }. Similarly, we put vague priors on α, φ, sens, and spec: spec ∼ uniform(0, 1). Finally, to estimate sensitivity and specificity, we assume 126 y spec ∼ binomial(n spec , spec), 127 y sens ∼ binomial(n sens , sens) where y spec is the number of negative controls that tested negative out of n spec negative controls. Similarly, y sens is the 129 number of positive controls that tested positive out of n sens positive controls. We fit a Bayesian logistic regression model with main effects for sex, race/ethnicity, age, in/out-patient status, and 132 payor. Interactions were considered, but not found to significantly improve the fit. This model allows us to 133 simultaneously model the hospital data and the lab validation data. To ensure each category in our main effects had a sufficient sample size, some categories were collapsed. All 135 outpatient, emergency, or unknown patients were listed as "outpatient." Additionally, the "other" and "unknown" 136 categories for payor were collapsed. Finally, the one patient with sex listed as "X" was removed from the dataset for 137 this analysis. We define the likelihood and present (̅) as the average estimated odds ratio. As before, to estimate sensitivity and specificity, we assume sens ∼ uniform(0, 1) spec ∼ uniform(0, 1) For results from the Bayesian models, we reported posterior means and equal-tail 95% credible intervals (i.e., the 187 2.5% and 97.5% quantiles of the posterior draws). In Table S1 , we calculated standard 95% confidence intervals: 188 ̂± /2 √̂( 1 −) Insurance status was determined from the most recent clinical encounter prior to the sampled blood draw Unknown/Other" consists of individuals for whom the health insurance payor 209 was left blank or otherwise unidentifiable, as well as listed insurance that read "Legal Liability / Liability Insurance Race and ethnicity identity was ascertained from that listed in the EMR for each patient. The categories listed under 212 Native Hawaiian or other Pacific Islander Hispanic or Latino", or were listed as 215 "Patient Refused" or "Unknown". In our report, we collapse race and ethnicity from separate variables into a single 216 variable in order to investigate the impact of systemic racism on SARS-COV-2 seroprevalence by both race and 217 ethnicity at the same time, though the constructs of race and ethnicity are inherently surrogate measures of racism and 218 other forms of marginalization We therefore binned individuals into the following groups Unknown" were binned as "Non-Latinx Black", similarly for "White or Caucasian" 221 as "Non-Latinx White", similarly for all other groups as "Non-Latinx Other We do not 223 further separate out other intersections of race and ethnicity because the number of individuals becomes too small to 224 make conclusive claims on odds of seropositivity. We here opt to use Latinx in place of "Hispanic" though it is not 225 the only way to refer to this grouping of individuals that often share cultural characteristics, language, religion, and 226 ancestral geography and history. 7 We also compare racial, ethnic Statistical methods and tool for cut point analysis in 231 immunogenicity assays Interim Guidelines for COVID-19 Antibody Testing Bayesian analysis of tests with unknown specificity and sensitivity RStan: the R interface to Stan Bayesian data analysis Race, socioeconomic status, and health: Complexities, 241 ongoing challenges, and research opportunities Affective communities and millennial desires: Latinx, or why my computer won't