key: cord-0902454-ia9zg8o8 authors: Takahashi, Saki; Greenhouse, Bryan; Rodríguez-Barraquer, Isabel title: Are SARS-CoV-2 seroprevalence estimates biased? date: 2020-08-28 journal: J Infect Dis DOI: 10.1093/infdis/jiaa523 sha: a7cc214b674ae0ff19a557ccb2dff30cd4782dab doc_id: 902454 cord_uid: ia9zg8o8 nan A c c e p t e d M a n u s c r i p t 3 Serosurveys are needed to understand how many people have been infected by SARS-CoV-2 and where we are in the epidemic curve. Unprecedented serosurveillance efforts have been launched (e.g., Solidarity II, NIH) to generate local and global infection estimates to guide social distancing policies. Recently published results suggest that the proportion of the population that has been infected, even in places with explosive outbreaks such as Spain or New York City, is low and far from the levels required for herd immunity [1] [2] [3] . Multiple serological assays and rapid tests are now available and, as of July 16 th , the Food and Drug Administration has authorized the use of twenty-nine, which report a range of test performance characteristics (i.e., sensitivity and specificity, as well as positive and negative predictive values) [4] . Assay validation requires samples from individuals with known infection status in order to determine test performance characteristics. Due to potential cross-reactivity of antibody responses to seasonal coronaviruses, much of the focus of assay development has been on ensuring near perfect specificity, to minimize the risk of false positive results. This is particularly important during early stages of the epidemic, when the number of true positives is expected to be very low. However, if the purpose of deploying a serological assay is to quantify the proportion of the population that has been infected by SARS-CoV-2 (i.e., serosurveillance), adequate characterization of assay sensitivity to detect prior infection in the general population is important as well. We raise this issue because in the absence of such characterization, it will not be possible to generate accurate estimates of population-level exposure to this novel pathogen [5] . A growing body of evidence suggests that asymptomatic and mild SARS-CoV-2 infections, together making up over 95% of all infections, may be associated with lower antibody titers than more severe infections [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] . Similarly, it is known that antibody levels peak a few weeks after infection and A c c e p t e d M a n u s c r i p t 4 then decay gradually [17] [18] [19] . Yet, positive controls used for assay optimization and validation are usually limited to early convalescence samples from hospitalized patients with severe disease, leading to what is commonly known as spectrum bias. Sensitivities estimated from these sample sets may therefore overestimate the actual sensitivity that the assay would have when applied to the general population, leading to underestimates of the true seroprevalence. To illustrate this point, we quantified the amount of bias in estimating population seroprevalence potentially introduced by the choice of positive controls used to evaluate assay sensitivity. We sensitivity has yet to be quantified for most assays, we assumed that sensitivity was highest for severe infections and considered a range of values for asymptomatic and mild infections. Similarly, we assumed that sensitivity peaked early after infection and then decayed over time ( Figure 2C ). For simplicity, we fixed test specificity at 100%. Assays with imperfect sensitivity lead to underestimates of the true seroprevalence ( Figure 1B , grey line), but can be easily corrected for if the actual sensitivity of the assay in the sampled population is known ( Figure 1B , purple line) [20] . However, if test sensitivity has been determined from positive control sets skewed towards those with severe clinical outcomes (high antibody levels), the estimated prevalence, even after correction, will still underestimate the true prevalence ( Figure 1B , cyan and gold lines). The magnitude of the underestimate will depend on how biased the distribution of positive controls is relative to the population, and on how much assay sensitivity varies with disease severity ( Figure 1C) . Similarly, corrected estimates of prevalence will only equal A c c e p t e d M a n u s c r i p t 5 the true prevalence if decreases in sensitivity due to waning antibody responses over time can be accounted for ( Figure 2C ). If spectrum bias stemming from clinical outcomes as well as times since infection are both present, underestimation of the true prevalence will be greater. includes only recent samples taken within 0-60 days after infection. We assume for these simulations that the validation controls, and all infections, represent severe infections with a baseline test sensitivity of 95%. (C) For a range of true prevalence (x-axis), we calculated the prevalence values that would be estimated in the population at days 60, 180, and 300 (y-axis) (assuming sensitivity (Se) is reduced to 80% of the baseline level in infections that are 60-180 days old, and to 60% of the baseline level in infections that are 180+ days old, and a specificity of 100%), correcting for test characteristics using the positive control set from the last bar in Figure 2B . We used the procedure described in the Extended Methods section to calculate the estimated prevalences. We also considered prevalence values that would be estimated in the population if the two sources of spectrum bias (both clinical outcomes and time since infection) are present. For this simulation (filled squares), we calculated the prevalence that would be estimated at day 180, now assuming that the distribution of true clinical outcomes and their test sensitivities is equal to that in Figure 1B (i.e., in terms of severity: 43% asymptomatic, 52% mild, 5% severe, and in terms of sensitivity: 95% in severe, 60% in mild, and 40% in asymptomatic), and that the positive controls are all severe, recent infections. Estudio ENE-COVID19: Primera Ronda Repeated seroprevalence of anti-SARS-CoV-2 IgG antibodies in a population Amid Ongoing COVID-19 Pandemic, Governor Cuomo Announces Results of Completed Antibody Testing Study of 15,000 People Showing 12.3 Percent of Population Has Food and Drug Administration Serology assays to manage COVID-19 Test performance evaluation of SARS-CoV-2 serological assays SARS-CoV-2 specific antibody responses in COVID-19 patients Neutralizing antibody responses to SARS-CoV-2 in a COVID-19 recovered patient cohort and their implications Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand Systemic and mucosal antibody secretion specific to SARS-CoV-2 during mild versus severe COVID-19 Antibody profiling of COVID-19 patients in an urban low-incidence region in Northern Germany assays for COVID-19 epidemiological screening: our experience [Internet]. Pathology. medRxiv; 2020 Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections Antibody responses to SARS-CoV-2 in patients with COVID-19 Magnitude and kinetics of anti-SARS-CoV-2 antibody responses and their relationship to disease severity Antibody dynamics to SARS-CoV-2 in Asymptomatic and Mild COVID-19 patients Interpreting Diagnostic Tests for SARS-CoV-2 Serological signatures of SARS-CoV-2 infection: Implications for antibody-based diagnostics Dynamics and significance of the antibody response to SARS-CoV-2 infection Estimating Prevalence Using an Imperfect Test