key: cord-0700485-r702bn03 authors: Peng, James; Liu, Jamin; Mann, Sabrina A; Mitchell, Anthea M; Laurie, Matthew T; Sunshine, Sara; Pilarowski, Genay; Ayscue, Patrick; Kistler, Amy; Vanaerschot, Manu; Li, Lucy M; McGeever, Aaron; Chow, Eric D; Marquez, Carina; Nakamura, Robert; Rubio, Luis; Chamie, Gabriel; Jones, Diane; Jacobo, Jon; Rojas, Susana; Rojas, Susy; Tulier-Laiwa, Valerie; Black, Douglas; Martinez, Jackie; Naso, Jamie; Schwab, Joshua; Petersen, Maya; Havlir, Diane; DeRisi, Joseph title: Estimation of secondary household attack rates for emergent spike L452R SARS-CoV-2 variants detected by genomic surveillance at a community-based testing site in San Francisco date: 2021-03-31 journal: Clin Infect Dis DOI: 10.1093/cid/ciab283 sha: e284d6c1556979454e1e785d52342c4a882e5498 doc_id: 700485 cord_uid: r702bn03 BACKGROUND: Sequencing of the SARS-CoV-2 viral genome from patient samples is an important epidemiological tool for monitoring and responding to the pandemic, including the emergence of new mutations in specific communities. METHODS: SARS-CoV-2genomicsequencesweregeneratedfrompositivesamplescollected,alongwithepidemiologicalmetadata,atawalk-up, rapid testing site in the Mission District of San Francisco, California during November 22-December 1, 2020 and January 10-29, 2021. Secondary household attack rates and mean sample viral load were estimated and compared across observed variants. RESULTS: A total of 12,124 tests were performed yielding 1,099 positives. From these, 928 high quality genomes were generated. Certain viral lineages bearing spike mutations, defined in part by L452R, S13I, and W152C, comprised 54.4% of the total sequences from January, compared to 15.7% in November. Household contacts exposed to the “California” or “West Coast” variants (B.1.427 and B.1.429) were at higher risk of infection compared to household contacts exposed to lineages lacking these variants (0.36 vs 0.29, RR=1.28; 95% CI:1.00-1.64). The reproductive number was estimated to be modestly higher than other lineages spreading in California during the second half of 2020. Viral loads were similar among persons infected with West Coast versus non-West Coast strains, as was the proportion of individuals with symptoms (60.9% vs 64.3%). CONCLUSIONS: The increase in prevalence, relative household attack rates, and reproductive number are consistent with a modest transmissibility increase of the West Coast variants. Genomic surveillance during the SARS-CoV-2 pandemic is a critical source of situational intelligence for epidemiological control measures, including outbreak investigations and detection of emergent variants [1] . Countries with robust, unified public health systems and systematic genomic surveillance have been able to rapidly detect SARS-CoV-2 variants with increased transmission characteristics, and mutations that potentially subvert both naturally acquired or vaccination-based immunity (e.g. COVID-19 Genomics UK Consortium). Examples include the rapidly spreading B.1. 1.7 lineage documented in the UK and the B.1.351 lineage described from South Africa, or the P.1/P.2 lineages that harbor the spike E484K mutation which is associated with reduced neutralization in laboratory experiments [2] [3] [4] [5] . In the US, genomic surveillance is sparse relative to the number of confirmed cases (27.8 million as of Feb 20, 2021) , with 123,672 genomes deposited in the GISAID database, representing only 0.4% of the total reported cases. Despite the low rates of US genomic surveillance, independent local programs and efforts have contributed to our understanding of variant emergence and spread [6] [7] [8] . The appearance of new nonsynonymous mutations highlight the utility of this approach in the US [9] . Genomic sequencing of SARS-CoV-2 in California has predominantly been conducted by academic researchers and non-profit biomedical research institutions (for example, the Chan Zuckerberg Biohub and the Andersen Lab at the Scripps Research Institute) in conjunction with state and local public health partners. These efforts identified an apparent increase in the prevalence of lineages B.1.427 and B.1.429 ("California" or "West Coast" variant), which share S gene nonsynonymous mutations at sites 13, 152, 452, and 614, during December 2020 to February 2021 when California was experiencing the largest peak of cases observed during the pandemic. While the cluster of mutations was first observed in a sample from May 2020, these variants rose from representing <1% of the consensus genomes recovered from California samples collected in October A c c e p t e d M a n u s c r i p t 5 2020 (5/546; 0.91%) to over 50% of those collected during January 2021 (2, Over November 22 -December 1, 2020 and January 10-29, 2021, BinaxNOW TM rapid antigen tests were performed at the 24th & Mission BART (public transit) station in the Mission District of San Francisco, a setting of ongoing community transmission, predominantly among Latinx persons [10, 11] . Tests for SARS-CoV-2 were performed free of charge on a walk-up, no-appointment basis, including persons > 1 year of age and regardless of symptoms, through "Unidos en Salud", an academic, community (Latino Task Force) and city partnership. Certified lab assistants collected 2 bilateral anterior nasal swabs. The first was tested with BinaxNOW TM , immediately followed by a A c c e p t e d M a n u s c r i p t 6 separate bilateral swab for SARS-CoV-2 genomic sequencing [11, 12] . Results were reported to participants within 2 hours, and all persons in a household (regardless of symptom status) corresponding to a positive BinaxNOW case were offered BinaxNOW testing. All persons testing BinaxNOW positive were offered participation in longitudinal Community Wellness Team support program [13, 14] . SARS-CoV-2 genomes were recovered using ARTIC Network V3 primers[15] and sequenced on an Illumina NovaSeq platform. Consensus genomes generated from the resulting raw .fastq files using IDseq [16] were used for subsequent analysis. Full details are included in Supplementary Methods. Households (n=328) tested in January and meeting the following inclusion criteria were eligible for secondary attack rate analyses: 1) ≥1 adult (aged ≥18 years) with a positive BinaxNOW result; 2) ≥1 case in household sequenced; and, 3) ≥2 persons tested with BinaxNOW during the study period. Households in which sequences represented both West Coast and non-West Coast variants were excluded (n=9). The index was defined as the first adult to test positive. Crude household attack rates, stratified by variant classification, were calculated as i) the proportion of positive BinaxNOW results among tested household contacts; and, ii) the mean of the householdspecific secondary attack rate, with 95% CI based on cluster-level bootstrap. Generalized estimating equations were used to fit Poisson regressions, with cluster-robust standard errors and an exchangeable working covariance matrix. Because symptoms and disease severity may be affected by strain, these factors were not included in the a priori adjustment set. We evaluated for A c c e p t e d M a n u s c r i p t 7 overdispersion[17], and conducted sensitivity analyses using targeted maximum likelihood estimation (TMLE) combined with Super Learning to relax parametric model assumptions; influence curve-based standard error estimates used household as the unit of independence [18] . We compared the growth rates of B. The UCSF Committee on Human Research determined that the study met criteria for public health surveillance. All participants provided informed consent for dual testing. From November 22 to December 1, 2020, 3,302 rapid direct antigen tests were performed on 3,122 unique individuals; sample characteristics from this testing have been previously described [11] . From January 10-29, using identical methods, 8,822 rapid direct antigen tests were performed on 7,696 unique individuals, representing 5,239 households; household attack rate analyses were restricted to January samples, described here (Supplementary Table 1 Table 2 , sequences deposited in GISAID). These 986 samples, together with an additional 191 SARS-CoV-2 genome sequences generated from the same testing site during the period of November 22-December 1, 2020 [11, 19] had adequate coverage of the full genome or spike protein for further analysis based on S gene sequence (Supplementary Table 3 ). Classification as either a West Coast variant or a non-West Coast variant was determined for 846 of all samples sequenced. [20] , full length sequences were distributed among the major clades (Supplementary Figure 1) . Notably, mutations at spike position 501 were not observed, and thus no instances of the B.1.1.7 strain or any other strain bearing the N501Y mutation were detected in any sample during this period in Janurary 2021. A single individual was found to have been infected with the P.2 strain, which carries the spike E484K mutation and was described in Brazil from a re-infection case [5] . This mutation has been associated with decreased neutralization in laboratory experiments [2, 4] . We observed SARS-CoV-2 genome sequences that belonged to PANGO lineages B. This increase in frequency is consistent with an expansion of viruses more broadly in California carrying these same mutations [21] . Additional non-synonymous mutations were observed throughout the genome, including 108 unique non-synonymous mutations in the spike gene, several within functionally-significant regions of the protein ( Figure 2C, Supplementary Table 3 ). Twelve unique mutations were observed in the receptor binding domain, most of which have yet to be investigated for functional effects. Additionally, 8 unique mutations were found adjacent to the polybasic furin cleavage site at the S1/S2 junction, which is reported to have a potential role in determination of virulence and host cell tropism [22] [23] [24] [25] . Moderately prevalent mutations were observed at spike position 681 (P681H, n=34 and P681R, n=1), which is within the furin recognition site, and at spike position 677, where two different amino acid substitutions were observed in this cohort (Q677H, n=22 and Q677P, n=11). Multiple mutations at both of these sites have been previously observed [9] . The SARS-CoV-2 RT-PCR cycle thresholds (Ct) for nasal swab samples from which whole (Table 2 ). Secondary cases were identified a median of 1 day after index cases (IQR 0-4). Based on unadjusted Poisson regression with cluster-robust standard errors, household contacts exposed to the West Coast variant had an estimated 28% higher risk of secondary infection, compared to household contacts exposed to a non-West Coast variant (RR: 1.28, 95% CI: 1.00-1. Relative attack rates were generally similar when stratified by household characteristics and by the characteristics of secondary contacts (Table 3 ); secondary attack rates among children aged <12 years were 51.9% (41/79) and 39.7% (31/78) when exposed to West Coast and non-West Coast strains, respectively. Sensitivity analyses in which parametric assumptions were relaxed using TMLE and Super Learning yielded similar estimates (Supplementary Table 5 ). Using Bayesian phylogenetic analysis, we estimated the reproductive number to be 1. We monitored SARS-CoV-2 viral variants by genomic sequencing and integration of metadata from households at a community based "test-and-respond" program. We found that the West Coast variants (PANGO lineages B.1.427 and B.1.429) increased in prevalence relative to wild type from November to January in the San Francisco Bay Area among persons tested in the same community-based location. These data extend and confirm prior observations from convenience, outbreak, and clinical samples reporting apparent increases in relative prevalence of the West Coast variants [21] . A c c e p t e d M a n u s c r i p t 12 Household secondary attack rates of the West Coast variants were modestly higher than for non-West Coast variants, suggesting the potential for increased transmissibility. The West Coast variants compromise two closely related lineages (B.1.427 and B.1.429) that share identical sets of mutations in the spike protein, but differ by additional synonymous and non-synonymous mutations in other genes. While the frequency of both lineages increased in this study and in California more widely [21] , and the estimated increase in risk of secondary household infection relative to non-West Coast variants was fairly consistent across lineages, the point estimate was somewhat higher for B.1.429. Although moderate compared to increased transmissibility of other previously identified variants, even small increases in transmissibility could contribute to a substantial increase in cases, particularly in the context of reproductive numbers just below one. While this finding may be due to chance, future work, should continue to monitor individual lineages. The household attack rate observed here was higher than that reported in a recent global meta analysis [26] , even for the non-West Coast variants. It was similar to, or lower than attack rates reported in other US settings. Prior US reports, however, were based on substantially smaller sample sizes. Our findings that the West Coast variants increased in relative prevalence and had higher household secondary attack rates potentially suggest higher transmissibility. However, the West Coast variant has been detected in multiple locations, and has been detected since May 2020 in California without relative expansion until the peak associated with the holiday season of November-January. Using Bayesian phylogenetic analysis, the estimated reproductive number for both West Coast lineages was found to be modestly higher than other circulating lineages. We found no significant differences in viral load (using Ct) between West Coast and non- carries the E484K mutation [2] , was detected in this study. Surprisingly, this case did not have a travel history, highlighting the risk of cryptic transmission. In addition to the mutations associated with spike L452R in the West Coast variants, we observed, at lower frequencies, other mutations of interest, such as those at spike positions 677 and 681, both of which have been reported previously on their own [9] . This study has several limitations. First, testing was conducted at a walk-up testing site, and thus these are inherently convenience samples; however, this would not be expected to impose a differential selection bias for those with or without any particular variant. Second, clear classification of the index case was not always possible, particularly when multiple adults from a household tested positive on the same date; further, secondary household attack rate calculations do not account for potential external sources of infection other than the index case. However, the relative risk of secondary infection from household exposure to West Coast versus non-West Coast variants was similar among children, a group less likely to have been misclassified as non-index or to be exposed to external infection. Third, household testing coverage was incomplete and in some cases, consisted of only a single follow-up test; this might contribute to an under (or over) estimate of secondary attack rate, and while we again have no reason to suspect differential ascertainment by strain, this could bias estimates of relative risk. A c c e p t e d M a n u s c r i p t The occurrence of variants in SARS-CoV-2 was always expected; however, it is often difficult to understand the clinical and epidemiological importance of any given single or set of co-occurring mutations. While further epidemiological and laboratory experiments will be required to fully understand the community impact and mechanistic underpinnings of each variant, it is clear that enhanced genomic surveillance paired with community engagement, testing, and response capacity is an important tool in the arsenal against this pandemic. M a n u s c r i p t 21 A c c e p t e d M a n u s c r i p t 23 A c c e p t e d M a n u s c r i p t A c c e p t e d M a n u s c r i p t Public health actions to control new SARS-CoV-2 variants Landscape analysis of escape variants identifies SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization Genomic characterization of a novel SARS-CoV-2 lineage from Rio de Janeiro Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants E484K as an innovative phylogenetic event for viral evolution: Genomic analysis of the E484K spike mutation in SARS-CoV-2 lineages from Brazil Emergence of an early SARS-CoV-2 epidemic in the United States Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States Real-time public health communication of local SARS-CoV-2 genomic epidemiology Emergence in late 2020 of multiple lineages of SARS-CoV-2 Spike protein variants affecting amino acid position 677 Field performance and public health response using the BinaxNOW TM Rapid SARS-CoV-2 antigen detection assay during community-based testing Performance Characteristics of a Rapid Severe Acute Respiratory Syndrome Coronavirus 2 Antigen Detection Assay at a Public Plaza Testing Site in San Francisco The COVID-19 Symptom to Isolation Cascade in a Latinx Community: A Call to Action Evaluation of a novel community-based COVID-19 'Test-to-Care' model for low-income populations IDseq-An open source cloud-based Data analysis using regression and multilevel/hierarchical models. Cambridge Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies disease and diplomacy: GISAID's innovative contribution to global health SARS-CoV-2 Community Transmission disproportionately affects Latinx population during Shelter-in-Place in San Francisco Emergence of a novel SARS-CoV-2 strain in Southern California A Multibasic Cleavage Site in the Spike Protein of SARS-CoV-2 Is Essential for Infection of Human Lung Cells The sequence at Spike S1/S2 site enables cleavage by furin and phospho-regulation in SARS-CoV2 but not in SARS-CoV1 or Loss of furin cleavage site attenuates SARS-CoV-2 pathogenesis Proteolytic Cleavage of the SARS-CoV-2 Spike Protein and the Role of the Novel S1/S2 Site A c c e p t e d M a n u s c r i p t 17 A c c e p t e d M a n u s c r i p t