key: cord-0901311-r8naytbo authors: Voss, William N.; Hou, Yixuan J.; Johnson, Nicole V.; Delidakis, George; Kim, Jin Eyun; Javanmardi, Kamyab; Horton, Andrew P.; Bartzoka, Foteini; Paresi, Chelsea J.; Tanno, Yuri; Chou, Chia-Wei; Abbasi, Shawn A.; Pickens, Whitney; George, Katia; Boutz, Daniel R.; Towers, Dalton M.; McDaniel, Jonathan R.; Billick, Daniel; Goike, Jule; Rowe, Lori; Batra, Dhwani; Pohl, Jan; Lee, Justin; Gangappa, Shivaprakash; Sambhara, Suryaprakash; Gadush, Michelle; Wang, Nianshuang; Person, Maria D.; Iverson, Brent L.; Gollihar, Jimmy D.; Dye, John; Herbert, Andrew; Finkelstein, Ilya J.; Baric, Ralph S.; McLellan, Jason S.; Georgiou, George; Lavinder, Jason J.; Ippolito, Gregory C. title: Prevalent, protective, and convergent IgG recognition of SARS-CoV-2 non-RBD spike epitopes date: 2021-05-04 journal: Science DOI: 10.1126/science.abg5268 sha: 3f9fd96b6fbc0603b9b719f561627e3dd579b509 doc_id: 901311 cord_uid: r8naytbo The molecular composition and binding epitopes of the immunoglobulin G (IgG) antibodies that circulate in blood plasma following SARS-CoV-2 infection are unknown. Proteomic deconvolution of the IgG repertoire to the spike glycoprotein in convalescent subjects revealed that the response is directed predominantly (>80%) against epitopes residing outside the receptor-binding domain (RBD). In one subject, just four IgG lineages accounted for 93.5% of the response, including an N-terminal domain (NTD)-directed antibody that was protective against lethal viral challenge. Genetic, structural, and functional characterization of a multi-donor class of “public” antibodies revealed an NTD epitope that is recurrently mutated among emerging SARS-CoV-2 variants of concern. These data show that “public” NTD-directed and other non-RBD plasma antibodies are prevalent and have implications for SARS-CoV-2 protection and antibody escape. The SARS-CoV-2 spike ectodomain (S-ECD) folds into a multidomain architecture (1, 2) and includes the RBD, which is essential for viral infectivity, and the structurally adjacent NTD, which plays an uncertain role. Humoral immunity to the spike (S) surface glycoprotein can correlate with protection, (3) and it is the primary antigenic target for most vaccines and monoclonal antibodies (mAbs). That the B cell repertoire can recognize multiple spike epitopes is supported by extensive single-cell cloning campaigns (4) (5) (6) (7) (8) (9) . However, the identity, abundance, and clonality of the IgG plasma antibody repertoire and the epitopes it may target are not known (10) (11) (12) . Divergence between the two repertoires is biologically plausible (13) (14) (15) (16) (17) and the evidence in COVID-19 includes a paradoxical disconnect between virus-neutralizing IgG titers and RBD-specific B cell immunity (6, 11, 18, 19) . To analyze the IgG repertoire, blood was collected during early convalescence from four seroconverted study subjects (P1-P4) who experienced mild COVID-19 disease that manifested with plasma virus-neutralization titers in the lowest quartile (P1 and P3), the second highest quartile (P2), or the highest quartile (P4) compared to a larger cohort (table S1 and fig. S1 ). The lineage composition and relative abundance of constituent IgG antibodies comprising the plasma response to either intact stabilized S-ECD (S-2P (1)) or RBD was determined using the Ig-Seq pipeline (13, 14, 20) that integrates analytical proteomics of affinity purified IgG fractions with peripheral B cell antibody variable region repertoires (BCR-Seq). IgG lineages detected by Ig-Seq in the S-ECD fraction but absent from the RBD fraction were deemed to be reactive with spike epitopes outside the RBD. In subject P3, we detected six IgG lineages that bound to S-ECD (Fig. 1A) . Four of these (Lin.1 to Lin.4) accounted for 93.5% abundance of the total plasma IgG S-ECD response and exhibited extensive intralineage diversity ( fig. S2 ) indicative of clonal expansion and selection. Notably, the top three lineages (Lin.1 to Lin.3; Prevalent, protective, and convergent IgG recognition of SARS-CoV-2 non-RBD spike epitopes The molecular composition and binding epitopes of the immunoglobulin G (IgG) antibodies that circulate in blood plasma following SARS-CoV-2 infection are unknown. Proteomic deconvolution of the IgG repertoire to the spike glycoprotein in convalescent subjects revealed that the response is directed predominantly (>80%) against epitopes residing outside the receptor-binding domain (RBD). In one subject, just four IgG lineages accounted for 93.5% of the response, including an N-terminal domain (NTD)-directed antibody that was protective against lethal viral challenge. Genetic, structural, and functional characterization of a multi-donor class of "public" antibodies revealed an NTD epitope that is recurrently mutated among emerging SARS-CoV-2 variants of concern. These data show that "public" NTD-directed and other non-RBD plasma antibodies are prevalent and have implications for SARS-CoV-2 protection and antibody escape. >85% abundance) all bound to non-RBD epitopes (S2 subunit or NTD). Bulk serology ELISAs recapitulated the Ig-Seq result and demonstrated similarly high levels of non-RBD-binding IgG (P>0.05) (Fig. 1B) , confirming that RBD-binding plasma antibodies comprise only a minor proportion of all spikebinding IgG in naturally infected individuals (21) . In all four subjects, the detected plasma IgG repertoire to S-ECD was oligoclonal, comprising only 6-22 lineages, with the top-ranked lineage comprising 15 to 50% total abundance. On average, 84% of the anti-S-ECD plasma IgG repertoire bound to epitopes outside the RBD (Fig. 1C) , a finding consistent with data from single B cell analyses (22) , and the most abundant plasma IgG lineage in all donors recognized a non-RBD epitope (Figs. 1A and 2A and fig. S3 ). Binding analysis of P3 mAbs CM29-CM32 representing the most expanded clones within each of lineages Lin.1 to Lin.4 showed that CM29 (Lin.1) recognizes the S2 subunit (KD = 6.6 nM), CM30 and CM31 (Lin.2 and Lin.3 with KD = 0.8 and 37.7 nM, respectively) were specific for the NTD, and CM32 (Lin.4) bound the RBD (KD = 6.0 nM), as expected from the Ig-Seq differential affinity purifications ( Fig. 1A and table S2). CM30 potently neutralized authentic SARS-CoV-2 in vitro (IC50 = 0.83 μg/ml), CM32 was slightly less potent (2.1 μg/ml), whereas CM29 and CM31 showed minimal neutralization activity (Fig. 1D) . We then determined the capacity of mAbs CM29-CM32, singly and in combination, to confer prophylactic protection in vivo to virus challenge using the MA10 mouse model of SARS-CoV-2 infection (23, 24) . Even though the RBD-directed mAb CM32 could neutralize authentic virus in vitro and had relatively high antibody-dependent cellular phagocytosis (ADCP) activity ( fig. S4 ), it did not protect in vivo (fig. S5), possibly due to amino acid changes in the MA10 virus. Similarly, no protection was observed for the non-neutralizing S2directed mAb CM29 or non-neutralizing NTD-directed mAb CM31. The neutralizing mAb CM30, derived from the topranking NTD-targeting IgG lineage (21% abundance), was the sole plasma antibody that conferred complete protection to MA10 viral challenge (Fig. 1 , E and F, and fig. S5 ). Interestingly, administration of a cocktail comprising the top non-RBD plasma mAbs CM29-CM31 (>85% of the IgG plasma lineages to S-ECD; Fig. 1A ) showed the most robust protection and lung viral titers below the limit of detection (LOD) in high viral load challenge (10 4 PFU). Subject P2, with ~10-fold higher neutralizing titer compared to subject P3 (fig. S1 and table S1), displayed a more polyclonal IgG response ( Fig. 2A) , with 12/15 lineages (>80% total abundance) in the anti-S-ECD repertoire recognizing non-RBD epitopes. Conspicuously, as with P3, the most abundant S-ECD-directed plasma antibodies target the S2 subunit, with the four topmost lineages (68% total abundance) binding to S2. MAbs CM25 and CM17, representative of two NTD-targeting lineages each comprising ~2.5% of the response at day 56 (Ig-Seq Lin.6 and Lin.9) ( Fig. 2A) , were both encoded by unmutated or near-germline IGHV1-24. We found an additional NTD-targeting unmutated IGHV1-24 plasma mAb (CM58) in subject P4. CM17, CM25 and CM58 bound S-ECD with similar single-digit nM affinity ( Fig. 2B and table S2 ) and all three potently neutralized SARS-CoV-2 virus, with IC50 values of 0.01-0.81 μg/ml comparable to S309 anti-RBD control (25) (Fig. 2C, fig. S6, and table S2 ). For all three mAbs, pre-administration in the MA10 mouse model resulted in significantly reduced lung viral titers post-infection with 10 5 PFU ( Fig. 2D; P<0 .001), resulting in 100% survival, compared to just 40% in the control group (Fig. 2E) . CM17-and CM25treated cohorts exhibited only minimal weight loss (Fig. 2F) . Thus, IGHV1-24 is intrinsically suited for potent and protective targeting of the NTD. B cell expression of IGHV1-24 in COVID-19 (~5 to 8%) (5, 7, 26) is ~10-fold higher than healthy individuals (0.4 to 0.8%) (27) . Moreover, we could detect IGHV1-24 plasma antibodies only in S-ECD fractions (mean 3.7%), but not among anti-RBD IgGs (Fig. 3, A and B) . Alignment of CM17, CM25, and CM58 with four neutralizing IGHV1-24 anti-NTD mAbs cloned from peripheral B cells [4A8 (4), 1-68 (5), 1-87 (5), COVA2-37 (7)] and an additional antibody [COV2-2199 (8)] identified a class of convergent VH immune receptor sequences (Fig. 3C) . In all cases, three glutamate (Glu) residues (Glu36, Glu59, and Glu80) located in complementarity-determining region (CDR)-H1, CDR-H2, and framework H3 (FWR-H3), respectively, as well as a phenylalanine (Phe) residue (Phe56) in CDR-H2, were invariably unmutated and are unique to the electronegative IGHV1-24 (pI=4.6). The convergent VH genes paired promiscuously with six distinct lightchain VL genes, yet CDR-H3 peptide lengths were restricted (14 or 21 amino acids) (Table S3) . A "checkerboard" bindingcompetition experiment (Fig. 3D ) indicated the presence of at least two epitope clusters on the NTD, including one targeted by all of the tested IGHV1-24 mAbs (4A8, CM25, CM17, CM58, and 1-68) and the IGHV3-11 mAb CM30. Another NTD epitope was identified by CM31 (IGHV2-5, 6.4% mutation), which overlapped with CM30 (IGHV3-11; 3.1% mutation), CM58, and 1-68 but did not compete with the other three IGHV1-24 NTD mAbs. To better understand the IGHV1-24 interactions with the spike NTD, we determined a cryo-EM structure of CM25 Fabs bound to trimeric S-ECD ( Fig. 4A and figs. S7 and S8 ). Focused refinement of the CM25-NTD interface resulted in a 3.5-Å reconstruction that revealed a heavy-chain-dominant mode of binding, with substantial contacts mediated by interactions between the three CDRs and the N3 and N5 loops of the NTD (Fig. 4B) . The light chain contributes only 11% (86 Å 2 ) of the total CM25 binding interface, mainly through a stacked hydrophobic interaction between CDR-L2 Tyr55 and Pro251 within the N5 loop. Unique germline IGHV1-24 residues contribute 20% (149 Å 2 ) of the total binding interface. CDR-H1 interacts extensively through hydrogen bonds and contacts between hydrophobic residues, including a salt bridge formed between the conserved Glu36 residue and the N5 loop residue Arg246 (Fig. 4C) . The common IGHV1-24 Phe56 residue in CDR-H2 forms a pi-cation interaction with Lys147 in the N3 loop (Fig. 4C ). CM25 contains a 14-aminoacid CDR-H3 loop that contributes 35% (261 Å 2 ) of the total interface, including the AV aliphatic motif found in all but one of the convergent IGHV1-24 NTD-binding mAbs. Ala109 and Val110 are buried at the interface in a binding pocket framed by the N3 and N5 loops. A comparison of CM25 with an extant structure of an IGHV1-24 NTD-binding antibody isolated by B cell cloning, 4A8 (4), revealed that the AV dipeptide interaction is structurally conserved, and the 21 amino-acid CDR-H3 of 4A8 extends along the outside of the NTD, contributing three additional contacts and 46% (415 Å 2 ) of the total binding interface (Fig. 4D ). Both structures show extensive contacts between the heavy chain of the Fabs and the NTD N3 and N5 loops. The Glu36-Arg246 salt bridge and an identical CDR-H2 contact between Phe56 and Lys147 are conserved in the 4A8-NTD interface. SARS-CoV-2 variants of concern contain mutations in the NTD N3 and N5 loops, including Y144/Y145Δ and K147E (UK lineage B.1.1.7), W152C (California B.1.429), and 242-244Δ or R246I (South Africa B.1.351). Alanine substitutions at several of these positions ablated binding or reduced affinity more than fivefold by public IGHV1-24 antibodies as exemplified by 4A8, CM17, and CM25 ( Fig. 4E and fig. S9 ), a result consistent with the CM25-NTD and 4A8-NTD structures. Additionally, we confirmed that an engineered N3-N5 doublemutant and native B.1.351 (28) both evade neutralization by mAbs CM25 and 4A8 (Fig. 4F) . Thus, mutations in SARS-CoV-2 variants confer escape from public neutralizing anti-NTD antibodies. In conclusion, we find that the convalescent plasma IgG response to SARS-CoV-2 is oligoclonal and directed overwhelmingly toward non-RBD epitopes in the S-ECD. This includes public, near-germline, and potently neutralizing antibodies against the NTD. The degree to which public anti-NTD antibodies contribute to protection is likely related to their relative levels in plasma, which can be dominant in some individuals. Our finding that mutations present in circulating SARS-CoV-2 variants can impair or ablate binding and neutralization by public anti-NTD antibodies may constitute a mechanism of viral escape in a subset of the population. Numerous other NTD mutations-which overlap with the structural epitope recognized by the public IGHV1-24 antibody class-have been described in additional circulating variants, in laboratory escape mutants, and in immunocompromised patients (12, (29) (30) (31) (32) (33) . J.J.L., and G.C.I.). Data and materials availability: FASTQ VH and VH:VL sequence files have been deposited in the NCBI Sequence Read Archive with accession numbers PRJNA422864. The monoclonal antibodies have been deposited in GenBank (https://www.ncbi.nlm.nih.gov/genbank/) with accession numbers MZ049539 to MZ049552. Coordinates for the CM25 Fab in complex with trimeric spike ectodomain have been deposited to the Protein Data Bank as PDBID:7M8J. Cryo-EM maps have been deposited to the Electron Microscopy Data Bank under accession code EMD-23717. These structural data are presented in Fig. 4, table S4 , and figs. S7-S8. All other data are available in the main text or the supplementary materials. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Correlates of protection against SARS-CoV-2 in rhesus macaques A neutralizing human antibody binds to the Nterminal domain of the Spike protein of SARS-CoV-2 Potent neutralizing antibodies against multiple epitopes on SARS-CoV-2 spike Convergent antibody responses to SARS-CoV-2 in convalescent individuals Potent neutralizing antibodies from COVID-19 patients define multiple targets of vulnerability Potently neutralizing and protective human antibodies against SARS-CoV-2 Broad neutralization of SARSrelated viruses by human monoclonal antibodies Orthogonal SARS-CoV-2 Serological Assays Enable Surveillance of Low-Prevalence Communities and Reveal Durable Humoral Immunity Humoral and circulating follicular helper T cell responses in recovered patients with COVID-19 Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants Identification and characterization of the constituent human serum antibodies elicited by vaccination Next-generation sequencing and protein mass spectrometry for the comprehensive analysis of human cellular and serum antibody repertoires Memory B cells, but not long-lived plasma cells, possess antigen specificities for viral escape mutants The extent of affinity maturation differs between the memory and antibody-forming cell compartments in the primary immune response Structures of Human Antibodies Bound to SARS-CoV-2 Spike Reveal Common Epitopes and Recurrent Features of Antibodies Neutralizing Antibody Activity in Recovered COVID-19 Patients Evaluating the Association of Clinical Characteristics With Neutralizing Antibody Levels in Patients Who Have Recovered From Mild COVID-19 in In-depth determination and analysis of the human paired heavy-and light-chain antibody repertoire Comprehensive mapping of mutations in the SARS-CoV-2 receptorbinding domain that affect recognition by polyclonal human plasma antibodies Isolation of potent SARS-CoV-2 neutralizing antibodies and protection from disease in a small animal model A Mouse-Adapted SARS-CoV-2 Induces Acute Lung Injury and Mortality in Standard Laboratory Mice A mouse-adapted model of SARS-CoV-2 to test COVID-19 countermeasures Cross-neutralization of SARS-CoV-2 by a human monoclonal SARS-CoV antibody Human B Cell Clonal Expansion and Convergent Antibody Responses to SARS-CoV-2 Individual variation in the germline Ig gene repertoire inferred from variable region gene rearrangements Detection of a SARS-CoV-2 variant of concern in South Africa Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape Persistence and Evolution of SARS-CoV-2 in an Immunocompromised Host Case Study: Prolonged Infectious SARS-CoV-2 Shedding from an Asymptomatic Immunocompromised Individual with The ongoing evolution of variants of concern and interest of SARS-CoV-2 in Brazil revealed by convergent indels in the amino (N)-terminal domain of the Spike protein SARS-CoV-2 escape in vitro from a highly neutralizing COVID-19 convalescent plasma Structure-based design of prefusion-stabilized SARS-CoV-2 spikes Controlling the SARS-CoV-2 spike glycoprotein conformation Treatment of Coronavirus Disease 2019 (COVID-19) Patients with Convalescent Plasma Antibody repertoires in humanized NOD-scid-IL2Rγ(null) mice and human B cells reveals human-like diversification and tolerance checkpoints in the mouse Ultra-highthroughput sequencing of the immune receptor repertoire from millions of lymphocytes Trimmomatic: A flexible trimmer for Illumina sequence data MiXCR: Software for comprehensive adaptive immunity profiling Search and clustering orders of magnitude faster than BLAST Immunoglobulins or Antibodies: IMGT ® Bridging Genes, Structures and Functions Brook, Mutation in myosin heavy chain 6 causes atrial septal defect Rapid characterization of spike variants via mammalian cell surface display SARS-CoV-2 Reverse Genetics Reveals a Variable Infection Gradient in the Respiratory Tract Real-time cryo-electron microscopy data preprocessing with Warp cryoSPARC: Algorithms for rapid unsupervised cryo-EM structure determination SAbPred: A structure-based antibody prediction server UCSF Chimera-A visualization system for exploratory research and analysis Coot: Model-building tools for molecular graphics PHENIX: Building new software for automated crystallographic structure determination We wish to thank Drs. G. Fenves, D. Jaffee, and A. Matouschek for their support. The authors are grateful for the administrative expertise of E. K. Miller, to The LaMontagne Center for Infectious Disease, for the university's core facilities, and to Dr. C.-L. Hsieh for providing reagents and advice. Funding: Funding for USAMRIID was provided through the CARES Act with programmatic oversight from the Military Infectious Diseases Research Program-project 14066041. Opinions, conclusions, interpretations, and recommendations are those of the authors and are not necessarily endorsed by the U.S. Army. The mention of trade names or commercial products does not constitute endorsement or recommendation for use by the Department of the Army or the Department of Defense. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of Centers for Disease Control and Prevention. Molecular graphics and analyses performed with UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH P41-GM103311. The Sauer Structural Biology Laboratory is supported by the University of Texas College of Natural Sciences and by award RR160023 from the Cancer Prevention and Research Institute of Texas (CPRIT) Author contributions: Conceptualization