key: cord-263452-y2ral8nx authors: Watanabe, Yasunori; Allen, Joel D.; Wrapp, Daniel; McLellan, Jason S.; Crispin, Max title: Site-specific glycan analysis of the SARS-CoV-2 spike date: 2020-05-04 journal: Science DOI: 10.1126/science.abb9983 sha: doc_id: 263452 cord_uid: y2ral8nx The emergence of the betacoronavirus, SARS-CoV-2, the causative agent of COVID-19, represents a significant threat to global human health. Vaccine development is focused on the principal target of the humoral immune response, the spike (S) glycoprotein, which mediates cell entry and membrane fusion. SARS-CoV-2 S gene encodes 22 N-linked glycan sequons per protomer, which likely play a role in protein folding and immune evasion. Here, using a site-specific mass spectrometric approach, we reveal the glycan structures on a recombinant SARS-CoV-2 S immunogen. This analysis enables mapping of the glycan-processing states across the trimeric viral spike. We show how SARS-CoV-2 S glycans differ from typical host glycan processing, which may have implications in viral pathobiology and vaccine design. Impaired glycan maturation resulting in the presence of oligomannose-type glycans can be a sensitive reporter of nativelike protein architecture (8), and site-specific glycan analysis can be used to compare different immunogens and monitor manufacturing processes (18) . Additionally, glycosylation can influence the trafficking of recombinant immunogen to germinal centers (19) . To resolve the site-specific glycosylation of SARS-CoV-2 S protein and visualize the distribution of glycoforms across the protein surface, we expressed and purified three biological replicates of recombinant soluble material in an identical manner to that which was used to obtain the high-resolution cryo-electron microscopy (cryo-EM) structure, albeit without glycan processing blockade using kifunensine (4). This variant of the S protein contains all 22 glycans on the SARS-CoV-2 S protein (Fig. 1A) . Stabilization of the trimeric prefusion structure was achieved using the "2P" stabilizing mutations (20) at residues 986 and 987, a "GSAS" substitution at the furin cleavage site (residues 682-685), and a C-terminal trimerization motif. This helps to maintain quaternary architecture during glycan processing. Prior to analysis, supernatant containing the recombinant SARS-CoV-2 S was purified by sizeexclusion chromatography ensure only native-like trimeric protein was analyzed ( Fig. 1B and fig. S1 ). The trimeric conformation of the purified material was validated using negative-stain electron microscopy (Fig. 1C) . To determine the site-specific glycosylation of SARS-CoV-2 S, we employed trypsin, chymotrypsin, and alpha-lytic protease to generate three glycopeptide samples. These proteases were selected to generate glycopeptides that contain a single Nlinked glycan sequon. The glycopeptides were analyzed by liquid-chromatography-mass spectrometry (LC-MS), and the glycan compositions were determined for all 22 N-linked glycan sites (Fig. 2) . To convey the main processing features at Site-specific glycan analysis of the SARS-CoV-2 spike Yasunori Watanabe 1,2,3 *, Joel D. Allen 1 *, Daniel Wrapp 4 , Jason S. The emergence of the betacoronavirus, SARS-CoV-2, the causative agent of COVID-19, represents a significant threat to global human health. Vaccine development is focused on the principal target of the humoral immune response, the spike (S) glycoprotein, which mediates cell entry and membrane fusion. SARS-CoV-2 S gene encodes 22 N-linked glycan sequons per protomer, which likely play a role in protein folding and immune evasion. Here, using a site-specific mass spectrometric approach, we reveal the glycan structures on a recombinant SARS-CoV-2 S immunogen. This analysis enables mapping of the glycanprocessing states across the trimeric viral spike. We show how SARS-CoV-2 S glycans differ from typical host glycan processing, which may have implications in viral pathobiology and vaccine design. There are two sites on SARS-CoV-2 S that are principally oligomannose-type: N234 and N709. The predominant oligomannose-type glycan structure observed across the protein, with the exception of N234, is Man5GlcNAc2, which demonstrates that these sites are largely accessible to α1,2-mannosidases but are poor substrates for GlcNAcT-I, which is the gateway enzyme in the formation of hybrid-and complextype glycans in the Golgi apparatus. The stage at which processing is impeded is a signature related to the density and presentation of glycans on the viral spike. For example, the more densely glycosylated spikes of HIV-1 Env and Lassa virus GPC exhibit numerous sites dominated by Man9GlcNAc2 (21) (22) (23) (24) . A mixture of oligomannose-and complex-type glycans can be found at sites N61, N122, N603, N717, N801 and N1074 (Fig. 2) . Of the 22 sites on the S protein, 8 contain significant populations of oligomannose-type glycans, highlighting how the processing of the SARS-CoV-2 S glycans is divergent from host glycoproteins (25). The remaining 14 sites are dominated by processed, complex-type glycans. Although unoccupied glycosylation sites were detected on SARS-CoV-2 S, when quantified they were revealed to form a very minor component of the total peptide pool (table S2). In HIV-1 immunogen research, the holes generated by unoccupied glycan sites have been shown to be immunogenic and potentially give rise to distracting epitopes (26) . The high occupancy of N-linked glycan sequons of SARS-CoV-2 S indicates that recombinant immunogens will not require further optimization to enhance site occupancy. Using the cryo-EM structure of the trimeric SARS-CoV-2 S protein (PDB ID 6VSB) (4), we mapped the glycosylation status of the coronavirus spike mimetic onto the experimentally determined 3D structure (Fig. 3) . This combined mass spectrometric and cryo-EM analysis reveals how the N-linked glycans occlude distinct regions across the surface of the SARS-CoV-2 spike. Shielding of the receptor binding sites on the SARS-CoV-2 spike by proximal glycosylation sites (N165, N234, N343) can be observed, especially when the receptor binding domain is in the "down" conformation. The shielding of receptor binding sites by glycans is a common feature of viral glycoproteins, as observed on SARS-CoV-1 S (10, 13), HIV-1 Env (27) , influenza HA (28, 29) , and LASV GPC (24). Given the functional constraints of receptor binding sites and the resulting low mutation rates of these residues, it is likely that there is selective pressure to utilize N-linked glycans to camouflage one of the most conserved and potentially vulnerable areas of their respective glycoproteins (30, 31) . It is interesting to note the dispersion of oligomannosetype glycans across both the S1 and S2 subunits. This is in contrast to other viral glycoproteins, for example the dense glycan clusters in several strains of HIV-1 Env induce oligomannose-type glycans that are recognized by antibodies (32, 33) . In SARS-CoV-2 S the oligomannose-type structures are likely protected by the protein component, as exemplified by the N234 glycan which is partially sandwiched between the N-terminal and receptor-binding domains (Fig. 3) . We characterized the N-linked glycans on extended flexible loop structures (N74 and N149) and at the membraneproximal C terminus (N1158, N1173, N1194) that were not resolved in the cryo-EM maps (4) These were determined to be complex-type glycans, consistent with steric accessibility of these residues. While the oligomannose-type glycan content (28%) (table S2) is above that observed on typical host glycoproteins, it is lower than other viral glycoproteins. For example, one of the most densely glycosylated viral spike proteins is HIV-1 Env, which exhibits ~60% oligomannose-type glycans (21, 34) . This suggests that SARS-CoV-2 S protein is less densely glycosylated and that the glycans form less of a shield compared with other viral glycoproteins including HIV-1 Env and LASV GPC, which may be beneficial for the elicitation of neutralizing antibodies. Additionally, the processing of complex-type glycans is an important consideration in immunogen engineering, especially considering that epitopes of neutralizing antibodies against SARS-CoV-2 S can contain fucosylated glycans at N343 (35) . Across the 22 N-linked glycosylation sites, 52% are fucosylated and 15% of the glycans contain at least one sialic acid residue (table S2 and fig. S3 ). Our analysis reveals that N343 is highly fucosylated with 98% of detected glycans bearing fucose residues. Glycan modifications can be heavily influenced by the cellular expression system utilized. We have previously demonstrated for HIV-1 Env glycosylation that the processing of complex-type glycans is driven by the producer cell but that the levels of oligomannose-type glycans were largely independent of the expression system and is much more closely related to the protein structure and glycan density (36) . Highly dense glycan shields, such as those observed on LASV GPC and HIV-1 Env, feature so-called mannose clusters (22, 24) on the protein surface (Fig. 4) . While small mannosetype clusters have been characterized on the S1 subunit of Middle East respiratory syndrome (MERS) CoV S (10), no such phenomenon has been observed for SARS-CoV-1 or SARS-CoV-2 S proteins. The site-specific glycosylation analysis reported here suggests that the glycan shield of SARS-CoV-2 S is consistent with other coronaviruses and similarly exhibits numerous vulnerabilities throughout the glycan shield (10) . Finally, we detected trace levels of O-linked glycosylation at T323/S325 with over 99% of these sites unmodified ( fig. S4 ) suggesting that O-linked glycosylation of this region is minimal when the structure is native-like. Our glycosylation analysis of SARS-CoV-2 offers a detailed benchmark of site-specific glycan signatures characteristic of a natively folded trimeric spike. As an increasing number of glycoprotein-based vaccine candidates are being developed, their detailed glycan analysis offers a route for comparing immunogen integrity and will also be important to monitor as manufacturing processes are scaled for clinical use. Glycan profiling will therefore also be an important measure of antigen quality in the manufacture of serological testing kits. Finally, with the advent of nucleotide-based vaccines, it will be important to understand how those delivery mechanisms impact immunogen processing and presentation. material transfer agreement with The University of Texas at Austin. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material. 4), with one RBD in the "up" conformation and the other two RBDs in the "down" conformation. The glycans are colored according to oligomannose content as defined by the key. ACE2 receptor binding sites are highlighted in light blue. The S1 and S2 subunits are rendered with translucent surface representation, colored light and dark grey, respectively. Note that the flexible loops on which N74 and N149 glycan sites reside are represented as dashed lines with glycan sites on the loops mapped at their approximate regions. (8, 21) . Site-specific N-linked glycan oligomannose quantifications are colored according to the key. All glycoproteins were expressed as soluble trimers in HEK 293F cells apart from LASV GPC, which was derived from virus-like particles from Madin-Darby canine kidney II cells. Exploiting the defensive sugars of HIV-1 for drug and vaccine design Unexpected receptor functional mimicry elucidates activation of coronavirus fusion Cryo-EM analysis of a feline coronavirus spike protein reveals a unique structure and camouflaging glycans The intracellular sites of early replication and budding of SARScoronavirus Identification of N-linked carbohydrates from severe acute respiratory syndrome (SARS) spike glycoprotein Innate immune recognition of glycans targets HIV nanoparticle immunogens to germinal centers Immunogenicity and structures of a rationally designed prefusion MERS-CoV spike antigen Site-specific glycosylation of virion-derived HIV-1 Env is mimicked by a soluble trimeric immunogen Composition and antigenic effects of individual glycan sites of a trimeric HIV-1 envelope glycoprotein Mapping the complete glycoproteome of virionderived HIV-1 gp120 provides insights into broadly neutralizing antibody binding Electron-microscopy-based epitope mapping defines specificities of polyclonal antibodies elicited during HIV-1 BG505 envelope trimer immunization Rational HIV immunogen design to target specific germline B cell receptors Structural basis of preexisting immunity to the 2009 H1N1 pandemic influenza virus Antibody neutralization and escape by HIV-1 Tracking global patterns of N-linked glycosylation site variation in highly variable viral glycoproteins: HIV, SIV, and HCV envelopes and influenza hemagglutinin Promiscuous glycan site recognition by antibodies to the high-mannose patch of gp120 broadens neutralization of HIV Structural and functional analysis of a potent sarbecovirus neutralizing antibody Cell-and protein-directed glycosylation of native cleaved HIV-1 envelope SARS-CoV-2 spike site-specific N-linked glycan analysis Grigorieff, cisTEM, user-friendly software for single-particle image processing