key: cord-1056249-fi9xuwu6 authors: Antonopoulos, Aristotelis; Broome, Steven; Sharov, Victor; Ziegenfuss, Christopher; Easton, Richard L; Panico, Maria; Dell, Anne; Morris, Howard R; Haslam, Stuart M title: Site-specific characterisation of SARS-CoV-2 spike glycoprotein receptor binding domain date: 2020-09-03 journal: Glycobiology DOI: 10.1093/glycob/cwaa085 sha: 25293778bba206bbe185c0f874ec8a607877e15d doc_id: 1056249 cord_uid: fi9xuwu6 The novel coronavirus SARS-CoV-2, the infective agent causing COVID-19, is having a global impact both in terms of human disease as well as socially and economically. Its heavily glycosylated spike glycoprotein is fundamental for the infection process, via its receptor binding domains interaction with the glycoprotein angiotensin converting enzyme 2 on human cell surfaces. We therefore utilized an integrated glycomic and glycoproteomic analytical strategy to characterise both N- and O- glycan site specific glycosylation within the receptor binding domain. We demonstrate the presence of complex type N-glycans with unusual fucosylated LacdiNAc at both sites N331 and N343 and a single site of O-glycosylation on T323. The COVID-19 pandemic caused by the SARS-CoV-2 virus is having a global impact with latest WHO data indicating 21,294,845 cases and 761,779 deaths (WHO COVID-19 Situation Report 209). The spike glycoprotein of SARS-CoV-2 plays a vital role in facilitating viral infectivity of host cells. It exists as a homotrimer on the viral surface. Each monomer comprises a S1 subunit, which contains the host cell receptor binding domain (RBD), and a membrane anchored S2 subunit involved in fusion of viral and host membranes which are formed by proteolytic processing (Walls et al. 2020 ). The RBD is responsible for binding angiotensin converting enzyme 2 (ACE2) receptor on human cell surfaces (Shang et al. 2020) . Informatic analysis of the spike glycoprotein derived amino acid sequence predicts it to be heavily N-and O-glycosylate. Such heavy glycosylation is likely to be functionally important for host cell recognition and viral entry, immune system interactions and any potential future vaccine or diagnostic developments (Kumar et al. 2020) . Rapid progress has already been made in characterizing spike glycoproteins glycosylation with two detailed glycoproteomic studies (Shajahan et al. 2020; Watanabe et al. 2020) . However, there is variability between the presented structural data. For example, Shajahan et al., indicate that site N616 expresses only high mannose glycans whereas Watanabe et al., indicate that this site expresses mostly complex N-glycans with only traces of high mannose and hybrid N-glycans. This clearly indicates both the complexity of glycoproteomic analysis, but also the potential impact that the structure of the spike glycoprotein recombinant constructs and bioprocessing conditions can have on recombinant glycoprotein glycosylation. We used an integrated glycomic and glycoproteomic strategy to characterize the N-and O-glycosylation of recombinant SARS-CoV-2, Spike glycoprotein S1 Subunit, RBD (Arg319-Phe541) expressed in HEK293 cells. We demonstrate the presence of complex type N-glycans with unusual fucosylated LacdiNAc and a single site of O-glycosylation on Thr323 at the beginning of the RBD. We discuss our data in the context of previous spike glycoprotein characterisations, but also highlight the limitations of the characterisation of recombinant spike glycoproteins. Recombinant SARS-CoV-2, Spike protein S1 Subunit RBD (Arg319-Phe541) derived from transfected human HEK293 cells was obtained from RayBiotech (Georgia, Product Number 230-30162). The sample was buffer exchanged into 0. The isolated fractions identified in the LC-MS analyses as containing N-linked glycopeptides were dissolved in 50 mM ammonium bicarbonate, pH 8.4, to which was added 1U of PNGase F (Roche p/n 11365169001) and incubated at 37°C for 20 hours. Released N-glycans were separated from the peptides using a RP C 18 Sep-Pak and permethylated (Ciucanu and Kerek, 1984) . Permethylated samples were dissolved in 10 μL of 50% methanol in water, 1 μL was mixed with 1 μL of 2,5-dihydroxybenzoic acid (10 mg/mL in 50% methanol in water). MALDI data were acquired in the positive ion reflector mode using a 5800 mass spectrometer (AB Sciex, Framingham MA). The MALDI data were analysed using Data Explorer 4.9 (AB Sciex). The processed spectra were subjected to manual assignment and annotation with the aid of GlycoWorkBench (Ceroni et al. 2008 ). The proposed assignments for the selected peaks were based on composition together with knowledge of the biosynthetic pathways. Proposed structures were further confirmed by MS/MS analysis. Glycomics and glycoproteomics analysis of the N331 and N343 glycosylation The recombinant SARS-CoV-2 S1 RBD expressed in HEK293 cells was subjected to a two-stage enzymatic digestion strategy employing Trypsin, followed by a Glu-C digestion in order to allow separation of the two N-linked consensus site peptides which would otherwise be present within a single tryptic peptide. The resulting glycopeptides were purified to allow separation of the 329-FPNITNLCPFGE-340 and 341-VFNATR-346 glycopeptides which contain the glycosylation sites N331 and N343 respectively (hereafter referred as N331 and N343 glycopeptides Table SII) . Evidence for the full NeuAc 2 Hex 2 HexNAc 2 Core 2 structure on peptide 320-VQPTE-324 was found in data within the elution region of the N343 glycopeptide glycoforms, which has clearly led to signal suppression. Nevertheless, reasonable confirmatory data are found, as seen in Supplementary Figure S2 where the key fragment ions arising from the VQPTE glycopeptide are shown, culminating in a molecular ion at m/z 1885.67. Searches for glycosylated SIVR peptides, including in data from Glu-C re-digested material, proved negative, with the free SIVR peptide itself found at 6.5 mins, although simple substitutions such as HexNAc cannot be ruled out because of facile conevoltage loss from some small glycopeptides. Defining the glycosylation of the SARS-CoV-2 spike glycoprotein is of great importance to better understand virus infectivity, to develop future vaccines and to improve diagnostics. Of particular interest is the receptor binding domain (RBD) since glycosylation sites at N331 and N343 which are found on the RBD of the S1 subunit, together with any O-linked structures, could potentially affect the binding to the ACE2 receptor. We demonstrate that the N-linked sites are glycosylated with complex type Nglycans with unusual fucosylated LacdiNAc structures, and Thr 323 carrying principally disialyl Core 1 and NeuAc 2 Hex 2 HexNAc 2 Core 2 O-linked structures. It is well 10 established that HEK293 cells have the biosynthetic potential to generate LacdiNAc containing N-glycans (Do et al. 1997; Dewal et al. 2015) . The sites of glycosylation determined in this paper are shown in their relative positions in the RBD amino acid sequence in Supplementary Figure S3 . A great deal of progress in the characterisation of SARS-CoV-2 spike glycoprotein has been made in a relatively short time frame. Shajahan et al. have recently carried out glycoproteomic characterisation of the S1 and S2 subunit of the spike glycoprotein individually expressed in HEK293 cells (Shajahan et al. 2020) . Their structural analysis indicated that the N-glycans occupying the N331 and N343 glycosylation sites consisted of high-mannose (Man 5-8 GlcNAc 2 ) and low molecular weight complex structures. In the work presented here, even though the RBD of the S1 subunit is also expressed in HEK293 cells, we have detected a different N-glycan repertoire occupying both the N331 and N343 glycosylation sites, with exclusively complex N-glycans, lacking highmannose structures. Whilst some of our O-linked assignments are the same (e.g. diSialyl Core 1 on T323) others differ and we find no clear evidence of S325 glycosylation. In other recent work (Watanabe et al. 2020) , glycoproteomic analysis was performed on soluble stabilised trimeric spike glycoprotein (encoding residues 1-1208) expressed in HEK FreeStyle293F cells with substitutions on the furin cleavage site and a C-terminal T4 fibritin trimerisation motif. Interestingly, their proposed N-glycan structural analysis for the N331 and N343 glycosylation sites was more similar to that shown here, in that both glycosylation sites were occupied mainly by complex Nglycans. Some high-mannose N-glycans were also detected, though of a minor abundance. More importantly, LacdiNAc contained N-glycan structures were not reported even though N-glycan compositions matching to these types of glycans were found. This emphasises the need for detailed N-glycan analysis which is best achieved by an integrated glycomic and glycoproteomic approach. Although Watanabe et al showed some evidence for trace levels of O-glycosylation on a glycopeptide containing both T323/S325, they were not able to achieve site specific characterisation. It is important to point out, especially to the broader research community, the caveats associated with both our data and the other published S glycoprotein analyses. The actual nature of the analysed recombinant S glycoprotein varies between the three studies. The RBD region which we analysed is, as its name suggests, believed to be the most relevant to ACE2 binding, but is a small section of the native viral glycoprotein, whereas the stabilized trimeric spike glycoprotein is most similar to the native viral glycoprotein. However, all are expressed as soluble products and therefore may not have undergone the same biosynthetic maturation as native viral spike glycoprotein. This is likely to impact on both glycan site occupancy and glycan structures. All the recombinant constructs are expressed in cultured HEK cells, but individual cell clones and culture conditions can impact recombinant glycoprotein glycosylation (Zhang et al. 2016) . Whilst these are human derived cells, again it is likely that native viral spike glycoprotein, produced by in vivo infection of for example ciliated nasal epithelial cells, will again demonstrate variability in both glycan site occupancy and glycan structures. With the guidance obtained from current studies including the work presented here, future research efforts need to focus on both the production and characterisation of S glycoprotein from human cells more similar to those that are actually infected in vivo and to purify and characterise virus-derived S glycoprotein. It is sobering to consider that there was a gap of 16 years between the first glycoproteomic characterisation of human-immunodeficiency-virus (HIV) recombinant envelope glycoprotein gp120 (Zhu et al. 2000) and actual virion-derived HIV-1 gp120 (Panico et al. 2016) . Given the scale of the current crisis, this cannot be allowed to happen for SARS-CoV-2 spike glycoprotein. During the review period of our manuscript, work by Zhao et al., which also details the glycomic/glycoproteomic characterisation of trimer-stabilized, soluble SARS-CoV-2 Spike glycoprotein produced in HEK cells has been published. Their data is consistent with ours in that it confirms the presence of LacdiNAc-containing N-glycan structures and that T323 is the main site of O-glycosylation (Zhao et al., 2020) . substitution with a mono-sialyl version (NeuAcHexHexNAc 2 ) of the assigned Core 2 structure seen in Figure 3 and described in the text. Proline cleavage y" ions at m/z 346, 549 (y 3 -HexNAc) and 752 (y 3 -HexNAc-HexNAc) support the interpretation made. The unlabelled higher mass signals were devoid of 13 C isotope clusters and not assigned as of glycopeptide origin. GlycoWorkbench: a tool for the computer-assisted annotation of mass spectra of glycans A simple and rapid method for the permethylation of carbohydrates Glycoprotein structure determination by mass spectrometry XBP1s Links the Unfolded Protein Response to the Molecular Architecture of Mature N-Glycans Differential expression of LacdiNAc sequences (GalNAc beta 1-4GlcNAc-R) in glycoproteins synthesized by Chinese hamster ovary and human 293 cells Structural, glycosylation and antigenic variation between 2019 novel coronavirus (2019-nCoV) and SARS coronavirus Mapping the complete glycoproteome of virionderived HIV-1 gp120 provides insights into broadly neutralizing antibody binding Deducing the N-and Oglycosylation profile of the spike protein of novel coronavirus SARS-CoV-2 Cell entry mechanisms of SARS-CoV-2 Antigenicity of the SARS-CoV-2 Spike Glycoprotein Site-specific glycan analysis of the SARS-CoV-2 spike Challenges of glycosylation analysis and control: an integrated approach to producing optimal and consistent therapeutic drugs Virus-Receptor Interactions of Glycosylated SARS-CoV-2 Spike and Human ACE2 Receptor Mass spectrometric characterization of the glycosylation pattern of HIV-gp120 expressed in CHO cells Council [grant number BB/V011324/1 to AD and SMH]