key: cord-0830942-6h03i6qp authors: Parker, Robert; Partridge, Thomas; Wormald, Catherine; Kawahara, Rebeca; Stalls, Victoria; Aggelakopoulou, Maria; Parker, Jimmy; Doherty, Rebecca Powell; Morejon, Yoanna Ariosa; Lee, Esther; Saunders, Kevin; Haynes, Barton F.; Acharya, Priyamvada; Thaysen-Andersen, Morten; Borrow, Persephone; Ternette, Nicola title: Mapping the SARS-CoV-2 spike glycoprotein-derived peptidome presented by HLA class II on dendritic cells date: 2021-05-13 journal: Cell Rep DOI: 10.1016/j.celrep.2021.109179 sha: f8847a871783f0020a7695c501458674e53e2bfb doc_id: 830942 cord_uid: 6h03i6qp Understanding and eliciting protective immune responses to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an urgent priority. To facilitate these objectives, we profile the repertoire of human leukocyte antigen class II (HLA-II)-bound peptides presented by HLA-DR diverse monocyte-derived dendritic cells pulsed with SARS-CoV-2 spike (S) protein. We identify 209 unique HLA-II-bound peptide sequences, many forming nested sets, which map to sites throughout S including glycosylated regions. Comparison of the glycosylation profile of the S protein to that of the HLA-II-bound S peptides reveals substantial trimming of glycan residues on the latter, likely induced during antigen processing. Our data also highlight the receptor-binding motif in S1 as a HLA-DR-binding peptide-rich region and identify S2-derived peptides with potential for targeting by cross-protective vaccine-elicited responses. Results from this study will aid analysis of CD4+ T cell responses in infected individuals and vaccine recipients and have application in next-generation vaccine design. HLA-II-peptide sequencing indicating that immune responses are capable of mediating protection (Chandrashekar et 70 al., 2020) . Passively transferred neutralising antibodies (nAbs) protect against SARS-CoV-2 71 infection in small animal models, and convalescent sera has been shown to be effective in 72 the treatment of severe disease, suggesting the utility of nAb induction by vaccines 73 (Brouwer et al., 2020) , (Zost et al., 2020) , (Liu et al., 2020) . Notably, the four seasonal 74 common cold-causing human coronaviruses, the zoonotic Middle East respiratory syndrome 75 (MERS), and SARS coronaviruses typically elicit poorly-sustained nAb responses, putatively 76 severe disease on challenge, providing a rationale for vaccine-mediated induction of T cell 79 as well as nAb responses (Gallais et al., 2020) , (Libraty et al., 2007) , (Yang et al., 2006) , (Zhao 80 et al., 2010) . More than 230 candidate SARS-CoV-2 vaccines are now in preclinical 81 development or clinical trials (WHO, 2021a), and several vaccines are in widespread clinical 82 use (Creech et al., 2021) . 83 The SARS-CoV-2 spike (S) glycoprotein (comprised of S1 and S2 subunits) is the primary 84 target of vaccine development efforts. Homotrimers of the transmembrane S protein on the 85 virion surface mediate virion attachment and entry into host cells, making S a key target for 86 nAbs (Letko et al., 2020) . S is also highly immunogenic for T cells, with many studies 87 suggesting that although infected individuals mount CD4 + and CD8 + T cell responses to 88 epitopes throughout the viral proteome, S is often at the top of the antigenic hierarchy 89 (Grifoni et al., 2020b) , (Altmann and Boyton, 2020) , (Weiskopf et al., 2020) . The relative roles 90 of CD4 + and CD8 + T cells in disease control or pathogenesis and impact of their protein and 91 epitope specificity are unknown; but given the importance of CD4 + T cells (particularly CD4 + 92 T follicular helper (Tfh) cells) in providing help for antibody responses (Crotty, Prior to purifying class II complexes, we also purified HLA-I ligands and identified 29,309 160 self-peptides. No MDDC HLA-I cross-presentation of the pulsed S protein was detected (data 161 not shown). 162 Characteristic of HLA-II-bound peptides, many of the S peptides identified (Table S1) formed 164 distinctive nested sets around a common core. Two of the identified peptides originated 165 from regions that were altered to assist recombinant protein expression and purification 166 ( Figure S1C ). The location of the identified S peptides in the context of protein region and 167 domain structure and relative frequency with which particular sites are presented in each 168 donor is summarised in Figure 3A To explore the contribution of individual HLA-DR alleles to presentation of the S protein, we 174 investigated the likely allele to which each peptide was bound using HLA-II-binding 175 prediction (NetMHCIIpan 4.0) ( Figure 3B ,C). 77-95% of the HLA-DR-bound S peptide 176 sequences identified in each donor were predicted to bind one or more of the donor's HLA-177 DR alleles ( Figure S4A ). This stratification demonstrated a within-patient allele usage bias in 178 S presentation that in most donors mirrored the previously observed bias in the percentage 179 of all peptides presented by individual alleles ( Figure 3B and Figure 2I ). For example, the 180 majority of both self and S peptides were presented with the DRB1*04:01 allele in donor 181 C459 and C460 MDDCs, although in donor C493 the proportion of S peptides showing the 182 highest predicted affinity of binding to DRB3*01:01 was higher than that observed for self-183 peptides, and exceeded values for this donor's DRB1 alleles ( Figure 2I and Figure 3B (Table S2) . Notably, regions of the S protein containing glycosites were devoid of 197 J o u r n a l P r e -p r o o f peptides identified in our initial analysis of the HLA-II-bound peptidome, raising the 198 question of whether S-derived glycopeptides were also presented by MDDCs ( Figure 3A) . 199 To enable glycopeptide analysis, non-PNGaseF-treated S digests were analysed using a well-200 established glycoproteomics strategy (Alves et al., 2017) . Using this approach, glycopeptides 201 at 19 sites of S were identified to carry oligomannosidic and complex/hybrid-type N-glycans 202 ( Figure 4A , Table S3 ). Most sites displayed extensive glycan microheterogeneity arising from 203 differences in both glycan types and structural features including terminal sialylation and 204 fucosylation, in agreement with the known site-specific glycosylation of S (Watanabe et al., 205 2020) . 206 Next, we applied the site-specific glycopeptide methodology to the mass spectra acquired 207 from samples eluted from HLA-II ( Figure 4B ). 80 distinct glycopeptide forms mapping to S 208 were identified; the majority of these (76) were derived from the HLA-DR-bound 209 immunopeptidome (Table S4) . These glycopeptide forms mapped to 52 unique peptide 210 sequences that typically formed nested sets, were predominantly observed in datasets 211 generated from donors C459 and C460 (where the highest number of unique HLA-DR-bound 212 non-glycosylated peptides were also detected), and had a similar length distribution to S-213 derived non-glycopeptides ( Figure 4C ,D). 75% (66-100%) of all glycopeptide sequences were 214 predicted (using NetMHCIIpan 4.0) to bind to one or more of the donor's HLA-DR alleles 215 ( Figure 4C ). The largest nested set consisted of glycopeptides from donor C459/C460 216 MDDCs that mapped to position N801 located directly in the fusion peptide (FP, 788-806), a 217 highly conserved region which facilitates membrane fusion during viral entry ( Figure 4E ,F). 218 In total, we identified HLA-II-bound glycopeptides bearing glycans derived from 14 of the N-219 linked glycosylation sites in S ( Figure 4F ). HLA-II-bound peptides carried predominantly short 220 paucimannosidic-type N-glycans while S carried oligomannosidic-and GlcNAc-capped 221 J o u r n a l P r e -p r o o f complex-type N-glycan structures at these sites ( Figure 4B ,F). The paucimannosylation of 222 the HLA-II bound peptides comprised both core-fucosylated (M1F, M2F, and M3F i.e. Man 1-223 3 GlcNAc 2 Fuc 1 ) and a fucosylated (M2, Man 2 GlcNAc 2 ) species, as supported by fragment 224 spectra analysis ( Figure 4E) . Within the receptor binding domain (RBD) of S1, the receptor binding motif (RBM), an 247 extended insert on the beta-6/5 strands that contains the contact points with the receptor 248 ACE2 (Lan et al., 2020) and is an important nAb target, contained 2 nested sets of peptides 249 predicted to be presented by 3 different HLA-DR alleles ( Figure 5 ). In the donors analysed, a 250 total of 21 unique peptide sequences derived from this region were identified altogether 251 ( Figure 6A ,B) and versions of some of these post-translationally-modified at residues C480 252 and C432 were also detected ( Figure 6C ). At least one peptide within this region was found 253 to be presented in every donor studied. Interestingly, a particularly large nested peptide set, 254 presented in all donors, was predicted to be bound by HLA-DR3, highlighting a potentially (Table S5 and 282 Supplementary Figure S7 ). There were also inter-study differences in the sequences defined 283 as T cell epitopes, as these studies were performed in HLA-diverse subjects, and whereas 284 three groups of authors employed T cells from SARS-CoV-2 convalescent patients for 285 peptide screening, Mateus et al. focused on identifying pre-existing cross-reactive T cell 286 responses to SARS-CoV-2 and so performed their peptide screening in individuals exposed 287 only to seasonal coronaviruses. Together, the 93 HLA-II restricted peptides to which CD4 + T 288 cell responses were detected in one or more of these publications spanned 57% of the 289 SARS-CoV-2 S protein sequence used in this study and 74% of the immunopeptidome 290 sequence, indicative of substantial enrichment in the latter. Moreover, 67% of the amino 291 acids contained within T cell targeted peptides were located in peptides in the 292 J o u r n a l P r e -p r o o f immunopeptidome ( Figure 6E , Supplementary Figure S7 ). Given that T cell response 293 screening was performed with overlapping sets of long peptides and epitopes were not 294 precisely defined within these, and the more distal regions of the HLA-II associated peptides 295 defined by our immunopeptidome profiling strategy may not be required for T cell 296 recognition, the amino acid overlap between the T cell-recognised and HLA-II-bound 297 peptide sequences provides an under-estimate of the concordance between the epitopes 298 targeted by CD4 + T cell responses and the repertoire of HLA-II bound peptides. Overall, the 299 extensive overlap observed between these datasets ( Figure 6E DR alleles correlated with protein expression levels, but certain alleles may also present a 361 more diverse repertoire of peptides due to a preference for more common amino acids 362 and/or ability to tolerate a greater number of different amino acid residues at key anchor 363 positions, and/or to differences in their association with the peptide editor HLA-DM or the 364 associated HLA-DO protein that modulates its function (Roche and Furuta, 2015) . adaptive responses (by shielding nAb binding sites and impairing antigen processing for T 371 cell recognition); but it is also targeted by host innate immune recognition pathways (Baum 372 and Cobb, 2017). S protein glycosylation is carried out by the host cell glycan processing 373 machinery, resulting in attachment of a range of oligomannosidic, complex or hybrid 374 structures that mimic mature surface glycoproteins of the host. We initially confirmed that 375 these patterns were present in the intact S protein used to pulse MDDCs. Strikingly, we 376 found that the HLA-II-bound S peptides were in contrast glycosylated at the same sites, but 377 with glycans rich in highly processed paucimannosidic-type structures. This observation 378 implies a significant modulation of the glycan phenotype upon internalisation, processing, 379 and presentation of the S glycoprotein in MDDCs. Paucimannosidic glycans are defined as 380 Interestingly, whilst 78% of the amino acids within S1-derived HLA-II bound peptides 486 overlapped with residues in peptides that were found to be targeted by CD4 + T cell 487 responses in SARS-CoV-2 convalescent patients in the study by Tarke where linkage to exogenous CD4 + T(fh) epitopes such as broadly-presented peptides from 506 tetanus toxoid may be advantageous in future vaccine design. 507 In the current study, the repertoire of peptides presented with HLA-II by S protein-pulsed 509 MDDCs derived from 5 HLA-DRB1-heterozygous donors expressing a total of 9 different 510 HLA-DRB1 alleles were profiled using a workflow focused primarily on identification of HLA-511 DR-bound peptides, which also provided some insight into the HLA-DP-bound peptide 512 repertoire. Limitations included the number of donors analysed and diversity of HLA-II 513 alleles they expressed, and the number of MDDCs it was possible to generate from each 514 donor, which restricted the quantity of HLA-II available for peptide profiling. In future, a 515 more comprehensive map of the S protein-derived HLA-II-bound peptide repertoire could 516 be generated by employing MDDCs from a larger number of donors selected to express 517 HLA-DR, -DP and -DQ alleles covering a higher proportion of global HLA-II diversity, and 518 obtaining more cells from each donor so that sufficient HLA-II was available to achieve an in-519 depth profiling of HLA-DR, -DP and -DQ bound peptides. Table S1 . University, Melbourne, Australia. Briefly, cells were cultured in serum-free Hybridoma-SFM 712 medium (Gibco) supplemented with hybridoma mix (2,800 mg/l of d-glucose, 2,300 mg/l of 713 peptone, 2 mM l-glutamine, 1% penicillin/streptomycin, 1% non-essential amino acids, 714 0.00017% 2-mercaptoethanol). Supernatant containing antibody was harvested and stored 715 at -20. To purify antibodies, supernatants were thawed then centrifuged at 2,500 × g for 25 716 min at 4°C, filtered (0.2 µm SteriCup Filter (Millipore)) and incubated with protein A resin 717 (PAS) (Expedeon) for 18 hours at 4°C. Antibody-resin complexes were then collected by 718 gravity flow through chromatography columns, washed with 20 ml of PBS, and eluted with 5 719 ml 100 mM glycine pH 3.0. pH was adjusted to pH 7.4 using 1 M Tris pH 9.5 and antibodies 720 were buffer exchanged into PBS and concentrated with a 5 kDa molecular weight cut-off 721 ultrafiltration device (Millipore). 722 Cross-linking of purified antibodies was performed as follows: 3 mg of antibody per 0. The peptidome of SARS-CoV-2 spike glycoprotein is presented by DCs on HLA-II Spike glycans are truncated during glycopeptide processing and presentation The spike receptor binding motif is a HLA-II-binding peptide-rich region. Concordance between the immunopeptidome and T cell immunogenicity data reported elsewhere In brief (max 50 words) Parker et al map the HLA-II-bound peptides and glycopeptides presented by SARS-Cov-2 spike protein-pulsed monocyte-derived dendritic cells. They observe that complex glycans on the spike immunogen are trimmed during antigen processing, revealing a signature for HLA-II presentation, and highlight congruence between the HLA-II-bound peptides identified and T cell epitopes. Figure S1 , related to STAR Methods. Production, purification, and sequence variation of the SARS-CoV-2 S ectodomain. A. Size exclusion chromatography profile of the SARS-CoV-2 S protein that was purified using the C-terminal TwinStrep tags. The S protein was run on a Superose 6 Increase 10/300 column. The dotted lines indicate the portion of the peak that was collected and used for this study. B. SDS-PAGE gel of the purified S protein with lanes from left to right showing molecular weight marker, S protein run under reducing conditions and S protein run under non-reducing conditions. C. Pairwise sequence alignment of regions of the S protein sequence (P0DTC2, SPIKE_SARS2, displayed as top sequence) and corresponding regions of the sequence of the recombinant S protein employed in this study (bottom sequence). Peptides found in the immunopeptidome are indicated in red text, and the sites where the amino acid sequence of the recombinant protein differs from the Uniprot S sequence are highlighted in yellow. Tables S1-5 Table S1 , relates to Figure 2 and 3: Supplementary data for immunopeptidomic data analysis, providing peptide identification metrics, nested clusters and NetMHCIIpan predictions for all peptides identified by the Peaks search engine for sequences that map to S protein in all donors. Table S2 , relates to Figure 3 and 4: Supplementary data for elastase digestion of purified S protein and treatment with PNGnase F in the presence of H2O18, providing site, modification, peptide identification score and occupancy. SARS-CoV-2 T cell immunity: Specificity, function, 828 durability, and role in protection Comprehensive glycoprofiling of the epimastigote and 831 trypomastigote stages of Trypanosoma cruzi The direct and indirect effects of glycans on immune 833 function SARS-CoV-2-reactive T cells in healthy 836 donors and patients with COVID-19 Potent neutralizing 839 antibodies from COVID-19 patients define multiple targets of vulnerability Potent Neutralizing Antibodies against SARS-CoV-2 Identified by High-Throughput Single-Cell Sequencing of Convalescent Patients' B Cells The convalescent sera option for containing COVID-844 SARS-CoV-2 infection protects against 847 rechallenge in rhesus macaques Modification of cysteine 849 residues in vitro and in vivo affects the immunogenicity and antigenicity of major 850 histocompatibility complex class I-restricted viral determinants SARS-CoV-2 Vaccines T Follicular Helper Cell Biology: A Decade of Discovery and Diseases Glycan side chains on naturally 855 presented MHC class II ligands Development of a novel clustering tool for linear peptide sequences Intrafamilial Exposure to SARS-CoV-2 Induces Cellular Immune Response 861 without Seroconversion. medRxiv Analysis of the SARS-CoV-2 863 spike protein glycan shield: implications for immune recognition A 865 Sequence Homology and Bioinformatic Approach Can Predict Candidate Targets for Immune 866 Responses to SARS-CoV-2 Targets of T Cell Responses to 869 SARS-CoV-2 Coronavirus in Humans with COVID-19 Disease and Unexposed Individuals Studies in humanized mice and convalescent humans yield a 873 SARS-CoV-2 antibody cocktail Cysteinylation of MHC class II ligands: 875 peptide endocytosis and reduction within APC influences T cell recognition InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams Controlling the SARS-CoV-2 spike glycoprotein 882 conformation High-resolution longitudinal N-and O-885 glycoprofiling of human monocyte-to-macrophage transition N-linked carbohydrates in tyrosinase are required for its recognition by human MHC 888 class II-restricted CD4(+) T cells An mRNA 891 Vaccine against SARS-CoV-2 -Preliminary Report Reduction of disulfide bonds during antigen processing: evidence from a 893 thiol-dependent insulin determinant Humoral and circulating follicular helper T cell 896 responses in recovered patients with COVID-19 The dynamics of humoral immune responses following 898 SARS-CoV-2 infection and the potential for reinfection The Human Leukocyte Antigen Class II Immunopeptidome of the SARS-CoV-2 Spike 901 Glycoprotein Structure of the SARS-CoV-2 spike receptor-binding domain bound to the 904 ACE2 receptor Dynamics of major histocompatibility complex class II compartments during B cell 907 receptor-mediated cell activation Toward Automated N-Glycopeptide Identification in Glycoproteomics Conservation analysis of SARS-CoV-2 spike suggests 912 complicated viral adaptation history from bat to human Functional assessment of cell entry and 915 receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses Human CD4(+) 918 memory T-lymphocyte responses to SARS coronavirus infection Potent Neutralizing Monoclonal Antibodies Directed to Multiple 921 Epitopes on the SARS-CoV-2 Spike Tandem 18O stable isotope 923 labeling for quantification of N-glycoproteome The effectiveness of convalescent plasma and hyperimmune immunoglobulin for the 927 treatment of severe acute respiratory infections of viral etiology: a systematic review and 928 exploratory meta-analysis Identification and Characterization of 931 Complex Glycosylated Peptides Presented by the MHC Class II Processing Pathway in 932 Melanoma Selective and cross-reactive SARS-CoV-2 T cell 935 epitopes in unexposed humans The HLA-A*0201-938 restricted H-Y antigen contains a posttranslationally modified cysteine that significantly 939 affects T cell recognition Antigen 941 processing and presentation of a naturally glycosylated protein elicits major 942 histocompatibility complex class II-restricted, carbohydrate-specific T cells Oxidation of peptides during electrospray 945 ionization SARS-CoV-2-derived peptides define heterologous and 948 COVID-19-induced T cell recognition Insights into the Role of GILT in HLA Class II Antigen 950 Processing and Presentation by Melanoma Broad and strong memory CD4 (+) and CD8 (+) T cells 953 induced by SARS-CoV-2 in UK convalescent COVID-19 patients The PRIDE database and 956 related tools and resources in 2019: improving support for quantification data Improved Prediction of MHC II Antigen Presentation through Integration and Motif 960 Deconvolution of Mass Spectrometry MHC Eluted Ligand Data The 963 IPD and IMGT/HLA database: allele variant databases The ins and outs of MHC class II-mediated antigen 965 processing and presentation Rapid isolation of potent SARS-CoV-2 neutralizing antibodies and 968 protection in a small animal model Immunogenicity of a DNA vaccine candidate for 971 COVID-19 In silico T cell epitope 973 identification for SARS-CoV-2: Progress and perspectives Glycopeptide epitope 976 facilitates HIV-1 envelope specific humoral immune responses by eliciting T cell help Comprehensive analysis of T cell immunodominance 980 and immunoprevalence of SARS-CoV-2 Human neutrophils secrete bioactive paucimannosidic proteins from 983 azurophilic granules into pathogen-infected sputum Human protein 985 paucimannosylation: cues from the eukaryotic kingdoms The cellular redox 989 environment alters antigen presentation Variations in MHC Class II Antigen Processing 991 and Presentation in Health and Disease ChAdOx1 nCoV-994 19 vaccination prevents SARS-CoV-2 pneumonia in rhesus macaques Glycan analysis of human 997 neutrophil granules implicates a maturation-dependent glycosylation machinery Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Site-specific 1003 glycan analysis of the SARS-CoV-2 spike Jalview 1005 Version 2--a multiple sequence alignment editor and analysis workbench Phenotype 1009 and kinetics of SARS-CoV-2-specific T cells in COVID-19 patients with acute respiratory 1010 distress syndrome The COVID-19 candidate vaccine landscape. 1012 WHO (2021b). COVID-19 Weekly Epidemiological Update Cryo-EM structure of the 2019-nCoV spike in the prefusion 1015 conformation Capturing Differential Allele-1018 Level Expression and Genotypes of All Classical HLA Loci and Haplotypes by a New Capture 1019 RNA-Seq Method Mass spectrometric analysis of asparagine deamidation 1021 and aspartate isomerization in polypeptides Searching 1023 immunodominant epitopes prior to epidemic: HLA class II-restricted SARS-CoV spike protein 1024 epitopes in unexposed individuals Long-lived effector/central memory T-cell responses to severe acute respiratory 1027 syndrome coronavirus (SARS-CoV) S antigen in recovered SARS patients A DNA vaccine induces SARS coronavirus neutralization and protective immunity in 1031 mice T cell responses are required for protection from 1033 clinical disease and for virus clearance in severe acute respiratory syndrome coronavirus-1034 infected mice Potently neutralizing and protective human antibodies 1037 against SARS-CoV-2 washed with 15 ml of 0.005% IGEPAL, 50 mM Tris pH 8.0, 150 mM NaCl, 5 mM EDTA, 15 ml 738 of 50 mM Tris pH 8.0, 150 mM NaCl, 15 ml of 50 mM Tris pH 8.0, 450 mM NaCl, and 15 ml of 739 50 mM Tris pH 8.0. 3 ml of 10% acetic acid was used to elute bound HLA complexes from 740 the PAS-antibody resin, which were then dried by vacuum centrifugation. Eluted peptides 741 were dissolved in loading buffer ( 0.1% (v/v) trifluoroacetic acid (TFA), 1%(v/v) acetonitrile in 742 water), and then injected by a Ultimate 3000 HPLC system (Thermo Scientific) and 743 separated across a 4.6 mm × 50 mm ProSwift RP-1S column (Thermo Scientific). Peptides 744 were eluted using a 1 ml/min gradient over 5 min from 1-35% Acetonitrile/0.1% TFA and 15 745 fractions were collected every 30 seconds. Peptide fractions 1-10 were combined into odd 746 and even fractions then dried. utilised to define binding (rank score cut of 10) . Sequence logos were generated by 782 Seq2logo2.0. Venn diagrams and UpsetR plots were created using the online 783 http://www.interactivenn.net/index2.html (Heberle et al., 2015) and UpsetR program in R. 784Clustering of sequences was done by GibbsCluster2.0 with defaults for MHC class II ligands 785