key: cord-0802584-3982dr4t authors: Marchan, Jose title: A vaccine built from potential immunogenic pieces derived from the SARS-CoV-2 spike glycoprotein date: 2020-09-24 journal: bioRxiv DOI: 10.1101/2020.09.24.312355 sha: 9faadd502db337db834ec5e19e6bfcf381dcba5c doc_id: 802584 cord_uid: 3982dr4t Coronavirus Disease 2019 (COVID-19) represents a new global threat demanding a multidisciplinary effort to fight its etiological agent—severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In this regard, immunoinformatics may aid to predict prominent immunogenic regions from critical SARS-CoV-2 structural proteins, such as the spike (S) glycoprotein, for their use in prophylactic or therapeutic interventions against this rapidly emerging coronavirus. Accordingly, in this study, an integrated immunoinformatics approach was applied to identify cytotoxic T cell (CTC), T helper cell (THC), and Linear B cell (BC) epitopes from the S glycoprotein in an attempt to design a high-quality multi-epitope vaccine. The best CTC, THC, and BC epitopes showed high viral antigenicity, lack of allergenic or toxic residues, and suitable HLA-viral peptide interactions. Remarkably, SARS-CoV-2 receptor-binding domain (RBD) and its receptor-binding motif (RBM) harbour several potential epitopes. The structure prediction, refinement, and validation data indicate that the multi-epitope vaccine has an appropriate conformation and stability. Three conformational epitopes and an efficient binding between Toll-like receptor 4 (TLR4) and the vaccine model were observed. Importantly, the population coverage analysis showed that the multi-epitope vaccine could be used globally. Notably, computer-based simulations suggest that the vaccine model has a robust potential to evoke and maximize both immune effector responses and immunological memory to SARS-CoV-2. Further research is needed to accomplish with the mandatory international guidelines for human vaccine formulations. Graphical Abstract by comparing the IC50 of each predicted peptide against random peptides from 66 SWISSPROT database. 67 In addition, binding peptides to HLA-II were also chosen by their potential to induce 68 interferon-gamma (IFN-g) (Fig. 1) , which was evaluated using the IFNepitope server 69 (http://crdd.osdd.net/raghava/ifnepitope/) [26] . 70 71 BCPRED (http://ailab.ist.psu.edu/bcpred/) [27] was used to predict linear BC epitopes 72 based on several physicochemical properties: hydrophilicity, flexibility, accessibility, and 73 antigenicity propensity (threshold = 1 for each parameter). Simultaneously, the S 74 glycoprotein amino acid sequence was also subjected to iBCE-EL 75 (http://thegleelab.org/iBCE-EL/) [28] and BepiPred-2.0 76 (http://www.cbs.dtu.dk/services/BepiPred/) [29] for additional predictions of linear BC 77 epitopes. 78 79 Molecular docking 80 To evaluate the presentation of the best epitopes in the context of HLA molecules, a 81 molecular docking study was conducted (Fig. 1) . Taking into account that HLA-C*06:02 and HLA-DRB1*01:01 were predicted as common interacting HLA alleles, they were 83 selected for this purpose. 84 The molecular docking simulation process is summarised as follows. First, the best 3D 85 structure of each epitope (9-mer and 15-mer peptides) was obtained from PEPFOLD server 86 (https://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3/) [30] . Second [45] . Three injections were applied four weeks apart as described previously [46] . Furthermore, 12 injections were applied four weeks apart to simulate 140 repeated exposure to potential immunogen. The Simpson index D was used to interpret the 141 diversity of the immune response. A total of 47 T cell epitopes were predicted; however, 8 CTC and 11 THC epitopes were 155 identified as the best (Table 1) . These 19 epitopes showed a potent viral antigenicity-156 ranging from 0.63 to 1.52-and lack of allergenic or toxic residues in their sequences 157 (Table 1) . Moreover, THC epitopes were characterized by their potential capability to 158 induce IFN-g (Table 1 ). Although "EGFNCYFPLQSYGFQ" (E47 in Table 1 ) could be 159 categorized as a strong potential THC epitope, it was identified as a probable inductor of 160 toxicity. Therefore, this epitope was not included in the amino acid sequence of the multi-161 epitope vaccine. The selected CTC epitopes (Table 1) A total of 10 linear BC epitopes of varying amino acid lengths were predicted ( Table 2 ). Most of the epitopes showed robust viral antigenicity (≥0.5), as well as, they were 175 identified as non-allergenic and non-toxic (Table 2 ). However, only 7 epitopes were 176 selected for the vaccine design due to they were predicted simultaneously by 3 different 177 web tools (BCPRED, iBCE-EL, and BepiPred-2.0) ( Table 2) . Interestingly, overlapping 178 residues were observed between some linear BC and T cell epitopes. HLA-I and HLA-II alleles were docked with CTC and THC epitopes, respectively, using 184 the Patchdock server, which has been recently applied to successfully dock epitopes from 185 SARS-CoV-2 into immune cell receptors [10] . The HLA allele-viral peptide complexes 186 showed high geometric shape complementarity scores (>6000) [52] similar to controls 187 ( Table 3 ). The inspection on VDM software allowed observing different binding patterns 188 wherein viral peptides rightly interact with the active site residues of the HLA groove in a 189 similar way to control peptides (Fig. 2) . Moreover, several viral peptides (e.g., E15 and 190 E33) formed a bulge that projected from their respective HLA allele ( Fig. 2B and 2C) , 191 which may suggest a more direct interaction with the T cell receptor [18] . To design the amino acid sequence of the multi-epitope vaccine, a total of 26 epitopes (8 195 CTC, 11 THC, and 7 LBC epitopes) were organized using several linkers (Fig. 1) . This 196 sequence is constituted by 437 amino acid residues (Fig. 1) . Of particular note, 6 epitopes selected for the vaccine design (E2, E19, E43, E44, and 198 E45 in Table 1 ; E10 in Table 2 ) harbour residues that are usually involved in the interaction 199 between the SARS-CoV-2 S glycoprotein and hACE2 [49] [50] [51] . For instance, N501-which 200 is present in the amino acid sequence of E19 (Table 1 ) and E10 (Table 2 )-has been 201 recently described as one of the critical hACE2-binding residues in SARS-CoV-2 [51] . The 437 amino acid long vaccine construct was analysed using the PSIPRED server to 210 predict its secondary structure, which identified 316, 70, and 51 amino acids forming coil, 211 helix, and strand regions, respectively (Fig. 3A) . The tertiary structure was subjected to refinement using the GalaxyRefine server. The 213 output showed five potential models. Model 1 (Fig. 3B) analysis. In this regard, the Ramachandran plot (Fig. 3C) showed that 86.4% of residues 217 were located in most favoured regions, whereas the remaining residues were observed in 218 additional allowed (11.7%), generously allowed (1.2%), and disallowed (0.6%) regions. In 219 addition, the Z-score value (-2.35) (Fig. 3D) suggests that the vaccine structure is similar to 220 native proteins of comparable size. Three conformational BC epitopes (CE) were predicted using Ellipro (Fig. 4) . These CE 223 showed high probability scores-CE1: 0.914, CE2: 0.841 and CE3: 0.821, suggesting a 224 considerable accessibility for antibodies (Fig. 4) . Likewise, these results also confirm the 225 immunogenic potential of the multi-epitope vaccine construct. viruses [18] , and macrophages were observed (Fig. 6 ). Regarding the adaptive immune 235 response, CTC and THC populations showed a proliferative burst, effector cell generation, 236 and a dramatic cell number contraction (Fig. 6) . Importantly, IL-2, which is necessary for T 237 cell activation and optimal proliferation [18] , was amplified after each dose (Fig. 6 ). Moreover, the vaccine model increased BC and plasma cell populations, particularly 239 immunoglobulin M (IgM) and IgG1 isotypes (Fig. 6) . In this regard, titres of IgM, IgG1, 240 and IgG2 were higher in the secondary and tertiary response compared to primary response 241 (Fig. 6 ). Of note, immunogen concentrations decreased after antibody response (Fig. 6 ). Notably, repeated exposure with 12 injections (given 4 weeks apart) increased the IgG1 243 levels and stimulated CTC and THC populations (Fig. S1 ). Taken Spain, Sweden, USA, UK, etc., (Fig. 7) . Immunoinformatics represents a valuable tool whereby the limitations in the selection of 257 appropriate antigens and immunodominant epitopes may be overcome [17] . Previous in 258 silico-based reports have shown that the SARS-CoV-2 S glycoprotein contains potential 259 epitopes [8] [9] [10] [11] . Therefore, researchers have recently attempted to design epitope-based 260 vaccine candidates against SARS-CoV-2 [10, 11] . Despite these relevant contributions, one 261 group only used T cell epitopes and did not include BC epitopes [10] , which are 262 fundamental players in antiviral immune response [18] . The second work, on the other 263 hand, considered several viral membrane proteins, including the S glycoprotein, to 264 identified probable T and BC epitopes [11] . Although the predicted epitopes showed good 265 immunogenic potential, the vaccine does not target S glycoprotein RBM [11] . In the 266 present study, highly potential B and T cell epitopes from the SARS-CoV-2 glycoprotein 267 were predicted and the best selected to design a high-quality multi-epitope vaccine 268 candidate. Remarkably, this vaccine harbours 2 epitopes (E19 in Table 1 and E10 in Table 269 2) that could evoke immune responses against SARS-CoV-2 RBM-the main responsible 270 for virus entry into human cells [4, 51] , whereas 4 epitopes (E43, E44, E45, and E46 in 271 [4] . The T cell epitopes included in the vaccine sequence accomplish with relevant requisites 275 to design a suitable multi-epitope vaccine candidate. Firstly, they showed a marked 276 antigenicity, immunogenicity, and lack of allergenic or toxic residues. Secondly, the THC 277 epitopes were predicted as potent inductors of IFN-g-a crucial cytokine for CTC 278 activation [18] . Thirdly, both CTC and THC epitopes properly interacted with the groove of 279 HLA-I and HLA-II alleles, respectively, which is in agreement with other computer-based reports [54] , thereby suggesting that the T cell epitopes identified and selected in the 281 present study could be successfully presented in the context of HLA molecules. The purpose of an adjuvant is to make a vaccine "detectable" for antigen-presenting 283 cells such as dendritic cells [55] . In this regard, adjuvants approved or in clinical trials 284 (NCT01609257) for virus-like particle-based vaccines are constituted by TLR4 agonists 285 [55] . Here, the TLR4 adjuvant known as RS09 [33] was included in the multi-epitope 286 vaccine sequence. The molecular docking simulation showed that the multi-epitope vaccine 287 rightly interacts with this innate immune receptor in a similar way to previous works [35] . 288 Notably, this study shows, by immunoinformatics simulations, the induction of both convalescent patients [57] . These immune responses were directed to the SARS-CoV-2 S 299 glycoprotein [57] . This work was limited by A) the population coverage analysis did not include some 301 countries, particularly from Africa, Central America, Eastern Europe, and Central Asia. This was mainly due to data not available concerning the HLA allele frequencies. Table 3 . Patchdock score based on geometric shape complementarity scores (GSCS) of the HLA allele-epitope complexes. A pneumonia outbreak associated with a new coronavirus of probable bat origin Virology, Epidemiology, Pathogenesis, and Control of COVID-19 Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein SARS-CoV-2 Vaccines: Status Report Therapeutic efficacy of a multi-epitope vaccine against Helicobacter pylori infection in BALB/c mice model, Vaccine Identification of Potential MHC Class-II-Restricted Epitopes Derived from Leishmania donovani Antigens by Reverse Vaccinology and Evaluation of Their CD4+ T-Cell Responsiveness against Visceral Leishmaniasis A Sequence Homology and Bioinformatic Approach Can Predict Candidate Targets for Immune Responses to SARS-CoV-2 Epitopes for a 2019-nCoV vaccine Development of epitope-based peptide vaccine against novel coronavirus 2019 (SARS-COV-2): Immunoinformatics approach Design of a peptide-based subunit vaccine against novel coronavirus SARS-CoV-2, Microb. Pathog. (2020) 104236 Advance online publication Biologicals: Nonclinical evaluation of vaccines AllergenFP: allergenicity prediction by descriptor fingerprints AlgPred: prediction of allergenic proteins and mapping of IgE epitopes Open Source Drug Discovery Consortium, G. P. Raghava, In silico approach for predicting toxicity of peptides and proteins VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines Moving from Empirical to Rational Vaccine Design in the 'Omics' Era Kuby Immunology Immune epitope database analysis resource Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding A consensus epitope prediction approach identifies the breadth of murine T(CD8+)-cell responses to vaccinia virus NetMHCpan, a method for MHC class I binding prediction beyond humans A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method Designing of interferon-gamma inducing MHC class-II binders Prediction of continuous B-cell epitopes in an antigen using recurrent neural network A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction BepiPred-2.0: improving sequence-based B-cell epitope prediction using conformational epitopes A fast and accurate method for large-scale de novo peptide structure prediction PatchDock and SymmDock: servers for rigid and symmetric docking VMD -Visual Molecular Dynamics Synthetic Toll like receptor-4 (TLR-4) agonist peptides as a novel class of adjuvants Toll-like receptors in antiviral innate immunity Novel Immunoinformatics Approaches to Design Multi-epitope Subunit Vaccine for Malaria by Investigating Anopheles Salivary Protein Protein identification and analysis tools in the ExPASy server SOLpro: accurate sequence-based prediction of protein solubility Protein secondary structure prediction based on position-specific scoring matrices The PSIPRED Protein Analysis Workbench: 20 years on Prediction of Protein Structure and Interaction by GALAXY protein modeling programs GalaxyWEB server for protein structure prediction and refinement PROCHECK -a program to check the stereochemical quality of protein structures ProSA-web: interactive web service for the recognition of errors in threedimensional structures of proteins ElliPro: a new structurebased tool for the prediction of antibody epitopes Computational immunology meets bioinformatics: the use of prediction tools for molecular binding in the simulation of the immune system Structural basis and designing of peptide vaccine using PE-PGRS family protein of Mycobacterium ulcerans-An integrated vaccinomics approach Predicting population coverage of T-cell epitope-based diagnostics and vaccines RStudio: Integrated Development for Role of changes in SARS-CoV-2 spike protein in the interaction with the human ACE2 receptor: An in silico analysis Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Structural basis of receptor recognition by SARS-CoV-2 Efficient Unbound Docking of Rigid Molecules World Health Organization. Coronavirus disease (COVID-2019) situation reports Immunoinformatics-aided identification of T cell and B cell epitopes in the surface glycoprotein of 2019-nCoV Adjuvant formulations for virus-like particle (VLP) based vaccines In-silico design of a multi-epitope vaccine candidate against onchocerciasis and related filarial diseases Targets of T cellresponses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding