key: cord-0823679-70xvrfbg authors: ali, Mohammad Tuhin; Morshed, Mohammed Monzur; Gazi, Md. Amran; Musa, Md. Abu; Kibria, Md Golam; Uddin, Md Jashim; Khan, Md. Anik Ashfaq; Hasan, Shihab title: Computer aided prediction and identification of potential epitopes in the receptor binding domain (RBD) of spike (S) glycoprotein of MERS-CoV date: 2014-08-30 journal: Bioinformation DOI: 10.6026/97320630010533 sha: f898a667de6c7ad8916d2e542498e953589be8d7 doc_id: 823679 cord_uid: 70xvrfbg Middle East Respiratory Syndrome Coronavirus (MERS-CoV) belongs to the coronaviridae family. In spite of several outbreaks in the very recent years, no vaccine against this deadly virus is developed yet. In this study, the receptor binding domain (RBD) of Spike (S) glycoprotein of MERS-CoV was analyzed through Computational Immunology approach to identify the antigenic determinants (epitopes). In order to do so, the sequences of S glycoprotein that belong to different geographical regions were aligned to observe the conservancy of MERS-CoV RBD. The immune parameters of this region were determined using different in silico tools and Immune Epitope Database (IEDB). Molecular docking study was also employed to check the affinity of the potential epitope towards the binding cleft of the specific HLA allele. The N-terminus RBD (S367-S606) of S glycoprotein was found to be conserved among all the available strains of MERS-CoV. Based on the lower IC(50) value, a total of eight potential T-cell epitopes and 19 major histocompatibility complex (MHC) class-I alleles were identified for this conserved region. A 9-mer epitope CYSSLILDY displayed interactions with the maximum number of MHC class-I molecules and projected the highest peak in the B-cell antigenicity plot which concludes that it could be a better choice for designing an epitope based peptide vaccine against MERSCoV considering that it must undergo further in vitro and in vivo experiments. Moreover, in molecular docking study, this epitope was found to have a significant binding affinity of -8.5 kcal/mol towards the binding cleft of the HLA-C*12:03 molecule. (WHO), till date a total of 536 laboratory-confirmed cases of MERS-CoV infections have been identified, 145 of which resulted in death (http: //www.who.int /csr/disease /coronavirus infections/archive_updates/en). Though it has not hit the world population in a massive scale yet, its progressive invasiveness with concerning fatality rate raises the demand of therapeutic solutions or vaccination on an urgent basis. The field of Computational Immunology is developing rapidly. Current tools enable us to predict potential epitopes from large protein antigens encoded by viral genomes. Identification of B-and T-cell epitopes and their respective MHC alleles is the crucial step for the very initial screening process of immunoinformatics approach [3] . Coronavirus possess a non-structural replicase polyprotein and several structural proteins including spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins [4, 5] . Like other coronaviruses, the host cell recognition of MERS-CoV by CD26 receptor is mediated by its surface anchored S glycoprotein [6, 7] . This S glycoprotein has S1 and S2 subunits. S1 subunit contains RBD which mediates initial host cell recognition whereas S2 subunit mediates membrane fusion [8, 9] . Several recent studies have revealed that the N-terminal 367-606 amino acid region of S glycoprotein of MERS-CoV is bound with human CD26 and produces significant immune response. This phenomenon suggests that the receptor binding capacity of MERS-CoV lies in this particular region of S glycoprotein [10, 11] . Considering these facts, our approach was to choose this particular region of S glycoprotein for the prediction and identification of the potential T-and B-cell epitopes in this computational study. The sequences of S glycoprotein were retrieved from two databases: NCBI (National Center for Biotechnology Information) (http://www.ncbi.nlm.nih.gov/) and uniprot (http://www.uniprot.org/). These sequences belong to diverse demographic distributions like Saudi Arabia, England, Qatar, Spain, Germany, Jordan, and UAE with time ranges from 2012 to 2013. The Clustal Omega software (version 1.2.1) was utilized to generate multiple sequence alignment for retrieved sequences which was the basis for determining the conservancy of MERS CoV RBD [12] . Different tools available from the IEDB analysis resource (http://tools.immuneepitope.org/main/index.html) were used to evaluate the immunogenicity of MERS-CoV RBD. The NetCTL prediction method was utilized for identifying T-cell epitopes from this region [13, 14] . The MHC class-I binding prediction tool was used to identify MHC class-I alleles for the final set of T-cell epitopes based on Stabilized Matrix Method (SMM) [15] . "Proteasomal cleavage/TAP transport/MHC class I combined predictor" was utilized for determining the overall score for each peptide's intrinsic capability to be considered as a T cell epitope (http://tools.immuneepitope.org/processing/) [15] [16] [17] . The epitope conservancy analysis tool was utilized to calculate the conservancy of the final set of epitopes within the sequences of S glycoprotein [18] . In addition to this, The ProPred1 and SYFPEITHI databases were also employed to identify the anchoring residues of the epitopes [19, 20] . The 3D structure of MHC class-I molecule (HLA-C*12:03) was retrieved from the Protein Data Bank (PDB) (HLA-C*12:03; PDB ID: 2FSE) database (http: //www.rcsb.org /pdb/home/ home.do). The 3D structure of the best epitope CYSSLILDY (see discussion section) was designed by using PEPstr peptide tertiary structure prediction server (http:/ /www.imtech.res.in /raghava/pepstr/). The AutoDock tools (ADT) from the MGL software package (version 1.5.6) and AutoDock vina (Vina, version 4.2) were used for molecular docking study [21, 22] . At the beginning of the docking study, both the 3D structures of protein (HLA-C*12:03) and ligand (CYSSLILDY) were opened in ADT. For the epitope to be bound at the binding groove of HLA-C*12:03, the centre grid box parameter was set at 22.474, 5.416, 34.437 Aº in x, y and z directions respectively with 1.0 Aº spacing. The points were set at 40, 38, 52 Aº in x, y and z directions respectively. AutoDock tool was utilized to set the above parameters whereas AutoDock vina was applied for conducting docking experiment. Output files were visualized by PyMOL molecular visualization system (version: 1.5.0.3). Along with this, the Immunodominant Determinant of human type-2 collagen (PDB ID: 2FSE) was used as ligand to run positive control maintaining all the parameters as same as initial docking of this study. The B-cell epitope prediction tool from IEDB was used to predict the B-cell antigenicity of MERS-CoV RBD. The Kolaskar and Tongaonkar method was employed for this prediction which can predict antigenic determinants with approximately 75% accuracy [23]. An illustration of the MERS-CoV RBD in complex with the human CD26 protein (DPP4) is given in Figure 1 . A total of 54 sequences of S glycoprotein were subjected to multiple sequence alignment. From the output of multiple sequence alignment of S glycoprotein, it was found that a peptide region of 367-606 amino acids remained conserved among all the given sequences (Figure 2A) . Thus the above finding gives a convincing affirmation that MERS-CoV RBD is a well conserved region. 235 overlapping CTL peptides and 79 possible MHC-1 alleles were predicted by NetCTL and MHC class-I binding prediction tool, respectively. The final set of epitopes and their respective MHC class-I allele was generated on the basis of a total score ≥ 1 and the "half maximal inhibitory concentration (IC50)" value ≤ 100. This final set includes only eight T-cell epitopes and a total of 19 MHC class-I alleles (Figure 2A & Figure 2B) . A graphical presentation of the result predicted by the NetCTL prediction tool is given in Figure 3 . The immunogenicity of final epitopes was also determined in terms of proteasome score, TAP score, MHC-I binding score and processing score. All of the eight T-cell epitopes demonstrated 100% sequence conservancy within the given sequences of S glycoprotein which is summarized in Table 1 (See supplementary material). The T-cell epitopes predicted by NetCTL prediction tool. Most of the epitopes failed to cross threshold level (1.0). Green coloured sharp points indicate the epitopes that crossed the threshold level. In this graph, x-axis represents the residue positions of predicted epitope whereas y-axis represents their score. The designed epitope (CYSSLILDY) and experimental epitope (immunodominant determinant of human type-2 collagen; PDB ID: 2FSE) are displayed in Figure 3 . AutoDock vina predicted the best conformer of CYSSLILDY epitope based on the binding energy in kcal/mol unit. In our study, the best conformer of this epitope had the binding energy of -8.3 kcal/mol which was very similar with that of the positive control experiment (-7.5 kcal/mol). Among 9 residues in the epitope, only cysteine, serine and tyrosine (first, fourth and ninth residue, respectively) showed the anchoring capacity and other six residues remained at the binding groove of HLA-C*12:03 molecule. For both the designed and experimental epitopes at the binding groove of HLA-C*12:03, a comparative analysis of their binding pattern is illustrated in Figure 4 . The graphical depiction of B-cell antigenicity of MERS-CoV RBD is in Figure 5 . Average antigenic propensity for this region is 1.050. Twelve antigenic determinants were found from this region which is presented in Table 2 (See supplementary material). To be considered as a potent epitope, a peptide must contain several key properties like well conservancy, T-or B-cell processivity and good binding affinity for MHC alleles. In our study, the values for above mentioned parameters were found in favour of the finally chosen epitopes. In this study we used NetCTL prediction tool for the primary screening of the RBD of S glycoprotein of MERS-CoV for T-cell epitope prediction which is based on neural network architecture. It predicts the epitopes on the basis of processing of peptide in vivo. It gives a total score for each epitopes by using an algorithm integrating MHC-I binding, Transporter of Antigenic peptides (TAP) transport efficiency and proteasomal cleavage prediction [13, 14] . Among eight T-cell epitopes, CYSSLILDY was the most successful because it was found to have the maximum interactions with different MHC class-I alleles which is a principle prerequisite for a potential epitope. In addition to this, it was found that the peptide region of 408-452 amino acids "NYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPL S" of S glycoprotein of MERS-CoV shows the highest B-cell antigenicity peak in antigenicity plot. Most importantly, we observed that the best successful T-cell epitope CYSSLILDY is also a part of the region which shows highest B-cell antigenicity peak, suggesting that it may stand as a competent choice to be a universal vaccine component against MERS-CoV, the rapidly growing health concern. Conclusion: The obtained results which identified the epitopes cover a significant number of strains with decent population coverage. To exert cellular and humoral immunity, the computer aided result has been found potential and expected to mount immune response upon further in vivo and in vitro validation. The step by step analysis and progression for maximum possible MHC coverage provides significant primary screening result against this fatal virus. License statement: This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes The authors declare that there is no conflict of interests.