key: cord-0011480-km5sqgj4 authors: Raoufi, Ehsan; Hemmati, Maryam; Eftekhari, Samane; Khaksaran, Kamal; Mahmodi, Zahra; Farajollahi, Mohammad M.; Mohsenzadegan, Monireh title: Epitope Prediction by Novel Immunoinformatics Approach: A State-of-the-art Review date: 2019-08-20 journal: Int J Pept Res Ther DOI: 10.1007/s10989-019-09918-z sha: 048d9f2c4e98bb9a9ddbde97b5d01aa253b24e07 doc_id: 11480 cord_uid: km5sqgj4 Immunoinformatics is a science that helps to create significant immunological information using bioinformatics softwares and applications. One of the most important applications of immunoinformatics is the prediction of a variety of specific epitopes for B cell recognition and T cell through MHC class I and II molecules. This method reduces costs and time compared to laboratory tests. In this state-of-the-art review, we review about 50 papers to find the latest and most used immunoinformatic tools as well as their applications for predicting the viral, bacterial and tumoral structural and linear epitopes of B and T cells. In the clinic, the main application of prediction of epitopes is for designing peptide-based vaccines. Peptide-based vaccines are a considerably potential alternative to low-cost vaccines that may reduce the risks related to the production of common vaccines. The immune system consists of a complex network of thousands of molecules that are interconnected. Application of high-throughput techniques can provide much invaluable information about the immune system and immune component interactions. Therefore, there is a need for a computational approach to store and analyze this information. Immunology has recently been influenced by resources and software providing insight into the features of the immune system and developed a new approach called immunoinformatics (Brusic and Petrovsky 2005; Gardy et al. 2009; Tomar and De 2010) . Immunoinformatics includes studies and design of algorithms for potential mapping of B cell and T-cell epitopes. With the help of this knowledge, antigenic regions of the target proteins can be detected (Davies and Flower 2007) . The research conducted on the basis of in silico immunoinformatics methods demonstrating the high potential of this approach in reducing the time and cost of immunogenicity studies and the reliable results of these methods in the past decade has attracted researchers to this approach. Antigenic regions called epitopes are regions of antigens that are identified by the immune system cells and have the immunogenic property for the immune cells . B and T cells are the main components of the acquired immune system that bind to the antigen epitopes. In order to the T cells identify the antigens, antigenic peptides should be attached to clefts of MHC (major histocompatibility complex) molecules, also known as HLA (human leukocyte 1 3 antigen) in humans. MHC molecules expressed on antigen presenting cells (APCs) represent epitope peptides to T cells. These peptide epitopes should be linear that can attach to the cleft of MHCs. T cells are phenotypically classified into CD8+ and CD4+ lymphocytes. (Martin et al. 2006 ) Activated CD8 + cells are called cytotoxic T cells (CTLs) bound to peptides presented by class I MHC molecules on APCs and are specific to intracellular antigens. Activated CD4+ cells are called T helper (TH) cells bound to peptides presented by class II MHC molecules and are specific to extracellular antigens Moise and De Groot 2006) . HLA system coded with 21 genes on chromosome 6 is highly polymorphic. HLA Class I has three HLA-A, HLA-B, and HLA-C loci, which detect peptide sequences of 8-12 amino acids long. HLA Class II contains HLA-DR, HLA-DP and HLA-DQ, and is bound to larger sequences of 15-24 amino acids. Identifying different alleles of HLAs is very important for predicting T cell epitopes (Sette et al. 1990) . Epitopes for B cells can be discontinuous (conformational) and/or continuous (linear). In linear epitopes, the amino acid sequence of the protein is used as an epitope and there are various criteria to choose an optimal epitope. The three-dimensional antigen structure plays an important role, and for predicting these epitopes, their three-dimensional structure should be used. Therefore, the peptide sequences of these epitopes can be adjacent to each other in the threedimensional structure, but in the linear sequence, they are distant from each other (Sharon et al. 2014) . Nuclear magnetic resonance (NMR) and X-ray crystallographic methods are two methods for determining the number of epitopes for B cells that are very costly and time-consuming. Today, with the advancement of immunoinformatics tools, it is possible to predict B cell epitopes more accurately. The most antigenic epitopes are not linear sequences of amino acids; they are structures composed of different protein sections in the form of three-dimensional structure. Therefore, their bioinformatics modelling is required for the proper development of antigenic regions in predicting the epitopes of B cells (Potocnakova et al. 2016) . Identification and prediction of antigenic regions for epitopes of T and B cells are the most important applications of immunoinformatics. Immunoinformatics appears to be a useful tool for the identification of new antigenic epitopes that can be used in new vaccine design for various infectious diseases caused by different pathogens including bacteria, viruses, fungi, and parasites. In addition to the microbial vaccine, immunoinformatics can be used for different types of malignancies with the pathogenic origin (i.e. hepatocellular carcinoma, cervical carcinoma, head and neck carcinoma, etc.) and non-pathogen based cancers including lung cancer, breast cancer, prostate cancer colon cancer etc. In this review, the major immunoinformatics software tools used in recent years are reviewed for introducing discontinuous and continuous viral, bacterial and tumor-specific epitopes for B and T cells based on their frequency in applied and valid results. Schematic diagram of the stages of the peptide-based vaccine development, and an overview of the process of epitope mapping for peptide-based vaccine design against three types of viral, bacterial and tumor-specific antigens have been shown in Fig. 1 . We conducted a state-of-the-art review aimed to study and evaluate the latest software and their applications for epitope mapping used in the recent publisher. Google Scholar and Scopus databases were used for searching. All papers matching the immunogenicity search strategy in silico prediction and epitope prediction approaches and immunoinformatics tools were collected. Among these papers, 50 papers covering the years 2006 through 2018 were used as sources of this review. These papers have been fully studied and categorized. The classification was based on the softwares and applications to predict epitopes of MHC Class 1 and MHC class 2 for T cells as well as linear and conformational B cell epitopes. Also, the classification of the papers and their results based on the nature of the target antigen were divided into three sections: bacterial, viral and tumorspecific antigens. Finally, the quality of the results of the reviewed papers was evaluated, and the frequency of the application of each software in the specific sections was determined and discussed in separate sections of this study. Various bioinformatics tools are available to predict MHC-I epitopes. Among the papers reviewed, 37 cases have studied these epitopes; and the tools used in these studies are as follows: IEDB, NetCTL, MHCPred, NetMHC, nHLAPred, CTL-Pred, SVMHC, RANKPEP, BIMAS, MAPPP, ProPred, SYFPEITHI, PREDEP, MHCPEP. The result of qualitative evaluations and frequency of software application in the papers confirms the frequent application of IEDB, but it is noteworthy that in many studies, first, using one of the tools listed above, epitopes are predicted, and then IEDB tool is used to verify and ensure the result. Of course, IEDB can also be used alone for predictions using Stabilized Matrix Method (SMM) method. The results of our studies for the frequency of each software in the prediction of MHC-I bound epitopes by the nature of the antigen in three sections of viral, bacterial and tumor-specific antigens have been shown in Tables 1, 2 and 3. As shown in Tables 1, 2 and 3, NetCTL and IEDB softwares were mostly used for the prediction of MHC-I bound viral epitopes, SVMHC software for the prediction of bacterial epitopes bound to the MHC-I and ProPred-I, SVMHC and SYFPEITHI softwares for the prediction of MHC-I bound tumor-specific epitopes. According to quantitative investigations in this study, approximately 50% of the papers have been devoted to predicting MHC-II epitopes, frequently studying viruses. Fig. 1 Schematic diagram of the stages of peptide-based vaccine development using immunoinformatics. At the first stage, candidate antigens are selected. These antigens can be viral, bacterial or tumorspecific antigens. At the next stage (with in silico approach), immunoinformatics software and tools are used to find the epitopes of MHC I, MHC II and B cells. At this stage, in order to evaluate the top-level epitope connection to the favourable binding site, molecular dynamics, molecular docking studies are used. Finally, high binding affinity epitopes for MHCs and B-cells are presented as predicted epitopes for the design of peptide-based vaccines Table 1 The assortment of articles for epitope prediction of viral antigen (38 articles out of 50 articles) Their related web tools have been shown in the second column. In the third and fourth column, the frequency and percentage of the regularity of each web tool in articles are shown a Twenty-eight cases of 50 articles was studied for prediction of MHC class I epitopes b Twenty cases of 50 articles were studied for prediction of MHC class II epitopes c Six articles were studied on discontinuous B-cell epitope prediction d Twenty-one articles were studied continuous B-cell epitope prediction In silico softwares References ( The qualitative studies and frequency of software application in these studies have shown that IEDB is among the most widely used and most available tools. As shown in Table 1 , 2 and 3, the results of our studies revealed the frequency of each software in the prediction of MHC-II attached epitopes according to the nature of antigens evaluated in the three sections of viral, bacterial and tumor-specific antigens. According to these Tables, IEDB software was mostly used for the prediction of MHC-II attached virus-based epitopes, Propred-II software for the prediction of MHC-II attached bacterial epitopes and MHCpred software for the prediction of MHC-II attached tumor-specific epitopes. Prediction of linear epitopes testing due to the continuity of amino acids unlike conformational epitopes had fewer complexities and problems. Therefore, 58% of the papers reviewed had addressed these epitopes. According to studied papers, bioinformatics tools to predict linear epitopes are as follows: Bepipred, BCpred, ABCpred, Pcipep, BCEpred, BepiTope, PrediTop, PEOPLE, LBtope, SVMTrip, COBEpro, EPMLR and Igpred. Table 2 The assortment of articles for epitope prediction of bacterial antigens (8 articles out of 50 articles) Their related web tools have been shown in the second column. In the third and fourth column, the frequency and percentage of the regularity of each web tool in articles are shown a Six cases of 50 articles was studied for prediction of MHC Class I epitopes b Five cases of 50 articles were studied for prediction of MHC Class II epitopes c Five articles were studied on discontinuous B-cell epitope prediction d Five articles were studied on continuous B-cell epitope prediction 1 3 The qualitative studies and frequency of using software programs in 29 papers showed that Bepipred has the most application for predicting epitopes. The prediction of linear epitopes is not limited to using an informatics tool, but it requires the evaluation of side features of methods used simultaneously for this purpose, that includes: Koloaskar-Tongankar antigenicity, Emini surface accessibility prediction, Chou-fasman beta-turn prediction, prediction of floppy-prediction and Parker hydrophilicity prediction. Tables 1, 2 and 3 show the results of our studies using software programs in the prediction of linear epitopes of B cell according to the nature of the antigens evaluated in the three sections of viral, bacterial, and tumor-specific antigens. Based on our review, Bepipred software had the highest frequency for predicting linear viral B cell epitopes and BCpred software for predicting linear bacterial B cell epitopes. Also, ABCpred software had the highest frequency for predicting tumor-specific B cell epitopes. Among the 50 papers reviewed, 15 cases for conformational epitopes were found. In previous studies, various bioinformatics tools have been introduced to assist in the process of prediction of discontinuous epitopes, each which has its own features and is selected depending on the type of work and purpose of the user as follows: Discotope, Ellipro, CBTope, Epitope, BEPro, CEP, SEPPA, CED, EPITOME, MAPOTOPE, EPCES, EPSVR and EPMETA. With qualitative reviews and frequency of applications in the previous studies conducted on the results of the papers, we found that among the software programs introduced, Discotope and Ellipro were the most widely utilized. Tables 1, 2 and 3 show the results of our studies in the frequency of each software for prediction of conformational epitopes of B cell according to the nature of the antigen evaluated in the three sections of viral, bacterial Table 3 The assortment of articles for epitope prediction of bacterial antigens (4 articles out of 50 articles) Their related web tools have been shown in the second column. In the third and fourth columns, the frequency and percentage of the regularity of each web tool in articles are shown a Three cases of 50 articles were studied for prediction of MHC Class I epitopes b Three cases of 50 articles were studied for prediction of MHC Class II epitopes c Four articles were studied on discontinuous B-cell epitope prediction d Three articles were studied on continuous B-cell epitope prediction Nezafat et al. (2014) , Manijeh et al. (2013) , Mahdavi et al. (2012) and Mahdavi and Moreau (2016) and tumor-specific antigens. Based on our investigations, Discotope and Ellipro softwares are most widely used for predicting conformational viral B cell epitopes and Discotope software is most widely used for predicting bacterial conformational B cell epitopes and predicting tumor-specific B cell epitopes. Molecular docking is a method for predicting how different formations of a small molecule or combination of molecules connected to a suitable target site. This prediction is performed through a computerized program, including regeneration of all possible formations of ligand structure and placement of all these ligand formations in a cavity of the active target protein position and then scoring of the examined conditions based on free energy or binding energy. In the field of epitope mapping, after predicting and evaluating the candidate epitopes, the molecular docking approach allows us to study the predicted epitopes' connection to the binding packet (cavity binding site) of the antibody molecules, MHC-I and MHC-II. Then evaluated in terms of the number of hydrogen bindings and free energy or binding energy with a strong binding potential for screening in silico candidate epitopes (Friesner et al. 2004 ). Among the 50 papers studied, we reviewed 19 papers on docking. The following are softwares and tools required for molecular docking used in the previous studies: Autodock Vina, autodock 4, Patch dock, Molegro Virtual Docker, Mti auto dock, Cluspro 2.0 and Python Prescription. It has been shown that the most widely used of these software programs in previous studies was Autodock including autodock vina and autodock 4. Epitope identification using high-performance immunoanalytic tools can be very useful in various applications in the field of epitope mapping, including the design of peptide-based vaccines, identification of immunological processes, prediction of epitopes used in the diagnosis of diseases, determination of features of antibodies in various diseases, etc. (Hemmati et al. 2017; Mohsenzadegan et al. 2015 Mohsenzadegan et al. , 2018 . The proper and accurate prediction of epitopes can be the basis for the development of therapeutic methods, including the production and design of peptide-based vaccines. Peptide-based vaccines are one of the attractive and alternative strategies for the design of vaccines, which only contain peptide fragments with a length of 20-30 amino acids of the pathogen proteins that have the highest antigenicity and can stimulate and activate the immune response. In fact, the immune response is developed only against immunogenic epitopes, and additional portions of antigens causing allergic reactions that can be removed (Mohsenzadegan et al. 2018) . Therefore, a possible alternative method for immunization is the identification of peptide epitopes that stimulate the immune responses and the use of their fully synthetic copies as a vaccine. These peptides are a good candidate for vaccine production, because they are easily produced, not mutated, are chemically stable and potentially do not cause infection. These sequences can be also manipulated. In this case, their chemical stability can be increased, and unwanted side effects seen in normal sequences can be minimized through these manipulations. The identification of peptide epitopes can be useful in the production of vaccines against microorganisms that have very little growth in the medium, as well as microorganisms whose antigenic regions are not detected during the normal infection of the immune system. These methods are used in the production of the vaccine against anti-cancer antigens (Mohsenzadegan et al. 2015; Perrie et al. 2007; Petrovsky and Aguilar 2004; Thompson and Staats 2011; Li et al. 2014) . The advancement of next-generation sequencing (NGS) techniques that provide the complete and accurate sequences of genomes of pathogens, and even humans, contribute to the advancement of epitope predictive techniques as well as the facility of designing and developing peptide-based vaccines. These vaccines can be universal and work against all types of microorganisms strains as well as can be designed to work for all individuals, given the presence of different HLA alleles per person (Toussaint et al. 2011; Backert and Kohlbacher 2015) . In addition, B and T cell epitopes can be attached together and create an immunogen molecule. The use of immunoinformatics methods to predict epitopes and subsequent development of peptide-based vaccines reduces the costs and the time as well as increases accuracy compared to pure laboratory tests . As shown in Tables 1, 2, and 3, our investigations identified the frequency and application of each of the novel software used for epitope mapping. The results obtained will be very useful and valuable for future studies in research on epitope regions' for viral, bacterial and tumor antigens. We hope that the relationship between research centers and companies producing immunoinformatics softwares will be facilitated by scientists and researchers in this field in the near future. The funding source is a college institute and has sponsored the project financially and approved. The authors declare that they have no conflict of interest.