key: cord-0884538-71940dh7 authors: Kumar, Ashutosh; Harjai, Kusum; Chhibber, Sanjay title: A multiepitopic theoretical fusion construct based on in-silico epitope screening of known vaccine candidates for protection against wide range of enterobacterial pathogens date: 2019-02-12 journal: Hum Immunol DOI: 10.1016/j.humimm.2019.02.008 sha: c40e0bbb4b43900b06fe7717b8a41e181a8607e5 doc_id: 884538 cord_uid: 71940dh7 Enterobacterial pathogens that have acquired antibiotic resistance genes are a leading cause of community and hospital acquired infections. In such a situation vaccination is considered as a better option to prevent such infections. In the current study reverse vaccinology approach has been used to select peptides from already known immunogenic proteins to design a chimeric construct. We selected Yersiniabactin receptor of Escherichia coli UMN026 and Flagellin of Stenotrophomonas maltophila. B-cell linear epitopes were predicted using Bepipred prediction tool. Peptide binding with reference sets of 27 alleles of MHC class I and class II was also analyzed. The predicted peptides-MHC complexes were further validated using simulation dynamics. The in-silico construction of chimera was done by restriction mapping and codon optimization. Chimera was evaluated using the immunoinformatic approach as done for the selected proteins. From the 673 amino acids of FyuA protein, a region from 1 to 492 was selected for containing more linear epitopes and the processing scores obtained were significant for MHC class I and class II binding. Similarly, from Flagellin, a region between 60 and 328 amino acids was selected and the peptides present in the selected region showed lower percentile ranks for binding with MHC molecules. The simulation studies validated the predictions of peptide-MHC complexes. The selected gene fragments accommodating maximum part of these peptides were used to design a chimaeric construct of 2454 bp. From the immunoinformatic analysis, the chimera was found to be more immunogenic in terms of increased number of B-cell and T-cell epitopes along with increased coverage of global populations with allelic variability. Communicable diseases caused by members of Enterobacteriaceae family put a great burden on the society by affecting humans and their livestocks. These infections become quite severe when they are not controlled on time. Some of the examples include pneumonia, pyogenic liver abscess, pyelonephritis and septicemia [1] . The treatment has become difficult due to the emergence of antibiotic resistance among some of the pathogens. Due to the existing challenge of treating these infections with antibiotics, it is inevitable for the research community to look forward to prophylactic means for the prevention of these infections. A large number of vaccine candidates have been proposed by various researchers for specific infections. However, evaluation of individual vaccine candidates under in-vivo infection conditions is an enduring task. In the recent years, "Reverse Vaccinology" (RV) has come to play an important role in scrutinizing the vaccine candidates by in-silico analysis [2] thereby reducing the time required for ruling out ineffective candidates. This immunoinformatic approach is being frequently used by researchers to predict the epitopes on viruses. This has led to the finding of epitopes on nucleocapsid protein and ovarian tumour domain of Crimean-Congo hemorrhagic fever virus [3] . Another study investigated the variability among epitopes of Hepatitis C virus (HCV) identified in genotype 1 and also predicted the immunogenicity of their variants from other genotypes against South African human leukocyte antigen (HLA) backgrounds [4] . Epitopes of E1 protein isolated from HCV have also been identified using the similar approach [5] . Effective immunogens of MERS-CoV have been discovered through immunoinformatics-driven genome-wide screening strategy [6] . Goodswen et al. [7] have also used this technique for designing protein based vaccines against eukaryotic pathogens. Reverse vaccinology has already been used against other bacterial pathogens like Group B streptococcus, where genomic analysis has led to the development of a vaccine composed of four proteins giving protection against all serotypes [8] . Another In-silico study has found a protein Bam A of Acinetobacter baumanii to be a potential immunogen [9] . RV has also been implied to predict the potential vaccine candidates from the proteome of Burkholderia pseudomallei [10] . The outer membrane proteins (OMPs) of these Gram-negative bacteria are usually considered as potent vaccine candidates as they are exposed to the host immune defenses [11] . These OMPs are not always conserved in different genus of bacteria but their lies a probability of presence of some conserved peptide sequence in these OMPs. In this study we have taken into account the proteins which have proved to be potential vaccine candidates on the basis of invivo research work on animal models. Yersiniabactin receptor FyuA is highly conserved protein prevalent in various members of Enterobacteriaceae. As per the reports, FyuA mediates the uptake of ferricyersiniabactin [12, 13] confirming its role in the virulence of bacteria, which makes it an important vaccine candidate. Moreover researchers have found it as a potential vaccine candidate against pyelonephritis in a murine model of urinary tract infection [14] . It has also been found to be protective in murine model of pneumonia caused by K. pneumoniae 43816 in our laboratory (manuscript under communication). The Flagellin protein is another potential vaccine candidate which has been included in this study. Various studies have established the role of flagellin in inducing a systemic inflammatory response via intraperitoneal and intravenous administration [15, 16] and a local inflammatory response with intraintestinal administration [17, 18] . In our lab, flagellin of Stenotrophomonas maltophilia has been shown to induce non-specific immune response which protected mice against subsequent bacterial challenge [19] . In the current study both of these vaccine candidates have been analyzed using IEDB server to design a novel insilico vaccine construct harbouring the properties of both these proteins. Gene sequence of FyuA accession no. NC_011751.1 and Flagellin NC_010943.1 were taken from NCBI database [20] . The obtained gene sequences were translated into the protein sequences using Expasy-Translate tool (Swiss Institute of Bioinformatics). Protein sequences of Yersiniabactin receptor of Escherichia coli UMN026 and Flagellin of Stenotrophomonas maltophilia were analyzed for the presence of linear epitopes using Bepipred portal of IEDB server. Both proteins were also analyzed on other algorithms, Parker Linear epitope prediction from the protein sequences FyuA (A) and Flagellin (B) using Bepipred prediction portal of IEDB server. The yellow peaks show the peptide sequences that are potential epitopes whereas the green peaks show the peptides that are not epitopic in nature. The encircled area on the graphs shows the region of protein having higher frequency of epitopic peptides. Conformational epitopes on yersiniabactin receptor were predicted using Discotope portal of IEDB server by analyzing solvent-accessibility. In human population Major Histocompatibility complexes (MHC-I and MHC-II) are encoded by human leukocyte antigen (HLA) alleles and are required for the presentation of antigen to T cells. The peptides which could bind to MHC I molecules were predicted by MHC class I binding peptide prediction portal on IEDB server using the consensus method. Interactions were evaluated in terms of percentile ranks. Similarly, peptides binding to MHC II molecules were predicted by MHC class II binding peptide prediction portal on IEDB server using the consensus method. 3D structures of alleles were retrieved from RCSB PDB database [22] . Predicted peptide sequences and 3D structure of MHC class I and Class II alleles were submitted to CABS Dock server for docking and simulation studies [23] . Secondary structures of peptides were generated from PSIPRED [24] . The simulation time was set to 50 cycles. The results were clustered according to the distance between the residues of the peptide and MHC molecules. Both gene sequences were analyzed on NEB cutter for mapping restriction sites. The enzymes which could cut the DNA at only single site were identified and the one that was common in both sequences was selected. The chimeric construct could be constructed after ligation of gene fragments excised from FyuA and Flagellin. Briefly, both the genes could be digested with AatII to produce linear sticky ended fragments of 7.4 and 6.5 kb respectively. AatII will digest FyuA at position 1475 bp and Flagellin at position 188 bp. These fragments could then be digested with XhoI to generate sticky ended fragments of 6.9 kb and 0.5 kb from FyuA and 5.5 kb and 0.9 kb from Flagellin. The 0.9 kb fragment from Flagellin and 6.9 kb fragment from FyuA could be ligated with T4 DNA ligase followed by transformation, screening and sequencing. Open reading frame for the sequence of chimeric gene was analyzed on Expasy-Translate tool (Swiss institute of Bioinformatics). The model structure of the chimera AKSC2 was generated using Modeller 9.20 [25] . The structure was energy minimized and validated using Molprobity [26] server for its stereophysical characteristics. Protein sequence of AKSC2 was analyzed using Bepipred portal of IEDB server. The obtained linear epitopes were analyzed for changes in comparison to the epitopes predicted from the individual proteins. The epitopes were also analyzed for the presence of overlapping regions between individual proteins. Protein sequence of AKSC2 was analyzed to observe the peptides which could bind to MHC class I molecules. These peptides were predicted using MHC class I binding peptide prediction portal on IEDB server. Similarly, peptides binding to MHC II molecules were predicted using MHC class II binding peptide prediction portal on IEDB server. Interactions were evaluated in terms of percentile ranks. Amino acid sequences of both the proteins were analyzed for the presence of linear epitopes using the Bepipred prediction tool of IEDB server. For FyuA a threshold score of 0.332 for a window size of 7 amino acids was generated in the portal. Results in Fig. 1(A) show that a large number of linear epitopes are present in FyuA and most of the prominent epitopes were in the region from amino acid position 20 to 520. The maximum score of 2.317 was obtained for the amino acids near position 160. Similarly, linear epitopes were predicted for Flagellin with the Bepipred generated threshold value of 0.316 for a window size of 7 amino acids. Results in Fig. 1(B) show a large number of linear epitopes on Flagellin and most of the prominent epitopes were in the region from amino acid position 20 to 380. The maximum value of 1.680 was obtained for the amino acids near position110. The peptides that were significant linear epitopes are shown here in Table 1 . Results of Bepipred prediction as shown in Fig. 1 revealed the presence of a wide range of linear epitopes on both FyuA and Flagellin proteins. These results were further verified using other parameters like hydrophilicity, surface accessibility and antigenicity. Similar results were obtained from all the above predictions as both the proteins were found to possess large number of linear epitopes (Supplementary information SI.1.). The region accommodating the maximum number of epitopes was then selected to be taken further for theoretical construction of chimaera. Further, evaluation of other parameters was done by keeping the selected regions into consideration. Since, there lies a possibility that the chimeric protein may not take up the proper folding when over expressed due to physical constraint of large size. Hence, it may be expressed as inclusion bodies when subjected to over expression. Misfolding of protein would certainly not affect the T-cell dependent response since the generation of T-cell response depends on the processing and presentation of peptides on MHC molecules but it may affect the antibody response to conformational epitopes. However, Results in Table 1show Table 1 . Therefore, the presence of linear epitopes can be considered as a very significant feature of the vaccine candidate protein as it could help in the generation of antibody response even if the protein is administered in denatured form. Conformational epitopes of FyuA were predicted using Discotope tool on IEDB server. Results in Fig. 2(A) as depicted by the green peaks is the region from amino acid position 200 to 400 that possess maximum conformational epitopes. These epitopes (yellow) were also shown on the 3-D structural image created by J-mol-PDB Fig. 2(B) . Since most of the conformational epitopes are present on the exposed surface, these epitopes may become a target of antibodies for effectively neutralizing the bacteria during infection. Both the protein could prove to be good vaccine candidates if they are able to generate both B cell as well T cell responses. However, both the proteins qualified for the generation of B cell response therefore predictions were made for their ability to generate T-cell responses. This was done by predicting MHC class I and II binding peptides. MHC Class I binding peptides were predicted using the MHC Class I binding prediction tool on IEDB server. Peptides with a percentile rank below 10 were considered significant. Results in Fig. 3(A) show that 2858 significant interactions were obtained for FyuA protein and among these, 2017 interactions were from the selected region of amino acid position 1 to 490. Similarly, for Flagellin protein results in Fig. 3(A) show about 1443 significant interactions and among these, 1226 interactions were from the selected region of amino acid position 61 to 389 (Supplementary information SI.2.). Therefore the results in Fig. 3(B) show that 70% of the significant interactions were from the selected region of FyuA and 85% of significant interactions were from the selected region of Flagellin Fig. 3(B) . The top 10 peptides from the selected region having lowest percentile rank were shortlisted and are shown in Table 2 . Data in Table 2 show that both of these proteins possessed significant number of peptidic regions that could bind to It is also seen that the peptides of both these proteins bind with different alleles which are present in different parts of world. This difference in the binding will turn out to be very beneficial when AKSC2 will be used as vaccine as more the number of binding alleles more would be the coverage of human population. Peptides binding to MHC class II molecules were predicted using MHC class II binding peptide prediction tool on IEDB server. For MHC class II binding, IEDB recommended method and a reference set of 27 alleles was used. Results in Fig. 3(A) show that 2018 significant interactions were obtained for FyuA protein. Out of which 1479 interactions with the reference set of alleles were from the selected region. Similarly, for Flagellin protein results in Fig. 3(A) show that 1353 significant peptide allele interactions were predicted by the server and among these, 1168 interactions were from the selected region (Supplementary information SI.2.). Also the results in Fig. 3(B) show that 73% of the significant interactions were from the selected region of FyuA and 86% of significant interactions were from the selected region of Flgellin Fig. 3(B) . Table 3 depict the top 10 peptides from the selected region having lowest percentile rank for FyuA for Flagellin respectively. Alleles HLA-DRB3*01:01, HLA-DRB1*09:01 and HLA-DRB1*08:02 are present largely in Asian and Russian populations (allelefrequencies. net). Allele HLA-DQA1*01:02/DQB1*06:02 is found in Asian, African, Israel and French populations and HLA-DQA1*04:01/DQB1*04:02 is present in German, African and Asian populations. Results obtained after simulated 3-Dimensional docking of predicted peptides with the predicted MHC molecules were presented in Tables 5 and 6. The data shows the average RMSD values of the each peptide-MHC complex. The average RMSD values between 3 and 5 are considered of medium accuracy whereas the values below 3 are considered as highly accurate [27] . Along with the RMSD values the distance between the interacting amino acid residues of MHC molecules and that of the interacting peptide was also analyzed. The results in Table 5and Fig. 4 show that the MHC class I and class II binding peptide predictions made using the IEDB server, were also found to be significantly accurate using the 3-D simulations for docking. This could be interpreted as the values of average RMSD were below 5 in the simulated complexes. The interacting residues shown in the table were lying at a distance of < 3 Å. These results of all the in-silico predictions helped us in choosing the regions from both the proteins, so that they could be combined to produce a single chimaeric construct. We therefore propose a strategy which could lead to the formation of the chimaeric construct. For designing the chimaeric gene construct, cloning strategy was adopted using genetic engineering tool for restriction digestion. Gene sequences of both the proteins were analyzed on NEB cutter tool of New England Biolabs. Restriction enzyme AatII (Supplementary information SI.1.) was found to be present in both the protein sequences and cleaves FyuA at position 1475 and Flagellin at position 188 and generates sticky ends. Fig. 5shows the graphical representation of the cloning strategy. In-silico created chimeric construct was found to be consisting of 2454 base pairs (Supplementary information SI.1.) which was further translated using Expasy/Translate tool and a protein of 817 amino acids was formed to give a molecular weight of approximately 90 KDa (Supplementary information SI.1.). This ruled out the presence of any stop codon within the whole sequence. The main objective of combining large gene fragments from both the proteins was considered so as to provide the cellular protein processing machinery with enough of proteasome cleavage sites. The possibility of a single peptide to act as a vaccine is usually a rare chance. Hence a protein can be processed to in different ways to create epitopes whose presentation to T-cells may add to protection during pathogenesis. Analysis of Ramachandran plot of the chimera AKSC2 generated using the Molprobity server suggested that 94.2% residues lied in the favoured and allowed region while only 5.8% residues were in the outlier region (supplementary information SI.3.). The obtained values suggest that the model is structurally stable. Further analysis of AKSC2 showed an increase in the average threshold value in linear epitope prediction this probably resulted by combining the two proteins (Fig. 6 ). There were also the peptides in AKSC2 that were epitopic and lied in the region joining both the proteins. This was another advantage of this chimaera as these overlapping peptides could increase the population coverage of vaccine. Table 4 show the peptides that are able to bind to MHC class I and Class II and are present in the region joining the two proteins. These peptides are able to bind to different alleles. Results in Table 7 show the similarity of overlapping peptides which were generated as a result of fusion of the two proteins. The peptide analyzed using BLASTp showed its identity with peptides present in microorganisms other than the source of peptide. This also gave a hope that the chimaeric vaccine candidate may confer protection upon global populations from the infectious diseases caused by a wide range of pathogens. From the in-silico analysis it is concluded that reverse vaccinology can be used to create novel chimeric constructs from the already known vaccine candidates to make them more effective and to confer protection among diverse populations against a wide range of enterobacterial pathogens. For this research work no specific grant from any funding agency was provided. Mr. Ashutosh Kumar was granted financial support in the form of Senior Research Fellowship from Indian Council of Medical Computer-aided biotechnology: from immuno-informatics to reverse vaccinology Epitope-based immunoinformatics and molecular docking studies of nucleocapsid protein and ovarian tumor domain of Crimean-Congo hemorrhagic fever virus Sequence-based in silico analysis of well studied hepatitis C virus epitopes and their variants in other genotypes (particularly genotype 5a) against South African human leukocyte antigen backgrounds Structural analysis and epitope prediction of HCV E1 protein isolated in Pakistan: an in-silico approach Epitope-based vaccine target screening against highly pathogenic MERS-CoV: an in silico approach applied to emerging infectious diseases Enhancing in silico protein-based vaccine discovery for eukaryotic pathogens using predicted peptide-MHC binding and peptide conservation scores Identification of a universal Group B streptococcus vaccine by multiple genome screen In silico analysis of Acinetobacter baumannii outer membrane protein BamA as a potential immunogen In-silico analysis of Burkholderia pseudomallei proteome to predict potential vaccine candidate proteins A multiepitope subunit vaccine conveys protection against extraintestinal pathogenic Escherichia coli in mice The pesticin receptor of Yersinia enterocolitica: a novel virulence factor with dual function Reduced synthesis of the Ybt siderophore or production of aberrant Ybt-like molecules activates transcription of yersiniabactin genes in Yersinia pestis Immunization with the yersiniabactin receptor, FyuA, protects against pyelonephritis in a murine model of urinary tract infection The innate immune response to bacterial flagellin is mediated by Toll-like receptor 5 Flagellin, a novel mediator of Salmonella-induced epithelial activation and systemic inflammation: IκBα degradation, induction of nitric oxide synthase, induction of proinflammatory mediators, and cardiovascular dysfunction Role of flagellin in the pathogenesis of shock and acute respiratory distress syndrome: therapeutic opportunities Humoral immune response to flagellin requires T cells and activation of innate immunity Stenotrophomonas maltophilia flagellin induces a compartmentalized innate immune response in mouse lung Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide The protein data bank CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site Protein secondary structure prediction based on position-specific scoring matrices Comparative protein modelling by satisfaction of spatial restraints MolProbity: all-atom structure validation for macromolecular crystallography Modeling of protein-peptide interactions using the CABS-dock web server for binding site search and flexible docking None to declare. Supplementary data to this article can be found online at https:// doi.org/10.1016/j.humimm.2019.02.008.