key: cord-0054365-cyjo991k authors: Ma, Junfei; Qiu, Jingxuan; Wang, Shuying; Ji, Qianyu; Xu, Dongpo; Wang, Haiwang; Wu, Zhiguang; Liu, Qing title: A Novel Design of Multi-epitope Vaccine Against Helicobacter pylori by Immunoinformatics Approach date: 2021-01-02 journal: Int J Pept Res Ther DOI: 10.1007/s10989-020-10148-x sha: 57a018785ab2a1a8afc3d50f5c2572719076ce02 doc_id: 54365 cord_uid: cyjo991k Helicobacter pylori (H. pylori) is a gram-negative spiral bacterium that caused infections in half of the world’s population and had been identified as type I carcinogen by the World Health Organization. Compared with antibiotic treatment which could result in drug resistance, the vaccine therapy is becoming a promising immunotherapy option against H. pylori. Further, the multi-epitope vaccine could provoke a wider immune protection to control H. pylori infection. In this study, the in-silico immunogenicity calculations on 381 protein sequences of H. pylori were performed, and the immunogenicity of selected proteins with top-ranked score were tested. The B cell epitopes and T cell epitopes from three well performed proteins UreB, PLA1, and Omp6 were assembled into six constructs of multi-epitope vaccines with random orders. In order to select the optimal constructs, the stability of the vaccine structure and the exposure of B cell epitopes on the vaccine surface were evaluated based on structure prediction and solvent accessible surface area analysis. Finally Construct S1 was selected and molecular docking showed that it had the potential of binding TLR2, TLR4, and TLR9 to stimulate strong immune response. In particular, this study provides good suggestions for epitope assembly in the construction of multi-epitope vaccines and it may be helpful to control H. pylori infection in the future. SUPPLEMENTARY INFORMATION: The online version of this article (10.1007/s10989-020-10148-x) contains supplementary material, which is available to authorized users. Helicobacter pylori infection has been shown to be associated with multiple gastric diseases such as chronic gastritis, peptic ulcers, gastric malignancies (Peek and Blaser 2002; Sigal et al. 2015; Suerbaum and Michetti 2002) , and non-gastric diseases including iron deficiency anemia, idiopathic thrombocytopenic purpura, non-alcoholic fatty liver (Barabino 2002; Emilia et al. 2004; Okushin et al. 2015) . Currently the H. pylori infection was usually treated by an antibiotic-based triple regimen although it causes a series of problems such as increased antimicrobial resistance and intestinal flora disturbance (Suzuki et al. 2019; Wang and Huang 2014) . Therefore, therapeutic vaccination against H. pylori infection could be a more effective and safer immunotherapy avoiding the drawbacks of antibiotics. Screening of antigen targets is key to vaccine development. In the previous studies, many antigens such as urease, neutrophil activating protein (NAP), and superoxide dismutase (SOD) have been proved to be the excellent candidates for their effective immune response against H. pylori (Corthesy-Theulaz et al. 1998; Every et al. 2011; Peng et al. 2018) . In most instances, we must face that verified antigens have limited effect on clearance of H. pylori in the stomach (Sutton and Chionh 2013) . So it is necessary to predict new candidate antigens through reverse vaccination for the ability of high-throughput screening. The strategies for predicting antigens through reverse vaccination based on bioinformatics have been successfully applied to multiple pathogens including serogroup B meningococcus, Klebsiella pneumoniae, Mycobacterium abscessus (Giuliani et al. 2006; Lundberg et al. 2013; Shanmugham and Pan 2013) . The reverse vaccination approach, which has become an alternative attractive way to identify all potential antigen targets, could accelerate the antigen discovery process and reduce antigen failure rate compared to traditional methods. In addition to this, an emphasis on vaccine design and development has moved to the generation of recombinant multi-epitope vaccines. A multi-epitope vaccine consists of B and T epitopes from several different antigens in a certain order, with reasonable linkers which could avoid generation of junctional epitopes and promote the antigen presentation process (Livingston et al. 2002; Nezafat et al. 2017) . As a new strategy relative to the single intact antigen vaccines, it has been shown to have benefits that the construct of multiple epitopes could activate a broader protective immune response and it could lead to an effective immunoreaction blocking multiple pathogenic channels for the control of H. pylori (Meza et al. 2017) . Recent studies have reported that a variety of multi-epitope vaccines against H. pylori could induce comparatively high levels of specific antibodies against multiple antigens which are the sources of epitopes (Guo et al. 2017a, b; Pan et al. 2018) . To evaluate the immunogenicity of target protein, the property of epitopes which could be presented by immune cell to trigger immune response should be estimated. Epitopes could be divided into T cell epitopes and B cell epitopes, T cell epitopes were presented by MHC molecule and be recognized by T cell, while B cell epitopes were recognized by antibodies (Gold and Reth 2019; Neefjes et al. 2011) . Based on the structure features, conformational epitopes were defined as the epitopes composed of segments, which were discontinuous in sequence but compact in spatial. And linear epitopes refers to the effective peptides in protein sequence (Van Regenmortel 2009; Zhang et al. 2014) . In order to construct multi-epitope vaccines, linear epitopes including MHC-binding peptide and B cell linear epitopes were taken into consideration in this study. Due to the string-of-beads structures of multi-epitope vaccines, diverse constructs may be made by changing the order of the epitope segments. The constructs with different orders make a difference in stability of the recombinant antigen, the exposure of the critical domains, and the docking with pathogen-associated molecular patterns (PAMP). Proper presentation of antigens, which can efficiently induce the immune system, is strongly dependent on the optimal structural stability of the vaccine construct (Scheiblhofer et al. 2017) . In this approach, computer-aided design could invoke available structural information to engineering and designing immunogens (Dormitzer et al. 2012; Kulp and Schief 2013) . Being evaluated by predictive structure and other physicochemical features including hydrophilicityhydrophobicity, solubility, and pI, the construct with optimal order will be selected and considered as the most promising candidate vaccine for further experimental verification (Negahdaripour et al. 2018) . In this paper, candidate antigens PLA1 and Omp6 were screened based on in-silico immunogenicity calculations and the experiment of antigen verification. Then multi-epitope vaccines were constructed using predicted epitopes of PLA1 and Omp6; the queried epitopes of UreB, which is the most successful antigen of H. pylori were also considered into the constructs. Subsequently, six different constructs of multiepitope vaccines were evaluated based on their predicted structures. To select the best multi-epitope antigen construct, the stability of the construct and the exposure of the B cell epitopes on the vaccine surface were considered. Structure validation severs were applied for evaluating the stability of the construct and assessed the exposure of the B cell epitopes on the vaccine surface by solvent accessible surface area analysis. Construct S1 that performed well in above two aspects was selected. Molecular docking with TLR2, TLR4, and TLR9 showed that Construct S1 had the potential to bind TLRs. Finally, codon optimization and in-silico cloning of the construct sequence were used to increase its expression level in E. coli K12. In short, we propose a novel strategy for designing multi-epitope vaccine against H. pylori based on reverse vaccinology and structural vaccinology. The mouse-adapted H. pylori strain SS1 and other 12 clinical isolates strains H. pylori A-L were obtained from Shanghai Institute of Digestive Disease and then preserved in our laboratory. H. pylori was cultured on Columbia plates containing 7% newborn bovine serum and H. pylori selective supplement SR0147E (Oxoid, England) under microaerophilic conditions at 37 ℃ for 3-4 days. To construct the protein dataset of H. pylori for antigen screening, available sequences were collected from the following public protein databases. (1) Protegen database (Yang et al. 2011 ) for protective antigens. (2) AntigenDB database (Ansari et al. 2010 ) for antigen proteins, which have been validated experimentally. (3) OMPdb database (Tsirigos et al. 2011 ) for β-barrel outer membrane proteins from Gram-negative bacteria. (4) PSORTdb database (Rey et al. 2005) for secreted proteins and outer membrane proteins of H. pylori. There is a total number of 14,702 protein sequences for H. pylori. And the sequence redundancy was reduced using CD-HIT (Fu et al. 2012 ) by selecting the representative sequence for sequences with identity above 90%. In the end, a non-redundant protein dataset of 381 sequences were constructed for H. pylori. To predict the peptides which could be presented by MHC-I molecule, NetMHCpan4.0 server (Jurtz et al. 2017 ) were used to screen each of 381 protein sequences. A total of 8 MHC subtypes were selected for mice, including H-2-Db, H-2-Dd, H-2-Kb, H-2-Kd, H-2-Kk, H-2-Ld, H-2-Qa1, H-2-Qa2. All of the 9-mer peptides in the input protein were screened by a sliding window, and the binding affinity between MHC molecule and peptides was ranked. The strong binding peptides were selected as the peptides with binding affinity ranked within top 0.5% and weak binding peptides refer to the peptides with binding affinity ranked within top 2%.In this model, a peptide was defined as the significant peptides S I , if it could bind to more than half of H-2 subtypes. For each input protein P in dataset, the MHC-I binding score M I (P) was defined as formula (1) In which, N(S I ) refers to the number of significant peptides S I in protein P. And len(P) represents the length of protein P. For each of 381 protein sequences, the sequences of proteins were input to NetMHCIIpan 3.2 server (Jensen et al. 2018) to predict the peptides, which could bind to MHC-II molecule. The length of screened peptides was set as 15-mer. The peptides with predicted binding affinity with MHC-II molecule raTnked in top 2% were defined as strong binding peptides. The weak binding peptides referred to the peptides with top 10% ranked binding affinity. All of the available peptides in each protein were screened and the binding ability between peptides and the MHC-II subtypes including H-2-IAb, H-2-IEb, H-2-IAd, and H-2-IEd were predicted. In this model, the significant peptides S II was defined, if the peptides could bind to more than half of selected MHC-II subtypes. For protein P, MHC-II binding peptide sore M II (P) was defined as formula (2). (1) M I (P) = N(S I ) len(P) where S II represents significant MHC-II binding peptides included in protein P.N(S II ) refers to the number of S II . And len(P) represents the length of protein P. Apart from the above T cell epitope prediction, B cell linear epitopes peptides or epitope residues were also predicted by two tools including LBtope server (Singh et al. 2013 ) and BepiPred2.0 server (Jespersen et al. 2017) . In LBtope server, all the available k-mer peptides split from the input protein by sliding window were screened. The probability score of each peptide to serve as a linear epitope was calculated. In this model, the k-mer was defined as the defaulting setting of 15-mer. The possible linear epitope peptides were defined with the cutoff of percent probability for correct prediction setting as 60%. For protein P, the LBtope score B 1 (P) was calculated as: where the epi_pep refers to the predicted linear peptide by LBtope for protein P.N(epi_pep) represents the number of predicted peptides. And len(P) represents the number of amino acids in protein P. In the calculation of BepiPred1.0, the results were predicted whether each residue in the input protein could serve as epitope residues. Thus, for protein P, the score for BepiPred B 2 (P) was defined as formula (4). In which the N(epi_rsd) represented the number of epitope residues which was predicted by BepiPred for protein P. And len(P) represented the length of protein P. To give a comprehensive evaluation of immunogenicity for each protein, the epitope mapping score was also designed in this model to reflect whether the protein contained the experimentally validated epitopes. First, a total of 39,879 experimental linear epitope recognized by mice were first collected from the immune epitope database and analysis resource (IEDB) (Vita et al. 2019) . Then protein sequence of each 381 proteins from H. pylori was compared to the above epitopes. And the epitope mapping score EM(P) for protein P was set as 1, if the protein from H. pylori contained any of known epitopes, otherwise the EM(P) was set as 0. (2) M II (P) = N(S II ) len(P) Before the calculation of final immunogenicity score for each protein, each of above scores should be normalized first. To eliminate the bias caused by different range of score. Each score was normalized in the range of 0 to 1, based on the maximum and minimum values. For the protein P, the MHC-I binding score M I (P) was normalized as NM I (P) as following: where the max(M I ) referred to the maximum value of M I for all the 381 protein sequences, and the min(M I ) referred to the minimum value of M I for all 381 H. Pylori proteins. The final immunogenicity score I(P) were defined as the sum of above normalized scores as formula (6). where the NM II (P) , NB 1 (P) , NB 2 (P) refer to the normalized score of M II (P) , B 1 (P) and B 2 (P) respectively. The immunogenicity score for all 381 protein sequences from H. pylori were calculated respectively, and the immunogenicity scores were ranked in descending order to select the potential antigens for H. pylori. The genomic DNA as PCR template from 13 H. pylori strains was extracted. Then 6 antigen genes we screening were amplified using primers shown in Table 1 . Agarose gel electrophoresis was used to identify them. Then three widely distributed gene ureB, HP0499, and HP0229 were cloned into T-vector and send to the Beijing Genomics Institute for sequencing after PCR and purification. According to the sequencing report, the phylogenetic tree was made based on gene sequences by MEGA6. According to the phylogenetic tree analysis, ureB (H. pylori G), HP0499 (H. pylori L) and HP0229 (H. pylori E) were chosen as the final sequence to be cloned into the plasmid pET30a vector. Super-fidelity PCR amplification was used to obtain accurate sequences with Phanta Max Super-Fidelity DNA Polymerase (Vazyme, China) and ClonExpress II One Step Cloning Kit (Vazyme, China) to clone the target genes into the plasmids. The recombinant plasmids were transformed into E. coli DH5α competent cells and screened on LB plates with kanamycin (50 μg/mL). Subsequently, the inserts were confirmed via PCR and sequencing. E. coli BL21(DE3) competent cells were transformed with pET30a-UreB/PLA1/Omp6 and inoculated in LB broth with kanamycin (50 μg/mL). All three expression was induced by addition of Isopropyl β-d-1-Thiogalactopyranoside (IPTG) after the optical density (OD600) had reached 0.6. The gradient of IPTG concentration and induced temperature were explored. After 6 h the whole proteins were extracted from the collected cells by One Step Bacteria Active Protein Extraction Kit (Sangon Biotech, China) and analyzed by Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The soluble histidin-tagged proteins were purified using affinity chromatography (Ni-IDA-Sefinose Column, Sangon Biotech, China). Purified recombinant proteins were evaluated by western blot probed with a rabbit anti-H. pylori antibody (GeneTex, USA). Briefly, purified UreB, PLA1 and Omp6 were applied to 12% SDS-PAGE under denaturing conditions and electro transferred on to a nitrocellulose membrane at 100 V for 1.5 h. Nonspecific binding sites were blocked overnight at 4 ℃ in blocking buffer (5% non-fat milk in TBST, pH 7.4, with 0.05% Tween-20). Membranes were washed three times for 10 min with TBST and further incubated with a 1:5,000 dilution of rabbit polyclonal anti-H. pylori antibody at 37 °C for 2 h. After the same washing way, PBST, the membranes The fragments containing CD4 + T cell epitopes and B cell epitopes in PLA1 and Omp6 were predicted by T and B cell epitope prediction tools. The B-cell epitopes in PLA1 and Omp6 were predicted by BepiPred 2.0 using the epitope threshold at 0.5. The T cell epitopes in PLA1 and Omp6 were predicted by NetMHCIIpan 3.2 Server. Mouse H-2 IAd and H-2 IEd were selected as the allele type due to the fact that the design was for BALB/C mice and other parameters were set as the default. All the real B and T cell epitopes in UreB was obtained from IEDB. Twenty constructs of tandem copies including 7 B cell epitopes, 6 T cell epitopes and the linkers "KK", "GGGS" were determined randomly. The selected linkers could effectively connect the epitopes without changing the immunogenicity of designed epitopes and performed well in the improvements of protein stability, domain-domain orientation, and solubility (Kavoosi et al. 2007; Klement et al. 2015; Nezafat et al. 2017) . After different attempts based on elementary evaluation of VaxiJen 2.0 (Doytchinova and Flower 2007), 6 constructs with higher VaxiJen scores numbered S1-S6 were obtained. Subsequently, the I-TASSER Server predicted the 3D structure of every construct sequence based on the degree of similarity between target protein and available template structure from PDB. ProSA-web, RAMPAGE, and ERRAT sever were used for evaluating six multi-epitope vaccine constructs based on their predicted structures. The Z-score calculated by ProSAweb could assess the overall quality of predicted structure (Wiederstein and Sippl 2007) . The main Ramachandran plot from RAMPAGE was used for calculating phi-psi torsion angles for each amino acid in the vaccine structure (Lovell et al. 2003) . Finally, ERRAT sever was used for distinguishing between correctly and incorrectly determined structure regions based on the statistics of highly refined structures (Colovos and Yeates 1993) . Based on the 3D structures of multi-epitope constructs, the SASA of B cell epitopes in them was calculated using Tcl command of VMD software. Besides, the SASA of B cell epitope in every antigen was shown in their structures. Based on the 3D structures of multi-epitope antigens, the molecular docking of the selected multi-epitope construct with TLR2, TLR4, and TLR9 was taken using ZDOCK server (Pierce et al. 2014 ) and the one with best docking of top 10 complexes was chosen as the next analysis object. Then the docking interface area and binding free energy (Δ i G) was reported by PDBePISA server (Krissinel and Henrick 2007) after the complex structures had been submitted. In order to achieve maximum expression in E. coli system, Jcat (Grote et al. 2005) was used for reverse translation and optimization of codons. Jcat is a rapid and easy method that could ensure maximum expression by calculating the GC content and CAI value (Grote et al. 2005) . Restriction sites of BamHI and KpnI were added to the reverse translated sequence. The optimized sequence was then cloned into pET30a plasmid using Snapgene software. In this study, an in-silico model was constructed to screen the antigen protein as vaccine target among all the available proteins of H. pylori. An effective antigen protein should meet the requirements including the following. (1) The candidate antigen protein should be able to trigger the immune response in hosts. (2) The vaccine targets should be accessible to the immune effectors. Based on the above hypothesis, the immunogenicity scoring model was constructed to evaluate the immunogenicity of selected proteins for H. pylori. At first, 14,702 proteins including outer membrane proteins and secreted proteins of H. Pylori were collected from public databases, and a none-redundant dataset of 381 protein sequences was constructed for antigen screening. Then for each protein, the immunogenicity score was calculated which reflect the ability to serve as a potential antigen. The epitope prediction score including MHC-I binding prediction, MHC-II binding prediction along with B cell linear epitope prediction was calculated, which refers to the ability of the protein to be recognized by MHC molecule or antibodies. Based on the descending order of immunogenicity score among 381 protein sequences, among the top 10% ranked proteins, six proteins were randomly selected for further validation, including protein HopQ (gene: hopQ), Omp25 (gene: HP1156), peptidylprolyl isomerase (gene: HP0175), Omp6(gene: HP0229), phospholipase A1 (PLA1, gene: HP0499), and the widely used UreB (gene: ureB). According to the results of agarose gel electrophoresis which were shown in Figure S1 , three genes were identified in a small number of H. pylori strains, including hopQ, HP1156, HP0175, while the others were all identified in thirteen H. pylori strains, including ureB, HP0499, HP0229. Correspondingly, only three antigens UreB, PLA1, and Omp6 were evaluated next due to their wide distribution in all thirteen H. pylori strains. and HP0229 Figure 1 shows the phylogenetic relationships of gene ureB (Fig. 1a), HP0499 (Fig. 1b) and HP0229 (Fig. 1c) from H. pylori SS1 and 12 clinical isolates H. pylori A-L. For example, the ureB sequence of H. pylori G has the highest conservation in 13 H. pylori strains according to the phylogenetic tree analysis. Compared with H. pylori G, the ureB sequence similarity of the other 12 H. pylori strains could reach more than 96.8%. This indicates that ureB gene is highly conserved among 13 H. pylori strains and could be used as a broad-spectrum antigen. The same analysis as ureB, the HP0499 sequence of H. pylori L and the HP0229 sequence of H. pylori E are highly conserved. The sequence similarity can reach 96.2% (HP0499) and 90.7% (HP0229) respectively, which also proves that they could be regarded as broad-spectrum antigens. The UreB protein (about 66 kD) was abundantly expressed in inclusion body of E. coli BL21(DE3)/pET30a-UreB induced with 0.2 mM IPTG at 37 °C. After purification by Ni 2+ -NTA affinity chromatography, the purity of the UreB protein was shown as analyzed by SDS-PAGE ( Fig. 2a; Lane 1) . Similarly, the PLA1 (about 42 kD) and Omp6 (about 53 kD) proteins were also mainly expressed in inclusion body of E.coli BL21(DE3)/pET30a-PLA1 and pET30a-Omp6 induced with 0.5 mM IPTG at 30℃. The purified PLA1 and Omp6 proteins were shown in Fig. 2a (lanes 2 and 3) . The immunoreactivity of UreB, PLA1, and Omp6 was identified by Western blotting. UreB, PLA1 and Omp6 proteins could be recognized by Rabbit anti-H. pylori polyclonal antibody (Fig. 2b) , which elucidated that they could be used as candidate antigens against H. pylori. B cell epitope-inducing humoral immune response and CD4 + T cell epitope-inducing cellular immune response must be considered in the treatment of H. pylori infection (Wilson and Crabtree 2007) . The experimentally verified B and T epitopes of UreB could be queried via IEDB which is a freely available resource cataloging experimental data on antibody and T cell epitopes. Besides, the B and T cell epitopes with high scores were picked based on the predictions of BepiPred and NetMHCIIpan server. All the B cell epitopes were shown in Table 2 and all the T cell epitopes were shown in Table 3 . Six constructs were produced by seven B cell epitopes and six T cell epitopes with random orders that showed in Fig. 3 and their sequences were shown in Table S1 . Then their structures were predicted by I-TASSER Server which provides C-score for assessing model quality. C-score (Yang et al. 2015) in Table 4 is typically in the range of [− 5, 2] , where a C-score of a higher value signifies a model with a higher confidence and vice-versa. All six structures of Constructs S1-S6 were shown in Fig. 4 . In order to further evaluate the structure reliability of six constructs, ProSA-web, RAMPAGE, and ERRAT severs were used for assessing the quality of the predicted structure. Taking Construct S1 as an example, the Z-score of the structure calculated by ProSA-web server was -2.39, which is in the range of native protein conformation scores (Fig. 5a) . Ramachandran plot from RAMPAGE showed that 74.9% of residues were in favored regions, 15.8% of residues were in allowed regions, 9.3% of residues were in outlier regions (Fig. 5b) . ERRAT results showed that the overall quality factor of Construct S1 was 86.250 (Fig. 5c ). All these results indicated that the predicted structure of Construct S1 was reasonable and reliable. As shown in Table 4 , the structure validation results of the other constructs were given (figures not shown). The structure validations of Constructs S1, S2, S3, and S6 by ProSA-web, RAMPAGE, and ERRAT severs were reasonable and they were consistent with C-score evaluation from I-TASSER. This also illustrated that the structures of these four constructs are more stable and reasonable. Nevertheless the structure validations of Constructs S4 and S5 were poor, which means that their two predicted structures were unreasonable and unstable. Only when the B cell epitopes are distributed on the surface of the antigen can the host immune system recognize them and generate specific antibodies against them. In the different multi-epitope antigen constructs we constructed, the constructs with highly exposed B cell epitopes are more likely to provoke specific antibody responses as we designed. SASA analysis of B cell epitopes was performed in six constructs. The B cell epitopes SASA analysis displays of Constructs S1-S6 were shown in Fig. 6 and the SASA value were shown in Table 4 . Obviously, the SASA of Construct S1 is 9600.67 Å 2 , the largest of all constructs based on its high structure quality. So Construct S1 could be considered as the best choice. Further, the binding affinity of T cell epitopes in Construct S1 was verified again using NetMHCIIpan. The selected 6 T cell epitopes in their original antigens were all belong to strong binding levels. After being constructed, 4 T cell epitopes in Construct S1 were Fig. 1 The phylogenetic tree of gene ureB (a), HP0499 (b) and HP0229 (c) from H. pylori SS1 and 12 clinical isolates H. pylori A-L still belong to strong binding levels and the other two turned into weak binding levels (Table S2) , which showed that the epitope assembly method of Construct S1 was feasible. The innate immune response principally depends on the pattern recognition receptors (PRRs) that recognizing the pathogen-associated molecular patterns (PAMP) of the pathogen (Chang et al. 2004 ). Toll-like receptors are the most studied PRRs, which could identify invading microorganisms to activate immune response (Netea et al. 2004) . According to previous studies, H. pylori elicits the expression of pro-inflammatory genes via different receptors (TLR2, TLR4, and TLR9) (Chang et al. 2004; Ding et al. 2005; Uno et al. 2007) . For the evaluation of the interaction between the vaccine Construct S1 and mouse TLR2, TLR4, and TLR9, ZDOCK server was employed. Their complex structures were given in Fig. 7 and the specific data of docking was shown in Table 5 . Under the premise of reliable Δ i G P-value (P < 0.5), the three docking interface surfaces are all interaction-specific based on the negative Δ i G. For this study, Jcat was used to improve the expression of the multi-epitope vaccine in E. coli K12. A total of 249 amino acids were submitted. After codon optimization, the Codon Adaptation Index (CAI) could reach 1.0 with GC contents of 48.46%, which is close to the GC contents of E. coli K12 (50.73%). This indicated the final multi-epitope vaccine with optimized sequence will be largely expressed in E. coli K12. The in-silico cloning map of multi-epitope vaccine construct was shown in Fig. 8 . To avoid the problems in antibiotic regimens against H. pylori infection, including relapses, increased resistance, and flora disturbances, vaccines could be a better alternative for its safety and effectiveness. Compared with the low effectiveness and high costs in terms of budget and time about conventional vaccine approaches, reverse vaccinology has become a new strategy for subunit vaccines development. Based on the information about the genome and proteome of microbes, the screening and evaluation of vaccine antigens have become more practically possible. In this study, 381 protein sequences were screened from 14,702 protein sequences including outer membrane proteins and secreted proteins of H. pylori based on the immunogenicity score. MHC-I binding prediction, MHC-II binding prediction along with B cell linear epitope prediction were comprehensively considered in the evaluation of protein epitopes. PLA1, Omp6 and UreB were selected by virtue of the immunogenicity score and distribution in clinical strains. Besides, other untested proteins may have some potential as candidate antigens, since the six screened antigens were selected at random. The development of subunit vaccine is no longer confined to a single antigen. Multi-epitope vaccine, which could stimulate a broader neutralizing antibody response and block multiple pathogenic channels, shows more effective results than a single antigen. Based on the new antigens PLA1, Omp6, and widely used antigen UreB, their epitopes were combined into several constructs with different orders. The constructs, although same in amino acid composition, show the diverse antigen stability along with different position of key domain. In this study, 3D structure reliability and B cell epitope SASA analysis were used for evaluating the stability of the vaccine construct and exposure of B cell epitopes. Compared with homology modeling server, I-TASSER Server reflects better superiority in structure prediction of multi-epitope vaccines for the fragmented composition. The scores provided by I-TASSER and further structure validation by ProSA-web, RAMPAGE, and ERRAT sever could reflect the rationality of protein folding and the certainty of protein expression to some extent. Besides, based on the preliminary screening of VaxiJen Server, six constructs with high scores were performed for next structural prediction because of the time consumption of I-TASSER Server. In fact, a reasonable number of structures could be predicted to test based on the needs and resources until an accepted multi-epitope vaccine construct was obtained. Fig. 4 The 3D structures S1-S6 predicted by I-TASSER Cholera Toxin Subunit B (CTB), a widely used mucosal adjuvant in the construction of multi-epitope vaccine against H. pylori, enhanced the immunogenicity and antigenicity of the vaccine constructs in actual animal experiments (Guo et al. 2017b; Pan et al. 2018) . It is doubtful that CTB or other adjuvant was added into the structure prediction of multi-epitope vaccines in the design phase in some studies (Hugo Urrutia-Baca et al. 2019; Nosrati et al. 2019 ). The combination of CTB and multi-epitope antigen is a fusion protein rather than a new independent protein. Since only one parsed protein structure could be used as a template, the structure prediction platforms based on parsed Fig. 5 Structure validation of Construct S1. a The Z-score of Construct S1 calculated by ProSA web server was − 2.39, which is in a reasonable range. b Ramachandran plot validation of Construct S1. Residues are in favoured regions 74.9%; residues are in allowed regions 15.8%; residues are in outlier regions 9.3%. c ERRAT analy-sis of Construct S1. The overall quality factor of Combination N1 was 86.250. On the error axis, two lines of 95% and 99% are drawn to indicate the confidence with which it is possible to reject regions that exceed that error value structures were not suitable for predicting the structures of fusion proteins. On the one hand, the structure of CTB has great changes in the predicted structure of fusion protein of CTB and multi-epitope antigen, which may result in loss of adjuvant effect. On the other hand, the addition of CTB may result in some changes in the terminal structure of the antigen, which affects the accuracy of the prediction to a certain extent. It may be more appropriate to add adjuvants after completing the design of multi-epitope vaccines. In the construction of multi-epitope vaccines, the optimal result is that the inserted B cell epitope could be located on the surface of the antigen protein. But in the real case, some amino acids of the B cell epitopes could be folded inside the protein. To some extent, the SASA analysis of B cell epitope could reflect the degree of B cell epitope exposure. More exposed ones could be selected according to the SASA calculation. Especially this is the first time that SASA analysis was used for assessing the exposure of the B cell epitopes on the vaccine surface in multi-epitope vaccine construction Nevertheless, the exposure to water, which is the analysis principle of SASA, could not represent the exposure to antibody. So there is a certain gap between this analysis method and the real situation. It still needs to be perfected. To evaluate interactions among TLRs (TLR2, TLR4, and TLR9) and the vaccine structure, ZDOCK Sever were applied for molecular docking. Docking results demonstrated that the multi-epitope vaccine we designed has the potential to bind TLRs. But the more accurate interactions need further validation of molecular dynamics simulations. After the codon optimization by Jcat Sever, the designed vaccine could enhance expression in E. coli K12. Also note that further animal experimental verification is required to check if the vaccine we designed can trigger a specific immune response for controlling H. pylori infection. Fig. 6 B cell epitopes SASA analysis displays of Constructs S1-S6. Exposed area of B cell epitopes filled with color blue, non-exposed one filled with color red. T cell epitopes filled with color gray Vaccine is an effective and safe measure against pathogen infection. Further, how to improve the effectiveness of vaccines is the key to treat infectious diseases. To develop the vaccine against H. pylori, reverse vaccinology based on genomics, proteomics along with bioinformatics was applied for screening the new antigens, PLA1 and Omp6. To enhance the immunity effect, a new construction strategy was applied to develop multi-epitope vaccines based on the epitope analysis of PLA1, Omp6 and UreB. It assesses the stability of vaccine constructs, the location of key domains and probable interaction with TLRs on the basis of the predicted structures. Finally, the best construct was selected. We believe that this strategy will have great potential in multi-epitope vaccine construction and it is worth considering for other pathogen vaccines. Fig. 7 The complex structures of Construct S1 and TLRs. The structure of Construct S1 filled with color blue and the structures of TLRs filled with color red. Amino acids at the interface are displayed with "surf" drawing method and the other amino acids are displayed with "Newcartoon" drawing method in VMD Table 5 The interface list of docking complex structures a Interface area calculated as difference in total accessible surface areas of isolated and interfacing structures divided by two b Δ i G indicates the solvation free energy gain upon formation of the interface. Negative Δ i G corresponds to hydrophobic interfaces, or positive protein affinity c Δ i G P-value indicates the P-value of the observed solvation free energy gain. P < 0.5 indicates interfaces with surprising (higher than would-be-average for given structures) hydrophobicity, implying that the interface surface can be interaction-specific Interface area a (Å 2 ) AntigenDB: an immunoinformatics database of pathogen antigens Helicobacter pylori-related iron deficiency anemia: a review Induction of cyclooxygenase-2 overexpression in human gastric epithelial cells by Helicobacter pylori involves TLR2/ TLR9 and c-Src-dependent nuclear factor-kappa B activation Verification of protein structures: patterns of nonbonded atomic interactions Mice are protected from Helicobacter pylori infection by nasal immunization with attenuated Salmonella Fig. 8 In-silico cloning map of multi-epitope vaccine Construct S1. The sequence of multi-epitope antigen is highlighted in red color typhimurium phoP(c) expressing urease A and B subunits Toll-like receptor 2-mediated gene expression in epithelial cells during Helicobacter pylori infection Structural vaccinology starts to deliver VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines Helicobacter pylori and idiopathic thrombocytopenic purpura Evaluation of superoxide dismutase from Helicobacter pylori as a protective vaccine antigen CD-HIT: accelerated for clustering the next-generation sequencing data A universal vaccine for serogroup B meningococcus Antigen receptor function in the context of the nanoscale organization of the B cell membrane Oral Immunization with a multivalent epitopebased vaccine, based on NAP, urease, HSP60, and HpaA, provides therapeutic effect on H. pylori infection in Mongolian gerbils Immunologic properties and therapeutic efficacy of a multivalent epitope-based vaccine against four Helicobacter pylori adhesins (urease, Lpp20, HpaA, and CagL) in Mongolian gerbils Immunoinformatics approach to design a novel epitope-based oral vaccine against Helicobacter pylori Improved methods for predicting peptide binding affinity to MHC class II molecules BepiPred-20: improving sequence-based B-cell epitope prediction using conformational epitopes NetMHCpan-4.0: improved peptide-MHC Class I interaction predictions integrating eluted ligand and peptide binding affinity data Strategy for selecting and characterizing linker peptides for CBM9-tagged fusion proteins expressed in Escherichia coli Effect of linker flexibility and length on the functionality of a cytotoxic engineered antibody fragment Inference of macromolecular assemblies from crystalline state Advances in structure-based vaccine design A rational strategy to design multiepitope immunogens based on multiple th lymphocyte epitopes Structure validation by C alpha geometry: phi, psi and C beta deviation Identification and characterization of antigens as vaccine candidates against Klebsiella pneumoniae A novel design of amulti-antigenic, multistage and multi-epitope vaccine against Helicobacter pylori: an in silico approach Towards a systems understanding of MHC class I and MHC class II antigen presentation Structural vaccinology considerations for in silico designing of a multi-epitope vaccine Toll-like receptors and the host defense against microbial pathogens: bringing specificity to the innate-immune system Designing an efficient multi-epitope oral vaccine against Helicobacter pylori using immunoinformatics and structural vaccinology approaches Towards the first multi-epitope recombinant vaccine against Crimean-Congo hemorrhagic fever virus: a computer-aided vaccine design approach Helicobacter pylori infection is not associated with fatty liver disease including non-alcoholic fatty liver disease: a largescale cross-sectional study in Japan Protection against Helicobacter pylori infection in BALB/c mouse model by oral administration of multivalent epitope-based vaccine of cholera toxin B subunit-HUUC Helicobacter pylori and gastrointestinal tract adenocarcinomas Production and delivery of Helicobacter pylori NapA in Lactococcus lactis and its protective efficacy and immune modulatory activity ZDOCK server: interactive docking prediction of proteinprotein complexes and symmetric multimers PSORTdb: a protein subcellular localization database for bacteria Influence of protein fold stability on immunogenicity and its implications for vaccine design Identification and characterization of potential therapeutic candidates in emerging human pathogen Mycobacterium abscessus: a novel hierarchical in silico approach Helicobacter pylori activates and expands Lgr5(+) stem cells through direct colonization of the gastric glands Improved method for linear B-cell epitope prediction using antigen's primary sequence Medical progress: Helicobacter pylori infection Why can't we make an effective vaccine against Helicobacter pylori? Development of Helicobacter pylori treatment: how do we manage antimicrobial resistance? OMPdb: a database of beta-barrel outer membrane proteins from Gram-negative bacteria Toll-like receptor (TLR) 2 induced through TLR4 signaling initiated by Helicobacter pylori cooperatively amplifies iNOS induction in gastric epithelial cells What is a B-cell epitope? The immune epitope database (IEDB): 2018 update Effect of Lactobacillus acidophilus and Bifidobacterium bifidum supplementation to standard triple therapy on Helicobacter pylori eradication and dynamic changes in intestinal flora ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins Immunology of Helicobacter pylori: insights into the failure of the immune response and perspectives on vaccine studies Protegen: a web-based protective antigen database and analysis system The I-TASSER Suite: protein structure and function prediction Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering