key: cord-0066177-01idumps authors: Awan, Furqan; Ali, Muhammad Muddassir; Dong, Yuhao; Yu, Yong; Zeng, Zhenling; Liu, Yongjie title: In Silico Analysis of Potential Outer Membrane Beta-Barrel Proteins in Aeromonas hydrophila Pangenome date: 2021-07-26 journal: Int J Pept Res Ther DOI: 10.1007/s10989-021-10259-z sha: 04c43860db1c4e5198b79faadaca15c392c7194d doc_id: 66177 cord_uid: 01idumps Outer membrane proteins (OMPs) of Aeromonas hydrophila have a variety of functional roles in virulence and pathogenesis and represent promising targets for vaccine development. The main objective of this study was to develop an in-silico model of beta-barrel OMP present among the valid A. hydrophila pangenomes (n = 22). With a program named the β-barrel Outer Membrane Protein Predictor (BOMP), total beta-barrel OMPs (n = 3127) were predicted across 22 genomes with the estimated median number of 64 per genome. In pangenome analysis, only 32 OMPs were found to be conserved. These beta-barrel OMPs also showed variations among source of isolation, COG and KEGG classes. Among 32 conserved OMPs, a highly antigenic protein was identified by utilizing Vaxijen. With B cell epitope predictions, two fragments of amino acid sequences i.e. GLTLGAQFTGNNDPQNADRSN (21 mer) and FKPSLAYLRTDVKDNARGI DDTATEY (26 mer) bearing B-cell binding sites were selected. Further, an epitope (12 amino acids: GLTLGAQFTGNN) that complexes to maximum MHC alleles with a higher antigenicity was determined. The analysis of evolutionary forces on the identified OMP sequence and epitope indicated that none of basic amino acid sites has shown significantly different substitution ratios. This conserved protein and epitope will be helpful in developing a vaccine that may be effective against all the A. hydrophila strains. Also, this study provides a theoretical basis for vaccine design against other pathogenic bacteria. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10989-021-10259-z. Aeromonas hydrophila is a zoonotic pathogen with a wide range of virulence factors, antibiotic resistances and biofilm formation capabilities (Dias et al. 2018) . This bacterium has a high number of pathogenic characteristics to cause disease in humans and animals such as aerolysin, gelatinase, hemolysins and type-III secretion system (Ponnusamy et al. 2016; Pandey et al. 2010 ). It can act as a reservoir to transmit and share some features such as antibiotic resistances (Piotrowska and Popowska 2014) . It can also grow at a relatively low temperature (4 °C) (Jahid et al. 2014) , which is harmful for refrigerated foods. A. hydrophila can form biofilms with the help of certain adhesion factors such as S-layer protein, flagella and outer membrane proteins (OMPs) (Qin et al. 2016; Awan et al. 2018a ). These virulence factors establish this bacterium an important pathogen among the humans and animals. In Gram-negative bacteria, OMPs comprise of 8-24 β-strands, arranged in an antiparallel fashion (Fairman et al. 2011) . These proteins are directly involved in the interaction with various environments encountered by pathogenic organisms, and thus play essential roles in bacterial pathogenesis (Ling et al. 2018) . As OMPs are a basic component of bacterial cell surface, they may serve as adhesins and porins. For example, a 43 kDa OMP of A. caviae found to significantly improve the adherence to HEp-2 cells (Rocha-De-Souza et al. 2001) . Similarly, Omp48 plays an important role in the adhesion process of A. veronii (Vàzquez-Juárez et al. 2004 ). Multiple studies have demonstrated that OMPs are extremely immunogenic and represent promising targets for vaccine development. A recent example of such immunogenic OMPs from A. hydrophila is the Omp48 that has induced protective response in Indian major carp, Rohu (Labeo rohita) (Khushiramani et al. 2012) . Previously, OmpG was found to produce immunogenic result against A. hydrophila infection in European eels (Guan et al. 2011) . Similarly, a recombinant Omp38 protein could effectively stimulate protection against A. hydrophila (Wang et al. 2013) . Although a number of OMPs have been identified in A. hydrophila, but there is not much known about a conserved and universal OMPs found in this bacterium that could play a role in vaccine development against all the strains identified yet. Screening of OMPs and their in silico modeling for reverse vaccine models are relatively new areas. Nowadays, in silico methods are proving helpful in predicting the characteristics of microbes, such as the virulence factors, antibiotic resistance genes, outer membrane proteins and bacteriophage (Jia et al. 2017; Chen et al. 2016; Zhou et al. 2011; Tsirigos et al. 2011; Berven et al. 2004) . A program named the β-barrel Outer Membrane Protein Predictor (BOMP), has been demonstrated to be a valid tool to identify possible new β-barrel OMPs in Escherichia coli K12 and Salmonella typhimurium (Berven et al. 2004 ). In the present study, we used the BOMP and other in silico approaches to identify putative β-barrel OMPs from 22 available A. hydrophila complete genomes, and analyzed the antigenic epitopes that could serve as basis of future vaccines or drugs. All publicly available valid A. hydrophila refseq genome sequences (30 full genomes), as in previous study (Awan et al. 2018b) , were obtained from the National Center for Biotechnology Information (NCBI Resource Coordinators 2017). Spatially, these sequenced strains were isolated from USA (n = 6), China (n = 14), Japan (n = 1) and South Korea (n = 1). The source of isolation involved environment (n = 5), humans (n = 2), fish (n = 11), snake (n = 3), and coypus (n = 1). To differentiate two genomes with the same name of WCX23, one of the genomes was renamed to WCX23-2. All the strains based on the ANI acceptable results (ANI > 95.0) were included in the pan-genome (core genome and dispensable genome) analysis. Genome alignment in MAUVE v.20150226 was performed to refine the genome assemblies and genome scaffoldings (Darling et al. 2010 ). To predict the BBPs present in the A. hydrophila genomes, the program BOMP (services.cbu.uib.no/tools/bomp) was utilized (Berven et al. 2004 ). All the predicted BBPs were clustered using UClust algorithm (Edgar 2010) . BBPs shared by all the strains were considered as the conserved/ core genes, while the dispensable genes either present in two or more strains (accessary proteins) or present in only one strain (unique proteins) were also identified. Identified proteins were further functionally annotated against Cluster of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Kanehisa et al. 2012) . For this functional annotation, genomes were aligned by utilizing Ublast with e-value 1 × 10 −6 and alignment length 80% against the above mentioned databases (Edgar 2010) . All the graphical figures were generated by R package of ggplot2 (Ihaka and Gentleman 1996) . Furthermore, predicted OMP sequences were translated into protein sequences using the Prodigal software (Hyatt et al. 2010) . Then these amino acid sequences were further concatenated and utilized to determine the molecular percentage of amino acid composition by the BioEdit software (Hall 1999) . Obtained representative sequences of core and dispensable genes were used for further downstream analyses. The conserved BBPs were performed by MSA. For this purpose, all the conserved OMP genes were concatenated together and aligned using MUSCLE (Edgar 2004) . To generate a neighbor joining tree, these concatenated aligned sequences were utilized in MEGA v6.0 to visualize the phylogenic tree (Edgar 2004) , which was further smoothened by using iTOL tree website (itol.embl.de/) (Letunic and Bork 2016) . All the conserved OMPs sequences were analyzed for the antigenicity by utilizing VaxiJen v2.0 server (www. ddgpharm fac. net/ vaxij en/ VaxiJ en/ VaxiJ en. html). Only an OMP with the highest antigenicity value was selected and utilized for downstream analyses. Swiss model server (https:// swiss model. expasy. org) was used to perform the protein homology modeling. This predicted 3D model was further refined using Galaxy Refine server (https:// galaxy. seokl ab. org) followed by reliability calculation using ProQ webserver (https:// proq. bioin fo. se/ cgi-bin/ ProQ/ ProQ. cgi). ProFunc server (https:// www. ebi. ac. uk/ thorn ton-srv/databases/ProFunc) was also used to confirm and calculate the functional parameters of 3D structures. BPreds and AApred servers were utilized to identify B-cell binding sites from the selected OMP sequence ). There exists a huge variation in the predicted epitopes between the two models. Therefore, common epitopes predicted by both models were selected, followed by a refined alignment to build amino acid fragment possessing B cell epitopes. Furthermore, TMHMM server v2.0 (https:// www. cbs. dtu. dk/ servi ces/ TMHMM/) was also used to confirm the location of these amino acid fragments in the OMP. ProPred-1 server (https:// crdd. osdd. net/ ragha va/ propr ed1/) was utilized to predict the major histocompatibility complex (MHC) I and II binding sites in the previous amino acid fragments containing B-cell binding sites. Only those amino acid fragments were selected which bound to both the MHC I and MHC II with a maximum number of MHC alleles. Finally a candidate epitope was selected based on the feature of both B-and T-cell binding sites. As previously performed, VaxiJen server v2.0 for antigenicity and TMHMM Server v. 2.0 for topology and exo-membrane position were utilized to verify each screened epitope. Additionally, to confirm the MHC binding activity, MHCPred v.2 server (https:// www. ddg-pharm fac. net/ mhcpr ed/ MHCPr ed/) was utilized with 1000 nM IC50 scores for HLA-DRB1*0101. To better understand the structural characteristics of the candidate epitope, 3-D modeling and calculation of some parameters were also necessary. To perform 3-D epitope modeling, the DISTILL (https:// disti llf. ucd. ie/ disti ll/) server was used. For the molecular weight and isoelectric point (pI) of epitopes, ExPASy server (https:// web. expasy. org/ compu te_ pi/) was utilized. The DATAMONKEY Web server (https:// www. datam onkey. org) was employed for the recognition of specific codons under positive selection in order to understand the influence of evolution on a candidate gene. The results were described on the basis of the amino acid residues evolving under positive Darwinian selection with high chances of the Omega (ω = ratio between Non-synonymous mutation substitution rate-dN and Synonymous mutation substitution rate-dS) value > 1 i.e. ω > 1. For confirmations of codons under positive selection, the aligned codons were uploaded to Selecton version 2.2 (https:// secton. tau. ac. il), by which the ω ratios were predicted to indicate the mutational shift in codons using Bayesian inference method. After ANI, 3 genomes were found to be having values less than the accepted level i.e. 95.0. Among the remaining 27 genomes, 5 genomes were found to be having some unusual behaviour. In order to prevent from their effects, such genomes were not included in downstream analysis. Total number of beta-barrel OMPs across 22 genomes was found to be 1428 with the estimated median number of 65 per genome (Table 1) . Among these, the number of accessory OMPs was found to range from 22 to 39, and the maximum number was found in strain ZYAH72. This difference in accessory OMPs showed the possible horizontal gene transfer and presence of acquired genes from their particular environments. The highest number of unique OMPs (n = 8) were found in strains HX-3 and WCHAH045096. Among the COG classes, core OMPs were found to be more enriched and functionally diverse whereas accessory and unique OMPs were reported to be more involved in metabolism, environmental processing and human diseases (Supplementary Fig. 1 ). Among the KEGG categories, core OMPs include varied functional groups but lack the group related to human diseases. OMPs associated with human diseases are among the accessory and unique groups ( Supplementary Fig. 1 ). This variation in OMPs explains the acquired gene transfer that empowers bacterial survival and virulence. MSA was performed on the basis of conserved core OMPs. As shown in Fig. 1 , some strains (n = 9) were clustered in one clade while the other strains (n = 13) showed a greater variation and distant relationship with each other. Among the conserved OMPs, one OMP (NCBI reference sequence: WP_016349559) was selected as a probable antigen with the highest antigenic score of 0.7588. Identification and confirmation of trans-membrane and exo-membrane regions of this OMP was performed by TMHMM server. The trans-membrane region ranging from 0 to 33 amino acid residues was not selected for further analysis (Supplementary Fig. 2 ). As the 3D structure of this protein was not accessible from the protein database (pdb), 3D homology modeling was carried out by using Swiss model web server. On the basis of sequence similarity, a sequence with PDB id: 5o77 holding 39.44% sequence identity was accepted as template (Fig. 2a) . The optimized 3D model of this protein consisted of 5 helices, 18 strands and 26 turns (Fig. 2b) . Additionally, the molecular weight and isoelectric point (PI) of the model were found to be 39.5 kDa and 4.61 respectively. This 3D model was further refined using galaxy model refine algorithm. The validation of this protein was futher analysed by using ProQ results, which indicated the acceptability of protein model, as the LG (3.75) and Max Sub (0.25) scores were indicative of good model and fairly acceptable model ranges. To understand the stereo chemical quality of the model, the Procheck server was used. In Ramchandran plot, residues were observed in various regions such as in most favored regions (red; 89.1%), in additional allowed regions (yellow; 9.6%), in generously allowed regions (light yellow; 0.3%), and in disallowed regions (white; 1.0%) (Fig. 2c) . The quality of the model about geometrical acceptability was determined here. For prediction of B-cell epitopes, both modules of BCPreds Server 1.0 were used. These resulting epitope sequences were highly overlapping (Supplementary Table 1 ). Thus, the highly common sequences were selected as B-cell Fig. 2 In silico 3-D modeling of selected omp (WP_016349559) and its verification. a Homology based predicted 3-D model of the selected OMP. b Predicted model determined to visualize the second-ary structure of the protein; revealing the number of beta barrels and alpha chains. c Ramachandaran plot describing the stereo-chemical quality of the predicted model epitopes. Then, these selected B-cell epitopes (more than 20 amino acids) were aligned to get common length of antigenic epitopes (Fig. 3) . As a result, two short sequences were generated such as GLTLGAQFTGNNDPQNADRSN (position = 160-181; 21 amino acids; VaxiJen score = 0.8931) and FKPSLAYLRTDVKDNARGIDD TATEY (position = 289-315; 26 amino acids; VaxiJen score = 0.7715) that could act as B-cell epitopes. For MHC binding, ProPred 1 (MHC I) and ProPred (MHC II) were utilized to examine the previously selected B-cell epitopes. The resulting common T-cell epitopes were screened (Supplementary Table 2 ) for downstream analyses (Fig. 3) . The epitope that complexes to both the MHC classes and possess maximum MHC alleles was chosen. Consequently, only one 12-amino acid long sequence (GLTLGAQFTGNN) (position = 161-172) was chosen. This epitope was found to bind 9 MHC I and 29 MHC II alleles (Supplementary Table 2 ). To understand the binding of epitope to DRB1*0101 allele, MHCPred v.2 analysis (MHC Pred nM IC50 score = 1815.52) was performed. For antigenicity examination of the selected epitope, VaxiJen server (VaxiJen score = 0.8389) was utilized again. As for the topology of this predicted epitope, TMHMM 2.0 server revealed that the epitope was exposed to the outer surface of the protein. Another peptide SLAYLRTDVK was found to bind 1 MHC I and 23 MHCII alleles. But it was not antigenic as predicted by VaxiJen v2.0 antigen prediction server (VaxiJen score = − 0.0895). In order to add more support for the selected epitope as an ideal peptide vaccine candidate, 3-D model was built. Because of its very short sequences (12 mers: GLTL-GAQFTGNN), the DISTILL server was utilized to predict 3-D structure of the epitope (Fig. 4) . The molecular weight (MW) and pI of the 12 mer epitope were calculated as 920.98 (Da) and 5.52 respectively. To determine if the selected amino acid sites in the selected OMP sequence are affected by evolutionary selection forces, BUSTED codon model was applied in DATAMONKEY web server. It was revealed that there was no indication of adaptive evolution. In the whole gene sequence length, none of basic amino acid sites had shown significantly different substitution ratios ( Supplementary Fig. 3 Graphical depiction of the B-cell and T-cell epitopes. These resulting epitope sequences were predicted from BCpreds, and Pro-Pred 1 (MHC I) and ProPred (MHC II) respectively. Red colored rectangles represent highly overlapped sequences identified by BCPreds Server 1.0. While these two BCPreds identified sequences were further examined for MHC allele binding. Underlined sequences were found to bind maximum multiple MHC alleles (Color figure online) Fig. 4 3-D model of the selected epitope (GLTLGAQFTGNN) generated by using DISTILL web server. Graphical representation was generated by using Pymol software Table 3 ). To confirm the effects of evolutionary positive selection forces on the screened epitope, Selecton server was also used to analyze the effect of selection pressure at codon level by the Mechanistic-Empirical Combination (MEC) model. The adaptive selection pressure was detected at various codons in selected OMP sequence, but the sites selected and analyzed for epitope were not recognized under positive selection (Fig. 5 ). Identification of potential drug targets and vaccine candidates are usually considered as the first step in combating a disease. This study was conducted to find out potential vaccine candidate genes for A. hydrophila. Along with zoonosis, this bacterium is also notorious for multiple antibiotic resistances, virulence factors and biofilm formation (Dias et al. 2018) . Like all gram-negative bacteria, OMPs are key components that play an important role in all of these mechanisms by controlling all the transportation and by acting as a porin or gateway. In this context, OMPs can serve as a universal drug target for designing a vaccine. In silico approach and immuno-informatics approaches are usually used to target the OMPs which are best suitable for antigenicity and are conserved. In silico prediction of beta-barrel OMPs with good accuracy is an area that is worth exploring. Previously, the most common way to identify integral beta barrel proteins from predicted proteomes was performed by locally aligning with PSORT database (Gardy et al. 2003) . But our study has used BOMP server that provides fast and reliable information for the experimental analysis of beta barrel OMPs (Berven et al. 2004 ). Furthermore, for the universal vaccine, comparative genomics approach has been utilized in order to identify core OMPs in all the genomes. In previous studies, certain OMPs such as ompK37, OprJ and OprM that play key role in antibiotic resistance were found to be present only in few strains (Ling et al. 2018; Awan et al. 2018b) . Designing a vaccine based on the dispensable OMPs will not provide good results in strains devoid of those OMPs. A recent study related to in silico designing of OMP based epitopes in A. hydrophila indicated the lack of universality or conserved aspect of OMP genes (Grassmann et al. 2017) . MSA was also performed in this study to reveal the phylogenetic relationship of the strains. The results obtained in this study are quite similar to previous studies performed on whole genomes (Awan et al. 2018b; Vaish et al. 2018) . Phylogenetic analysis based on the OMPs sequence alignments is reliable but this methodology requires clarification and more evidences (Nahar et al. 2017; Heinz and Lithgow 2014) . However, application of comparative genomics and immunoinformatics strategies, such as applied in this study, hold remarkable significance in in silico designing of potential vaccine targets. Recent vaccine strategies include the delivery of targeted and prolonged peptide antigens to antigen presenting cells (APCs) such as subunit vaccines (Hos et al. 2018; Devi and Chaitanya 2020; Yang et al. 2021) . These APCs present the antigens through MHC and cluster of differentiation (CD) receptors to T and B lymphocytes that secrete lymphokines and antibodies respectively. Additionally, B-cells differentiate into memory cells that would recognize the same antigen in future. It has been reported that vaccines designed on the above strategy provide better results (Azuar et al. 2019) . Hence, in this study, it was considered necessary to design an epitope that can stimulate both B cell-and T cell-mediated immune responses. In current study, the OMPs sequences were screened on the basis of their antigenicity followed by B-cell epitope prediction by BCPreds server. Both BCPred and AAP modules were utilized for confirmation of predicted epitopes. Interestingly, highly varying epitopes were predicted that were further screened on the basis of exo-membrane location and antigenicity. Further, these selected B-cell epitopes were examined to find out T-cell epitopes by using ProPred 1 (MHC I) and ProPred (MHC II). Predicted peptides that bind both MHC classes, especially binding with maximum MHC alleles, were ultimately selected in this study. As a result, single 12-amino acid long sequence (GLTLGAQFT-GNN) was identified with high antigenicity score and exomembrane localization. The methodology of this study is in agreement with the previously designed vaccine targets with good results in other bacterial species (Vaish et al. 2018; Azuar et al. 2019; Satyanarayana et al. 2018) . Effects of evolution via selection were also analyzed on the selected OMP sequence. BUSTED codon model in DATAMONKEY webserver was utilized as it was found suitable to predict the effects of positive selection on a whole gene (Weaver et al. 2018) . As previously studied, conserved sequences from several strains of the same species have not shown significantly different substitution ratios. Furthermore, in this study, the Selecton Server has been also utilized to evaluate the effect of adaptive evolution at specific codons. The epitope sites selected and analyzed were not recognized under positive selection. Although the whole sequence was already found conserved but it was necessary to understand the epitope sites as well. In silico prediction of beta-barrel OMPs is helpful in identifying the potential candidates for developing a vaccine against A. hydrophila infection. The epitope (GLTLGAQFT-GNN) screened through successive strategies has the potential to stimulate both T-cell and B-cell immune responses. Furthermore, there is little chance of evolutionary selection forces to effect on the epitope sequence, which means that this epitope is conserved among all the strains. Thus, this predicted epitope (GLTLGAQFTGNN) is suitable for further laboratory validation in order to develop a universal vaccine against A. hydrophila infection. The fight for invincibility: environmental stress response mechanisms and Aeromonas hydrophila Comparative genome analysis provides deep insights into Aeromonas hydrophila taxonomy and virulence-related factors Recent advances in the development of peptide vaccines and their delivery systems against Group A Streptococcus BOMP: a program to predict integral β-barrel outer membrane proteins encoded within genomes of Gram-negative bacteria VFDB 2016: hierarchical and refined dataset for big data analysis-10 years on progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement In silico designing of multi-epitope vaccine construct against human coronavirus infections Biofilm formation and multidrug-resistant Aeromonas spp MUSCLE: multiple sequence alignment with high accuracy and high throughput Search and clustering orders of magnitude faster than BLAST The structural biology of β-barrel membrane proteins: a summary of recent reports PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria Discovery of novel leptospirosis vaccine candidates using reverse and structural vaccinology Enhancement of protective immunity in European eel (Anguilla anguilla) against Aeromonas hydrophila and Aeromonas sobria by a recombinant Aeromonas outer membrane protein BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT A comprehensive analysis of the Omp85/ TpsB protein superfamily structural diversity, taxonomic occurrence, and evolution Approaches to improve chemically defined synthetic peptide vaccines Prodigal: prokaryotic gene recognition and translation initiation site identification R: a language for data analysis and graphics Inactivation kinetics of cold oxygen plasma depend on incubation conditions of Aeromonas hydrophila biofilm on lettuce 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database KEGG for integration and interpretation of large-scale molecular data sets Recombinant Aeromonas hydrophila outer membrane protein 48 (Omp48) induces a protective immune response against Aeromonas hydrophila and Edwardsiella tarda Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees Proteomic analysis of alterations in Aeromonas hydrophila outer membrane proteins in response to oxytetracycline stress In silico assessment of the genotypic distribution of virulence and antibiotic resistance genes in Pseudomonas aeruginosa Hemolysin, protease, and EPS producing pathogenic Aeromonas hydrophila strain An4 shows antibacterial activity against marine bacterial fish pathogens The prevalence of antibiotic resistance genes among Aeromonas species in aquatic environments Cross-talk among flesheating Aeromonas hydrophila strains in mixed infection leading to necrotizing fasciitis Flagellar motility is necessary for Aeromonas hydrophila adhesion Identification of a 43-kDa outer-membrane protein as an adhesin in Aeromonas caviae In silico structural homology modeling of nif A protein of rhizobial strains in selective legume plants OMPdb: a database of {beta}-barrel outer membrane proteins from Gram-negative bacteria In silico genome-wide identification and characterization of the glutathione S-transferase gene family in Vigna radiata Adhesive properties of a LamB-like outer-membrane protein and its contribution to Aeromonas veronii adhesion Identification of Omp38 by immunoproteomic analysis and evaluation as a potential vaccine antigen against Aeromonas hydrophila in Chinese breams Evaluation and comparison of newly built linear B-cell epitope prediction software from a users' perspective Datamonkey 2.0: a modern web application for characterizing selective and other evolutionary processes An in silico deep learning approach to multi-epitope vaccine design: a SARS-CoV-2 case study PHAST: a fast phage search tool