key: cord-0728591-uaqkpdmb authors: Sundar, Shobana; Thangamani, Lokesh; Piramanayagam, Shanmughavel; Natarajan, Jeyakumar title: Discovering mycobacterial lectins as potential drug targets and vaccine candidates for tuberculosis treatment: a theoretical approach date: 2021-05-18 journal: J Proteins Proteom DOI: 10.1007/s42485-021-00065-y sha: b6110582aef429cb45e4aeeedb7410e7cb230ccc doc_id: 728591 cord_uid: uaqkpdmb M. tuberculosis proliferates within the macrophages during infection and they are bounded by carbohydrates in the cell wall, called lectins. Despite their surface localization, the studies on exact functions of lectins are unexplored. Hence, in our study, using insilico approaches, 11 potential lectins of Mtb was explored as potential drug targets and vaccine candidates. Initially, a gene interaction network was constructed for the 11 potential lectins and identified its functional partners. A gene ontology analysis was also performed for the 11 mycobacterial lectins along with its functional partners and found most of the proteins are present in the extracellular region of the bacterium and belongs to the PE/PPE family of proteins. Further, molecular docking studies were performed for two of the potential lectins (Rv2075c and Rv1917c). A novel series of quinoxalinone and fucoidan derivatives have been made to dock against these selected lectins. Molecular docking study reveals that quinoxalinone derivatives showed better affinity against Rv2075c, whereas fucoidan derivatives have good binding affinity against Rv1917c. Moreover, the mycobacterial lectins can interact with the host and they are considered as potential vaccine candidates. Hence, immunoinformatics study was carried out for all the 11 potential lectins. B-cell and T-cell binding epitopes were predicted using insilico tools. Further, an immunodominant epitope (1062)SIPAIPLSVEV(1072) of Rv1917c was identified, which was predicted to bind B-cell and most of the MHC alleles. Thus, the study has explored that mycobacterial lectins could be potentially used as drug targets and vaccine candidates for tuberculosis treatment. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s42485-021-00065-y. Tuberculosis (TB) is an infectious disease caused by the rod-shaped bacilli Mycobacterium tuberculosis (Mtb) and is affected millions of people worldwide. In accordance with WHO, in the year 2018, nearly 10 million people around the world had developed TB and 1.5 million people had died from the disease. Many new chemotherapeutic drugs and vaccinations are been used to treat TB; however, the emergence of drug resistance strains increases the treatment burden. Mtb proteome has abundance of hypothetical or putative or uncharacterized proteins whose function remains unexplored. This also includes the unknown functions of the mycobacterial lectins. Lectins are glycan or sugar-binding proteins and they bind diverse range of sugar molecules. They are involved in many biological processes such as cell-cell interactions, host-pathogen interactions, etc., initially, lectins are majorly found in plants and animals (Vijayan and Chandra 1999) . In bacteria, lectins occur as fimbriae and pili which are filamentous protein appendages projecting from their bacterial surface. The most extensively studied bacterial lectins are the mannose-specific FimH of type 1 fimbriae and the galabiose-specific PapG of P fimbriae of Escherichia coli. FimH is associated with urinary tract infections (Chen et al. 2009 ) and PapG suggested having a role in renal colonization (Lane and Mobley 2007) . In 1989, Kundu et al. isolated mycotin (a lectin) from Mycobacterium smegmatis and it was found to agglutinate human erythrocytes (Kundu et al. 1989) . Later in 2007, Singh et al., by comprehensive bioinformatics analysis, identified eleven potential lectins in the Mtb genome (Singh et al. 2007 ). In 2012, Abhinav et al. performed a bioinformatics based homology search of lectin-encoding gene regions in 30 fully or partially sequenced mycobacterial genomes and identified 94 potential glycan-binding proteins (Abhinav et al. 2013) .Three potential lectins of the Mtb genome identified by Singh et al. (2007) were also found in this study. This study further adds to the importance of mycobacterial lectins in the pathogenesis of the organism. Therefore, in this study through insilico approaches, 11 mycobacterial lectins found by Singh et al. (2007) was explored as potential drug targets and vaccine candidates which can be used for treatment of TB. Protein-protein interaction network for all the 11 putative mycobacterial lectins were retrieved using stringApp tool in Cytoscape 3.7.2 (Smoot et al. 2011) . The confidence score cutoff was set at 0.4 and the maximum additional interactors was set to 100. The functional enrichment data for all the nodes was also retrieved using the functional enrichment tool in Cytoscape. Clustering analysis was performed by using the MCODE app in Cytoscape. Default parameters were used in the clustering analysis. Two mycobacterial lectins, Rv1917c and Rv2075c which has 1459 and 487 amino acids were retrieved from Uni-protKB in FASTA format. The three-dimensional model was generated using the I-TASSER server (Zhang 2008) which generates a 3D model of query sequence by multiple threading alignments and iterative structural assembly simulation. The conformation of the best 3D model (model 1) for the two selected lectin proteins was selected for further studies and the homology models were further validated by the Ramachandran plot and ERRAT scores (Colovos and Yeates 1993) . Pymol was used to visualize the modeled structure. AutoDock Vina program was used to for molecular docking studies (Trott and Olson 2010) . 54 fucoidan derivatives were retrieved from PubChem database (Kim et al. 2016 ) and made to dock with the Rv1917c homology model whereas 233 quinoxalinone derivatives retrieved from PubChem database and were made to dock with the Rv2075c homology model. The retrieved ligand structures were converted into PDB file format in OBABEL command line. The grid box was laid for the whole protein and the grid spacing was fixed at 1 Å. The coordinates were set to 126 × 126 × 126 for Rv1917c whereas for Rv2075c it was fixed at 62 × 60 × 64, respectively. The ligand docking scores were noted and the interactions were obtained using Discovery studio visualizer. ABCpred server (Saha and Raghava 2007) was used to predict the B-cell epitopes using the target protein sequence. The window length was fixed at 20 and the default threshold of 0.5 was used for the prediction. The retrieved epitopes were filtered using their score. IEDB tools were used for T-cell epitope predictions and predominantly occurring 10MHC(Major Histocompatibility Complex) alleles in human population for HLA class I (A*01:01, A*02:01, A*03:01, A*11:01, A*24:02, B*07:02, B*08:01, B*35:0, B*40, B*44) were considered for predictions (Agrewala and Wilkinson 1999; Chodisetti et al. 2012; Verma et al. 2018) . Stabilized Matrix Base method (SMM) (Peters and Sette 2005) was used for the prediction and the peptide length was fixed at 11 amino acids. The promising MHC I T-cell epitopes were predicted based on their IC50 values (< 50 nM) of the peptides. Proteasomal tools available at IEDB were also employed to predict T-cell epitopes which consider several important stages of the MHC degradation pathway: proteasome cleavage, TAP binding, MHC binding and epitope selection. MHC II binding epitopes were predicted using IEDB tools and predominantly occurringHLA class II (DRB1*01:01, DRB1*03:01, DRB1*04:01, DRB1*07:01, DRB1*08:02, DRB1*11:01, DRB1*13:02, DRB1*15:01, HLA-DQA1*05:01|DQB1*02:01, H L A -D P A 1 * 0 2 : 0 1 | D P B 1 * 0 1 : 0 1 , H L A -D Q A 1 * 0 1 : 0 2 | D Q B 1 * 0 6 : 0 2 , H L A DQA1*03:01|DQB1*03:02) alleles were considered for predictions. SMM method was used for the prediction and the peptide length was set at 15 amino acids. The promising MHC II T-cell epitopes were predicted based on IC50 values (< 50 nM) of the peptides. The antigenicity of the epitopes was assessed using Vaxi-Jen server (Doytchinova and Flower 2007) and the default parameters were used. Population coverage of the epitopes predicted was analysed using the IEDB population coverage tool and it was predicted for population of South Asia, India and for the entire globe. In order to avoid cross reactivity with the human host, the predicted epitopes were also checked for their similarity with the human proteome by performing BLAST (Altschul et al. 1997) analysis. The predicted epitopes were then scanned for allergenicity using ANTIGENpro (Magnan et al. 2010 ) and SORTALLER (Zhang et al. 2012) tools. Structural properties of the predicted epitopes including transmembrane topology and their solubility upon overexpression were further predicted by ABTMpro and SOLpro respectively, which are available at SCRATCH protein prediction server (http:// scrat ch. prote omics. ics. uci. edu/). The immunodominant epitope (IDE) can bind both B-cell and T-cell effectively and these regions can induce strong immune responses. Hence, epitopes which can bind both B-cell and major of T-cells were selected as IDEs (Verma et al. 2018 ). Protein-protein interaction network for the 11 putative mycobacterial lectins was obtained using the stringApp tool of Cytoscape. A network of 104 nodes and 1095 edges were retrieved and is given in Fig. 1 . The functional enrichment data for all the 104 nodes were retrieved and classified using its gene ontology (GO) component, process, domain databases such as INTERPRO and PFAM. Figure 2 illustrates the functional enrichment data of all the 104 nodes used in this analysis. Gene classification based on GO component reveals that majority of mycobacterial lection network comprises of genes that are present in the extracellular region followed by cell surface and cell envelope. Classification based on GO process discloses that most of the genes are involved in the interspecies interaction between the species. Classification based on protein domains yielded that most of the genes belong to PE/PPE family of proteins. Clustering analysis was performed on the mycobacterial lectin network and is depicted in Fig. 3 . The network divided into eight clusters and its details are given in Table 1 . The first cluster has a score about 28.933 and possesses about 31 nodes and 434 edges and most of the genes are uncharacterized proteins (Table S1 ). The exact roles of these uncharacterized proteins in TB pathogenesis remain unexplored. Interference of the mycobacterial lectin-host-cell surface carbohydrates interaction using carbohydrate ligand mimics (or an anti-adhesive drug) is a recent therapeutic approach which can aid to treat resistance in TB patients. In this sort, Mydock-McGrane et al. (2016) has developed mannosides as FimH antagonist which aids in the treatment of urinary tract infections in humans. Recently, Kuhaudomlarp et al. (2020) have found certain non-carbohydrate glycomimetics as Inhibitors of Calcium(II)-Binding Lectins, namely LecA for the treatment of biofilm-associated P. aeruginosa infections. Similarly, Šmak et al. (2021) through in-silico virtual screening studies have identified potential inhibitors for selectins which are cell surface lectins for treatment of COVID-19. Fucose-containing glycans were identified as specific ligands for the N-terminal part of the AlsI protein from Candida albicans (Donohue et al. 2011 ). However, fucose-containing ligands were found minimal in chemical compound databases. Therefore in our study, we did not perform molecular docking studies for Rv2082 and Rv1753 which belongs to Agglutinin-like sequences lectin family. Quinoxalinone inhibitors were used against the C-type lectin dendritic cell-specific intercellular adhesion molecule 3-grabbing non-integrin (DC-SIGN) (Mangold et al. 2012 ). Rv2075c of Mtb is believed to be a C-type lectin. Sulfated glycolipid, such as fucoidan, was found to strongly inhibit filamentous hemagglutinins like laminin and thrombo spondin, suggesting its anti-adhesive role (Roberts and Ginsburg 1988) . Rv0355, Rv1917c, Rv3343 and Rv3350 of Mtb are characterised as filamentous hemagglutinins. Therefore, mycobacterial lectins could function as potential drug targets and certain lead-like molecules against lectin-carbohydrate interactions can be identified and used in TB treatment. The homology model of Rv2075c (Fig. 4a) was obtained using I-TASSER server. Model 1 was selected as best predicted models with C-score -3.19, TM score 0.36 ± 0.12, and RMSD 15.2 ± 3.4 Å. The validation of the model was checked using the Ramachandran plot ( Supplementary Fig. S1 ), where it shows 59.0% residues in most favoured regions and 25.6% residues in additional allowed regions,i.e., the total of 84.6% residues in allowed region. Further, the homology model of R1917c has an ERRAT score of 78.7056%, which indicates the obtained model is a good Fig. 4b . The homology model of Rv1917c (Fig. 4c) was obtained using I-TASSER server. Model 1 was selected as best predicted models with C-score 0.55, TM score 0.79 ± 0.09 and RMSD 8.5 ± 4.5 Å. The validation of the model was checked using the Ramachandran plot ( Supplementary Fig. S2) , where it shows 75.4% residues in most favoured regions and 18.2% residues in additional allowed regions, i.e., the total of 93.6% residues in allowed regions which indicates a good quality model. Further, the homology model of R1917c has an ERRAT score of 76.8116%, which indicates the obtained model is a good model which can be further used for various studies. 54 fucoidan derivatives were retrieved from PubChem database and were made to dock against the modelled Rv1917c. The top three compounds with better docking scores are listed in Table 3 . The top scored ligand interaction diagram is depicted in Fig. 4d . As lectins are present on the mycobacterial cell surface, they can be easily recognized by our immune cells and lectins are used as potential vaccine candidates. FimH, an E.coli lectin, have been identified as a potential vaccine candidate in the treatment of urinary tract infection in humans (Zandi et al. 2020) . Similarly, Pseudomonas lectins have also been used as potential vaccine candidates in treating cystic fibrosis patients (Day et al. 2019 ). The potential MHC-I and MHC-II T-cell binding epitopes for the mycobacterial lectins are given in Tables 4 and 5. The peptides having IC50 > 50 nm, Vaxijen score < 0.4 and > 80% identity to human proteome are excluded for further analysis. B-cell binding epitopes were predicted for all the 11 putative mycobacterial lectins as tabulated in Supplementary Table S2 . Further, we intend to identify certain immunodominant regions of the mycobacterial lectins which share both B-cell and T-cell binding epitopes. Through this analysis, we found that Rv1917c has an IDE 1062 SIPAIPLS-VEV 1072 which can bind both B-cell and T-cell commendably. Besides previous studies suggests that, Rv1917c can interact with human Toll-like receptor 2 (TLR2) and elicit functional maturation of human dendritic cells (DCs), which leads to secretion of interleukins from CD4 + T cells and induction of Th2 immune response (Bansal et al. 2010) . Further, we modelled the IDE and performed a molecular docking study with TLR2 receptor using Cluspro protein-protein interaction server. The docked IDE with the TLR2 receptor along with its interactions obtained using Ligplot software available at PDBsum server is depicted in Fig. 5 . In-vitro and in-vivo studies need to be performed to prove the efficacy of the predicted peptide as a potential vaccine candidate. In this study, a gene interaction network of the 11 potential mycobacterial lectins was retrieved and their functional partners along with the gene ontology analysis. Out of 520 functional interactions, about 30 genes are found to be responsible for cellular activity. Mycobacterial lectins can be used as potential drug targets which can be used for anti mycobacterial drug resistance. Further, molecular docking studies were performed for Rv2075c and Rv1917c with quinoxalinone and fucoidan derivatives, respectively. Good binding affinity was observed for quinoxalinone derivatives against Rv2075c and fucoidan-based derivatives against Rv1917c. Mycobacterial lectins can be used as potential vaccine candidates. Therefore, potential B-cell and T-cell binding epitopes for all of the 11 potential mycobacterial lectins were predicted. Further, an epitope 1062 SIPAIPLSVEV 1072 of Rv1917c, which can bind B-cell and majority of MHC alleles was identified in our study. In conclusion, we have theoretically explored through insilico approaches that mycobacterial lectins can be used as potential drug targets and vaccine candidates against TB treatment. The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s42485-021-00065-y. Identification of mycobacterial lectins from genomic data Influence of HLA-DR on the phenotype of CD4+ T lymphocytes specific for an epitope of the 16-kDa α-crystallin antigen of Mycobacterium tuberculosis Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Src homology 3-interacting domain of Rv1917c of Mycobacterium tuberculosis induces selective maturation of human dendritic cells by regulating PI3K-MAPK-NF-κB signaling and drives Th2 immune responses Positive selection identifies an in vivo role for FimH during urinary tract infection in addition to mannose binding Potential T cell epitopes of Mycobacterium tuberculosis that can instigate molecular mimicry against host: implications in autoimmune pathogenesis Verification of protein structures: patterns of nonbonded atomic interactions Lectin activity of Pseudomonas aeruginosa vaccine candidates PSE17-1, PSE41-5 and PSE54 The N-terminal part of Als1 protein from Candida albicans specifically binds fucose-containing glycans VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines PubChem substance and compound databases Non-carbohydrate glycomimetics as inhibitors of calcium (II)-binding lectins Purification and characterization of an extracellular lectin from Mycobacterium smegmatis Role of P-fimbrial-mediated adherence in pyelonephritis and persistence of uropathogenic Escherichia coli (UPEC) in the mammalian kidney High-throughput prediction of protein antigenicity using protein microarray data Quinoxalinone inhibitors of the lectin DC-SIGN Mannosederived FimH antagonists: a promising anti-virulence therapeutic strategy for urinary tract infections and Crohn's disease Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method Sulfated glycolipids and cell adhesion Prediction methods for B-cell epitopes Scanning the genome of Mycobacterium tuberculosis to identify potential lectins Pan-selectin inhibitors as potential therapeutics for COVID-19 treatment: in silico screening study AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Multi-epitope DnaK peptide vaccine against S. Typhi: an in silico approach Construction and development of FimH lectin domain for rising immune response after injection by uropathogenic E. coli 2008) I-TASSER server for protein 3D structure prediction SOR-TALLER: predicting allergens using substantially optimized algorithm on allergen family featured peptides Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations