key: cord-0865288-4slkmu42 authors: Siqueira, Andrei Santos; Lima, Alex Ranieri Jerônimo; Aguiar, Delia Cristina Figueira; Santos, Alberdan Silva; Vianez Júnior, João Lídio da Silva Gonçalves; Gonçalves, Evonnildo Costa title: Genomic screening of new putative antiviral lectins from Amazonian cyanobacteria based on a bioinformatics approach date: 2018-09-25 journal: Proteins DOI: 10.1002/prot.25577 sha: b62cb8b15ead1c29a8a4a3b39605dd0a5bbe2f33 doc_id: 865288 cord_uid: 4slkmu42 Lectins are proteins of nonimmune origin, which are capable of recognizing and binding to glycoconjugate moieties. Some of them can block the interaction of viral glycoproteins to the host cell receptors acting as antiviral agents. Although cyanobacterial lectins have presented broad biotechnological potential, little research has been directed to Amazonian Cyanobacterial diversity. In order to identify new antiviral lectins, we performed genomic analysis in seven cyanobacterial strains from Coleção Amazônica de Cianobactérias e Microalgas (CACIAM). We found 75 unique CDS presenting one or more lectin domains. Since almost all were annotated as hypothetical proteins, we used homology modeling and molecular dynamics simulations to evaluate the structural and functional properties of three CDS that were more similar to known antiviral lectins. Nostoc sp. CACIAM 19 as well as Tolypothrix sp. CACIAM 22 strains presented cyanovirin‐N homologues whose function was confirmed by binding free energy calculations. Asn, Glu, Thr, Lys, Leu, and Gly, which were described as binding residues for cyanovirin, were also observed on those structures. As for other known cyanovirins, those residues in both our models also made favorable interactions with dimannose. Finally, Alkalinema sp. CACIAM 70d presented one CDS, which was identified as a seven‐bladed beta‐propeller structure with binding sites predicted for sialic acid and N‐acetylglucosamine. Despite its singular structure, our analysis suggested this molecule as a new putative antiviral lectin. Overall, the identification and the characterization of new lectins and their homologues are a promising area in antiviral research, and Amazonian cyanobacteria present biotechnological potential to be explored in this regard. The Phylum Cyanobacteria has been shown to be a promising source of antiviral lectins, mainly because of their anti-HIV activity. Currently, three lectins have gained prominence: (i) Cyanovirin-N, which was isolated from Nostoc ellipsosporum, may inactivate HIV strains even at low nanomolar concentrations 8,9 ; (ii) microvirin, isolated from Microcystis aeruginosa, shares 33% of identity with cyanovirin, but it is 50 times less cytotoxic 10 ; and, (iii) scytovirin, which was isolated from Scytonema varium, acts against Zaire Ebola virus, coronavirus, and Cryptococcus fungi besides having anti-HIV activity. 4, 11, 12 In this sense, screening of new cyanobacterial lectins as well their structural improvement is a reasonable strategy in the search for new microbicide candidates. [13] [14] [15] Despite the great biological diversity and ecological importance of the region, the first Amazonian cyanobacteria genome was published only in 2014. 16 Since then, other genomic studies have been performed with the aim of investigating the genetic potential and diversity of cyanobacteria from this region. [17] [18] [19] However, there are few studies based on biotechnological applications for these cyanobacteria. In this sense, genomic screening has been used successfully to access the genetic potential of an individual or an organism group. [20] [21] [22] Thus, the aim of this study was to investigate, by genomic analysis, homology modeling, and molecular dynamics simulations, the presence of potential antiviral lectins in cyanobacterial genomes isolated from Amazonian environments. All of the strains analyzed in this study belong to the CACIAM collec- In order to identify putative lectin sequences in the 7 CACIAM strain genomes, a conserved domain approach search was applied. First, amino acid sequences from NCBI annotated as lectins that have solved structures were downloaded. Next, the CD-HIT standalone version 23 was used to cluster sequences with >90% identity, making the data nonredundant. The final dataset was submitted to a conserved domain database (CDD) webserver 24 for a domain identification procedure. The results obtained were stored to be used as a positive indication of a lectin in the next step. All coding sequences (CDS) annotated in the 7 CACIAM strains genomes were extracted and translated with Geneious R9, 25 using bacterial genetic code. For each genome, the translated CDS were submitted to CDD. A custom Perl script was applied to parse CDD results, retrieving only sequences that had the same domains identified in lectin sequences. To perform the homology modeling, two strategies were used. Sequences having enough identity with templates of Protein Data Bank 26 (PDB) were modeled with Modeller9. 16 27 software. Pro-mals3D 28 performed the sequence alignment of the template and the target. A total of 100 models were generated based on the targettemplate alignment, considering different conformations, and ranked by molecular probability density function (Molpdf ) and DOPE score. Automatic loop refinement was used after model building and the models were generated, satisfying spatial restrictions such as bond lengths, bond angles, dihedral angles, and interactions between nonbonded atoms, and then subjected to validation. Sequences that had no identity with PDB structures were modeled on a I-TASSER server 29 and the best model was selected according to C-score and alignment quality with the templates. All homology models generated here were submitted to a Molecular Dynamics (MD) refinement simulation at 100 ns. After that the proteins with known ligands were complexed to them for a new MD simulation of 210 ns and binding free energy calculations. To perform these MD, a PDB2PQR server (http://nbcr-222.ucsd. edu/pdb2pqr_2.0.0/) was used to determine the protonation state of the protein considering a pH level of 7.0. All steps of preparation and production of MD were produced using the AMBER 16 software package 33 . The force fields applied were GLYCAM_06j 34 and FF14SB 35 for the ligand and the protein, respectively. Counter ions Na + or Cl − were added to neutralize the charges and TIP3P 36 water molecules in an octagonal box with 10 Å in each direction of the protein. Energy minimization was performed in five steps, four of these using 3000 cycles of steepest descent and 5000 cycles of conjugate gradients for each one; the heavy atoms were restrained by a harmonic potential of 1000 Kcal/mol*Å 2 . In the last step, we used classified this sequence as an integrin-like fungal protein that adopts a seven-bladed beta-propeller structure and interacts with monosaccharides and calcium. A 100-ns MD simulation was performed for refining the models. After that, they were validated and the results are presented in Table 2 . Conformational changes observed in these simulations were fundamental for improving the validation tests of the models, which were constructed based on crystallographic data. Alkalinema sp. CACIAM 70d lectin showed the highest RMSD values but its Ramachandran evaluation went up from 74% to 96.5% after the 100 ns simulation ( Figure 3(A) ). The CDD identification, the validated model, and the binding site prediction suggest that this ORF (OUC12179.1) annotated as a hypothetical protein in the Alkalinema sp. CACIAM 70d genome is probably a new putative antiviral lectin, the first described for this new genus. 17 The structural coordinates of crystallographic dimannose (MAN-MAN) were obtained from cyanovirin 2RDK deposited in PDB 47 and were employed as a template to complex this ligand to the cyanovirin models and their respective templates used in the homology modeling step. MD simulations of 210 ns were produced for these four systems and the structural stability of the models is presented in Figure 3 (B) and (C). All complexes remained stable during the simulation and the ligand MAN-MAN showed RMSD values less than 1 Å in the four cases. The final state of the MD simulation for each system is presented in Figure 4 . The binding site affinity and the individual contribution of residues were estimated through binding free energy calculations. The last MD 5000 frames were used for calculating the binding free energy by MM-GBSA, MM-PBSA, and SIE methods. The results are presented in Table 3 . According to these results, Tolypothrix sp. CA-CIAM 22 cyanovirin and its template Cyanothece sp. PCC 7424 (5 K79) cyanovirin presented higher affinity to the MAN-MAN ligand. The individual contribution of residues was evaluated by decomposition of the MM-GBSA method and the results are presented in Figure 5 . Asn, Glu, Thr, Lys, Leu, and Gly, which were described as binding residues for cyanovirin, were also observed on those structures. As to other known cyanovirins, those residues in both of our models also made favorable interactions with dimannose. Despite the structural differences among four cyanovirins evaluated here, it was possible to observe the structural conservation of a group of residues. The aspartate, asparagine, and threonine triad appeared in three of the four systems and the Nostoc sp. CACIAM 19 cyanovirin system, which had not conserved the aspartate residue, showed the worst results in binding free energy calculations ( Figure 4 and Table 3 ). Besides that, this triad seems to be fundamental to complexation with dimannose of microvirin, another cyanobacterial lectin. 48 Arginine residues were also conserved at the binding site and [49] [50] [51] [52] In this sense, the search for new forms of cyanovirin obtained from other cyanobacteria could reveal new applications for this protein, including the reduction of adverse reactions caused by some protein variants. In fact, an identity of approximately 33% is capable of reducing the cytotoxic 50-fold and maintaining the antiviral activity in microvirin 10 . Cyanovirin is more active than microvirin due its bivalent iterations with the viral envelopes 53 , so new cyanovirin forms may present combined properties to be potent inhibitors and less cytotoxic than current variants at the same time. Additionally, the detailed study of residue conservation and binding interactions helps to select the best candidates for antiviral applications. Lectins as cell recognition molecules Structural studies of algal lectins with anti-HIV activity Algal lectins as potential HIV microbicide candidates The cyanobacterial lectin scytovirin displays potent in vitro and in vivo activity against Zaire Ebola virus Cyanovirin-N inhibits hepatitis C virus entry by binding to envelope protein glycans Potent anti-influenza activity of Cyanovirin-N and interactions with viral hemagglutinin Multiple antiviral activities of cyanovirin-N: blocking of human immunodeficiency virus type 1 gp120 interaction with CD4 and coreceptor and inhibition of diverse enveloped viruses Discovery of cyanovirin-N, a novel human immunodeficiency virus-inactivating protein that binds viral surface envelope glycoprotein gp120: potential applications to microbicide development Cyanovirin-N, a potent human immunodeficiency virus-inactivating protein, blocks both CD4-dependent and CD4-independent binding of soluble gp120 (sgp120) to target cells, inhibits sCD4-induced binding of sgp120 to cell-associated CXCR4, and dissociates bound sgp120 from target cells Microvirin, a novel alpha(1, 2)-mannose-specific lectin isolated from Microcystis aeruginosa, has anti-HIV-1 activity comparable with that of cyanovirin-N but a much higher safety profile A potent novel anti-HIV protein from the cultured cyanobacterium Scytonema varium Novel antifungal activity for the lectin Scytovirin: inhibition of Cryptococcus neoformans and Cryptococcus gattii Scytovirin engineering improves carbohydrate affinity and HIV-1 entry inhibition S2-003.php?aid=13502. doi Cyanobacterial lectins characteristics and their role as antiviral agents New cell surface lectins with complex carbohydrate specificity from cyanobacteria Draft genome sequence of the Brazilian Cyanobium sp. strain CACIAM 14 Draft genome sequence of Alkalinema sp. strain CACIAM 70d, a cyanobacterium isolated from an Amazonian freshwater environment Draft genome sequence of Microcystis aeruginosa CACIAM 03, a cyanobacterium isolated from an Amazonian freshwater environment Characterization of nitrogen-fixing cyanobacteria in the Brazilian Amazon floodplain C-type lectin-like domains in Caenorhabditis elegans: predictions from the complete genome sequence Archeal lectins: an identification through a genomic search Identification of mycobacterial lectins from genomic data Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences CDD: NCBI's conserved domain database Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data The RCSB protein data bank: integrative view of protein, gene and 3D structural information Comparative protein structure modeling using MOD-ELLER PROMALS3D web server for accurate multiple protein sequence and structure alignments The I-TASSER suite: protein structure and function prediction MolProbity: all-atom contacts and structure validation for proteins and nucleic acids Assessment of protein models with three-dimensional profiles QMEAN: a comprehensive scoring function for model quality assessment The Amber Molecular Dynamics Package Extension of the GLYCAM06 biomolecular force field to lipids. Lipid Bilayers and Glycolipids Molecular simulation Simmerling C. ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB A modified TIP3P water potential for simulation with Ewald summation The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities Assessment of solvated interaction energy function for ranking antibody-antigen binding affinities Gapped BLAST and PSI-BLAST: a new generation of protein database search programs SignalP 4.0: discriminating signal peptides from transmembrane regions Structure and glycan binding of a new Cyanovirin-N homolog Solution structure of a circular-permuted variant of the potent HIV-inactivating protein cyanovirin-N: structural basis for protein stability and oligosaccharide interaction Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment Catabolism of N-glycoproteins in mammalian cells: molecular mechanisms and genetic disorders related to the processes Direct and label-free influenza virus detection based on multisite binding to sialic acid receptors Molecular insight into dengue virus pathogenesis and its implications for disease control Conformational gating of dimannose binding to the antiviral protein cyanovirin revealed from the crystal structure at 1.35 Å resolution. Protein Science : A Publication of the Investigating the effects of point mutations on the affinity between the cyanobacterial lectin microvirin and high mannose-type glycans present on the HIV envelope glycoprotein Functional homologs of cyanovirin-N amenable to mass production in prokaryotic and eukaryotic hosts Optimization of the expression of the HIV fusion inhibitor cyanovirin-N from the tobacco plastid genome Engineering soya bean seeds as a scalable platform to produce cyanovirin-N, a non-ARV microbicide against HIV Transformation of Althaea officinalis L. by agrobacterium rhizogenes for the production of transgenic roots expressing the anti-HIV microbicide cyanovirin-N Solution structure of the monovalent lectin Microvirin in complex with Manα(1-2)man provides a basis for anti-HIV activity with low toxicity The authors declare that there is no conflict of interest. Andrei Santos Siqueira https://orcid.org/0000-0002-2397-7119