key: cord-0980447-6zg31xw5 authors: Gomes, João Pedro Agra; Rocha, Larissa de Oliveira; Leal, Cíntia Emi Yanaguibashi; Filho, Edilson Beserra de Alencar title: Virtual screening of molecular databases for potential inhibitors of the NSP16/NSP10 methyltransferase from SARS-CoV-2 date: 2022-03-28 journal: J Mol Struct DOI: 10.1016/j.molstruc.2022.132951 sha: 491b5bf04f774d9e95ac8bf5c02bed3c15dade4e doc_id: 980447 cord_uid: 6zg31xw5 COVID-19 is a disease caused by the SARS-CoV-2 virus and represents one of the greatest health problems that humanity faces at the moment. Therefore, efforts have been made with the objective of seeking therapies that could be effective in combating this problematic. In the search for ligands, computational chemistry plays an essential role, since it allows the screening of thousands of molecules on a given target, in order to save time and money for the in vitro or in vivo pharmacological stage. In this paper, we perform a virtual screening by docking looking for potential inhibitors of the NSP16-NSP10 protein dimer from SARS-CoV-2, by evaluating a homemade database of molecules found in plants from Caatinga Brazilian biome, compounds from ZINC online molecular databank, as well as structural analogues of the enzyme cofactor s-adenosylmethionine (SAM) and a known inhibitor in the literature, sinefungin (SFG), provided at PubChem database. All databases provided molecules that deserve attention, highlighting four ZINC compounds as the most promising molecules. These results contribute to the discovery of new molecular hits in the search of potential agents against SARS-CoV-2 virus, still unveiling a pathway that can be used in combined therapies. An outbreak of a new Severe Acute Respiratory Syndrome hit the city of Wuhan, China, in late 2019. Caused by SARS-CoV-2, COVID-19 has reached global proportions, differing from respiratory epidemics caused by other coronaviruses, such as SARS-Cov (Severe Acute Respiratory Syndrome) and Mers-CoV (Middle East Respiratory Syndrome), by its high transmission potential, resulting so far in the largest pandemic of the 21st century, with more than 4.4 million deaths [1] - [3] . SARS-CoV-2 is a single-stranded, positively polar RNA virus of the family Coronaviridae [4] . There are seven types of coronaviruses pathogenic to humans: 229E and NL63 (Alphacoronaviruses), OC43, HKU1, SARS-Cov, MERS-Cov and SARS-Cov-2 (Betacoronaviruses) [5] - [7] . Structurally, SARS-CoV-2 contains 4 major structural proteins (the nucleocapsid protein, the transmembrane protein, the envelope protein, and the spike protein), 16 nonstructural proteins (NSP), and 9 open reading frame (ORF) sequences [8] . Among the nonstructural proteins (NSP), the NSP16 is one of the most studied. NSP16 is a methyl transferase dependent on the S-adenosylmethionine (SAM) cofactor, which has enhanced activity and greater stability when associated with the nonstructural protein 10 (NSP10). NSP16 is made up of seven stranded β-sheet surrounded by α-helices and loops [9] , whereas NSP10 can be described in a general way into three regions: a helical domain at the N terminus followed by an irregular β-sheet region, and a C-terminal loop region. It has also a positively charged and hydrophobic surface that interacts with a hydrophobic pocket and negatively charged surface from NSP16, aiding in the stabilization of the SAM binding site [10] . These two proteins interact with each other through salt bridges, hydrophobic interactions, and a large network of hydrogen bonds, resulting in a stable complex [9] . NSP16-NSP10 dimer is responsible for a step in the capping of messenger RNA (mRNA) [11] , which is a mechanism of the virus to evade the innate immune response mechanisms. It involves the insertion of three phosphate groups, a guanosine and a methyl group (methylation, which is carried out by the NSP16-NSP10 dimer) to the 5' end of the RNA. This modification is responsible for preventing the action of RNAses enzymes on mRNA strand [11] , [12] . Thus, interfering on the activity of this dimer using molecules that have affinity for the SAM bind site can inhibit the methylation process, thereby making the virus susceptible to the body's immune system. Although SAM is important for various processes in the human body, the dimer is still considered an interesting target by several research groups [7] , [13] - [15] due to its virus-fighting potential. It is still a strategy that needs refinement and testing, but the selection of molecules by theoretical methods represents a first and important step in drug development. Several studies have been alredy conducted to understand the activity of 29 viral proteins [16] . In this sense, understanding the structures, functions and interactions of these viral proteins enables the development of targeted strategies [7] . Computational chemistry has a key role in this process, since it has tools capable of evaluating the pharmacological potential of thousands of molecules through in silico strategies, using the structures of several protein targets of SARS-CoV [17] . Molecular modeling methods aim to build representative models of complex molecular systems, using computational mathematical models [18] . Thus, they save money and time in relation to experimental studies, since it is possible to direct future investigations only with ligands that have a greater chance of success in in vivo and in vitro tests. Molecular docking is a computational strategy that involves determining the best conformational pose of a ligand in a region of a target macromolecule, through algorithms and graphical interfaces implemented in computer programs. Screening of molecules against a molecular target of interest is one of the major applications of this type of study [19] . Thus, obtaining the structure of the NSP10-NSP16 protein dimer and understanding that the functionality of the dimer is dependent on a cofactor (SAM), a virtual screening can be performed aiming to find structures of several potential inhibitors available in databases, allowing the graphical visualization of the various ligand-target complexes, the identification of interactions, as well as the understanding of the structure-affinity relationship [9] , [11] , [19] , [20] . One of the best-characterized drug targets among coronaviruses is the main protease (M pro , also called 3CL pro ), because this enzyme is essential for processing the polyproteins that are translated from the viral RNA [5] . However, treatments for viral infections with single drugs have not been successful, as exemplified by human immunodeficiency virus (HIV) and hepatitis C virus (HCV) infections. Combination therapy has led to improved clinical outcomes, since it can enhance therapeutic efficacy through additive, and ideally synergistic effects [21] . Natural products are a promising source for the discovery of new drugs. Besides the great variability of compounds, they generally have good efficacy with tolerable toxicity levels [22] . Caatinga is an exclusively Brazilian phytogeography domain, located in the Northeast region of the country. It has great diversity, with 744 endemic species, thus being an important source of new drug candidate molecules [23] . Ethnobotanical and ethnopharmacological studies reflect this diversity: reports in the literature range from use for gastrointestinal problems like diarrhea, to respiratory problems like asthma, and antiproliferative activity [24] - [26] . Moreover, in silico studies were recently conducted using a database containing only natural products from the semi-arid region of Bahia (NatProDB), and revealed three (secotrachylobanoic acid, hexahydronaphthalen-1-yl, indolinedione, and benzopyran-4-one) promising scaffolds for Mpro inhibition of SARS-Cov2 [27] . So, this study aims to search for new ligands that can be potential inhibitors of the NSP16-NSP10 protein dimer, evaluating a homemade database of molecules from plants of the Caatinga, aleatory database (ZINC) as well as from structural analogs based on the scaffold of its enzyme cofactor (SAM), using virtual screening by docking. We hope to contribute with the studies and the proposition of potential chemotherapeutic alternatives for future tests and treatment of COVID-19. Initially, a search for the 3D structure of the NSP10-NSP16 protein dimer of SARS-Cov-2 was performed using the Protein Data Bank virtual platform, obtaining two crystallographic structures of the same dimer: 6W75 (resolution 1.95 Å), bounded with its natural ligand, S-adenosylmethionine (SAM); and the 6WKQ (resolution 1.98 Å), bounded with an inhibitor already known in the literature, sinefungin (SFG) [15, 20] . The cocrystallographic ligands were extracted from original structures and prepared for redocking using AutoDock Tools, in which the conformation and coordinates of the ligands were previously randomized to avoid biased results. After that, the proteins were prepared for docking using Chimera software 1.15rc [21] , and the virtual platform ABPS, for the calculation of Poisson-Boltzmann electrostatic analysis, being adjusted to pH 7.4 [22] . The determination of the gridbox values was performed by AutoDock Tools software. Then, redocking calculations were performed for both structures using the AutoDock Vina module [23] . RMSD calculations were performed on Zhang Lab's DockRMSD virtual platform [24] . For the validation of the method used, a ROC (Receiver Operating Characteristic Curve) curve was constructed using an online platform (http://stats.drugdesign.fr/) [28] . Since it is still a recent topic, not many ligands were found that bind in the site of interest on the NSP16-10 dimer. Therefore, molecules tested in vitro for the SAM binding site on NSP14 of SARS-Cov2 were also considered. The library of decoys was generated using DUD -E [29] , and a total of 12 true positive ligands were used [30] - [33] . Fifty decoys were generated for each active ligand, and 10 decoys corresponding to each real ligand were randomly chosen. Then, ROC curve calculations were performed using 132 molecules. A homemade databank of Caatinga was previously generated with 248 molecules, which were obtained through the search for secondary metabolites isolated from plants of this phytogeographic domain, performed with the online platform SciFinder®. The search was done based on the scientific name of 40 species cataloged in the book "Caatinga" [34] , using filters for articles resulting from the keywords: "LC-MS", "GC-MS", "secondary metabolites" and "chemical composition", ensuring that only studies with identified and/or isolated molecules were selected. The compounds were drawn with ChemSketch and had their geometry optimized with Hyperchem, at the RM1 semiempirical level. After geometry optimization, the molecules were converted to '.pdbqt' with the aid of OpenBabel GUI, for docking by AutoDock Vina [ [35] , [36] . Using the ZINC Database, a search was performed for molecules whose molecular weight and LogP were similar to those of the endogenous coenzyme of the NSP10-NSP16 protein dimer, S-adenosylmethionine (399.4 g/mol and -5.3, respectively), yielding 28, 926 molecules in this selection. The compounds were downloaded as '.sdf' and converted to '.pdbqt' with OpenBabel GUI [36] . Docking was performed with AutodDock Vina following the same parameters as before [35] . A search for related compounds analogous to S-adenosylmethionine was performed in the PubChem molecule database, using Lipinski's rule of five as filter [37] , obtaining 711 molecules. Similarly, a search for analogs of sinefungin was performed, obtaining 358 molecules. After this, the molecules of both groups were downloaded in a sdf file. In this case, the molecules were in 2D format, so they were adjusted to the three-dimensional form using ChemSketch. After this adjustment, the molecules were converted to pdbqt. with Open Babel GUI 2.3 for docking [36] . Docking calculations were performed by AutoDock Vina [35] . For the visualization of intermolecular interactions between ligand and macromolecule, the software Discovery Studio was used, which creates a 2D diagram of the possible interactions between the ligand and the amino acids in the biding site. The software considers interactions in different ranges from the ligand, for different types of interactions, as shown in Table 1 . In this study, we performed a virtual screening using Autodock Vina with the purpose of filtering thousands of molecules in order to find those that bind to NSP10-16. The area under the curve (AUC) is a widely used metric for analyzing the overall performance of virtual screening, because it summarizes in a number whether the screening methodology can differentiate by scoring whether compounds are active or not [38] , [39] . In our study, AUC=0.886, indicating that Autodock Vina performed satisfactorily in the ranking of the compounds, that is, experimentally active molecules with good IC 50 (for example, IC 50 = 0.3+-0.05 and 0.008+-0.0006) [31] had good energies (-8.9 and -8.8 kcal/mol, respectively). Global metrics of this curve also allows us to infer that the early recognition of active compounds was very satisfying (BEDROC = 0.576). As a second part of docking validation, redocking analyses was performed with Sadenosylmethionine (SAM) and Sinefungin (SFG). Energy values and RMSD indicates the successful of the algorithm to study this system ( Table 2) . known as Catingueira [34] , [40] . Table 3 lists the results obtained by dockings. It is noteworthy that some of the interactions between molecules 15 and 98 with the NSP10-NSP16 dimer occur between the same amino acids that interact with the crystallographic coenzyme ( Figure 2C , Figure 5A and Figure 5B ), an indication that both are promising for further experimental study. The same is seen when compared to the known antagonist, SFG ( Figure 2C , Figure 5A , and Figure 5B ). Therefore, it is noticeable that there was no sudden variation in the conformations of the structures, which shows a pattern in the way the molecules stabilize in the site (Figure 7 ). Regarding the ZINC database, since the number of molecules was too large, docking was performed on the A subunit of the crystallography code 6W75 of the protein dimer NSP10-NSP16. After screening with the 28,926 molecules, only those that had energy better than or equal to -10.0 kcal/mol were analyzed at the C-subunit site of 6W75, as well as at both sites of 6WKQ. Results are shown in Table 4 and the top 5 molecules are shown in Figure 8 . It is noteworthy that the arrangement of the top 3 molecules in the binding site was comparable to the coenzyme ligand (Figure 9) , with the possibility of hydrogen bond formation at the same amino acids ( Figure 2C , Figure 10A-D) , an indication that it is a promising analog for further pharmacological testing. The same is perceived when compared to the known antagonist, SFG ( Figure 3C , Figure 10A -D). Despite the good interaction profile, most of the best molecules present more than 10 hydrogen bond acceptors in their structure, which violates Lipinski's rule of five [37] . However, bioisosteric strategies can be used to exploit the molecular bases found, by modifying them through substitutions that do not interfere with the protein affinity profile. Regarding the search for SAM and SFG analogs in PubChem, after screening with the 1069 molecules, molecules with energy better or equal to the energy obtained in the redocking were selected (-8.7 kcal/mol). For SAM, the selected analogs were 20592073 and 45268044, both with the best energy of -8.8 kcal/mol. In addition, for the SFG analogs, 25 results were initially selected. All molecules whose isomers were not explicit in the database were excluded, resulting in 17 results (2 SAM analogs and 15 SFG analogs, illustrated in 2D in Figure 12 ). Table 5 lists the results obtained after filtering. The analogs highlighted with an asterisk are those whose energies at all sites were equal to or above -8.7 kcal/mol, and the molecules 152457420, 52921643, and 132561820 had the best results within this group. Molecules with nitrogen-substituted heterocycles have the most diverse health applications, having already presented anti-cancer [41] anti-neurodegenerative disease [42] , [43] activities, and furthermore, are commonly used to combat viruses and bacteria [44] - [47] . The presence of the fluorine atom in molecule 152457420 certainly influenced its better energy compared to the others, which have a similar structure. Some studies show that a fluorine substitution is an approach that is being increasingly used to improve the interaction profile of the ligand [48] , [49] . It is noted that the arrangement of the 3 best molecules in the binding site was comparable to the native ligand ( Figure 13 ) making hydrogen bonds at the same amino acids ( Figure 2C ), an indication that it is a promising analog for further pharmacological testing. The same is observed when compared to the known antagonist, SFG ( Figure 3C ). For the best molecules in this study, ADMET calculations were conducted using the SwissADMET online platform (http://www.swissadme.ch), and some parameters were analyzed. Log P is a classic descriptor to represent the lipophilicity of a substance, and the consensus of the Log P represents the arithmetic mean of the value obtained by 5 methods (XLOP3, WLOGP, MLOGP, SILICOS-IT, iLOGP). It has been a common practice performed in order to increase the accuracy of simulations [50] . Two other characteristics that are crucial to understanding the probable pharmacokinetic behavior are the ability to be passively absorbed from the gastrointestinal (GI) tract and to cross the blood-brain barrier (BBB), which were also evaluated. PAINS (pan assay interference compounds) refers to the presence of structural alerts (frequent hitters or promiscuous compounds) [51] Water solubility is an important property, since it facilitates and guides drug development activities, so, was also evaluated. SwissADMET uses three predictors for this calculation [50] . Finally, synthetic accessibility is a score that indicates how easily that compound can be obtained by synthesis [52] . As shown in Table 6 , the best energy compounds are polar, can be synthesized with some ease, without structural alerts, and do not cross the blood-brain barrier, characteristics that are considered positive for the purpose that the molecules are being studied. Due to the low level of gastrointestinal absorption, studies to test an adequate drug delivery should be performed, or even molecular modifications that do not interfere significantly in its interaction with the protein. In addition, it has good water solubility acording to all three predictions used by the platform. So, one can infer that besides having a good activity for the target, the molecules also have some ADMET parameters that favor future studies for drug development. The screening procedure allowed the identification of molecules analogous to Sadenosylmethionine (enzyme cofactor) and sinefungin (inhibitor described in the literature) with high affinity for the NSP16-NSP10 protein dimer receptor of SARS-Cov-2. Future pharmacological assays are encouraged, starting with the results obtained with ZINC database, which showed better results, such as better affinity with the macromolecule, do not cross the blood-brain barrier and can be easily synthesized. This study, therefore, contributes to the discovery of potential hits for the control of infectious processes caused by SARS-Cov-2, including the use of combination therapies with other compounds already reported to date. As the corresponding author, and on behalf of the other authors, I declare that all authors agree to the publication of this article "Virtual screening of molecular databases for potential inhibitors of the NSP16/NSP10 methyltransferase from SARS-COV-2". We have not conflict of interest to declare. Impact of the coronavirus (COVID-19) pandemic on scientific research and implications for clinical academic training -A review Brief History of Pandemics (Pandemics Throughout History) COVID-19 Map -Johns Hopkins Coronavirus Resource Center Coronaviruses and sars-cov-2 Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved a-ketoamide inhibitors Coronavirus Pathogenesis and the Emerging Pathogen Severe Acute Respiratory Syndrome Coronavirus SARS-CoV-2 Nsp16 activation mechanism and a cryptic pocket with pan-coronavirus antiviral potential Neuropsychiatric Symptoms of COVID-19 Explained by SARS-CoV-2 Proteins' Mimicry of Human Protein Interactions Biochemical and Structural Insights into the Mechanisms of SARS Coronavirus RNA Ribose 2′-O-Methylation by nsp16/nsp10 Protein Complex The crystal structure of nsp10-nsp16 heterodimer from SARS-CoV-2 in complex with S-adenosylmethionine A High-Throughput RNA Displacement Assay for Screening SARS-CoV-2 nsp10-nsp16 Complex toward Developing Therapeutics for COVID-19 Conventional and unconventional mechanisms for capping viral mRNA Tackle the COVID-19 Disease Structural screening into the recognition of a potent inhibitor against nonstructural protein 16 : a molecular simulation to inhibit SARS-CoV-2 infection In silico validation of coumarin derivatives as potential inhibitors against Main Protease , NSP10 / NSP16-Methyltransferase , Phosphatase and Endoribonuclease of SARS CoV-2 Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19 Emerging strategies on in silico drug development against COVID-19: challenges and opportunities Planejamento de fármacos, biotecnologia e química medicinal: aplicações em doenças infecciosas Docking and scoring in virtual screening for drug discovery: Methods and applications Sinefungin, a potent inhibitor of S-adenosylmethionine: Protein O-methyltransferase Drug combination therapy for emerging viral diseases The Traditional Medicine and Modern Medicine from Natural Products Instituto de Pesquisa Jardim Botânico do Rio de Janeiro Antiproliferative Activity, Antioxidant Capacity and Tannin Content in Plants of Semi-Arid Northeastern Brazil Uso e diversidade de plantas medicinais da Caatinga na comunidade rural de Laginhas , município de Caicó , Rio Grande do Norte ( nordeste do Brasil ) Levantamento etnobotânico de plantas medicinais em área de Caatinga na comunidade do Sítio Nazaré , município de Milagres Natural Products-Based Drug Design against SARS-CoV-2 Mpro 3CLpro Screening Explorer -An interactive tool for the analysis of screening results Screening Explorer -An Interactive Tool for the Analysis of Screening Results Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking In Vitro Reconstitution of SARS-Coronavirus mRNA Cap Methylation Critical Reviews and Perspectives Coronaviral RNAmethyltransferases : function , structure and inhibition The Structure-Based Design of SARS-CoV -2 nsp14 Methyltransferase Ligands Yields Nanomolar Inhibitors Potent SARS-CoV -2 mRNA Cap Methyltransferase Inhibitors by Bioisosteric Replacement of Methionine in SAM Cosubstrate Caatinga : árvores e arbustos e suas utilidades., 2a ed. Fortaleza: Printcolor Gráfica e Editora AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Open Babel: An Open chemical toolbox Lead-and drug-like compounds: The rule-of-five revolution Structure-Based Virtual Screening : From Classical to Artificial Intelligence Virtual Screening Workflow Development Guided by the ' Receiver Operating Characteristic ' Curve Approach . Application to High-Throughput Docking on Metabotropic Glutamate Receptor Subtype 4 Occurrence of biflavones in leaves of Caesalpinia pyramidalis specimens Design, click synthesis, anticancer screening and docking studies of novel benzothiazole-1,2,3-triazoles appended with some bioactive benzofused heterocycles Drugs Against Neurodegenerative Diseases: Design and Synthesis of 6-Amino-substituted Imidazo[1,2-b]pyridazines as Acetylcholinesterase Inhibitors Imidazopyridazine Acetylcholinesterase Inhibitors Display Molecular docking and simulation studies of 3-(1-chloropiperidin-4-yl)-6-fluoro benzisoxazole 2 against VP26 and VP28 proteins of white spot syndrome virus Recent advances in the synthesis of organic chloramines and their insights into health care Antiviral activity of 3-(1-chloropiperidin-4-yl)-6-fluoro benzisoxazole 2 against White spot syndrome virus in Freshwater crab, Paratelphusa hydrodomous Synthesis of quinoline acetohydrazide-hydrazone derivatives evaluated as DNA gyrase inhibitors and potent antimicrobial agents Applications of Fluorine in Medicinal Chemistry The Many Roles for Fluorine in Medicinal Chemistry SwissADME : a free web tool to evaluate pharmacokinetics , drug-likeness and medicinal chemistry friendliness of small molecules New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays based on molecular complexity and fragment contributions