key: cord-0700684-3x53iela authors: Gupta, Shiv Shankar; Kumar, Ashwani; Shankar, Ravi; Sharma, Upendra title: In Silico Approach for Identifying Natural Lead Molecules Against SARS-COV-2 date: 2021-04-13 journal: J Mol Graph Model DOI: 10.1016/j.jmgm.2021.107916 sha: bbf6a435f0db68c92c60eba11ea70ac47d9cdbb7 doc_id: 700684 cord_uid: 3x53iela The life challenging COVID-19 disease caused by the SARS-CoV-2 virus has greatly impacted smooth survival worldwide since its discovery in December 2019. Currently, it is one of the major threats to humanity. Moreover, any specific drug or vaccine's unavailability against COVID-19 forces discovering a new drug on an urgent basis. Viral cycle inhibition could be one possible way to prevent the further genesis of this viral disease, which can be contributed by drug repurposing techniques or screening of small bioactive natural molecules against already validated targets of COVID-19. The main protease (M(pro)) responsible for producing functional proteins from polyprotein is an important key step for SARS-CoV-2 virion replication. Natural product or herbal based formulations are an important platform for potential therapeutics and lead compounds in the drug discovery process. Therefore, here we have screened >53,500 bioactive natural molecules from six different natural product databases against M(pro) (PDB ID: 6LU7) of COVID-19 through computational study. Further, the top three molecules were subjected to pharmacokinetics evaluation, which is an important factor that reduces the drug failure rate. Moreover, the top three screened molecules (C00014803, C00006660, ANLT0001) were further validated by a molecular dynamics study under a condition similar to the physiological one. Relative binding energy analysis of three lead molecules indicated that C00014803 possess highest binding affinity among all three hits. These extensive studies can be a significant foundation for developing a therapeutic agent against COVID-19 through animal studies. Severe acute respiratory syndrome-coronavirus-2, a positive-sense single-stranded RNA virus with 27-31 kb RNA genome size, largest one known to date, belongs to the β genus of coronaviridae family [1] . COVID-19 is a major global health emergency in the 21st century emanated from Wuhan city of China in December 2019, which is responsible for lead pandemic with high morbidity and 4-5% mortality rate [2 ,3] . It leads to over 2.6 million death & over 119 million cases until 16 th March 2021 globally, as per the WHO report and still counting [4] . The genesis of coronavirus is over 55 million years earlier, which was concurred with bats, and its most recent familiar progenitor [5] . -9.2 % fatality [6] . Primary infection site by a coronavirus (CoV) in humans and other animals includes severe respiratory and gastrointestinal tract infections [7] . The scientific community worldwide is trying to develop an effective vaccine as well small drug molecules against COVID-19. Several druggable targets in order to inhibit threatning viral infection cycle can be targeted including viral entry inhibitor to cell, viral replication inhibitor, polymerase inhibitor, viral protein synthesis inhibitor, protease inhibitor, RNA-dependent RNA polymerase (RdRp), viral exit inhibitor etc. to develop a remedy [8] . Among them, viral main protease is one of the attractive target as it is responsible for producing polyprotein, which cleaved to different functional proteins of replicase and polymerase, responsible for viral RNA replication process [9, 10] . M pro is comprised of three domains (I-III) featured by active His-Cys catalytic dyad [11] . Main protease possesses several substrate binding pockets such as P1, P1`, P2, P3, P4 and P5 along with S1, S1`, S2 and S4 subsites [11] . Global research status is a vital step for catching an effective cure against COVID- 19 . Remdesivir has been conceded as a propitious antiviral agent against SARS/MERS-CoV virus in cultured cells and non-human primate models [12] . Remdesivir has shown it's potential by treating first US case of COVID-19 successfully [13] . Apart from synthetic drug, there are natural products based active agents which can't be ignored, especially for the treatment of infectious diseases [14] . Many, nature derived anti-SARS-CoV molecule identified experimentally include glycyrrhizin, lycorin, luteolin, emodin, hesperetin, quercetin, sinigrin, savinin, myricetin and scutellarein, which could contribute significantly toward the discovery of a therapeutic agent against COVID-19 [15] [16] [17] [18] [19] [20] . Wrapp et al. disclosed the cryo-electron microscopic structure of the COVID-19 spike protein target for antibody-mediated neutralization [21] and knowledge of spike protein structure at the atomic-level could help to discover drug molecules and vaccine design. Additionally, Liu et al. disclosed the structure of nCOV main protease with internal ligand (PDB ID: 6LU7), which greatly helped to identify new drug molecules against COVID-19 virus in-silico study [22] . Based on in silico study, Wang J. repurposed five neutral drugs, including eravacycline, lopinavir, carfilzomib, elbasvir, and valrubicin as potential inhibitor of COVID-19 main protease [23] . Similarly, Contini A. also identified indinavir and cobicistat as potential binder of M pro of COVID-19 [24] . In this series, Wu group contributed, few indispensable natural molecules e.g. series of andrographolide derivatives, isodecortinol, cerevisterol & hesperidin against M pro of nCoV through a structure based virtual ligand screening [25] . Alqahtani et. al. has also proposed few natural lead molecules such as chrysophanol 8-(6-galloylglucoside), 3,4,5-tri-O-galloylquinic acid, mulberrofuran G, withanolide A, isocodonocarpine and calonysterone as efficient binder of protease enzyme [26] . Mendanha et. al. has screened ten anti-HIV drug molecules which have shown nelfinavir with better molecular dynamics parameters as compare to reference N3 [27] . Additionally, Prasad et. al. has screened two hundred fifty-six FDA approved antimicrobial agents to revel cefiderocol, plazomicin and vanganciclovir as active agent against main protease [28] . Further, Purohit et. al. has identified oolonghomobisflavan-A lead molecule against COVID-19 protease via virtual screening of sixty-five molecules derived from tea plant [29] . Further virtual screening by AboulMagd et. al. inferenced chromene-2-one based active molecule against main protease [30] . Inspired by these studies along with the fact that most of the natural origin molecules possess high biocompatibility and low toxicity, high throughput virtual screening (Glide module of Schrodinger) of natural bioactive molecule databases (six different natural database) against the main protease of SARS-CoV-2 virus is performed. Subsequent filtration of a potent molecule through standard & extra precision algorithm afforded hit molecules. Moreover, top-ranked molecules were further validated through molecular dynamics study along with N3 (reference molecule) to propose a potent lead molecule against M pro of COVID-19. Insilico pharmacokinetic studies were done for top candidates in order to reduce the drug failure rate. Through this study we have aimed to find lead molecules, which have potential to act as therapeutic candidate against SARS-CoV-2. The solved structure of nCoV-19 main protease was retrieved from Protein Data Bank (PDB ID: 6LU7). Mentioned protein has resolution of 2.16 possess R-value free and work of 0.235 and 203 respectively. Internal co-crystallized ligand namely benzyl(3S,6S,9S,12R,Z)-9-isobutyl-6-isopropyl-3-methyl-1-(5-methylisoxazol-3-yl)-1,4,7,10-tetraoxo-12-(((R)-2oxopyrrolidin-3-yl)methyl)-2,5,8,11-tetraazapentadec-13-en-15-oate or simply N3 was defined to generate receptor grid. The protein preparation wizard was utilized for optimization and minimization process for downloaded raw protein (PDB ID: 6LU7) with default option. Maestro operating environment was used to visualize the protein structure in Maestro software Optimization, ionization and energy minimization action were performed by keeping default option in protein preparation wizard of Maestro [31] . An extensive literature survey has been carried out in search of safe naturally bioactive candidates. Bioactive natural molecules to be screened were extracted in sdf format from different natural product database such as NPASS (Natural product activity and species source database), TIPdb (Taiwan indigenous plants database), Analyticon, ChemDiv, ChEBI (Chemical Entities of Biological Interest) and Life Chemicals. Reference molecule i.e. cocrystallized ligand molecule (N3) were drawn through ChemDraw 15.1 and saved in sdf format. All ligands were prepared into 3D structure by utilizing LigPrep module keeping 32 stereoisomer for each ligand as default. OPLS_2005 force field were utilized to generate all possible 3D-conformer of ligands [32] . The active site of main protease was identified for binding of natural ligands and internal ligand of the receptor was chosen for grid box generation. The glide module of Maestro 2017-4 was utilized for receptor grid generation by selecting the atom of co-crystallized ligand of the proteins (Figure 1 ). The size of the grid box was centroid of internal ligand (N3) of the protein, and the docked ligands were similar in size to the N3 molecule. The atoms of protease were fixed within the default parameters of the radii of Vander W l' l of 1Å along with partial charge cut-off of 0.25Å using OPLS_2005 force field. Maestro program was utilized with default parameters in order to evaluate binding affinities of natural candidates within active site of main protease [31] . Receptor grid was produced by selecting atom of co-crystallized ligands. Docking score reflects binding affinity of ligands with the protein; therefore, docking score has been utilized to afford top hits [33] . Initially, HTVS (High Throughput Virtual Screening) was performed for retrieved molecules, eliminating most of the molecules against the main protease of COVID-19 based on their docking score. Molecules with considerable dock score were selected for SP (Standard Precision) & XP (Extra Precision) algorithm of docking, in order to get hit candidates. Topranked candidates docking pose were compared with reference molecule N3 (co-crystallized ligand). In order to know dock pose and ligand interaction for apex molecule, Ligplot option of Maestro software was used [33] . Ligand interactions such as H-bonding, pi-pi interaction & hydrophobic interactions can be seen through this tool. These interactions are important criteria for any ligand candidates to be a lead molecule 3-D interaction diagram for apex hits that were executed directly from Maestro suite workspace visualizer. Drug likeliness parameters were estimated as stipulated in QikProp tool [34] . ToxiM web server was utilized in order to predict toxicity of hit molecules [34, 35] The ADME-T (Absorption, distribution, metabolism, excretion-Toxicity) profile is important aspect for a candidate to behave like drug molecule. Bioavailability and bioactivity greatly depends upon pharmacokinetic properties. Toxicity is another huge factor, therefore safety of hit candidates in human physiology do matter for further experimental validation. Molecular dynamics (MD) study was performed for four protein-ligand complexes, including three-hit ligand and one reference molecule, i.e. N3, G OM (GROningen MAchine for Chemical Simulation) 1 .1. M l l w l z p l p w p l p l l . CHARMM General Force Field (CgenFF) server was utilized to produce desirable ligand topology, while 'pdb2gmx' script was implied for protein topology preparation. Further, a complex of both topologies i.e. ligands and protein, were subjected under Charmm36-July-2017 force field to produce the energy minimized conformation of all the processed complexes [36] . After that, ligand-protein complexes were solvated in a cubic simulation box through one-point water model (SPC216) [37] . The particle mesh Ewald method was taken into account for specifying long-range electrostatic interactions. Steepest decent function was incorporated for the energy minimization of the complexes (50,000 steps) [38] . The system was equilibrated under NVT ensemble with a constant number of particles, volume and temperature for 2ns, followed by NPT ensemble with a constant number of particles, pressure and temperature for 10 ns. In terms of geometry and solvent orientation, the well-equipped and solvated structures were subjected to further simulation. Sequentially, production phase was executed for all equilibrated complexes without any restriction for 200 ns and system coordinates were saved after every 2 ps [39] . The trajectories generated by the simulation were inspected using in-built GROMACS scripts in the form of quantitative parameters such as the root mean square deviation (RMSD), root mean square fluctuation (RMSF), gyration radius (Rg) and hydrogen bond contact profile [40] . Availability of nCoV main protease crystal structure was the main driving force for screening natural molecule databases toward finding lead molecule against COVID-19. Molecular docking represent comparable affinity parameters of tested chemical entities with validated inhibitor molecule or drugs toward target receptor [32] . and Figure 2 ). Molecular docking allows us to determine the partial static bonding stability of the complex, but to determine full descriptors responsible for dynamic stability of complex in its biological environments; molecular dynamic simulation is required. The static picture of the complexes does not provide full descriptions of all the factors responsible for providing stability to the protein-ligand complexes [42] . RMSD is an important parameter for measuring the dynamic stability and equilibration of protein-ligand complex. To validate the docking study, we further analyzed basic MD parameters like RMSD and RMSF of C-α pl x protein and fluctuation range of protein residues, respectively. Three top scored natural candidates based on virtual screening were subjected for simulation by using GROMACS- Additionally, test candidates were also investigated for H-bonding interaction (% occupancy) against amino acid of protease enzyme for the last 50ns time-period trajectory ( Table 3) . Another important simulation parameter viz. Radius of gyration of protease were calculated to measure its consistency [43] . Further, these molecules were subjected for ADME-T and molecular dynamics study in order to validate it's drug likeness properties. ADME-T study is an important criteria for knowing whether a molecule can act as drug or not. Here, ADME parameter for top three hits found to be approximately in safe range. Toxicity behaviour of these hits were evaluated through ToxiM web server, and all parameters found in safe range ( Table 2 ) [35] . Table 3 : H-bonding contact of molecular dynamics with significant percentage occupancy of test and reference compounds against residues of nCOV protease during last 50 ns simulation. Participating amino acid residues (% occupancies) The binding affinity of protein-ligand complexes is also defined by the binding free energy measured using g_MM-PBSA software. The results of the MM-PBSA studies are shown in the table below. For the measurement of binding free energy, only 10ns MD trajectories were used. The binding energy calculations in this method were calculated by adding all type of . . W l' l p l l ( l l surface area) energy. The binding affinities of all three expected hits were outstanding, but they were lower than the p N3 (− 35. 4 kJ/mol). C00014803 has the highest binding energy (−143.63 kJ/ l), while C00006660 and ANLT0001 possess low binding affinities i.e. −141.33 kJ/ l −1 .56 kJ/mol respectively). Furthermore, the electrostatic interactions, non-polar solvation energy, and Vander Waals interactions all contributed negatively to the overall interaction energy, while the polar solvation energy enriched the binding energy positively. Based on these observations, the best compound -complexes among the four possible inhibitors are N3, C00014803 and C00006660. These findings corroborate our previous MD simulation study. Computational tool is an emerging technique in drug design and discovery process [44] . In order to search a new antiviral drug protease enzyme is most validated target, which is responsible for producing protein involved in viral replication [45] . In our study we have subjected thousands of natural molecules again main protease of COVID-19, in order to identify lead molecules, which can act as antiviral agent. Docking analysis afforded top three hits based on their binding affinity toward viral protease. First hit i.e. C00014803 has docking score of -12.76 kcal/mol possess nine hydrogen bonding with THR24, THR26, SER46 (2 Hbonding), ASN142. CYS145, HID163, GLU166, GLN189 with no hydrophobic interaction ( Table 1 and Figure 3) . Second hit molecule i.e. C00006660 has docking score of -11.59 kcal/mol possess six H-bonding with THR26, LEU141, ASN142, CYS145, THR190 (2 Hbonding) (Table 1 and Figure 4) . Third ranked molecule i.e. ANLT0001 has docking score of -11.09 kcal/mol possess seven H-bonding with THR26 (2 H-bonding), ASN142 (2 Hbonding), HID163, GLU166 (2 H-bonding) (Table 1 and Figure 5 ). Figure 6 ). All top three hit molecules were found to have similar pose in active binding pocket of protein as reference molecule N3 (Supporting information, Figure S3 ).The three top-hits of docked protein-ligand complexes were studied further for understanding structural details, dynamic behaviour and stability by using MD simulation. When we simulated the protein with each ligands and compared against without ligand individually for 200 ns, we observed their rmsd graph and noticed the variation associated w . l α-backbone atom of simple protein shows less fluctuation (initial stage). RMSD initially swinged slight for 4-12 ns for protein bound with ligand C00014803 followed by minute brink seen at 60 ns and eventually resulted in significant uniformity for the last 50 ns, suggesting stability of the complex. In contrast, as compared to C00014803 and reference N3 molecules, C00006660 and ANLT0001 showed more RMSD fluctuations for the entire MD run. RMSF was also evaluated to measure the atomic mobility of structural fluctuating backbone atoms over a span of 200 ns. Overall, it was noted that, relative to other top docked hits, molecule specifically C00014803 docked protein residues showed less fluctuation as compared to remaining residues of protein. Initially, without any checked ligands, it shown alike reasonable spike generation at 50-90 and 150-200 residue index numbers as mentioned in the 6LU7 (backbone). The RMSF captured, for each atom and alteration about its average position, which gives insight into the pliability in regions of the protease. Specifically, by looking at the RMSF plots for protease with 306 amino acids and three possible drug candidates and one reference molecule, RMSF was calculated; the values confirmed that the residues of the binding site showed less fluctuation. For first hit (C00014803), second hit (C00006660), third hit (ANLT0001) and N3 (reference) the average RMSF values were 0.13, 0.14, 0.11 and 0.03 nm respectively. The active site residue PRO108 and GLN127 reflects minute variation in almost all the structures. This residues are responsible for catalytic action of protein against all computationally simulated ligands. H-bonding contact were also evaluated for last 50 ns. We observed from the table 3 that there was high occurrence of some residues like PRO which play important role in molecular recognition and structurally provide more conformational rigidity. Along with PRO, VAL and GLN were other two most Disclosure l p l l . Coronaviruses as DNA Wannabes: A New Model for the Regulation of RNA Virus Replication Fidelity Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2 Pathogenicity and transmissibility of 2019-nCoV-a quick overview and comparison with other emerging viruses A Case for the Ancient Origin of Coronaviruses Summary of probable SARS cases with onset of illness from 1 Biochemical aspects of coronavirus replication and virus-host interaction Antiviral replication agents, Viral Replication Viral RNA replication in association with cellular membranes, Membrane Trafficking in Viral Replication An overview of severe acute respiratory syndrome-coronavirus (SARS-CoV) 3CL protease inhibitors: Peptidomimetics and small molecule chemotherapy Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors Broad-spectrum antiviral GS-5734 inhibits both epidemic and zoonotic coronaviruses Remdesivir for severe acute respiratory syndrome coronavirus 2 causing COVID-19: An evaluation of the evidence Natural Products as a Foundation for Drug Discovery Glycyrrhizin, an active component of liquorice roots, and replication of SARSassociated coronavirus Small Molecules Blocking the Entry of Severe Acute Respiratory Syndrome Coronavirus into Host Cells Identification of natural compounds with antiviral activities against SARS-associated coronavirus Anti-SARS coronavirus 3C-like protease effects of Isatis indigotica root and plant-derived phenolic compounds Specific Plant Terpenoids and Lignoids Possess Potent Antiviral Activities against Severe Acute Respiratory Syndrome Coronavirus Identification of myricetin and scutellarein as novel chemical inhibitors of the SARS coronavirus helicase, nsP13 Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors Fast Identification of Possible Drug Treatment of Coronavirus Disease-19 (COVID-19) through Computational Drug Repurposing Study Virtual screening of an FDA approved drugs database on two COVID-19 coronavirus proteins Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods Structure-based virtual screening and molecular dynamics of phytochemicals derived from Saudi medicinal plants to identify potential COVID-19 therapeutics Molecular dynamics simulation of docking structures of SARS-CoV-2 main protease and HIV protease inhibitors Docking of fda approved drugs targeting nsp-16, n-protein and main protease of sars-cov-2 as dual inhibitors Identification of bioactive molecules from tea plant as SARS-CoV-2 main protease inhibitors Ligand-based design, molecular dynamics and ADMET studies of suggested SARS-CoV-2 M pro inhibitors Docking Techniques in Pharmacology: How Much Promising? Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy Benchmarking the reliability of QikProp. correlation between experimental and predicted values, QSAR and COMB SCI ToxiM: A toxicity prediction tool for small molecules developed using machine learning and chemoinformatics approaches CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields Structure and dynamics of the TIP3P, SPC, and SPC/E water models at 298 K Continuous matrix product states for quantum fields: An energy minimization algorithm GROMACS 4.5: a highthroughput and highly parallel open source molecular simulation toolkit g_mmpbsa-A GROMACS tool for high-throughput MM-PBSA calculations Molecular docking and dynamic simulations for antiviral compounds against SARS-CoV-2: A computational study Radius of gyration as an indicator of protein structure compactness Computational methods in drug discovery Three decades of the hepatitis C virus from the discovery to the potential global elimination: the success of translational researches occurred amino acids which also provide structural stability. Radius of gyration were checked in order to see consistency of protein ligand complex, which have shown top hit i.e. 14 3 p l p w p . l l w (C00014803) p p w ll w l w p l l ( l 4). P k l l w p l w l l x p l w l l k p p . In current study, a library of natural compounds retrieved from natural product databases was screened against nCOV-protease in order to propose possible therapeutic candidates for current crisis i.e. COVID-19. Our studies provide detailed information about the inhibitory ability of top-ranked three natural candidates. Molecular docking algorithms against main protease afforded hit molecules, which were further subjected under different pharmacokinetic and toxicity parameters, in order to check their drug likeliness behavior. ADME-T studies have shown acceptable pharmacokinetic descriptor with safe profile. Further, these top hits were subjected to MD simulation study in order to validate stability behavior of top natural candidates against COVID-19 protease. MD simulation studies confirmed the stability of screened hits with protease of nCoV, which were evaluated on the basis of the RMSD, RMSF and Rg trajectories of dynamic simulation for 200 ns. Hydrogen bonding contact were further evaluated to confirm their stability. Current studies afforded top 3 candidates in form of C00014803, C00006660, ANLT0001 (natural candidates) which possess potential against COVID-19 main protease. These top three hits also possess decent free binding energy with main protease. The current findings conclude that the design of novel class of COVID-19 protease inhibitors and the need for more experimental validation can be considered for these three top hits.