key: cord-347684-qzswojwp authors: Majumder, Ranabir; Mandal, Mahitosh title: Screening of plant-based natural compounds as a potential COVID-19 main protease inhibitor: an in silico docking and molecular dynamics simulation approach date: 2020-09-08 journal: Journal of biomolecular structure & dynamics DOI: 10.1080/07391102.2020.1817787 sha: doc_id: 347684 cord_uid: qzswojwp A new strain of coronavirus (CoV) has been identified as SARS-CoV-2, which is responsible for the recent COVID-19 pandemic. Currently, there is no approved vaccine or drug available to combat the pandemic. COVID-19 main protease (M(pro)) is a key CoV enzyme, which plays an important role in triggering viral replication and transcription, turns it into an attractive target. Therefore, we aim to screen natural products library to find out potential COVID-19 M(pro) inhibitors. Plant-based natural compounds from Sigma-Aldrich plant profiler chemical library have been screened through virtual molecular docking and molecular dynamics simulation to identify potential inhibitors of COVID M(pro). Our virtual molecular docking results have shown that there are twenty-eight natural compounds with a greater binding affinity toward the COVID-19 M(pro) inhibition site as compared to the co-crystal native ligand Inhibitor N3 (-7.9 kcal/mol). Also, molecular dynamics simulation results have confirmed that Peonidin 3-O-glucoside, Kaempferol 3-O-β–rutinoside, 4-(3,4-Dihydroxyphenyl)-7-methoxy-5-[(6-O-β-D-xylopyranosyl-β-D-glucopyranosyl)oxy]-2H-1-benzopyran-2-one, Quercetin-3-D-xyloside, and Quercetin 3-O-α-L-arabinopyranoside (selected based on the docking score) possess a significant amount of dynamic properties such as stability, flexibility and binding energy. Our In silco results suggests that all the above mention natural compounds have the potential to be developed as a COVID-19 M(pro) inhibitor. But before that, it must go through under the proper preclinical and clinical trials for further scientific validation. Communicated by Ramaswamy H. Sarma Last year, in December 2019 rapidly spreading viral Pneumonia cases were found in the city of Wuhan (China) (Wu et al., 2020; Zhou et al., 2020) . Later on, a new strain of coronavirus was identified responsible for the outbreak, named SARS-CoV-2 (Gorbalenya et al., 2020) , because the RNA genome of this new virus is 82% identical to the SARS coronavirus (SARS-CoV) and both the viruses belong to clade b of the genus Betacoronavirus (Wu et al., 2020; Zhou et al., 2020) . On February 11, 2020, the World Health Organization (WHO) officially named the disease COVID-19 (coronavirus disease 2019). As human to human transmission of this virus headed to exponential growth globally. The World Health Organization (WHO) declared the outbreak a pandemic on 11 th March 2020. According to the current situation report (WHO) on Aug 14 th , 2020 there are 20,730,456 cumulative confirmed cases globally with a 3.62% death rate (World Health Organisation (WHO), 2020). Currently, there is no available therapy to treat COVID-19. Therefore, drugs are needed which can inhibit the SARS-CoV-2. One of the best drug targets to combat the coronaviruses is the main protease (M pro ) (Anand et al., 2003) (Figure 1 ). As M pro plays a pivotal role in the translation of polyproteins from viral RNA. The functional polypeptides are release from two overlapping polyproteins, pp1a and pp1ab through an expanded proteolytic process, majorly by the M pro . It operates at no less than 11 cleavage sites on the large polyprotein 1ab (replicase 1ab, $790 kDa); the recognition sequence at most sites is Leu-Gln #(Ser, Ala, Gly) (# marks the cleavage site) . Therefore, blocking the activity of this enzyme would inhibit viral replication and transcription. Also, no proteases with a similar cleavage specificity are known to be found in human, therefore inhibitors are more likely to be non-toxic . Regarding that, target-based screening of bioactive compounds could be an option to identify potential M pro inhibitor for SARS-CoV-2. For this purpose, computation (In silico) methods like virtual molecular docking and molecular dynamic simulation could be utilized efficiently. Later, screened potential compounds can be validated through In vitro/In vivo experiments. Thus, we can speed up the process of drug discovery and development. As plant-based natural compounds have a large range of structural diversity, we have tried to screen (In silico) natural compounds from Sigma-Aldrich plant profiler chemical library against the target (COVID-19 M pro ), and found a natural compound Rutin as a hit from garlic (Allium sativum) with a basic structure of anthocyanin. Later on, we have tried to find structurally similar natural compounds to Rutin and also screened (In silico) them against the target. The 3 D crystal structure of COVID-19 main protease (M pro ) has been obtained from RCBS Protein Data Bank (PDB) (https:// www.rcsb.org/) using PDB ID: 6LU7 (Jin et al., 2020) . The crystal structure has two chains: A and C. Chain A of the macromolecule has been selected as the target receptor. The native co-crystal ligand present in the crystal structure is n-[(5-methylisoxazol-3-yl)carbonyl]alanyl-l-valyl-n$1$-((1r,2z)-4-(benzyloxy)-4-oxo-1-f[(3r)-2-oxopyrrolidin-3-yl] methylgbut-2-enyl)-l-leucinamide (Inhibitor N3). Then, the selected target receptor has been prepared (add hydrogen and charge: AMBER ff14SB) for docking using Dock Prep tool of UCSF Chimera software. COVID-19 main protease (M pro ) 3 D crystal structure (PDB ID: 6LU7) is a complex structure with an inhibitor N3. In Discovery Studio Visualizer 2017 R2, it has been observed that this protease crystal structure has five cavities site and the Inhibitor N3 is bound to the protease at site 1 ( Figure S9 ). Therefore, we have selected site 1 as an inhibition site ( Initially, we have screened various natural compounds present in Sigma-Aldrich plant profiler chemical library (https:// www.sigmaaldrich.com/life-science/nutrition-research/learningcenter/plant-profiler.html). Performing virtual molecular docking, we found that Rutin from garlic (Allium sativum) possesses a higher binding affinity than native co-crystal ligand inhibitor N3 toward COVID-19 main protease. After that, structurally similar compounds of rutin has been searched using SwissSimilarity (http://www.swisssimilarity.ch/) web tool (Zoete et al., 2016) . From here, 40 compounds have been successfully docked to COVID-19 (M pro ) inhibition site (Table 1) . All the ligands have been obtained from PubChem chemical library in SDF format. Later on, it have been converted into mol2 file format for docking purposes using Discovery Studio Visualizer 2017 R2. Before docking, all the ligands (mol2 format) have been prepared (add hydrogen and charge: Gasteiger) using Dock Prep tool of UCSF Chimera software. Virtual molecular docking and analysis have been performed according to the previously described method (Majumder et al., 2019) . All the small molecules have been docked at COVID-19 M pro inhibition site by Autodock vina (Trott & Olson, 2009 ) using UCSF Chimera gui. Explicit solvent molecular dynamics (MD) simulations have been executed for the further investigation of virtual molecular docking results. GROMACS v.2019.4 package has been used to run 100 ns MD simulation of protease-ligand complexes according to the previously described method (Das et al., 2020; Panda et al., 2020) . GROMOS96 43a1 in single point charge (SPC) water models (Berendsen et al., 1995) and PRODRG server (van Aalten et al., 1996) have been used to generate force field and parameter files for protease and ligands. The receptor-ligand complexes have been to be placed in a periodic cubic solvated box (x ¼ 9.631, y ¼ 9.631, z ¼ 9.631) with at least 10 Å distance from the edge of the box. Periodic boundary conditions have been employed using the particle mesh Ewald (PME) method for long-range electrostatics interactions. All the receptor-ligand complex systems have been neutralized by adding 4 NA þ counter ions. Prior to the running of real dynamics, energy minimization and equilibration of all systems has been executed through following three steps: (i) Energy minimization of all systems containing ions, solvent, receptor, and ligand has been executed through using the steepest descent minimization algorithm with 5000 steps to achieve stable system with maximum force < 1000 kJ mol À1 nm À1 . (ii) Position restrains have been applied to receptor and ligand of the each systems for 100 ns throughout heating (300 K) utilizing NVT (No. of atoms, Volume, Temperature) ensemble with leap-frog integrator, a time step of 2 fs and LINCS holonomic constraints. (iii) NPT (No. of atoms, Pressure, Temperature) ensemble has been applied at a constant pressure (1 bar) and temperature (300 K) for 100 ps using a time step of 2 fs for NPT equilibration phase. After the energy minimization and equilibration of all systems, MD production run has been executed without any restrain for 100 ns with a time step of 2 fs, and after every 10 ps coordinates of the structure have been saved. After the completion of 100 ns MD simulation, the trajectories have been used for various dynamics analysis such as root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), number of hydrogen bonds, Gibbs free landscape 3-hydroxy-7,3 0 ,4 0 ,5 0 -tetramethoxyflavone 21721930 and secondary structural analysis by different inbuilt scripts of GROMACS. Also, the MM-PBSA binding free energy of receptor-ligand complexes have been calculated by utilizing the Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) method which utilized the MD simulation trajectories and is one of the best-used methods in this kind of analysis (Kumari et al., 2014) . Initially, native co-crystal ligand: Inhibitor N3 of the COVID-19 main protease (M pro ) complex (crystal structure) has been redocked to legitimize the virtual molecular docking protocol for its accuracy. The root mean square deviation (RMSD) value between the co-crystal and redocked native ligand: Inhibitor N3 pose is 1.094 Å. The RMSD value within 2 Å is considered as a successful docking protocol (Rao et al., 2007) . It has been observed that the interaction of co-crystal and redocked native ligand: Inhibitor N3 with COVID-19 M pro inhibition site residues are mostly identical ( Figure 2 ). After that, forty natural compounds have been screened as a potential inhibitor (Table 1) . Redocking of native co-crystal ligand: Inhibitor N3 shows that it has a high amount of binding affinity (-7.9 kcal/mol) toward inhibition site. Therefore, we have analysed the interaction of those small molecules which have lesser binding energy that means higher binding affinity toward inhibition site as compared to native co-crystal ligand: Inhibitor N3. We have found twenty-eight small molecules that possess higher binding affinity as compared to native co-crystal ligand: Inhibitor N3 ( Figure S1 -S8). Also, these twenty-eight small molecules and native co-crystal ligand: Inhibitor N3 interactions with inhibition site amino residues of COVID-19 (M pro ) have been depicted in Table 2 . For the further investigation of virtual molecular docking results, top six docked ligand (binding energy À9.0 kcal/ mol) complex (COVID-19 M PRO -ligand) and crystal protease complex with the Inhibitor N3 (COVID-19 M PRO , PDB ID: 6LU7), have been run for 100 ns molecular dynamics simulation. From that MD simulation trajectory, we have analysed root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), number of hydrogen bond, Gibbs free energy landscape and secondary structural to check receptor-ligand conformational properties such as stability, and flexibility. Besides that, the binding energy of receptor-ligand complexes throughout the whole MD simulation has also calculated using MMPBSA methods. Root mean square deviation (RMSD) value is an indicator of the stability of the receptor-ligand complex. Here, the RMSD values of all six docked COVID-19 M pro -ligand complexes (ligands docked at inhibition site) have been compared with COVID-19 M pro -Inhibitor N3 complex (crystal structure) (Figure 3a) . It is visible that RMSD of COVID-19 M pro complexes docked with Peonidin 3-O-glucoside (0.303 nm), Quercetin-3-D-xyloside (0.363 nm), 4-(3,4-Dihydroxyphenyl)-7methoxy-5-[(6-O-ß-D-xylopyranosyl-ß-D-glucopyranosyl)oxy]-2H-1-benzopyran-2-one (0.434 nm), Quercetin 3-O-a-L-arabinopyranoside (0.436 nm), and Kaempferol 3-O-ß -rutinoside (0.586 nm) respectively are significantly stable as compared to the native co-crystal ligand Inhibitor N3 (0.617 nm). It indicates that all of these bioactive molecules have formed stable complexes with COVID-19 M pro . We have found that RMSD value of Rutin (0.822 nm) is higher when it binds with COVID-19 M pro as compared to the native co-crystal ligand Inhibitor N3. It suggests that COVID-19 M pro -Rutin complex is slightly less stable. Rg is the indicator of compactness, stability, and folding of structure. We have calculated the Rg based on the intrinsic dynamics of protease-ligand complexes (Figure 3c ). All the complexes have a steady average Rg of $2.10 nm over the 100 ns MD simulation. Such results indicate that majority of ligands have formed compact and stable complexes with COVID-19 M PRO as compared to the native co-crystal ligand Inhibitor N3. The number of Hydrogen bonds (H-bond), which are the main stabilizing interaction factor between two molecules, have been calculated to investigate the nature of H-bond at the inhibition site of COVID-19 M pro . The H-bonds have been recorded throughout the 100 ns of the receptor-ligand MD simulations (Figure 3d) . Consequently, it has been observed that native co-crystal ligand Inhibitor N3 has formed an average of 1.06 H-bonds, while other docked ligands such as Gibbs free energy landscape (FEL) plot for COVID-19 M PROligand complexes have been generated by calculating the first two principal components (PC1 and PC2) as reaction coordinates, using GROMACS inbuilt scripts (g_covar, g_anaeig, and g_sham). The primary function of g_sham is to plot Gibbs free energy landscapes by Bolzmann inverting multi-dimensional histograms using PC1 and PC2 as input. The FEL can depict the global minima energy conformation of a structure (receptor-ligand complex). If the receptorlignad interaction is very weak or unstable, it can achieve multiple minimum energy clusters; whereas a strong and stable interaction can bring almost a single conformation cluster in the potential energy distribution. All the contour 2 D and 3 D plots of receptor-ligand complexes during 100 ns MD simulation have been depicted in Figure 4 . Each receptor-ligand complex has shown a distinct pattern for FEL. Dark violet/blue spots reflect the energy minima and energetically favoured structural conformation and red/yellow spots indicate the unfavourable structural confirmation. The shallow and narrow energy basin indicates the low stability of structural conformation. It can be observed from the Figure (Figure 4a ). It suggests these two natural compounds have formed more strong, stable, and energetically favourable structural conformation when they bind with COVID-19 M pro as compared to native ligand Inhibitor N3. To calculate a more accurate binding free energy between COVID-19 main protease (COVID-19 M pro ) and selected ligands at receptor inhibition site, the MM/PBSA based method has been used. Here, the binding free energy defines the total of non-bonded interaction energies (van der Wall, electrostatic, polar solvation, and SASA energy) between receptor and ligand throughout the whole MD simulation (100 ns). Lower binding free energy indicates a better binding between receptor and ligand. We have estimated binding free energy of docked ligands from 100 ns MD simulation trajectories and it suggests that Kaempferol 3 Table 3 ). We have found that Rutin (-164.299 þ/-38.698 kJ/mol) has a lower binding affinity towards COVID-19 M pro inhibition site as compared to co-crystal ligand inhibitor N3. It may be due to the higher RMSD value of Rutin at the receptor binding/inhibition site as compared to other ligands. Later on, we have analyzed the binding contribution energy of each amino acid residue of the receptor (COVID-19 M pro ) in the binding of ligands at the inhibition site (Figure 5b ). There are several amino acid residues (THR-24, HIS-41, , which have shown favorable contribution energy when they bind with natural ligands as compared to native ligand Inhibitor N3. We have found that amino acid residue GLU-166 has involved in highest contribution energy during the receptor-ligand binding. It has been reported that amino acid residue GLU-166 involves in the construction of a biologically active main protease (M pro ) dimer form (Anand et al., 2003) . It suggests bioactive natural ligands like Kaempferol 3-O-b -rutinoside, Peonidin 3-O-glucoside, Quercetin-3-D-xyloside, etc. could effectively bind to COVID-19 M pro amino acid residue GLU-166 and inhibit the dimerization process of COVID-19 M pro . It could lead to higher efficacy in the inhibition of the COVID-19 M pro . An extensive study has been done to understand the structural evolution of the COVID-19 M pro -ligand during the 100 ns simulation. Moreover, how the ligand is sticking to the inhibition site of COVID-19 M pro is a crucial point to understand the stability of the ligands. Five snapshots of the COVID-19 M pro -ligand complexes have been taken at every 25 ns interval (25 ns, 50 ns, 75 ns, and 100 ns). It is observed that the secondary structure elements (helix and beta-sheets) remained conserved throughout the simulation process ( Figure 5) , which highlights the stability and reliability of COVID-19 M pro after binding to the ligands. The N-terminal and the C-terminal coils are observed to be fluctuating a bit, but structurally, no significant changes have been seen. It has been also observed that the ligands are constantly attached to the inhibition site without any structural modification, which implies that the ligands are highly stable. The Secondary structural analysis with gmx_do_dssp has shown no significant level of overall conformational change in ligand-bound COVID-19 M pro complexes as compared to the crystal structure of COVID-19 M PRO in complex with an inhibitor N3 ( Figure 6 and Table 4 ). Our In silico studies reveals that natural compounds like Rutin and its structurally similar compounds with a basic structure of anthocyanin (Peonidin 3-O-glucoside, Kaempferol 3-O-b-rutinoside, Quercetin-3-D-xyloside, Quercetin 3-O-a-Larabinopyranoside, etc.) may inhibit the COVID-19 main protease (M pro ), which is essential to block the replication and growth of the novel coronavirus (SARS-CoV-2). Our virtual molecular docking score suggests that the top twenty-eight compounds (Table 1 ) have a higher amount of binding affinity toward inhibition site of COVID-19 M pro as compared to the native co-crystal ligand: Inhibitor N3. Furthermore, molecular dynamic simulation studies of the top six compounds which have a very high amount of binding affinity (binding energy À9 kcal/mol from docking score) toward inhibition site reveal that compounds like Kaempferol 3-O-brutinoside, Peonidin 3-O-glucoside, Quercetin-3-D-xyloside, 4-(3,4-Dihydroxyphenyl)-7-methoxy-5-[(6-O-ß-D-xylopyranosyl-ß-Dglucopyranosyl)oxy]-2H-1-benzopyran-2-one, and Quercetin 3-O-a-L-arabinopyranoside have a highly favorable conformational stability, flexibility, and binding energy when they have docked into the inhibition site of COVID-19 M pro compared to the crystal structure of COVID-19 M PRO in complex with an Inhibitor N3. Overall, our In silico results indicate that the above-mentioned compounds may have the potential to be developed as an anti-COVID-19 main protease drug to combat the novel coronavirus but before that, it must go through under the proper preclinical and clinical trials for further scientific validation. Inhibitor N3 54 30 24 1 14 7 22 1 0 Peonidin 3-O-glucoside 59 26 26 2 12 9 23 0 2 Kaempferol 3-O-b -rutinoside 59 28 26 1 12 9 23 0 2 Rutin 60 27 25 1 12 9 24 0 2 Coronavirus main proteinase (3CLpro) structure: Basis for design of anti-SARS drugs GROMACS: A message-passing parallel molecular dynamics implementation Flavonoids from Mimosa xanthocentra (Leguminosae: Mimosoideae) and molecular modeling studies for isovitexin-2 00 -O-a-lrhamnopyranoside rotamers Hypoglycemic activity of extracts and compounds from the leaves of hintonia standleyana and H. latiflora: Potential alternatives to the use of the stem bark of these species In-Silico approach for identification of effective and stable inhibitors for COVID-19 main protease (M pro ) from flavonoid based phytochemical constituents of Calendula officinalis Phenolic derivatives from ruprechtia polystachya and their inhibitory activities on the glucose-6-phosphatase system Cytotoxic activities of flavonoid glycoside acetates from Consolida oliveriana Inhibition of thyroid peroxidase by Myrcia uniflora flavonoids The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2 In vitro cultures of Scutellaria alpina as a source of pharmacologically active metabolites A study of apiin from the parsley seeds and plant Chapter 26 -Antidiabetic herbal medicines rebranded as dietary supplements Evolution of juice anthocyanins during ripening of new selected pomegranate (Punica granatum) clones Human Metabolome Database: Showing metabocard for Rhamnetin 3-sophoroside (HMDB0038303). (n.d.) Chapter 2 -Visual perceptions Characterization and quantification of flavonoid glycosides in the Prunus genus by UPLC-DAD-QTOF/MS Structure of M pro from COVID-19 virus and discovery of its inhibitors Qualitative and quantitative analysis of flavonoids from 12 species of Korean mulberry leaves Effects of anthocyanins on the AhR-CYP1A1 signaling pathway in human hepatocytes and human cancer cell lines g_mmpbsa-a GROMACS tool for high-throughput MM-PBSA calculations Antioxidant capacities and phenolics levels of French wines from different varieties and vintages Antifungal activity of camptothecin, trifolin, and hyperoside isolated from Camptotheca acuminata Quantitative analysis of phenolic compounds in Chinese hawthorn (Crataegus spp.) fruits by high performance liquid chromatography-electrospray ionisation mass spectrometry In vitro and in silico study of Aloe vera leaf extract against human breast cancer A phenylstyrene from Hintonia latiflora Bioavailability of Apigenin from Apiin-Rich Parsley in Humans Bioavailability of pelargonidin-3-O-glucoside and its metabolites in humans following the ingestion of strawberries with and without cream Structure-based drug designing and immunoinformatics approach for SARS-CoV-2 Chapter 26 -The beneficial role of rutin, a naturally occurring flavonoid in health promotion and disease prevention: A systematic review and update Two new glycosides from Dryopteris fragrans with anti-inflammatory activities Validation studies of the site-directed docking program libdock Rhoifolin: A review of sources and biological activities Trifolin from Euphorbia condylocarpa Isolation and spectroscopic identification of some constituents of bioactive fractions of aerial parts of mirabilis Jalapa AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading PRODRG, a program for generating molecular topologies and unique molecular descriptors from coordinates of small molecules Antioxidant activity of compounds isolated from Salvia plebeia Coronavirus disease 2019 (COVID-19) situation report -77 A new coronavirus associated with human respiratory disease in China Separation and purification of flavones from Nelumbo nucifera Gaertn. by silica gel chromatography and high-speed counter-current chromatography Flavonoid glycosides from the seeds of litchi chinensis Antitcoagulant and antiplatelet activities of scolymoside Chapter 76 -Cardiovascular effects of hesperidin: A flavanone glycoside Polyphenols in human health and disease Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved a-ketoamide inhibitors A pneumonia outbreak associated with a new coronavirus of probable bat origin SwissSimilarity: A web tool for low to ultra high throughput ligand-based virtual screening We would like to acknowledge the Indian Institute of Technology Kharagpur, India, Indian Council of Medical Research, Council of Scientific and Industrial Research, and J C Bose Fellowship (SERB, India) for providing structural and financial support. No potential conflict of interest was reported by the authors. http://orcid.org/0000-0003-3585-3276