key: cord-0709764-cibg80zy authors: Khan, Shama; Fakhar, Zeynab; Hussain, Afzal; Ahmad, Aijaz; Jairajpuri, Deeba Shamim; Alajmi, Mohamed F.; Hassan, Md. Imtaiyaz title: Structure-based identification of potential SARS-CoV-2 main protease inhibitors date: 2020-11-19 journal: Journal of biomolecular structure & dynamics DOI: 10.1080/07391102.2020.1848634 sha: b9f7fbaaa20ff595853d9ded9183f50ed52a81b9 doc_id: 709764 cord_uid: cibg80zy To address coronavirus disease (COVID-19), currently, no effective drug or vaccine is available. In this regard, molecular modeling approaches are highly useful to discover potential inhibitors of the main protease (M(pro)) enzyme of SARS-CoV-2. Since, the M(pro) enzyme plays key roles in mediating viral replication and transcription; therefore, it is considered as an attractive drug target to control SARS-CoV-2 infection. By using structure-based drug design, pharmacophore modeling, and virtual high throughput drug screening combined with docking and all-atom molecular dynamics simulation approach, we have identified five potential inhibitors of SARS-CoV-2 M(pro). MD simulation studies revealed that compound 54035018 binds to the M(pro) with high affinity (ΔG(bind) −37.40 kcal/mol), and the complex is more stable in comparison with other protein-ligand complexes. We have identified promising leads to fight COVID-19 infection as these compounds fulfill all drug-likeness properties. However, experimental and clinical validations are required for COVID-19 therapy. Communicated by Ramaswamy H. Sarma The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has emerged as a global pandemic as it has affected the entire population (Pang et al., 2020) . The genome sequence of SARS-CoV-2 revealed a close similarity to the coronavirus (CoV) that caused an outbreak of severe acute respiratory syndrome (SARS) in 2003 Lai et al., 2020) . World Health Organization (WHO) declared COVID-19 as pandemic because more than 70,000 people died and about 2.5 million infected (Chang et al., 2020) . After H1N1 (2009 H1N1 ( ), Polio (2014 , Ebola in West Africa (2014), Zika (2016) and again Ebola in the Democratic Republic of Congo (2019); WHO has declared COVID-19 as the sixth public health emergency of global distress (Lai et al., 2020) . The current situation is rapidly advancing; thus, the ultimate extent and severity of this pandemic remain to be critical. The mortality rate of SARS-CoV-2 is about 3.8% which has been reported as the lowest than other SARS-CoV (10%) and Middle East respiratory syndrome coronavirus (MERS-CoV) (37.1%). However, its infection rate is more than 10 times higher than these CoVs (Ahn et al., 2020) . The most characteristic symptom of COVID-19 patients is a high level of respiratory distress, which needs immediate intensive care facility (Lai et al., 2020; Zou et al., 2020) . So far, there has been no effective and accurate treatment to cure COVID-19 . Hence, there is an urgent need to make significant efforts to develop therapeutic interventions and diagnostic methods to control coronavirus infections . CoVs belong to the family of Coronaviridae containing a single-strand of positive-sense RNA viruses. These viruses can be classified into four genera: alpha, beta, gamma, and delta. The current SARS-CoV-2 belongs to the beta genus and is commonly known to infect humans (Menachery et al., 2015; Ahmed et al., 2020) . The genome length of this virus is about 27-32 kb encoding both structural and non-structural proteins (Zhang & Holmes, 2020) . Among the structural proteins, membrane, envelope, nucleocapsid, and spike proteins contribute significantly to virus transmission and its replication in the host cells (Naqvi et al., 2020; Shanmugaraj et al., 2020) . PLpro and 3CLpro proteases are vital in virus replication and hence considered a promising drug target. Structural and mechanistic information will help to design potent and selective inhibitors of 3CLpro and PLpro that will eventually be implicated to address COVID-19 as these proteases are indispensable for virus assembly and replication . The unexpected emergence of COVID-19, have underscored the urgent need for effective preventive and therapeutic measures for the development of antiviral therapy (Kumari et al., 2020) . To address the critical health challenge of COVID-19, many therapeutic targets for the development of vaccines and drugs are under the explorative stage Wrapp et al., 2020) . COVID-19 is difficult to control because of heterogeneous member's periodical cycle and significant overlap with human and wild animal ecologies. Currently, there are no approved specific antiviral therapies available for SARS-CoV-2. Several attempts have been made using approved antivirals drugs (ribavirin and lopinavir-ritonavir) and immunomodulators (corticosteroids, interferons, etc.) (Grein et al. 2020; Li et al., 2020; Morse et al., 2020; Wang et al., 2020) . However, these approaches are not much effective in the case of COVID-19 because of significant differences in the surface, structural, and enzymatic proteins of SARS-COV-2 in comparison to the SARS-CoV and MERS-CoV (Grein et al. 2020; Kumari et al., 2020) . Design and development of SARS-COV-2-specific direct-acting antiviral drugs could be potentially obtained by targeting conserved enzymes such as 3 C-like protease, papain-like protease, non-structural protein 12 (nsp12), and RNA-dependent RNA polymerase (Zumla et al., 2016; Tang et al., 2020) . The main protease (M pro )is a key enzyme that plays a critical role in the SARS-CoV-2 life cycle as this enzyme influence the process of viral replication and transcription (Jin et al.; Naqvi et al., 2020) , making it an attractive target for antiviral drugs discovery (Ahn et al., 2020) . Here, we have used SARS-CoV-2 M pro as a target enzyme to facilitate the rapid search of antiviral compounds with clinical potential in the therapeutics of COVID-19. Recently, the three-dimensional structure of SARS-CoV-2 M pro has been determined which is highly similar to that of the SARS-CoV M pro , with a 96% sequence identity with the RMS deviation of 0.53 Å ). Although, a minor variation in the sequence and structure of M pro from both the viruses, a significant difference was noticed in the binding pattern and inhibition of M pro by inhibitors designed for SARS-CoV-2. Hence, approved protease inhibitors such as disulfiram, lopinavir, and ritonavir have been reported to be active against SARS and MERS CoVs, but are ineffective in the case of SARS-CoV-2 (Kandeel & Al-Nazawi, 2020) . The bound inhibitor showed the formation of several interactions with the binding pocket residues. The N atom of lactam moiety, N atom of amide functional group of the neighboring His164, and O atom of benzyloxy group formed hydrogen bonds with His163, His164, and Gly143, respectively. The alkyl moiety of leucinamide subunit of N3-ILP is surrounded by the hydrophobic side chain of His41, Met49, Tyr54, Met165, and Asp187 of M pro . The solvent-exposed area of the Val side chain of the inhibitor indicated that this site could be substituted by a wide range of functional groups. The Ala side chain N3-ILP was surrounded by the side chains of Met165, Leu167, Phe185, Asp192, and Asp189, which formed a hydrophobic pocket. P5 makes van der Waals contacts with P168-A and the backbone of residues 190-191, Figure 1 . Here, the structural features of already known peptide-like N3-ILP inhibitor of SARS-COV-2 M pro enzyme have been employed to generate a small library of similar compounds to gain a short-term and specific solution to treat COVID-19 patients Ton et al., 2020) . To address this general challenge, we developed an integrated approach of drug discovery using pharmacophore modelling, screening of PubChem library, molecular docking, and molecular dynamics (MD) simulation to find potential preclinical leads for the therapeutic management of COVID-19 . This strategy will provide a route-map for tailoring the antiviral inhibitors that might help in assisting and developing novel SARS-CoV-2 M pro lead compounds to be implicated in the treatment of COVID-19 patients. The crystal structure of SARS-CoV-2 M pro in complex with a peptide-like inhibitor was retrieved from the Protein Data Bank (PDB ID: 6LU7) (Jin et al., 2020) . The structure of the enzyme was pre-processed, minimized, and refined using the Protein Preparation Wizard implemented (Madhavi Sastry et al., 2013) in Schr€ odinger suite (Schr€ odinger Release 2020-1: Protein Preparation Wizard; Epik). This involved eliminating crystallographic waters, missing hydrogens/side-chain atoms were added to assign appropriate charge and protonation state to relieve the steric clashes among the residues. The OPLS-2005 force-field was used for the energy minimization using a root mean square deviation (RMSD) cut-off value of 0.30 Å. The preparation of the ligand, N3 peptide-like inhibitor (N3-IPL) and the studied compounds were performed using LigPrep (Schr€ odinger Release 2020-1: LigPrep) module of Schrodinger Suite which performs addition of hydrogens, adjusting realistic bond lengths and angles, ionization states, correct chiralities, stereo chemistries, tautomers, and ring conformations . Partial charges were assigned to the structures using the OPLS-2005 (Harder et al., 2016) force-field, and the resulting structures were subjected to energy minimization until their average RMSD reached 0.001 Å. Epik ionization tool (Schr€ odinger Release 2020-1: Epik) was used to set the ionization state at physiological pH ¼ 7.4. We have filtered and retrieved 409 compounds from the PubChem database considering Lipinski's rule of five (Lipinski, 2004) . All the compounds were selected based on 80% structural similarity and the main scaffold of N3-inhibitor and subsequently considered for further virtual screening analysis. For the structure-based pharmacophore modeling, PHASE module (Dixon et al., 2006) implemented in Maestro 11.6 was used with the default set of six chemical features: hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobic contacts (H), negative ionizable (N), positive ionizable (P), and aromatic ring (R) to construct the most representative features of the M pro active sites. The five 3 D-features were generated using Hypothesis Generation for Energy-Optimized structure-based pharmacophores considering the excluded volumes within 5 Å of refined ligand for the enzyme (Loving et al., 2009; Salam et al., 2009) . Pharmacophore features were selected based on essential interaction contacts with the key residues of the enzyme accommodated the inhibitor. The resulted pharmacophore features contain the functional groups involved in their bioactivity of a targeted enzyme. The Excluded volumes include all atoms within 5 Å of the refined ligand for the target. The obtained five 3 D-pharmacophore features were exported and set as a reference for PHASE-based virtual screening to screen the library of 409 compounds from the PubChem database which was retrieved and filtered with 80% structural similarity to the known N3-Inhibitor of M pro enzyme (Dixon et al., 2006) . Out of 409 compounds, 171 were generated based on the highest PHASE screen score and matched ligand sites. Both the quantity and quality of a similar feature was taken into account in the Phase-Screen-Score factor. Molecular-docking-based virtual screening was performed using the Glide-Based virtual screening workflow of Maestro 11.6 to identify suitable compounds that strongly bind to M pro enzyme (Halgren et al., 2004) . The receptor grid was generated as center coordinates (X ¼ À10.81 Y ¼ 12.41 Z ¼ 68.93) using two cubical boxes having a common centroid to organize the calculations: a larger enclosing and a smaller binding box with dimensions of 24 Â 24 Â 24 Å and 32 Â 32 Â 32 Å, respectively. The grid box was centered on the centroid of the ligand in the complex, which was sufficiently large to explore a larger region of the enzyme structure. The ligands were docked using three docking protocols which starts with "High throughput Virtual Screening" (HTVS) followed by "Standard Precision" (SP) and then by "Extra-Precision" mode (XP). Finally, 171 input compounds were evaluated using Docking-Based Virtual Screening and filtered to the final 20 compounds based on the docking scores and XP-GScores. The QikProp 5.6 module (QikProp) of Schrodinger was used to predict absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of the considered compounds to generate the ADMET related descriptors. This protocol predicts significant physicochemical and pharmacokinetic-based descriptors of the compounds based on Lipinski's rule of five (Lipinski, 2004) . ADMET properties of the top five compounds and N3-Inhibitor were assessed and analyzed using QikProp 5.6 module and these compounds were designated for the final molecular dynamic (MD) simulations analysis. To understand the physical basis of the structure and function of biological macromolecules, MD simulation substantiates to be an essential approach (Amir et al., 2020; Beg et al., 2019; Dahiya et al., 2019; Gulzar et al., 2019; Gupta et al., 2019; Gupta et al., 2019) . This technique assists in discovering the structural dynamics and how it is coupled to the biomolecular function of the enzyme (McCammon et al., 1977; Karplus & McCammon, 2002; Arnittali et al., 2019) . AMBER 18 package was used to execute 100 ns MD simulations on all the prepared complexes using (Graphics Processing Unit) GPU accelerated version of Partial Mesh Ewald Molecular Dynamics (PMEMD) simulations . The ff99SB (Hornak et al., 2006) and the general AMBER force fields (GAFF) (Wang et al., 2006) were employed to parametrize the enzyme and the considered ligands using LEaP module implemented in Amber 18 (D.A. Case). The ANTECHAMBER (Wang et al., 2001) module was used to assign atomic partial charges for the ligands employed in General Amber Force-Field (GAFF) (Wang et al., 2004) . The system was solvated using the TIP3P (Harrach & Drossel, 2014 ) explicit water in a cubic box with an 8 Å box edge. The Na þ counter ions were added randomly to neutralize the complex. The partial Mesh Ewald (PME) (Harvey & De Fabritiis, 2009 ) method was used to account for the long-range electrostatic forces using a cutoff of 12 Å, and the SHAKE algorithm (Ryckaert et al., 1977) was used to constrain all the hydrogen atom bonds. Energy minimizations were performed in two stages with 2500 steps of steepest descent minimization followed by 2500 of the conjugated gradient to remove the bad contacts. The first stage was followed with a harmonic restraint of 500 kcal mol-1 A-2 on the solute molecule whereas, ions and water molecules were relaxed. In the second stage of minimization, the restraints were removed and the whole system was relaxed. Each minimized complex was then gradually heated up from 0 K to 300 K for 200 ps to keep the solute using a weak harmonic restraint of 10 kcalmol-1 A-2. The 50 ps density equilibration with weak restraints followed by the 500 ps constant pressure equilibration at 300 K were performed at constant pressure using Berendsen barostat (Lin et al., 2017) . Ultimately, the production phase of 100 ns MD simulation was performed on all the complexes at a constant temperature of 300 K and constant pressure at 1 atm. The atomic coordinates of enzymes bound with the inhibitors were further saved after every 1 ps and the trajectory curves were calculated using the CPPTRAJ module integrated into AMBER 18 package (Roe & Cheatham, 2013) . The root means square deviation (RMSD) of C a atoms, root means square fluctuation (RMSF) of each residue in the complex, a radius of gyration (R g ), solvent accessible surface area (SASA), intramolecular and intermolecular hydrogen bond formation and thermodynamic calculations of all the systems were calculated. Origin software was used for MD trajectories analysis (Janert, 2009 ). The relative binding free energies were calculated using Molecular Mechanics/Generalized Born Surface Area (MM/ GBSA) binding free energy . All water molecules and counterions were stripped using the CPPTRAJ module. The binding free energies (DG bind ) were calculated with the MM/GBSA method for each system as below: The free energy term, DG bind is computed using the following equations: Where, The gas-phase energy (DE gas ) is the sum of the internal (E int ), van der Waals (E vdW ) and Coulombic (E elec ) energies, (Eq. 4). The solvation free energy is the combination of polar (G GB ) and nonpolar (G nonpolar , solvation) contributions (Eq. 5). The polar solvation G GB contribution was calculated using the Generalized Born (GB) solvation model with the dielectric constant 1 for solute and 80.0 for the solvent (Onufriev & Case, 2019) . However, the nonpolar free energy contribution was estimated using Eq. 6, where the surface tension proportionality constant, c, and the free energy of nonpolar solvation of a point solute, b, were set to 0.00542 kcal mol À1 Å À2 and 0 kcal mol À1 , respectively. SASA was calculated by a linear combination of the pairwise overlap (LCPO) model. The crystal structure of SARS-CoV-2 M pro (PDB ID: 6LU7) was extensively analyzed to understand their binding affinity, mode of binding and interacting residues (Figure 1 ). Structure analysis revealed the existence of 8 hydrogen bonds offered by Gly143, His163, His164, Glu166, Gln189 and Thr190 of M pro with the N3-Inhibitor-Like-Peptide (N3-ILP) . The structure of N3-ILP was selected to screen the PubChem database. 409 compounds were filtered and collected based on 80% structural similarity to N3-ILP and Lipinski's rule of five (Lipinski, 2004) from the PubChem database (Kim et al. 2016) . The screened 409 compounds are listed in Table S1 . Structure-based pharmacophores derived from the threedimensional structure of a target protein provide detailed and accurate information on ligand binding (Langer, 2010) . The commonly used descriptors in the pharmacophore modeling are H-bond acceptors, H-bond donors, positive and negative ionizable groups, lipophilic regions and aromatic rings. The best 3 D structure-based e-pharmacophores (Loving et al., 2009; Salam et al., 2009) were generated using the receptor-ligand pharmacophore generation protocol implemented in PHASE (Dixon et al., 2006) , based on the crystal ligand inside the active site and residues involved in ligand binding. The generated e-pharmacophore of the considered enzyme showed five main 3 D-features including, Hbond acceptor, H-bond donor, and aromatic rings. In each pharmacophore model, the red arrows represent hydrogen bond acceptor, the blue arrow represents hydrogen bond donor and the orange spheres represent an aromatic ring. The 3 D pharmacophore features and 2 D-chemical structure of N3-ILP are illustrated in Figure 2 showing five 3-D pharmacophore features as three hydrogen-bond donor, one hydrogen-bond acceptor and one aromatic ring sphere. The obtained structure-based pharmacophore hypotheses were used to screen the PubChem database to explore new compounds. These compounds were screened considering the PHASE screen score and matched ligand site factors. A total of 409 compounds passed this filter and subsequently, 171 compounds were retrieved based on the created pharmacophore hypothesis. Molecules that have satisfied all the features of the pharmacophore model were considered as potential hits. The virtual screened compounds are presented in Table S2 . A total of 171 compounds obtained from virtual screening were docked using the Glide module of the Schr€ odinger package (Schr€ odinger Release 2020-1: Maestro) into the active site of M pro . A stepwise filtering protocol was used, in the first stage, compounds were docked using HTVS where a total of 157 hits were obtained, Table S3 . These compounds were further docked with Glide SP where a total of 81 hits were obtained (Table S4) . Afterward, the hits from the previous step were subjected to Glide XP docking and only one pose per ligand was retained. Finally, a total of 20 hits were obtained as shown in Table S5 . Among the 20 hit compounds, the known inhibitor N3-ILP (EC 50 : 16.77 mM) ( Figure S1 ) and top five compounds with PubChem IDs 54456426, 54152887, 54035018, 91366909, 57076946 with Glide GSCORE (XP SCORE) was estimated as À4.65 kcal/mol, À9.05 kcal/mol, À9.01 kcal/mol, À8.95 kcal/ mol, À8.80 kcal/mol and À8.70 kcal/mol indicating a strong binding affinity. The docking based RMSD values for N3-ILP and other top-five compounds were 2.77 Å, 2.64 Å (54456426), 2.68 Å (54152887), 2.02 Å (54035018), 2.72 Å (91366909) and 2.81 Å (57076946) respectively. The criteria for selecting these top five compounds were based on the binding scores (above À8.50 kcal/mol), lowest RMSD values and most promising binding interactions generated after molecular docking calculations. The theoretical binding affinities of identified hits are better than well-known inhibitor N3-ILP, indicating potential future clinical use of these preclinical leads. The number of hydrogen-bonded interactions is a key factor for tighter and specific binding affinity. The interacting binding residues of M pro enzyme forming noncovalent interactions with compounds 54035018, 54152887, 54456426, 57076946, and 91366909 are illustrated in Figure 3A and B. Compound 54035018 interacts with the binding site residues, Ser144, Glu166, Thr190 and Gln192 through four hydrogen bonds. Compound 54152887-M pro formed similar four hydrogen bonds to Ser144, Cys145, Glu166, and Thr190. It also forms a pi-pi stacking with His41. The complexes 54456426-M pro , 57076946-M pro , 91366909-M pro are stabilized by H-bonded interactions offered by His41, Leu141, Asn142, Ser144, Glu166 and His164. Pharmacokinetic and toxicity properties were predicted using the QikProp module of Schrodinger (QikProp) for N3-ILP, 54035018, 54152887, 54456426, 57076946 and 91366909 compounds. The results of pharmacokinetic and toxicity properties analysis are presented in Table 1 . The selected properties are representatives of influence metabolism, cell permeation, bioavailability, and toxicity. The predicted central nervous system activity (CNS) of all the molecules depicted as inactive. The recommended binding to human serum albumin (QPlogKhsa), all compounds showed in the acceptable range, although the N3-ILP showed slight out of range. The estimated total solvent accessible surface area (SASA) of all PubChem compounds met the acceptable range: 300-1000 whereas the known N3-ILP showed out of range. The predicted octanol/water partition coefficient (QPlogPo/w) showed in the acceptable range from À2 to 6.5. The predicted aqueous solubility (QPlogS) and predicted brain/blood partition coefficient (QPlogBB) for all these compounds showed in the acceptable ranges whereas the known inhibitor was out of range. The percentage of human oral absorption for all the compounds met in the recommended range. Some violations of Lipinski's rule of five for all the PubChem compounds significantly satisfied this rule with zero value than known inhibitor with the rule of five equal to two. Alternations inside the enzyme structure are directly associated with their biological activities. Any changes or disruption on enzymes' structural integrity might have a significant influence on its activity . The binding of small molecule inhibitors affects the mechanism of action of enzymes that are involved in disease pathways, thus there is a necessity to calculate the structural dynamics and conformational changes linked with the inhibitory activity of these inhibitors (Skjaerven et al., 2011) . The calculation of a time variable concerning an RMSD of C a atoms from produced trajectories was performed to discover the constancy and efficacy of the simulated M pro in complex with N3-ILP and along with the top five hits. The 2 D structure of the top five PubChem compounds and N3-ILP-inhibitor are presented in ( Figure S2 ). The perturbations in the RMSD values as indicated in the plot ( Figure 4A) during simulation time revealed a probable conformational deviation in the structure of the enzyme upon inhibitor binding. As Figure 4A revealed, all the systems were stabilized and achieved convergence after almost 30 ns of the simulation run. 54035018-M pro exhibited the lowest average RMSD of 2.03 Å, while 54152887-M pro and 54456426-M pro demonstrated the average RMSD of 2.45 Å and 2.33 Å respectively. The N3-ILP-M pro indicates a perturbation of 2.50 Å as shown in the plot. This analysis suggests that any other investigations made on the generated trajectories of all models were reliable. The RMSD plots suggest that 54035018-M pro , 54152887-M pro , and 54456426-M pro complexes exhibit the lowest divergence of C a backbone atoms indicating that binding of these three compounds imposed higher stability on M pro enzyme relative to N3-ILP-M pro complex. This could be implicated that the overall inhibitory activity of M pro by 54035018, 54152887, and 54456426 compounds was illustrated by the stabilization of its structural conformation. The stability of compounds 54035018, 54152887, and 54456426 was further retrieved during the simulation time as their corresponding activities might inform the respective interactions that they cause with active site residues. To further validate our results, we have calculated R g value, a parameter directly associated with the overall conformational changes in the structure of the enzyme upon ligand binding. It also reveals the structure stability, compactness and folding behavior (Fakhar et al., 2017) . We measured the compactness of all the selected compounds and the reference complex by calculating their R g values. The average values of R g for 54035018-M pro , 54152887-M pro , and 54456426-M pro complexes were noted to be 38.87 Å, 40.39 Å, and 43.68 Å, respectively. Figure 4B plots revealed a slight alteration in the compactness of the three compounds. The compound showed the lowest R g as compared to the other two complexes and also with the control N3-ILP-M pro complex (41.26 Å), thus suggesting increased compactness and better binding with the M pro enzyme. All these patterns of conformational evaluation are indicating greater stability, flexibility, and compactness of compound 54035018 with the M pro enzyme. After analyzing the conformational binding, we have also explored SASA to determine the function of hydrophobic and hydrophilic residues and forces exposed to the solvent during the simulation time (Khan et al., 2018) . The fast and precise calculation of SASA is very beneficial in the energetic Table1. In silico ADMET predictions of the selected compounds. À2 0.13 668.87 4.14 À5.10 À1.50 96.62 0 a Predicted central nervous system activity from -2 (inactive) to þ2 (active). b Prediction of binding to human serum albumin (acceptable range: -1.5-1.5). c Total Solvent Accessible Surface Area: SASA (acceptable range: 300-1000). d Predicted octanol/water partition coefficient (acceptable range: À2-6.5). e Predicted aqueous solubility, S in mol/dm À 3 (acceptable range: À6.5-0.5). f Predicted brain/blood partition coefficient (acceptable range: -3.0 -1.2). g Predicted percentage human oral absorption (<25% is poor and >80% is high). h Number of violations of Lipinski's rule of five, Compounds that satisfy these rules are considered druglike (maximum 4). evaluation of biological molecules. The interactions between the hydrophobic native contacts within the enzyme structure is a significant intermolecular interaction that influences enzyme inhibition. Hydrophobic interaction produced among the non-polar residues confirms the stability of the enzyme structure in solution by protecting the non-polar residues inside the hydrophobic core distant from an aqueous solution (Chen & Panagiotopoulos, 2019; Gupta et al., 2020) . As shown in Figure 5A , an average SASA for all the selected compounds has been calculated during the 100 ns MD simulation run. The average value of SASA for the compound 54035018-M pro complex was 13745 Å 2 which was exposed to the solvent system. A total mean of SASA of 14283 Å 2 and 14439 Å 2 was noted by 54152887-M pro complex and 54456426-M pro complexes, respectively. The variations in SASA values for all the complexes during the simulation time resemble the folding and unfolding of an enzyme. The overall SASA values in the control complex were 14009 Å 2 , higher than the 54035018-M pro complex. The SASA estimation observed in compound 54035018 bound complex further confirmed that this compound has increased exposure to solvent and subsequently preferred the increased inhibitory activity of compound 54035018 over other complexes. The flexible or rigid residues of a protein contribute specifically to the biological function of the target enzyme in different biological pathways. Thus, the binding of inhibitors to the enzyme may be analyzed through the change in flexibility in terms of RMSF values (Mart ınez, 2015) . To discover the rigidity and flexibility of residues in M pro upon binding of the selected compounds, RMSF values for C a atoms were calculated from trajectories generated over 100 ns of MD simulations production run. As shown in Figure 5B and C, the 54035018-M pro complex exhibited the least fluctuations in the residues with 11.12 Å. An average RMSF of 15.06 Å and 15.29 Å were noticed in complex 54152887-M pro and 54456426-M pro , respectively. The complex, N3-ILP-M pro revealed an average of 11.25 Å that is slightly greater than 54035018-M pro complex, suggesting a better binding in comparison to the N3-ILP-M pro complex. This significant decrease could be associated with structural inactivation that is ensured as an outcome of the influential binding of this compound in the catalytic site of the M pro enzyme. The For the overall conformation and stability of enzyme structure, we have assessed intramolecular and intermolecular hydrogen bond analysis ( Figure 6 ). This analysis gives deeper insights into the binding mechanism of enzyme-ligand with specific consideration. An average number of intramolecular hydrogen bonds in 54035018-M pro and 54152887-M pro complexes were observed to be 135 and 141, respectively as shown in Figure 6A . As we have mentioned in the residue interaction map analysis these complexes contributed greatly to the binding throughout the simulation period. The number of intermolecular hydrogen bonds formed in the active site of the M pro enzyme noted to be 2 and 3 hydrogen bonds in 54035018-M pro and 54152887-M pro complexes and between 3-4 hydrogen bonds in 54456426-M pro and 57076946-M pro complexes with higher fluctuations Figure 6B . To understand the intermolecular interactions between ligand with the active site residues, we have calculated the ligand interaction maps ( Figure 7A and B). Compound 54035018 forms two direct hydrogen bonds with the polar residues Thr26 and Asn142 with OH and NH group suggestive of a stable interaction. However, hydrogen bond formation of Asparagine was noted to be absent in other complexes. Another polar residue Gln192 formed two hydrogen bonds with the hydroxyl functional group of compound 54152887. Gln166 also formed a hydrogen bond with the hydroxyl group of 54152887 compound and residue His41 formed a p-p stacking within the complex. Compounds 54456426 and 57076946 also contributed significantly to the hydrogen bond network by forming major hydrogen bonds between the polar (Gln19, Thr26, Thr190), negatively charged (Gln166) and hydrophobic (Cys145) amino acid residues indicative of a stabilized binding with the enzyme. His41 forms a p-p stacking in compound 54456426 and 91366909. Compound 91366909 formed only one hydrogen bond with the negatively charged Gln166 residue, thus suggestive of weak binding. The hydrogen bond network plays an important role in the drug-binding mechanism, thus Thr amino acid residue contributed notably in the binding of the M pro enzyme as shown in Figure 7 . Based on the conformational behavior of bound complexes noted in RMSD, RMSF, and R g , we have calculated the secondary structure of 54035018-M pro complex and apo form of M pro enzyme as it is essential to observe the changes in enzyme structure without the inhibitor ( Figure 8 ). As shown in Table 2 , the average number of structural components are higher in the inhibitor bound M pro complex as compared to the apo form of the M pro enzyme. A slight increase was observed in the a-helix of the M pro apoenzyme. The b-strands and 3 10 -helix are marginally high in 54035018-M pro complex indicative of a stable binding. However, no major change was observed in the secondary structure of the M pro enzyme upon binding of the compound which shows strong stability, flexibility, and compactness of 54035018-M pro complex. The secondary structural analysis characterized here might provide useful conformational information of M pro which leads to the development of SARS-CoV-2 M pro inhibitors. We presume that the design and development of selective inhibitors of M pro enzyme using rational approaches may pave new routes in antiviral therapeutics. The thermodynamic energy contribution of an inhibitor to the overall binding free energy of the complex corresponds to the structural stability of the inhibitor in the catalytic site of the enzyme. The intermolecular interactions in the catalytic/allosteric site residues participate considerably in the stability, binding affinity, and selectivity of the inhibitor. Thus, it was necessary to examine the binding affinity of all the selected compounds towards M pro enzyme using MM/ GBSA technique to establish the impact of the studied compounds. The results are presented in Table 3 . As estimated free binding energy (DG bind ) of M pro complex was calculated as the highest energy with an average value of À37.40 kcal/mol relative to 54152887-M pro and 54456426-M pro complexes with total mean values of À37.18 and À24.79 kcal/mol. The overall binding energy of control complex N3-ILP-M pro appeared to be lower (-30.89 kcal/mol) than the 54035018-M pro complex suggesting a better and improved binding of compound 54035018 to its target enzyme. We further estimated other components of the free binding energy associated with enzyme-inhibitor binding (Table 3) . We observed that the intermolecular van der Waals and electrostatic energies in complex 54035018-M pro were favorable with average values of À44.79 and À31.73 kcal/mol whereas the gas-phase energy was higher (-56.21 kcal/mol) in complex 54152887-M pro . There is a slight difference of 0.22 kcal/mol among the DG bind energies of 54035018-M pro and 54152887-M pro complexes. The DG solvation contributed to the unfavorable binding of compound 54035018 as it is the least energy with a value of 25.01 kcal/mol among all complexes. Although the systematic movement of compound 54035018 and 54152887 from the solvent phase to the active site of M pro stimulated van der Waals and electrostatic interactions with the catalytic residues, these interactions are not enough to contribute fully in the binding of M pro enzyme as DG bind contributes to enhanced binding of these compounds. This analysis suggests that compound 54035018 bounds to M pro enzyme with greater affinity in comparison with other complexes. The necessity to control the emerging COVID-19 pandemic made us develop, a strategy to discover lead compounds that might be used in clinical trials to provide rapid success. Despite major efforts in the design and development of specific drugs or vaccines, not much proven to be efficacious against this SARS-CoV-2 infection. This curiosity leaves us to explore the drug designing approaches that could serve valuable to combat this disease. In this report, we have performed structure-based drug designing, virtual drug screening, and all-atom MD simulation approaches to discover highly selective compounds that possess significant binding affinity and presumably inhibition of the SARS-CoV-2 M pro enzyme. We have identified the five best compounds, however, only three have been discussed in detail as they have shown favorable binding energy against M pro enzyme. 54035018-M pro complex exhibited most stable, flexible, and compact in all the complexes with the highest notable secondary structure elements and binding energy (DG bind À37.40 kcal/mol). Based on our overall observations, compound 54035018 could be recommended as a potential lead for the therapeutic management of COVID-19 patients after required clinical validations. Disclosure statement Preliminary Identification of Potential Vaccine Targets for the COVID-19 SARS-CoV-2) Based on SARS-CoV Immunological Studies Current Status of Epidemiology, Diagnosis, Therapeutics, and Vaccines for Novel Coronavirus Disease 2019 (COVID-19) Virtual highthroughput screening of natural compounds in-search of potential inhibitors for protection of telomeres 1 (POT1) Structure Of Biomolecules Through Molecular Dynamics Simulations. Procedia Computer Science SARS-CoV-2 mediated lung inflammatory responses in host: Targeting the cytokine storm for therapeutic interventions Diagnostic approaches in COVID-19: Clinical updates Molecular Basis of Pathogenesis of Coronaviruses: A Comparative Genomics Approach to Planetary Health to Prevent Zoonotic Outbreaks in the 21st Century High throughput screening, docking, and molecular dynamics studies to identify potential inhibitors of human calcium/calmodulin-dependent protein kinase IV Coronavirus Disease 2019: Coronaviruses and Blood Safety Molecular Modeling of Surfactant Micellization Using Solvent-Accessible Surface Area Molecular interaction studies on ellagic acid for its anticancer potential targeting pyruvate dehydrogenase kinase 3 PHASE: A new engine for pharmacophore perception, 3D QSAR model development, and 3D database screening: 1. Methodology and preliminary results Anthocyanin derivatives as potent inhibitors of SARS-CoV-2 main protease: An insilico perspective of therapeutic targets against COVID-19 pandemic Differential flap dynamics in l,d-transpeptidase2 from mycobacterium tuberculosis revealed by molecular dynamics Impact of Hydroxychloroquine/Chloroquine in COVID-19 Therapy: Two Sides of the Coin Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein À Ligand Complexes Compassionate Use of Remdesivir for Patients with Severe Covid-19 Binding mechanism of caffeic acid and simvastatin to the integrin linked kinase for therapeutic implications: A comparative docking and MD simulation studies 2020). Identification of Potential Inhibitors of Calcium/Calmodulin-Dependent Protein Kinase IV from Bioactive Phytoconstituents Evaluation of binding and inhibition mechanism of dietary phytochemicals with sphingosine kinase 1: Towards targeted anticancer therapy Glide: A New Approach for Rapid, Accurate Docking and Scoring. 2. Enrichment Factors in Database screening OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins Structure and dynamics of TIP3P, TIP4P, and TIP5P water near smooth and atomistic walls of different hydroaffinity An Implementation of the Smooth Particle Mesh Ewald Method on GPU Hardware Comparison of multiple Amber force fields and development of improved protein backbone parameters Gnuplot in Action: Understanding Data with Graphs Structure of M(pro) from COVID-19 virus and discovery of its inhibitors Virtual screening and repurposing of FDA approved drugs against COVID-19 main protease Molecular dynamics simulations of biomolecules Distort the Binding of PTP1B Enzyme: A Novel Therapeutic Approach for Cancer Treatment Reversible versus irreversible inhibition modes of ERK2: A comparative analysis for ERK2 protein kinase in cancer therapy PubChem Substance and Compound databases Potential diagnostics and therapeutic approaches in COVID-19 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): The epidemic and the challenges Pharmacophores in Drug Research GPU-Accelerated Molecular Dynamics and Free Energy Methods in Amber18: Performance Enhancements and New Features Updated approaches against SARS-CoV-2 Application of Berendsen barostat in dissipative particle dynamics for nonequilibrium dynamic simulation Lead-and drug-like compounds: the rule-of-five revolution Energetic analysis of fragment docking and application to structure-based pharmacophore hypothesis generation Protein and ligand preparation: Parameters, protocols, and influence on virtual screening enrichments Automatic Identification of Mobile and Rigid Substructures in Molecular Dynamics Simulations and Fractional Structural Fluctuation Analysis Dynamics of folded proteins A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence Identification of high-affinity inhibitors of SARS-CoV-2 main protease: Towards the development of effective COVID-19 therapy Learning from the Past: Possible Urgent Prevention and Treatment Options for Severe Acute Respiratory Infections Caused by 2019-nCoV Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach Generalized Born Implicit Solvent Models for Biomolecules Potential Rapid Diagnostics, Vaccine and Therapeutics for 2019 Novel Coronavirus PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data Numerical integration of the cartesian equations of motion of a system with constraints: Molecular dynamics of n-alkanes Novel Method for Generating Structure-Based Pharmacophores Using Energetic Analysis Schr€ odinger Release Schr€ odinger Release 2020-1: Protein Preparation Wizard Glecaprevir and Maraviroc are highaffinity inhibitors of SARS-CoV-2 main protease: Possible implication in COVID-19 therapy Perspectives on monoclonal antibody therapy as potential therapeutic intervention for Coronavirus disease-19 (COVID-19) Dynamics, flexibility and ligand-induced conformational changes in biological macromolecules: A computational approach Coronavirus membrane fusion mechanism offers as a potential target for antiviral development Rapid Identification of Potential Inhibitors of SARS-CoV-2 Main Protease by Deep Docking of 1.3 Billion Compounds End-Point Binding Free Energy Calculation with MM/PBSA and MM/ GBSA: Strategies and Applications in Drug Design Antechamber: An accessory software package for molecular mechanical calculations Automatic atom type and bond type perception in molecular mechanical calculations Development and testing of a general amber force field Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors A Genomic Perspective on the Origin and Emergence of SARS-CoV-2 SARS-CoV-2 viral load in upper respiratory specimens of infected patients Coronaviruses -drug discovery and therapeutic options