key: cord-0712065-qidvjbwu authors: Mehmood, Aamir; Nawab, Sadia; Wang, Yanjing; Chandra Kaushik, Aman; Wei, Dong-Qing title: Discovering potent inhibitors against the Mpro of the SARS-CoV-2. A medicinal chemistry approach date: 2022-01-26 journal: Comput Biol Med DOI: 10.1016/j.compbiomed.2022.105235 sha: 899916f7f61d041cbfdaed48f07e0dce788637b0 doc_id: 712065 cord_uid: qidvjbwu The global pandemic caused by a single-stranded RNA (ssRNA) virus known as the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is still at its peak, with new cases being reported daily. Although the vaccines have been administered on a massive scale, the frequent mutations in the viral gene and resilience of the future strains could be more problematic. Therefore, new compounds are always needed to be available for therapeutic approaches. We carried out the present study to discover potential drug compounds against the SARS-CoV-2 main protease (Mpro). A total of 16,000 drug-like small molecules from the ChemBridge database were virtually screened to obtain the top hits. As a result, 1032 hits were selected based on their docking scores. Next, these structures were prepared for molecular docking, and each small molecule was docked into the active site of the Mpro. Only compounds with solid interactions with the active site residues and the highest docking score were subjected to molecular dynamics (MD) simulation. The post-simulation analyses were carried out using the in-built GROMACS tools to gauge the stability, flexibility, and compactness. Principal component analysis (PCA) and hydrogen bonding were also calculated to observe trends and affinity of the drugs towards the target. Among the five top compounds, C1, C3, and C6 revealed strong interaction with the target's active site and remained highly stable throughout the simulation. We believe the predicted compounds in this study could be potential inhibitors in the natural system and can be utilized in designing therapeutic strategies against the SARS-CoV-2. The current outbreak of the severe acute respiratory syndrome known as coronavirus 2 (SARS-CoV-2) has been declared as a pandemic by the World Health Organization (WHO) and is still a severe threat to public health [1, 2] . It is observed phylogenetically that the SARS-CoV-2 is highly associated with the previous coronaviruses SARS-CoV and MERS-CoV, infecting millions of people all-inclusive [3, 4] . As of November 20, 2021 , the global metrics have confirmed 257,090,259 infections and 5,158,478 deaths caused by the SARS-CoV-2, which is an alarming upsurge in the number of infections if compared with the previous data [5] . Furthermore, the novel coronavirus has a lower mortality rate, but the transmission rate is relatively high when equated with the earlier coronaviruses [6] . The SARS-CoV-2 infections could be asymptomatic, mildly symptomatic, or highly symptomatic, depending on the physiological conditions of an individual. Usually, the symptoms involve high temperature, shortness of breath/difficulty in breathing, myalgia, coughing, and radiological signs of ground-glass lung opacity well-matched with anomalous pneumonia in most patients [7] [8] [9] . Upon infection, the person remains asymptomatic but could be a carrier and might infect another individual. Therefore, a 14-16 days isolation is advised upon contact with the infected area or person. The SARS-CoV-2 has a genome of ~30 kb that encodes the entire viral proteome, including structural, non-structural, and additional proteins. The structural proteins are nucleocapsid (N), matrix (M), spike (S), and small envelope (E) protein. Together with additional proteins, they are translated by 33.33% of the viral genome, while the remaining 66.66% of the genome, known as the ORF1a/b region, corresponds to the non-structural proteins [7] [8] [9] . The ORF1 (a and b) are the two non-structural proteins (polypeptides pp1a and pp1ab) encoding genes that form the RTC (replication transcription complex). The pp1a and pp1ab polypeptides, once translated, are then proteolytically cleaved by other two viral proteases, which are knowns as PLpro (papain-like protease) and 3CLpro (3-chymotrypsin-like protease) or main protease (Mpro). PLpro is responsible for cleaving non-structural proteins (nsp 1-3). At the same time, the 3CLpro bisects the polyprotein at 11 specific sites downstream of nsp4 to produce various non-structural proteins, which are vital for the viral cycle [10] . There have been announcements of the SARS-CoV-2 vaccines, while other researchers have reported drugs that could be proven effective. The vaccines announced could be effective in some regions and the drugs reported may or not reach the clinical trial. This is because such viral infections are pretty complicated, and they can quickly develop resistance to the drugs because of the fast mutation rate. Therefore, a new drug that could strongly bind with the virus and aid in the inhibition process is always needed. Computer-aided drug design has proven to be quite fruitful in designing therapeutic strategies. It has helped us construct or discover molecules that could effectively cope with the desired target. In the current COVID-19 pandemic, medicinal chemists and drug designers have attempted to target several of the above structural proteins for inhibition purposes. Among these structural targets, one of the proteins is the main protease (Mpro) [11, 12] . As mentioned above, it digests the pp1a and pp1b (L-Q* (S, A, G) (* is the cleave shows the cleavage spot)), which are essential polyproteins for the SARS-CoV-2 transcription and translation [13] . Therefore, the functional activity inhibition of the Mpro provides a reasonable chance of producing an J o u r n a l P r e -p r o o f effective drug against the SARS-CoV-2 infection [14] . The Mpro crystallographic structure has already been deposited in the protein databank in its apo form (PDB: 6M03). Its structural characterization reveals 96.1 % of sequence similarity to the previous SARS-CoV, containing a decidedly conserved manner of the catalytic binding spot. This protease has an active site arranged in the form of a homodimer where the binding grove exists in the two domains (domain one and domain two) [15] . Besides, several crystallographic structures have been reported for the COVID-19 Mpro coupled with various inhibitors, which ultimately provided a clue of hotspots and communicating residues as part of protease catalytic role in its leading protease spot and the inhibition process. The computer-aided drug design (CADD) approaches have shortened the timeline and lowered the expense to design drugs that could reach the clinical trial compared to those developed via traditional techniques [16] [17] [18] . Hence, we benefited from the reported crystal structure for the SARS-CoV-2 Mpro to carry out vanguard computational analysis for discovering ways of Mpro inhibition. The current work practices a multi-step in silico pipeline to discover drug-like compounds against the SARS-CoV-2 Mpro via virtual screening, molecular docking, and re-docking tailed by molecular dynamics simulation and free energy estimations. The outcomes obtained from these analyses could be of high importance for sure. They would be beneficial in designing an effective therapeutic strategy against the ongoing pandemic, aiming to save millions of lives. The SARS-CoV-2 crystal structure resolved via X-Ray diffraction approach at a resolution of 2.00 Å in its apo form (PDB ID: 6M03) was retrieved from the protein databank. We used the structure preparation tool in the Molecular Operating Environment (MOE_2018) for water striping, charge adjustments, and 3D protonation [19] [20] [21] . The structure was minimized as well via the minimization algorithm in MOE. The overall conformation was keenly observed to assure the lack of any unintended structural defects. The overall pipeline of this study is presented as a flowchart in There are billions of compounds, each having its features, which may or may not be similar to another compound in its physiochemical features and targets. Some might follow Lipinski's rule of five while others may not; however, every chemical compound has its own significance in medicinal chemistry. Computational molecular docking is an effective technique that enables us to see if a particular molecule will bind to the desired target or not in real-time. In the current work, we considered scanning ~16,000 drug-like small molecules from the ChemBridge database being docked into the Mpro active site that involves Thr24, Leu27, His41, Phe140, Cys145, His163, Met165, Pro168, and His172 as reported by Wu et al., 2020 (Fig 2) [22] . The docking parameters were slightly tweaked as the refinement was set to Induced Fit, and both the scoring algorithms were set to be London dG. Once the screening was completed, the top 1032 compounds were selected, having the docking score ranging from -11.99 to -6.30 kcal/mol. Next, to filter out the best possible hits, a criterion was set for each molecule to be selected for further evaluation. This standard was that the molecule must have a docking score of more than -10 kcal/mol and have a minimum of three interactions with the active site residues. Any additional interaction with the neighboring residue was considered a plus for the compound. Thus, a total of ten compounds were selected, having strong interaction with the Mpro active site residues. Once the top-ranked ten compounds were shortlisted, they were further subjected to the induced-fit docking via MOE using the same parameters [23] . However, the conformations were set to fifty because we wanted to observe the compound, which is more likely to interact with most of its conformations. Human physiological conditions are highly dynamic, and thus, compounds that can establish interactions with the target in most of its poses are unquestionably considered to be the more effective ones. The bioactivity of each top ligand was predicted by the Molinspiration Cheminformatics tool [24] , whereas SwissADME [25] was used for the ADMET analysis. Approximately 4500 studies predicted their bioactivity results via Molinspiration. We further examined the druglikeness of the selected compounds through OSIRIS Property Explorer [26] . Molecular dynamics (MD) simulation is a practical approach with numerous applications. In the present work, we employed MD simulation to examine the stability of our shortlisted compounds in the active site of the Mpro at different time intervals. Hence, the shortlisted compounds individually in a complex with the Mpro were subjected to MD simulation via GROMACS v.5.1 [27] . The All-atom OPLS force field [28] was employed in this case. The ligands topology is not defined in the GROMCS forcefield; therefore, a freely accessible server LigParGen [29] was used to generate the ligands' topology. This server takes the compounds' simplified molecular-input line-entry system (SMILES) as an input and provides both the coordinates (.gro) and topology (.itp) files for the ligand. A total of five holo simulations were conducted by the addition of an explicit flexible SPC water molecule fixed in a cubic box [30] , whereas the margins were placed ≥10 Å from all the protein atoms. The size and vectors of the box were set to be 4.256 × 4.061 × 4.142 nm, and 6.7 × 6.7 × 6.7 nm, respectively, whereas the box angles were equal to 90° on each side. Further we added Na+ to maintain a neutral system. Next, for the neutralization of the system, we minimized the solvated structures for 50,000 steps of steepest descent minimization that dismisses when the overall maximum force is <1000 KJ/mol/nm. A stable temperature and pressure of 300 K and 1 bar, respectively, with a timestep of 2 fs was kept to attain the equilibrium. For the heavy atoms, the LINCS (LINear Constraint SolVer) [31] constraints and non-bonded pair list were updated every ten steps under the position restraint conditions for the heavy atoms. Electrostatic interactions were calculated using the particle mesh Ewald method. The v-rescale (modified Berendsen thermostat) temperature coupling method [32] was used to maintain a constant temperature inside the box. All the simulations were carried out for a duration of 100 nanoseconds (ns). Upon successful completion of the simulation, each system's trajectory was subjected to post-simulation analysis, the stability calculation, residual level flexibility, and overall structural compactness. All the analyses were carried out using the built-in GROMCAS functions such as g_rms, g_rmsf, and g_gyrate. The protein structures were rendered in PyMOL [33, 34] , and the graphs were constructed and visualized in Origin [35] . The g_mmpbsa tool was used to use to compute the binding energy for all the proteinligand complexes [36] . The g_mmpbsa tool functions on the GROMACS generated datatypes (.tpr and .xtc), and estimates mainly three components of the binding energy that involves molecular mechanical energy, polar solvation energy, and apolar solvation energy. Molecular mechanical (MM) energy consisted of electrostatic contribution (Elec) and van der Waals (EvdW) contributions. To calculate polar solvation energy, g_mmpbsa relies on the Assisted Poison Boltzmann Solver (APBS) program. In case of the apolar solvation energy estimation, the Solvent Accessible Surface Area (SASA) model is used. Binding free energy estimation depends on the following equation. where indicates the total free energy of the binding complex , denotes the total free energies of the individual protein and shows the total free energies of the individual ligand. In addition, for each individual entity, the free energy can be given by: where " " represents the protein or ligand or protein-ligand complex. Indicates the average potential energy in a vacuum. The entropic contribution to the free energy in a vacuum is jointly denoted by temperature ( ) and entropy ( ), and the Gsolvation stands for the free energy of solvation. The sum of bonded as well as non-bonded interactions imply the vacuum potential energy ( ) , which is calculated on the base of forcefield parameters. = is bonded interactions comprising of bond, angle, dihedral, as well as improper interactions. Both electrostatic ( ) and van der Waals ( ) interactions encompass the non-bonded interactions − and are demonstrated using a Coulomb and Lennard-Jones (LJ) potential function, respectively. The free energy of solvation is well defined as the energy vital for transmitting a solute from a vacuum into a solvent. In the method of MM-GPBSA, an implicit solvent model is used for its calculation. The solvation free energy ( ) is defined as: where represents the electrostatic and − denotes the non-electrostatic contributions to the solvation-free energy. For the analysis of all the complexes' binding free energy, all frames that cover the period of 10 ns of the constant MD trajectories were utilized. J o u r n a l P r e -p r o o f The top ten compounds redocked into the active site of Mpro were carefully examined via the ligand interaction tool in MOE ( Table 1) . The objective was to search for compounds that show interactions with the active site residues in most of its poses. Compound 1 observed stronger interactions with the target, making two arene-H bonds with His41 and Pro168 and two polar side chain donors with His163 and Gln189. Compound 2 established three arene-H bonds with Leu141, Pro168, and Gln189 while side-chain donor interactions were observed for His41 and Met165. This compound also made an arene-H bond with the threonine residue at position 25 attached to its Cyclopentane ring in several of its poses. This compound was seen to be having good interactions in every orientation. Interactions formed by Compound 3 include two arene-H (Glu166, , and Pro168), polar side chain donor (Cys145, and Gln189), and a greasy side-chain acceptor (Met165) took part in the interaction. In the case of Compound 4, few poses were observed to be interacting. However, most of the conformations made zero interaction with the target, and thus this compound does not qualify our simulation standard. Interactions of Compounds 5, 7, 8 were confined to a particular site, such as residues in the 140s or 190s or just single irregular contacts; however, we noticed stronger affinity in the case of Compound 6 as it made five direct contacts with the key residues as well as one weaker interaction with the neighboring key amino acid Gln189. Compounds 9 and 10 had a presentable affinity towards the target, but we were concerned with selecting those with diverse interactions. Therefore, Compound 10 was chosen because it made three arene-H bonds with His41, Met165 and Gln189 while one greasy side-chain acceptor with Cys145. To conclude, Compounds 1, 2, 3, 6, and 10 (now called C1, C2, C3, C4, and C5, respectively) were finally selected as the top five compounds as a result of molecular docking analysis. The chemical structures, accession numbers and docking scores along with the RMSDs and binding energies of the top five (all the ten compounds) selected compounds are listed in Table 1, while the hydrogen bond lengths and energies made by the selected top five compounds are tabulated in Table 2 . The interactions established by these selected compounds are given in Fig 3, while the remaining five compounds that failed to qualify the selection criteria are visually produced in the additional information (Fig S1) , showing their interactions in 2D space. For validation purposes, the selected top compounds were subjected to various biophysical and chemical properties checkups. The bioactivity values of these compounds are tabulated in Table 3 , while physiochemical properties, such as lipophilicity, solubility, pharmacokinetics, and drug-like features are provided in the supporting information (Fig S2) . We observed that the selected compounds are safe as they fulfill all the criteria to be an active drug, and no rules are being violated by these compounds. Table 3 Bioactivity score of the top five compounds. Higher values represent the more active nature of a compound and vice versa. Additionally, we used the OSIRIS Property Explorer further to examine the selected compounds for their drug-likeness. We see that C2 has a higher mutagenic effect, but the rest of all the compounds are in the optimal range and safe to be used, including C2 (Table 4 ). The sub-structure fragments and their drug-likeness score for the selected compounds are given in the additional information (Fig S3) . Molecular dynamics (MD) simulation is a practical computational approach for understanding the stability and dynamics of an entity inside a system in real-time. We analyzed the last frame simulation trajectories of all the five hits using the built-in GROMACS commands to calculate the stability (RMSD) and flexibility (RMSF) of the Mpro and our proposed hits complexes. The compactness and unstable folding were evaluated by calculating the radius of gyration (Rg). The stability tests of the docked hits into the active site and overall systems stability were observed to be under control with acceptable deviations (Fig 4) . The average RMSD for all the systems ranged from 0.10 to 0.30 nm (1.0 to 3.0 Å). The RMSD for Compound 1 starts from 0.11 nm and goes up to 0.25 nm, experiencing minor drops till 20 ns from where it keeps on fluctuating till 50 ns reaching up to 0.25 nm. Here it shows the sudden decline and rise reaching 0.20 nm and maintains this stable position till the end of the simulation. Minor fluctuations can be observed between 20 to 40 and 40 to 60 ns, but the overall system remains stable, and no significant escalation can be seen. In the case of Compound 2, the RMSD keeps on elevating till 8.00 ns reach an altitude of 0.25 nm from where it rises and falls till 40 ns where it maintains the RMSD value greater than 0.15 nm and keeps on fluctuating with sharp peaks till 70 ns from where it remains relatively stable till the last frame. The highest and lowest RMSD value in the case of this compound is observed to be 0.27 and 0.10 nm. The Mpro-Compound 3 complex is observed to be fluctuating at the start of the simulation, reaching its highest RMSD value of 0.27 and 0.30 in the first 15 and 25 ns followed by a gradual decline after 25 ns and keeps on falling, reaching its lowest value of 0.12 nm. However, few significant stunts can be observed at particular time intervals, such as at 25, 42, and 60 ns. The overall stability hovers around 0.22 nm, which is believed to be a highly stable range. Similarly, the RMSD for Compound 4 starts from 0.12 nm and keeps on elevating till 20 ns reaching its maximum altitude of 0.24, from where it drops down back to 0.17 nm. Here it keeps on elevating till 60 ns reaching an altitude of 0.23 nm. From 60 ns, the RMSD remains stable with minor hops till 90 ns. Here it rises, reaching the altitude of 0.25 nm, and drops down to 0.16 nm. For a complex Mpro-Compound 5, the RMSD stays highly uniform till the last frame. No major fluctuation has been encountered, and the overall value ranges from 0.10 to 0.27 nm. Besides the ligand-Mpro complex, we also calculated the RMSD of the ligands only (Fig. 5) to examine whether they maintain the initial docking pose or they reset to a different position along the time. We can observe that Compound 1 remains slightly unstable in the first 15 ns and then attains a stable position till 67 ns with an RMSD value of 0.49 nm, but it drops down to 0.35 nm after 70 ns and remains there with minor fluctuations as the rest of the compounds. Compound 3, 4, and 5 remains relatively stable and while Compound 5 is observed to have minor fluctuations throughout the system. Interestingly, all the five compounds stabilize after 67 ns. To conclude, Compound 1 gains the highest RMSD value, Compound 3, 4, and 5 are the most stable ones, and Compound 5 has the highest fluctuation rate. Accordingly, these outcomes reveal the stable intrinsic motions and insignificant fluctuations during the 100 ns simulation. Stable RMSD values confirm that the ligand remains intact, while differences in the RMSD show the ligand's attachment and release at various time points. It is important to note that the small molecule attachment directly impacts the system contrarily as the ligand-binding positioning alters over the simulation period. The residual level flexibility is understood in terms of the RMSF as it depicts the flexibility at the residues level. The RMSF of all the five complexes shows a highly stable nature, excluding the loop regions that show the highest fluctuation compared to the rest of the residues. The overall flexibility ranges from 0.05 to 0.25 nm, excluding a few leaps by Compounds 3 and 5 (Fig 4) . The active site residues did not show any significant fluctuation, confirming ligand recognition in the ligand-binding domains. All the five complexes show stable behavior throughout simulation except for Compound 3 and Compound 5, which were observed to cause major fluctuation from residues 30 to 60. However, the rest of the residues remain relatively low, and the average RMSF for all the five complexes did not cross the scale of 0.25 nm. We also plotted the RMSF for solo Mpro (Fig 6) to analyze the amino acids' flexibility without the ligand, observing that the solo protein is slightly unstable compared to the Mpro-ligand complex. This proves that our proposed ligands firmly attach to the protein throughout the system and stabilize the protein. Minor fluctuations can be observed throughout the system, which is acceptable because the ligand attachment is not a rigid phenomenon. Thus, upon its attachment, the respective bonds residues fluctuate slightly to best orient for the attachment, which results in minor crests. These analyses confirm the stability of the target protein by the attachment of all the five proposed drug compounds. Hence, the ligand binding significantly impacts the residual fluctuation because of the internal residues experiencing effects caused by various ligands' attachment. The Rg values determine a structure's behavior under compression along an axis. A higher Rg value means less compactness (additional unfolding) and vice versa. Therefore, the Rg plots were plotted against time to examine the structural compactness of the system (Fig 4) . All the simulated complexes exhibit the Rg scores ranging from 2.20 to 2.26 nm except for Compound 2 that exhibited significant fluctuations around 40000, 60000, and 70000 ps, reaching the highest Rg value of 2.30 nm. The Rg values in the case of the Mpro-Compound 1 complex were recorded to be ranging from 2.20 nm to 2.20 with minor leaps reaching up to 2.25 and 2.27 nm at 30000 and 15000 ps, respectively. The average Rg values for Compound 3 stays between 2.20 and 2.25 nm with minor ups and downs throughout the simulation. A large leap is observed at the five ns, but it drops down back to its optimal range. The complexes of Compound 4, and 5 range from 2.20 to 2.24 nm. In the case of the Rg plots for solo Mpro, no significant difference can be observed except Compound 1 that shows a slightly higher Rg value in the first quarter of the simulation, but it stabilizes as the rest of the compounds (Fig 6) . Gradual elevations and declines can be observed at different periods, but the average plots remain pretty stable. The compactness of the complexes was greatly affected by the attachment and release of ligands. For the calculation of the binding energies, the polar and apolar solvation conditions were estimated. This analysis set up the energies associated with the binding of Mpro-Compound 1/2/3/4/5 in the course of the 100 ns MD simulations. The subsequent protein-ligand energies have been calculated, such as vdW interaction energy, electrostatic energy, SASA, and average binding energy ( Table 5) . Compound 4 is observed to have the highest binding energy in all the complexes, while the lowest value was recorded for Compound 2. The obtained energy values validate the docking results as more bonds resulted in higher binding energy. Analyzing protein-drug communication The interaction between the top five drug-like compounds was further examined through the hydrogen bonding analysis and PCA. All five compounds show strong hydrogen bonding with the protein through the course of simulation (Fig 7 and Table S1 ). It can be seen that compound 1 has the highest number of hydrogen bonds, down to which the second-highest is compound 4 and then compound 3. Hydrogen bonding in the case of compounds 3 and 5 is lower than the rest of the compounds but is more consistent throughout the simulation. In order to understand the principal structural variations revealed by each Mpro-Compound (1/2/3/4/5) complex, we plotted the PCA graphs (Fig 8) . All the compounds exhibit a clear color transition at various points that signifies the switching behavior from one conformation to another caused by the inhibitor attachment. Each dot represents an individual frame. It is observed that in most of the frames, the system remains compact throughout the 100 ns simulation with slight dispersions that could be linked to the loss of hydrogen bonds for short intervals. These analyses reveal the substantial flexibility of protein structure docked with the shortlisted compounds during primary simulation phases that ultimately reduce with the simulation interlude. Additionally, the contribution percentage in eigenmodes serially decreased too, suggesting the local instabilities in the target's structure for every complex inclined to gain stability. In the case of every protein-ligand complex, these motions are meant to be contributed by docked compounds in the viral protease active site, directing to the establishment of a strong complex. Previous studies have witnessed such observations, for instance, the G-protein coupled receptor and complexes of Fructose transporter GLUT5. The given figure (Fig 8) shows three eigenvectors or PCA for the SARS-CoV-2 main protease being docked with the top five potential ligands based on their extracted trajectories and exposed in clusters. Examining these eigenvectors supports the solid and clustered motions in the target's corresponding complexes during the simulation. The Mpro complexes formed clusters ranging from -30 to 30 on PC1, and -20 to 20 on PC2 as well as PC3. The 2D plots determine the changes in the ensemble dispersal regarding each conformation during the course of 100 ns, while the color transition from blue to red denotes the episodic hops among different conformational positions of the target's structure. Overall, interrelated motion in SARS-CoV-2 conformation expressed the rigidity and substantial variations brought at the active site due to ligand binding during the simulation. Since these analyses suggest selected compounds' stability in the SARS-CoV-2 active site, it restricts the protein indispensable motion needed for enzymatic action and thus results in the inhibition; maintained by RMSD, RMSF, Rg, protein-ligand communication, and PCA study along with many molecular docking scores. There is a need to discover potent compounds that could bind with the SARS-CoV-2 targets and facilitate the inhibition of this virus. Therefore, we evaluated 16,000 drug-like compounds obtained from the ChemBridge database, and the top hits were docked into the active site of the Mpro of the SARS-CoV-2. Additionally, the molecular dynamics simulation confirmed the stability of the selected compounds in the system. Considering the outcomes of molecular docking, MD simulation, and free energy calculations, we believe that the proposed compounds could act as an effective drug against the SARS-CoV-2. However, this whole study is founded on a in silico pipeline, and thus in vitro assays are recommended to confirm their validity further. J o u r n a l P r e -p r o o f All the interactions made by the additional five compounds are provided in Fig S1, while the physiochemical and pharmacokinetic properties for each selected compound are provided in the supplementary information as Fig S2. The number of hydrogen bonds formed by each selected compound is tabulated in Table S1 . The authors declare no competing financial interest. Fig.1 This figure shows the entire workflow of our study. The structure of Mpro (3CLpro) monomer is comprised of three domains: domain I, domain II, and domain III consisting of residues 8-101, residues 102-184, and residues 201-303, respectively, as well as a long loop (encompassing residues 185-200) that connects domain II and III [22] . In the gap between domains I and II, the active site (Cys145 and His41) is located, while in the pocket, a hydrophobic surrounding is also formed by hydrophobic amino acids, i.e., T24, L27, H41, F140, C145, H163, M165, P168, and H172. The RMSD and RMSF, and Rg for each compound are plotted as a whole for the entire simulation to analyze the stability, residual fluctuation, and compactness. The RMSD graphs are considerably stable and remain less than 0.30 nm throughout the simulation. In the RMSF analysis, few active site residues fluctuate from 25 to 70 but not those in the range of 140 to 175. The top selected compounds are mainly interacting with the residues from 140 to 175. Moreover, slight fluctuation occurs when there is an attachment between two chemical entities because of their affinity. The overall RMSF remains within the optimum range with no drastic changes. For the Rg, the x-axis shows the total number of frames in ps, while the y-axis represents the Rg value in nm. In the first 10,000 frames, Compound 2 has the highest Rg value and takes three major leaps but comes back in line with the rest of the compounds' Rg values. As a whole, the Rg are markedly stable, and no radical changes are experienced, proving the stable nature of our predicted compounds in the system. Discovery of a Natural Product with Potent Efficacy Against SARS-CoV-2 by Drug Screening HD5 and LL-37 Inhibit SARS-CoV and SARS-CoV-2 Binding to Human ACE2 by Molecular Simulation, Interdisciplinary Sciences: Computational Life Sciences COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses Phylogenetic analysis and structural modeling of SARS-CoV-2 spike protein reveals an evolutionary distinct and proteolytically sensitive activation loop Ensemble Molecular Docking, and 3D-RISM Solvation Studies Expose Potential of FDA-Approved Marine Drugs as SARS-CoVID-2 Main Protease Inhibitors Comparing SARS-CoV-2 with SARS-CoV and influenza pandemics Coronavirus genome structure and replication, Coronavirus replication and reverse genetics Commentary: origin and evolution of pathogenic coronaviruses The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak-an update on the status In silico and in vitro evaluation of kaempferol as a potential inhibitor of the SARS-CoV-2 main protease (3CLpro) Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra αhelical domain Structure of M pro from SARS-CoV-2 and discovery of its inhibitors An investigation into the identification of potential inhibitors of SARS-CoV-2 main protease using molecular docking study Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors Role of computer-aided drug design in modern drug discovery Molecular docking and structure-based drug design strategies Protein structure-based drug design: from docking to molecular dynamics, Current opinion in structural biology Bringing Structural Implications and Deep Learning-Based Drug Identification for KRAS Mutants Structural dynamics behind clinical mutants of PncA-Asp12Ala, Pro54Leu, and His57Pro of Mycobacterium tuberculosis associated with pyrazinamide resistance, Frontiers in bioengineering and biotechnology Abu-Izneid, Epitopes based drug design for dengue virus envelope protein: a computational approach Analysis of therapeutic targets for SARS-CoV-2 and discovery of potential drugs by computational methods Prediction and validation of potent peptides against herpes simplex virus type 1 via immunoinformatic and systems biology approach Unveiling novel 2-cyclopropyl-3-ethynyl-4-(4-fluorophenyl) quinolines as GPCR ligands via PI3-kinase/PAR-1 antagonism and platelet aggregation valuations; development of a new class of anticancer drugs with thrombolytic effects SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules Effectiveness of Bioactive Compound as Antibacterial and Anti-Quorum Sensing Agent from Myrmecodia pendans: An In Silico Study Computational screening and analysis of lung cancer related non-synonymous single nucleotide polymorphisms on the human kirsten rat sarcoma gene Topologies, structures and parameter files for lipid simulations in GROMACS with the OPLS-aa force field: DPPC, POPC, DOPC, PEPC, and cholesterol LigParGen web server: J o u r n a l P r e -p r o o f an automatic OPLS-AA parameter generator for organic ligands An efficient, box shape independent non-bonded force and virial algorithm for molecular dynamics LINCS: a linear constraint solver for molecular simulations Effect of a dirhamnolipid biosurfactant on the structure and phase behaviour of dimyristoylphosphatidylserine model membranes Pymol: An open-source molecular graphics tool, CCP4 Newsletter on protein crystallography Computational identification, characterization and validation of potential antigenic peptide vaccines from hrHPVs E6 proteins using immunoinformatics and computational systems biology approaches 1: Scientific Data Analysis and Graphing Software Software Review Identifying novel sphingosine kinase 1 inhibitors as therapeutics against breast cancer The heavy-duty computational operations were partially performed at the Peng Cheng Lab and the Center for High-Performance Computing, Shanghai Jiao Tong University.