key: cord-0817697-v2h0ig8v authors: Muhtar, Eldar; Wang, Mengyang; Zhu, Haimei title: In silico discovery of SARS-CoV-2 main protease inhibitors from the carboline and quinoline database date: 2021-07-20 journal: Future Virology DOI: 10.2217/fvl-2021-0099 sha: 4e1b13c43495aebf28716dce2173a6316d83da4c doc_id: 817697 cord_uid: v2h0ig8v Aim: SARS-CoV-2 caused more than 3.8 million deaths according to the WHO. In this urgent circumstance, we aimed at screening out potential inhibitors targeting the main protease of SARS-CoV-2. Materials & methods: An in-house carboline and quinoline database including carboline, quinoline and their derivatives was established. A virtual screening in carboline and quinoline database, 50 ns molecular dynamics simulations and molecular mechanics Poisson−Boltzmann surface area calculations were carried out. Results: The top 12 molecules were screened out preliminarily. The molecular mechanics Poisson−Boltzmann surface area ranking showed that p59_7m, p12_7e, p59_7k stood out with the lowest binding energies of -24.20, -17.98, -17.67 kcal/mol, respectively. Conclusion: The study provides powerful in silico results that indicate the selected molecules are valuable for further evaluation as SARS-CoV-2 main protease inhibitors. For this study, we established a small molecule database named "CQDB". CQDB includes 2598 carboline and quinoline derivatives without duplicated molecules. 1117 molecules were synthesized and patented in our previous work. Another 1481 molecules were obtained from the publications of other pharmaceutical researches. 2D structures of these molecules were sketched in ChemBioDraw and converted to 3D structures with Open Babel v3.1.0 as ligands [17] . Afterward, optimizations were done in 250 steps of a steepest-descent geometry optimization with the MMFF94 forcefield in Open Babel. The crystal structure of SARS-CoV-2 M pro (PDB code: 6LU7) was obtained from the PDB with a resolution of 2.16 angstrom. The protease consisted of one chain with 306 amino acids. 6LU7 was prepared at a pH level of 7.4 for protonation using the 'prepare protein' protocol in BIOVIA Discovery Studio 2016 (Dassault Systèmes, Vélizy-Villacoublay, France). The protonation state of crucial residuals such as HIS 41, HIS 164 and GLU 166 in binding-site was checked carefully again. The solvent was stripped off. Ligands were obtained from the CQDB database. Docking-based virtual screening, which is one of the most promising methods in silico for the drug-like molecule discovery, is useful to predict the best interaction state between a ligand and a protein. AutoDock Vina (Scripps Research, CA, USA) was chosen to perform the virtual screening. Since AutoDock Vina provides the maximum accuracy and the minimum computer time, which refers to the empirical and knowledge-based scoring functions [18] . Ligands were assigned with the gasteiger charges. The grid box size in three dimensions was 40 × 40 × 40Å with a center coordinate of -7.857, 11.856 and 67.687, which was the center of the ligand N3 in the crystal structure. The exhaustiveness of the global search was increased to 12. Based on the binding affinity ranking, the top 12 molecules that satisfied a threshold ( G ≤−9.8 kcal/mol) were screened out for more detailed analysis. In the docking analysis, a theoretical method to identify the appropriate configuration of ligand in enzyme activesite is very important [19, 20] . To evaluate the docking and selection strategy, N3 was fetched out from the crystal structure and redocked into M pro as a reference. Covalent docking was performed using AutoDock 4.2.6 [21] . The grid box dimensions were 40 × 40 × 40Å. The grid spacing was set as 0.375Å. Lamarkian genetic algorithm was utilized to find the appropriate configurations of ligands. The global optimization was performed with parameters of 300 randomly positioned individuals. The maximum number of energy evaluations was enhanced to 2.5 × 10 7 , and the maximum number of generations in lamarkian genetic algorithm was enhanced to 2.7 × 10 5 . The Solis and Wets local search was executed with a maximum number of 3000. During docking experiments 200 runs were carried out. The resulted 200 conformations of each were ranked by the lowest binding energy and clustered with an all-atom root mean square deviation tolerance of 2.0Å. The lowest binding energy, the population of the configuration in the cluster analysis and the proper binding mode were considered comprehensively for selecting the promising configuration of N3. Subsequently, the top 12 molecules were redocked toward the M pro in AutoDock 4.2.6 to select the promising configurations using the same selection strategy. The selected promising configurations of the 12 molecules were submitted as the start configurations for the MD simulations. MD simulation analysis MD simulation is a decision-making procedure for the evaluation of complex stability [22] . It is useful in the investigation of the dynamic behavior at an atomic level of biological systems, which is hard to process in the laboratory [23] . In the present study, the most promising configurations of the 12 M pro -ligand complexes were chosen as the starting point for MD simulations. MD simulations for 12 M pro -ligand complexes and an apo form of the M pro were performed on a 50 ns time scale. The CHARMM General Force Field web-based tool (https://cgenff .umaryland.edu/) was used to generate ligand parameter files. Charge of ligands was assigned by the extended bond-charge increment scheme [24] . Gromacs 2020.1 was utilized to perform the simulation with the CHARMM36 all-atom force field [25, 26] . All of the 13 systems were solvated with an extended simple point charge model (SPC/E) and neutralized via adding Na + ions [27] . In the following process, the energy of the system was minimized by the steepest descent algorithm at a threshold of 1000 kJ·mol -1 ·nm -1 . Then two-part equilibration, conserved moles, volume and temperature (NVT) and conserved moles, pressure and temperature (NPT) ensembles were done for 0.5 ns. Long-range electrostatics was calculated by the particle Mesh Ewald method [28] . About 50 ns MD simulations were performed at a time step of 2 fs. Then root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg) and the number of hydrogen bonds were calculated to analyze the MD trajectories in GROMACS utilities. MM-PBSA is used in study of biomolecular interactions and the computational drug design. MM-PBSA binding energy of chosen molecules was calculated via g mmpbsa [29] . This tool calculates the enthalpic components of the MM-PBSA interaction using GROMACS and the APBS packages. The total binding free energy is calculated as follows: Dihedral angle principal component analysis The principal component analysis (PCA) method was used to calculate eigenvectors and eigenvalues and their projection along with the first two principal components during MD simulation. The dihedral angle principal component analysis (dPCA), which is based on the Gromacs protocol, uses backbone dihedral angles to analyze while PCA uses Cartesian coordinates [25] . dPCA can readily be characterized by the corresponding conformational changes of peptides in a protein [30] . dPCA was calculated from the MD backbone trajectories. Via diagonalizing the matrix, a bunch of eigenvectors and eigenvalues were generated and plotted in the 2D projection to evaluate the motion of trajectory. To evaluate the potential of molecules in our database to become inhibitors of M pro , AutoDock Vina was used to screen in the database and rank molecules according to their binding affinities. The name, structure and molecular weight of the top 12 molecules are listed in Table 1 . The proper docked configuration of N3 was selected and shown in Figure 2 . The selected configuration of N3 is well overlapped with N3 in the crystal structure. The binding free energy of the selected configuration of N3 to the M pro is -15.40 kcal/mol in AutoDock 4.2.6. Compared with the crystal N3, the selected configuration of N3 The unit of all parameters is kcal/mol. Table 1 . The top 12 molecules were reranked according to the MM-PBSA binding energies shown in Table 1 Table 2 shows contributions in MM-PBSA binding energy of the top four molecules. To evaluate the binding stability of the top 12 molecules at the binding site of M pro , the protein backbone RMSD, RMSF, the Rg and the number of hydrogen bonds, the protein dihedral principal component (dPCA) and the ligand binding mode during 50 ns MD simulations were carefully inspected. It was found that p63 9h exhibited unstable binding mode at the active site of the M pro . Thus, p59 7m, p12 7e and p59 7k were suggested as the top three molecules with the potential of inhibiting M pro . The average protein backbone RMSD of apo form is 0.26 nm while holo forms with p59 7m, p12 7e, p59 7k are 0.18, 0.20, 0.19 nm, respectively ( Figure 4A ). This result suggests the reduction of overall protein flexibility upon binding of the three selected molecules. The protein backbone RMSF shown in Figure 4B represents lower fluctuations of protein residues during MD simulation upon binding p59 7m and p59 7k than binding p12 7e. The Rg shown in Figure 4C hydrogen bonds reflects one of the crucial interactions between the protein and the corresponding ligand. As seen in Figure 4D , average number of hydrogen bonds upon binding of p59 7m, p12 7e, p59 7k are 2.0, 1.2 and 1.8, respectively. This suggests p59 7m forms stronger hydrogen bond interactions than p59 7k and p12 7e. The dihedral principal component analysis (dPCA) method was employed to reveal the dynamical behavior in the space of SARS-CoV-2 M pro when combined with the top three molecules. The first two principal components were selected to analyze the projection of apo form phase space and holo form phase spaces with top three molecules during the 50 ns MD simulations. Figure 5 clearly shows that apo protein and the M pro -p59 7k complex covered a larger region of phase space while the M pro -p59 7m complex and M pro -p12 7e complex covered smaller ones. The result suggests that binding of p59 7m and p12 7e in the active site limits large dynamic behaviors of SARS CoV-2 M pro , which is in accordance with the Rg results. Discussion SARS-CoV-2 invades human respiratory system, spreads rapidly and causes a severe health crisis all over the world. Vaccines can help to prevent infection and severe symptoms caused by SARS-CoV-2. However, drugs treating SARS-CoV-2 infection are still in urgent need. The RNA polymerase inhibitor, remdesivir, is the only drug approved by FDA up to now. Nowadays, The M pro of SARS-CoV-2 has become one of the most promising targets for new drugs. The M pro inhibitor PF-07321332 from Pfizer has entered a Phase I clinical study [31] . Carboline and quinoline molecules were reported to possess powerful antiviral bioactivities. An in-house database CQDB containing 2598 carboline and quinoline molecules was established for the discovery of M pro inhibitors. Dockingbased virtual screening and following MD simulations revealed three potential M pro inhibitors p59 7m, p12 7e and p59 7k. Among the three molecules, p59 7m exhibits the lowest binding free energy of -24.20 kcal/mol. p59 7m forms the most extensive and stable hydrogen bond and hydrophobic interactions with M pro active site residues THR 26, HIS 41, MET 49, ASN 142, CYS 145 and ARG 188, which are shown to interact with peptidomimetic inhibitors in the crystal structures [7, 32] . Different from the peptidomimetic inhibitors in the crystal structures which spans S1 to S4 pocket, p59 7m binds at S2 to S2 pocket. The carboline derivative p59 7m is worthy of further investigation. SARS-CoV-2 may predispose to both venous and arterial thromboembolic disease due to excessive inflammation, hypoxia, immobilization and diffuse intravascular coagulation [33] . Previous works indicate carboline and quinoline molecules decrease both arterial and venous thrombus in vivo [34, 35] . Also, carboline and quinoline molecules may have anti-inflammatory activity [34, 36] . Therefore, the selected carboline derivatives may also have the potential of reducing thrombus and inflammatory syndromes of SARS-CoV-2 in the future perspective. Among 2598 molecules in the database we established, the top 12 molecules were selected through the docking-based virtual screening using AutoDock Vina. Then MD simulations were performed on the apo form and the 12 docked complexes of SARS-CoV2 M pro and molecules to confirm their system stabilities. Based on the MD simulations, binding affinities of the 12 molecules with SARS-CoV-2 M pro were calculated using MM-PBSA method. The top three molecules p59 7m, p12 7e and p59 7k bind SARS-CoV-2 M pro with the lowest MM-PBSA binding free energies of -24.20, -17.98, -17.67 kcal/mol, respectively. They form extensive hydrogen bonds with the active site residues and obviously decreased the flexibility of SARS-CoV-2 M pro . The selected three molecules are worthy of further bioactivity studies against SARS-CoV-2 M pro . This result also encourages further exploration of bioactive carboline and quinoline derivatives against SARS-CoV-2. Previous works indicate carboline and quinoline molecules decrease both arterial and venous thrombus in vivo. Also, carboline and quinoline molecules have the anti-inflammatory activity in vivo from the previous research. Carboline, quinoline and their derivatives have the potential to be explored as antiviral molecules with anti-thrombosis and anti-inflammatory activities in the future. • SARS-CoV-2 causes more than 3.8 million deaths according to the WHO. • Drugs treating SARS-CoV-2 infection are in urgent need. • The main protease (M pro ) of SARS-CoV-2 has become one of the most promising targets for new drugs. • Carboline and quinoline molecules possess powerful antiviral bioactivities. • An in-house database carboline and quinoline database containing 2598 carboline and quinoline molecules was established for the discovery of M pro inhibitors. • Docking-based virtual screening and following molecular dynamics simulations revealed three potential M pro inhibitors p59 7m, p12 7e and p59 7k. • Among the three molecules, p59 7m exhibits the lowest binding free energy of -24.20 kcal/mol. COVID-19) Dashboard. World Health Organization Compounds with therapeutic potential against novel respiratory 2019 coronavirus COVID-19 vaccination & dialysis patients: why the variable response Remdesivir approved to treat COVID-19 amid controversy • g Mmpbsa tools that performs more accurate calculation was used to calculate the free binding energy based on the data of molecular dynamics Novel 2019 coronavirus structure, mechanism of action, antiviral drug promises and rule out against its treatment • g Mmpbsa tools that performs more accurate calculation was used to calculate the free binding energy based on the data of MD Using integrated computational approaches to identify safe and rapid treatment for SARS-CoV-2 In-silico drug repurposing and molecular dynamics puzzled out potential SARS-CoV-2 main protease inhibitors Rutin and flavone analogs as prospective SARS-CoV-2 main protease inhibitors: in silico drug discovery study In silico mining of terpenes from red-sea invertebrates for SARS-CoV-2 main protease (M-pro) inhibitors In silico evaluation of prospective anti-COVID-19 drug candidates as potential SARS-CoV-2 main protease inhibitors Natural-like products as potential SARS-CoV-2 M(pro)inhibitors: in-silicodrug discovery In silico drug discovery of major metabolites from spices as SARS-CoV-2 main protease inhibitors • Some of plant secondary metabolites, carboline and quinoline derivatives show powerful antiviral bioactivity which encourage us to explore carboline and quinoline in our in-house database against SARS-CoV-2 virus Structure, inhibition and regulation of two-pore channel TPC1 from Arabidopsis thaliana • Some of plant secondary metabolites, carboline and quinoline derivatives show powerful antiviral bioactivity which encourage us to explore carboline and quinoline in our in-house database against SARS-CoV-2 virus Open Babel: an open chemical toolbox Drug repurposing for coronavirus (COVID-19): in silico screening of known drugs against coronavirus 3CL hydrolase and protease enzymes A newly developed oxime K203 is the most effective reactivator of tabun-inhibited acetylcholinesterase Flexibility in the molecular design of acetylcholinesterase reactivators: probing representative conformations by chemometric techniques and docking/QM calculations Covalent docking using AutoDock: two-point attractor and flexible side chain methods Understanding the mechanism of amygdalin's multifunctional anti-cancer action using computational approach Identification of novel small molecules against GSK3 beta for Alzheimer's disease using chemoinformatics approach CHARMM general force field: a force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers CHARMM36 all-atom additive protein force field: validation based on comparison to NMR data The spatial structure in liquid water Particle mesh Ewald -an N.log(N) method for Ewald sums in large systems Open Source Drug Discovery C. g mmpbsa-A GROMACS tool for high-throughput MM-PBSA calculations • g Mmpbsa tools that performs more accurate calculation was used to calculate the free binding energy based on the data of MD Dihedral angle principal component analysis of molecular dynamics simulations Considerations for the discovery and development of 3-chymotrypsin-like cysteine protease inhibitors targeting SARS-CoV-2 infection Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved alpha-ketoamide inhibitors Incidence of thrombotic complications in critically ill ICU patients with COVID-19 Docking of THPDTPI: to explore P-selectin as a common target of anti-tumor, anti-thrombotic and anti-inflammatory agent • As a carboline derivative, THPDTPI shows anti-thrombosis and anti-inflammation activity in animal models 3S)-N-(L-Aminoacyl)-1,2,3,4-tetrahydroisoquinolines, a class of novel antithrombotic agents: synthesis, bioassay, 3D QSAR, and ADME analysis As quinoline derivatives, (3S)-N-(L-aminoacyl)-1,2,3,4-tetrahydroisoquinolines show anti-arterial thrombosis activity in a rat model Design and synthesis of nanoscaled IQCA-TAVV as a delivery system capable of antiplatelet activation, targeting arterial thrombus and releasing IQCA • As a quinoline derivative, IQCA-TAVV shows anti-arterial thrombosis activity in both rat and mouse models The authors thank Engineering Research Center of Endogenous Prophylactic of Ministry of Education of China for technical support. This work was financially supported by Scientific Research Program of Beijing Municipal Education Commission (KM202110025024). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.No writing assistance was utilized in the production of this manuscript. The authors contributed equally to this work.