key: cord-0759410-sk2dvrrl authors: De Vita, Simona; Chini, Maria Giovanna; Lauro, Gianluigi; Bifulco, Giuseppe title: Accelerating the repurposing of FDA-approved drugs against coronavirus disease-19 (COVID-19) date: 2020-11-10 journal: RSC advances DOI: 10.1039/d0ra09010g sha: 663ec38e977cd99952aee8805ef4aa659a627059 doc_id: 759410 cord_uid: sk2dvrrl The recent release of the main protein structures belonging to SARS CoV-2, responsible for the coronavirus disease-19 (COVID-19), strongly pushed for identifying valuable drug treatments. With this aim, we show a repurposing study on FDA-approved drugs applying a new computational protocol and introducing a novel parameter called IVS(ratio). Starting with a virtual screening against three SARS CoV-2 targets (main protease, papain-like protease, spike protein), the top-ranked molecules were reassessed combining the Inverse Virtual Screening novel approach and MM-GBSA calculations. Applying this protocol, a list of drugs was identified against the three investigated targets. Also, the top-ranked selected compounds on each target (rutin vs. main protease, velpatasvir vs. papain-like protease, lomitapide vs. spike protein) were further tested with molecular dynamics simulations to confirm the promising binding modes, obtaining encouraging results such as high stability of the complex during the simulation and a good protein–ligand interaction network involving some important residues of each target. Moreover, the recent outcomes highlighting the inhibitory activity of quercetin, a natural compound strictly related to rutin, on the SARS-CoV-2 main protease, strengthened the applicability of the proposed workflow. At the end of 2019, several cases of pneumonia caused by a novel virus were reported in Wuhan, China. This pathogen is a member of the Coronaviridae virus family, which is characterized by a positive-sense single-stranded RNA genome, 1 and shares 79.5% sequence identity with another well-known coronavirus: the SARS-CoV; [2] [3] [4] [5] for this reason, the virus was called SARS-CoV-2. [6] [7] [8] [9] The failure of the containment measures and the high infectivity of the virus, transformed a local problem into a pandemic disease (The World Health Organization, March 11 th, 2020). This tragic outbreak counts currently over 33 million cases and 1 million deaths worldwide, due mainly to severe respiratory syndrome and its complications. Scientists all over the world are joining the forces to come up with a cure for this disease, whether it is a small molecule or an antibody, and ameliorate the clinical picture of the patients. This resulted in around 3400 ongoing clinical trials, according to the National Institute of Health (data from https://clinicaltrials.gov/ct2/ results?cond¼COVID-19 accessed on September 30 th ). The current therapeutic protocols include known antiviral drugs [10] [11] [12] as a specic molecule has not yet been found, though many studies are focusing on the repurposing of existing drugs to be used against either the infection itself or the symptoms and complications related to it. 13 Several ongoing trials involving anti-interleukin-6-receptor (IL6R) drug tocilizumab are of great importance. 14, 15 In this framework, the structural elucidation of viral proteins, mainly spike glycoprotein (S), small envelope protein (E), matrix glycoprotein (M), and nucleocapsid protein (N), 16 results extremely useful to design and create specic ligands for these targets. At the moment, due to the massive effort of the scientic community, over 400 protein structures of SARS-CoV-2, including the S protein, are available in the Protein Data Bank (https://www.rcsb.org/). 17, 18 Computational techniques are essential in these early stages of research as they can provide useful data in a short period, boosting the subsequent phases of research. [19] [20] [21] [22] [23] [24] [25] For instance, the accurate prediction of the binding affinity represents one of the most interesting eld of investigation. [26] [27] [28] Moreover, to date there is no vaccine available, and the computational repurposing campaigns are privileged approaches as they deal with already approved drugs that will not need to pass the entire pre-clinical phase. 29, 30 In this framework, several new studies were published in the last few months concerning the use of existing drugs (alone or in combination) in ghting COVID-19. [31] [32] [33] Obviously, some of the papers focused their attention on existing antiviral drugs (lopinavir, saquinavir, etc.) or molecules that showed a good activity on the SARS-CoV virus in the past years. [34] [35] [36] The main strategy remains to prevent the virus from entering human cells by inhibiting one of the many components responsible for the viral uptake. 34 Therefore, starting from the three main protein categories available in the database (the main protease, the spike protein, and the papain-like protein), we decided to carry out a drug repurposing study using the FDA-approved drugs to highlight new possible drug candidates. We used a new approach that foresaw the application of direct virtual screening to select the most valuable candidates for each target. Such molecules were then tested using an Inverse Virtual Screening (IVS) approach on a panel of viral proteins, including the three initial ones, to conrm the robustness of the obtained results. Moreover, we introduced a new reliability parameter, called IVS ratio , to study the binding affinity of the compounds selected on the specic SARS-CoV-2 target compared to other "decoy" receptors. This parameter can give an indication whether the calculated binding affinity is within the top-ranked if compared with other targets. The three-dimensional structure of the SARS-CoV-2 main protease, bound to an inhibitor (PDB: 6LU7), 37 the spike protein receptor-binding domain (PDB: 6M0J), 38 and the papain-like protease (PDB: 6W9C) 39 were downloaded from the Protein Data Bank. The targets were cleaned and prepared following the automated protocol previously reported by us (available at http://computorgchem.unisa.it/cloe/). 40 In detail the solvent molecules, ions, and unnecessary protein chains were removed; then, the structural features like bond order and protonation state were adjusted using the Protein Preparation Wizard. 41, 42 Due to the presence of an inhibitor in the crystal structure of the main protease, the molecular docking grid was built taking the centroid of the ligand as the center of the box (x: À10.83, y:12.57, z: 68.68) and extending the latter of 25 A, 27 A, and 25 A on the x, y, and z-axis respectively. For the other two proteins, lacking an indication about the pharmacological site of interest, the putative binding pockets were detected using SiteMap [43] [44] [45] and the centroid of the top-ranked one was used as the center of the docking grid (x: À32.31, y:16.72, z: 33.31 for papain-like protease and x: À32.31, y:12.72, z: 30.31 for the spike protein). The dimensions of the two boxes are 21 Â 21 Â 25 A and 22 Â 21 Â 28 A for the papain-like protease and the spike protein, respectively. The viral proteins composing the target panel for the Inverse Virtual Screening were prepared likewise. Concerning the ligands, the chemical structures of FDA-approved drugs were downloaded from the Selleckchem website (https://www.selleckchem.com/screening/fda-approved-druglibrary.html) in the SDF format and prepared with LigPrep 46 to assign the correct protonation state and regularize the geometries. The congurations were not altered in the nal structure. Both ligands and targets were then converted into the PDBQT format using OpenBabel. 47 Direct and Inverse Virtual Screening The library of approved drugs was screened against the three viral targets using the soware AutoDock Vina 48 at the exhaustiveness of 64. From the results obtained, the best molecules were selected by setting an energetic cut-off of 0.5 kcal mol À1 from the best value of each protein. Such molecules were, then, tested against a panel composed of viral proteins already reported elsewhere 40 using the IVS approach. 49, 50 We calculated the IVS ratio with the following equation: where BA ligand is the calculated binding affinity of the ligand against the selected target and BA max is the best binding affinity obtained against the whole panel. The efficacy of the binding was further evaluated with MM-GBSA. The protein-ligand complexes with an IVS ratio within 0.1 from the best result were tested using the soware Prime with the VSGB solvent model and OPLS-2005 force eld. The residues surrounding the ligand (6.0 A) were allowed to move during the minimization to reduce steric clashes. The protein-ligand complexes selected for each target were prepared for the molecular dynamics simulations. Each complex was inserted in an orthorhombic box with a buffer distance of 10 A in each space direction and solvated with TIP3P water molecules. The system charge was neutralized by adding Na + or Cl À ions and then the physiological cell environment was mimicked by adding 0.15 M of NaCl. Once ready, the systems were simulated for 100 ns in NPT ensemble using the soware Desmond 36,37 aer a relation phase carried out with an internal 5-step protocol. At rst, the system was adjusted in NVT ensemble at 10 K with Brownian Dynamics for 100 ns, followed by two 12 ns steps at 10 K, rst in NVT, and then in NPT ensemble, with restrains on solute heavy atoms. In the end, two steps (12 ns and 24 ns, respectively) were carried out at 300 K and in NPT ensemble with and without restrains. Each trajectory was analyzed to extract qualitative information. In particular, the backbone RMSD, the RMSF and the radius of gyration (R g ) were calculated for each frame, taking the initial structure as reference. In order to identify repurposed FDA-approved drugs for treating COVID-19, we introduced a new computational protocol basing on the application of virtual screening, our recently introduced and implemented Inverse Virtual Screening approach, MM-GBSA calculations, and molecular dynamics simulations (Fig. 1) . In detail, the three-dimensional structure of each ligand belonging to the FDA-approved drugs library was adjusted using LigPrep, 46 which regularizes the geometries, and determines the correct protonation state at physiological pH. Three of the most important SARS-CoV-2 proteins were selected as targets: the virus main protease (PDB: 6LU7), 37 the spike protein receptorbinding domain (PDB: 6M0J), 38 and the papain-like protease (PDB: 6W9C). 39 To narrow down the putative ligands for each protein, a virtual screening campaign was carried out using the soware AutoDock Vina. 48 We then selected the molecules within a binding affinity range of 0.5 kcal mol À1 from the best result for each protein target. This rst group of 27 molecules (see ESI †) was used for an Inverse Virtual Screening campaign against a panel of viral proteins previously prepared 40 in which we added the three SARS-CoV-2 proteins (1027 total proteins). In this way, we wanted to relate the predicted binding affinities on the investigated SARS CoV-2 targets with those obtained testing Fig. 1 Schematic view of the computational approach used. In each step, the targets and the cutoff parameters are listed. the same selected molecules on a large panel of other viral proteins, with the nal aim of identifying the most robust results and to exclude putative false positives. Indeed, for each molecule, we calculated the IVS ratio by dividing the calculated binding affinity of the molecule against each of the SARS-CoV-2 protein by the best binding affinity calculated on the whole panel (see Materials and methods). In this way, we were able to estimate the discrepancy between the considered binding affinity and the best one, evaluating whether it falls above the average value. Using this parameter, we were condent that the protein-ligand complex was among the top-ranked ones and was not false positive result. The nal additional renement was carried out, at this point, with the calculation of the DG bind energy through MM-GBSA for the molecules that had an IVS ratio within 0.1 from the best result. Out of the 27 ligands tested on each protein target, 20, 24, and 13 ligands were initially found within the selected range for the main protease, the papain-like protease, and the spike protein, respectively (Fig. 1) . Some of them were redundant results and they were deleted from the nal ranking (Table 1) . From the analysis of this amount of data (see ESI Tables S1-S12 †), some interesting results emerged (Table 1) . Among them, two antiviral agents (dolutegravir acting as anti-HIV integrase and velpatasvir acting against Hepatitis C NS5A) were identied and represented interesting outcomes. Specically, the best ligand for the main protease, the papain-like protease, and the spike protein, according to the MM-GBSA calculations, were rutin, velpatasvir, and lomitapide, respectively (Fig. 2) . Rutin is a derivative of quercetin containing a glucose/ rhamnose glycosyl group (rhamnose) at C-3. It is a avonoid with antioxidant, antiproliferative, and anti-inammatory properties that can be found in different plant species and is very common in Chinese traditional medicine. Recently, this natural compound was predicted active on the SARS-CoV-2 main protease by Abd El-Mordy et al., 51 Al-Zahrani, 52 and in other studies, [53] [54] [55] giving a further indication about the robustness of the presented method. Very interestingly, an important milestone was recently reached by Abian et al. 56 who identied and tested in vitro quercetin, the precursor of rutin, against SARS-CoV-2, nding a very promising inhibitory activity on the main protease (Ki $ 7 mM). This experimental result indirectly corroborated the applicability of the proposed workow for accelerating the repurposing of FDA-approved drugs for COVID-19 treatment, thus pushing for the future experimental evaluation of the other drugs here identied. In this context, velpatasvir is interestingly used in the treatment of Hepatitis C, 57 another single-stranded, positive-sense RNA virus, and its use in the treatment of SARS-CoV-2 was already suggested, 58-60 though focusing the attention on the main protease. This represents a further indication of the robustness of the methodology. Lomitapide, on the other hand, is a drug used in the homozygous familial hypercholesterolemia, 61,62 and the outcomes of the proposed methodology suggest an off-label application of this molecule. The threedimensional representations of the docking poses are shown in Fig. 3 , and they represent the starting point for the subsequent experiments and evaluations. It can be noticed from Fig. 3 , especially for the main protease, that molecular docking results are in line with what reported, 52, 63 showing interactions with Glu166, Gly143, and Thr45. Eventually, the ligand-protein complexes with the best DG bind were simulated to test their stability throughout time. To assess the stability of the complexes, three different parameters were computed: RMSD, RMSF, and the radius of gyration. These parameters reect the overall mobility of the structures and can be used to conrm the initial hypothesis of a privileged binding between the three targets and the selected ligands. On a general note, for each complex, the displacement of the backbone atoms, side chains, and ligand structure from the initial position fell in the range of the ordinary biomolecules RMSD (Fig. 4) . Interestingly, the papain-like protease/velpatasvir showed the lowest RMSD values both for the protein and the ligand, indicating a highly stable complex, likely due to the strong protein-ligand contacts and interactions, despite its complex and bulky structure. If we consider the RMSF trend, calculated averaging the backbone and the side chain values for each residue, it appeared small and within the normal range (Fig. 5 ). In detail, the main protease had no residue with mobility above 4 A, indicating a high degree of stability. Concerning the papain-like protease, some of the residues (Cys192, Thr225, Cys226, and Gly227) featured a high degree of mobility, but none of them was involved in the ligand binding. The remaining amino acids showed very low RMSF values, corroborating the results obtained with the MM-GBSA and the RMSD analysis. Eventually, the spike protein showed a highly unstable fragment, from Ser477 to Glu484, which was not part of the binding site. We moved on considering the radius of gyration (R g ) distribution for each protein, which represents the change in protein ductility during the simulation (Fig. 6) . The spike protein showed a peculiar trend with a curve characterized by two peaks, probably caused by the reduced dimensions of this system when compared to the other two. On the other hand, the distribution curve for the two proteases chains was narrow, indicating a uniform distribution of the measures around the mean value. The standard deviation reected the behavior previously highlighted, with the main protease and the papain-like protease showing the most compact results. These data, thus, are in line with what previously stated on the relative stability of the complexes. Concerning the protein-ligand interactions, they were evaluated throughout time ( Fig. 7 and 8) . The strength and the number of protein-ligand interactions followed the trend showed by the RMSD, RMSD, and R g of the complexes. In detail, the main protease/rutin complex showed a high percentage of hydrogen bonds and water bridges due to its polar nature ( Fig. 2 and 7) for almost the entire simulation; some residues, thus, were involved in interactions with the ligand for almost the 80% This journal is © The Royal Society of Chemistry 2020 RSC Adv., 2020, 10, 40867-40875 | 40873 of the time (interactions fraction of $0.8). Velpatasvir established powerful interactions with Asp164 and Tyr264 that was kept stable during the simulation with an interactions fraction $ 1.0, due to the cumulative effect of each interaction type and the presence of two hydrogen bond acceptor on Asp164. These two residues were found to be relevant in stabilizing inhibitors in the SARS-CoV papain-like protease 64 (Asp165 and Tyr265 in that paper), corroborating the idea of an interaction between the ligand and the target. The spike protein/lomitapide complex showed fewer and/or weaker interactions if compared to the other two systems, but, despite that, the residues involved in the binding are considered crucial for the interaction with the known human counterpart (ACE2 receptor), 38, 65 supporting the hypothesis of a good protein-ligand binding formulated based on the RMSD values. Overall, the MD simulations helped us conrming the data obtained with molecular docking, IVS experiments, and MM-GBSA, highlighting positive interactions between the selected molecules and the protein counterpart. Summarizing, in this work a new computational protocol was proposed, combining classic Virtual Screening, Inverse Virtual Screening, MM-GBSA predictions, and molecular dynamics simulation, to carry out a repurposing campaign using FDAapproved drugs against COVID-19. We introduced a new parameter called IVS ratio that can help to discriminate falsepositive results and highlight new interacting compounds. The novelty is represented by the insertion of the Inverse Virtual Screening, a new and powerful method, to cross-check the results obtained from virtual screening. This step helps to limit the number of false-positive results and enhances the most promising bindings. In this way, we veried, using the IVS ratio parameter, that the protein-ligand binding was still among the favorite ones even when the ligand was tested on multiple proteins. In this case, we focused our attention on three of the most important SARS-CoV-2 proteins: the main protease, the papain-like protease, and the spike protein, but this approach can be quickly re-iterated on further targets of interest. For each protein-ligand complex selected downstream, some results (e.g., rutin vs. main protease) are in line with recent outcomes, making the proposed screening scheme encouraging and giving a precious suggestion for further evaluations. Moreover, we would like to highlight the crucial role played by computational techniques in this emergency, which can provide valuable information in such a short time, helping the scientic community in the ght against this pandemic. There are no conicts to declare. Coronaviruses: Methods and Protocols Schrödinger Release 2020-1: Protein Preparation Wizard, Epik, Schrödinger, LLC Schrödinger Release Schrödinger Release The research leading to these results has received funding from AIRC under IG 2018-ID. 21397 project -P.I. Bifulco Giuseppe and MFAG 2017-ID. 20160 project -P.I. Lauro Gianluigi.