key: cord-0982326-nvyypxua authors: Power, Helen; Wu, Jiadai; Turville, Stuart; Aggarwal, Anupriya; Valtchev, Peter; Schindeler, Aaron; Dehghani, Fariba title: Virtual screening and in vitro validation of natural compound inhibitors against SARS-CoV-2 spike protein date: 2021-12-21 journal: Bioorg Chem DOI: 10.1016/j.bioorg.2021.105574 sha: 21f8d3ade4547e7b07ca8650268fcdddc278a53f doc_id: 982326 cord_uid: nvyypxua The COVID-19 pandemic caused by the SARS-CoV-2 virus has led to a major public health burden and has resulted in millions of deaths worldwide. As effective treatments are limited, there is a significant requirement for high-throughput, low resource methods for the discovery of novel antivirals. The SARS-CoV-2 spike protein plays a key role in viral entry and has been identified as a therapeutic target. Using the available spike crystal structure, we performed a virtual screen with a library of 527 209 natural compounds against the receptor binding domain of this protein. Top hits from this screen were subjected to a second, more comprehensive molecular docking experiment and filtered for favourable ADMET properties. The in vitro activity of 10 highly ranked compounds was assessed using a virus neutralisation assay designed to facilitate viral entry in a physiologically relevant manner via the plasma membrane route. Subsequently, four compounds ZINC02111387, ZINC02122196, SN00074072 and ZINC04090608 were identified to possess antiviral activity in the µM range. These findings validate the virtual screening method as a tool for identifying novel antivirals and provide a basis for future drug development against SARS-CoV-2. Since its emergence in Wuhan in December 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a global health concern. As of July 2021, it has caused over 186 million infections and 4 million deaths [1] . The wild-type virus and its variants are highly transmissible with the potential to rapidly overwhelm healthcare systems [2, 3] . The associated illness, known as coronavirus disease (COVID-19) has a wide range of symptoms but is typically associated with fever, cough, fatigue, loss of smell and taste, loss of appetite and muscle pain [4, 5] . Serious complications have also been associated with the disease, including clotting disorders, cardiac injury, stroke, and seizures [6] [7] [8] . Moreover, long-term symptoms persisting from 3 weeks to months after disease onset occur in 10-30% of patients [9] [10] [11] , including in patients no longer testing positive for the virus [12] . Since the detection of the initial Wuhan strain, several variants of concern have emerged. In particular, the variants known as alpha, beta, gamma, and delta have spread rapidly and feature mutations associated with increased transmissibility and virulence and the ability to evade the host immune response [13, 14] . Currently, only one drug (remdesivir) has been approved by the FDA for non-emergency treatment of COVID-19 [15] . Remdesivir is a repurposed broad-spectrum antiviral that targets viral RNA production [16] . However, results from multiple clinical trials suggest that this drug may not improve mortality or other relevant health outcomes [17] . Monoclonal antibodies against SARS-CoV-2 have also been explored as a treatment option and are currently FDA approved for emergency use only [15] . These treatments including bamlanivimab, etesevimab, casirivimab, imdevimab and sotrovimab, are most effective as an early intervention and are not authorised for patients hospitalised for COVID-19. Moreover, supply limitations and the requirement for mAbs to be administered systemically, prevents the widespread use of these therapies [18, 19] . There is also a risk of resistance to common circulating variants, as observed with bamlanivimab, etesevimab and casirivimab [20] . There is a strong interest in finding rapid and economical methods to identify new and effective anti-coronavirus treatments. Virtual screening involves computational estimation of the optimal binding of a library of potential drugs against targets of interest. This procedure can greatly reduce the time and cost of the drug development compared to in vitro screening alone and has been frequently used to identify compounds with therapeutic potential [21] [22] [23] [24] [25] . Natural compounds are a valuable source of molecules for drug screening. These represent a diverse range of chemical structures, many with known therapeutic properties. In fact, of the 1,328 drugs approved by the US Food & Drug Administration between 1981 and 2016, 41% were natural or derived from natural products [26] . Natural compound libraries offer advantages for drug discovery due to their high structural diversity, a feature necessary for selective and specific interactions with protein targets [27] . For SARS-CoV-2, the viral spike protein represents a key therapeutic target as it mediates viral entry into host cells. The spike protein binds to the cellular receptor angiotensinconverting enzyme 2 (ACE2), leading to spike cleavage by the host enzyme transmembrane serine protease (TMPRSS2) [28] . The structure of the receptor binding domain of the Wuhan strain complexed with ACE2 has since been solved by Lan et al. and Shang et al. using X-ray crystallography [29, 30] . Contacting residues have been identified, including two key regions involved in the stabilisation of the binding interaction. These two regions, known as the Lys31 and Lys353 hotspots, provide a basis for the design of targeted treatments that could prevent the early stages of infection. Using this information, we aimed to identify natural compounds with anti-SARS-CoV-2 activity using a combined molecular docking and in vitro approach. Most docking studies screening natural compounds against the spike protein have screened only a small number of ligands (<200) and focus on the computational aspect alone [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] . Even larger-scale studies do not include experimental validation [41] [42] [43] . To address these limitations, we performed a structure-based docking study targeted at the regions of the spike receptor binding domain (RBD) involved in stabilisation of the Lys31 and Lys353 hotspots. A virtual screen against this site was performed with a total of 527 209 molecules from five natural product databases: phenol explorer [44] , marine natural products [45] , ZINC [46] , Super Natural II [47] and the Human Metabolome Database [48] . After filtering top hits for favourable druglike characteristics, commercially available compounds were validated using a cell model designed to replicate infection in humans. This model uses human embryonic kidney (HEK) cells genetically modified to express ACE2 and TMPRSS2, thus facilitating viral entry in a physiologically relevant manner via the plasma membrane route [28] . Subsequently, we identified four compounds with anti-SARS-CoV-2 activity in the µg/ml range. We also discussed the relevancy of this model in relation to the emerging variants of concern. These findings highlight the capacity of virtual screening to successfully identify inhibitory molecules and provide a basis for further studies regarding viral mechanisms and drug optimisation. Docking was performed using the AutoDock Vina software with a rigid docking protocol [49] . To prepare the protein target for docking, the X-ray derived crystal structure of the SARS-CoV-2 spike receptor-binding domain bound with ACE2 (PDB ID: 6M0J) was downloaded in PDB format from the Research Collaboratory for Structural Bioinformatics Protein Data Bank (https://www.rcsb.org/). PyMOL was used to remove ACE2, water molecules and ions and to add polar hydrogens. The file was then converted to PDBQT format using the PyRx virtual screening tool [50] . The search space of spike was defined to include key regions of the protein involved in the stabilisation of binding to ACE2 (centre: -36.402, 24.771, 6.459; dimensions: 18.703, 29.999, 13.617). The docking protocol was conducted in two stages. The primary round of screening was performed with all 527 209 ligands. To prepare the ligands for the initial screen, all files were downloaded in SDF format and edited to include polar hydrogens using PyMOL [51] . 3D coordinates were generated by conversion to MOL2 with Marvin Suite 6.0 from ChemAxon [52] . Ligands were then energy minimised and converted to PDBQT format with Open Babel software v 3.1.1 [53] . For the docking of each ligand, the number of docked conformations (modes) and the number of independent runs (exhaustiveness) was set to 10. To increase the probability of identifying binding conformations with the lowest energy, a secondary round of virtual screening was conducted using the top ligands from the primary screen ranked with the leading 30 scores (n=12,322). Ligands screened in the second round were energy minimised and converted to PDBQT format using the PyRx virtual screening tool [50] . For this stage, the number of runs was increased to 40. The binding conformation of the ligands complexed with spike protein were visualised using PyMOL [51] . Residues involved in hydrogen-bonding and hydrophobic interactions were analysed and plotted using the LigPlot + software [54] . To filter compounds with favourable drug-like characteristics, the OSIRIS property explorer was used to calculate the toxicity risk, hydrophilicity (cLogP), solubility (logS), molecular weight (MW), topological surface area (TPSA), drug-likeness and overall drug-score of the top ranked ligands [55] . Compounds with no predicted toxic fragments (mutagenic, tumorigenic, irritant or reproductive effects) and favourable drug-like properties were retained. Ten commercially available compounds with docking scores ≤ -7.9 and favourable drug-like properties were purchased from Vitas-M Laboratory for in vitro testing. All compounds were dissolved in DMSO at a concentration of 10mg/ml and stored at -20ºC. The cytotoxicity of the purchased compounds was initially assessed in VERO cells using the water-soluble tetrazolium salt (WST-1) assay. Twenty-four hours prior to the assay, VERO cells were seeded in 96-well plates at a density of 10,000 cells per well. Cells were then treated with test compounds serially diluted 10-fold to final concentrations of 100 µg/ml, 10 µg/ml, 1 µg/ml, and 0.1 µg/ml. Vehicle controls were prepared by diluting DMSO to final concentrations corresponding to the percentage of DMSO in the prepared chemicals. Untreated cells were prepared as positive controls and wells containing DMEM and DMSO only were used as blanks. Following treatment for 24 hours at 37 ºC, cells were washed once with DMEM and incubated with 10% WST-1 for 2 hours. To determine cell viability, absorbance was measured at 450 nm and 620 nm as a reference wavelength. The relative cell viability was calculated using the following equation: The highest concentration with a cell viability over 75% was selected as the safe concentration for downstream antiviral assays. The human cell line HEK clone 24 used by this study was genetically modified to express ACE2 and TMPRSS2 (PMID:34228725) [56] . Successful genetic modification was assessed based on the ability of this cell line to be infected with a strain of SARS-CoV-2 that is genetically the same as that observed in Wuhan in late 2019 (Clade A2.2). On the day of the experiment, live HEK-24 cells were stained using NucBlue dye and seeded Where D represents the percentage of cell death in the absence of test compounds and is calculated as: = 1 - All statistical analyses were performed using the GraphPad Prism 9 software (GraphPad Software, Inc.). Non-linear regression analysis was used to generate dose-response curves and calculate EC 50 values. Values are represented as the mean ± standard deviation (SD). The docking scores for these top 10 compounds ranged from (-8.4)-(-7.9). One ligand had the top docking score of -8.4 (ZINC02726715), two ligands had a docking score of -8.3 (ZINC02122196, ZINC96115460), three ligands had a docking score of -8.1 (SN00074072, ZINC02111387, ZINC96114431) two ligands had a docking score of -8.0 (SN00092464, ZINC04090608) and two ligands had a docking score of -7.9 (ZINC06624435, ZINC85878555). The number of receptor residues involved in the binding interaction ranged from 7-11 and the number of hydrogen bonds ranged from 0-4 (Table 2) . Additionally, all selected compounds contained 4-7 ring structures ( Figure 2 ). This feature is particularly notable for inhibitors of protein-protein interactions (iPPI), as iPPIs are associated with a greater number of aromatic rings than other drugs [57] . The search space used in the docking study was defined to include the spike residues involved in binding to two key regions of the ACE2 protein known as the Lys31 and Lys353 hotspots. Neutralisation of these lysine residues is required to stabilise the binding interaction between the viral RBD and its receptor. Within the spike protein, Leu455 and Gln493 are required for stabilisation of the Lys31 hotspot and Gly496 and Asn501 are required for stabilisation of the Lys353 hotspot [30] . Lys417 forms two salt bridges with ACE2 and is also important for receptor binding [29] . Therefore, the interaction of small molecules with these residues may interfere with stable receptor binding and reduce viral entry. Notably, the docking conformation of all 10 ligands included interactions with both Gly496 and Asn501. Three ligands interacted with Leu455, five ligands interacted with Gln493 and one ligand interacted with Lys417. However, of these 29 interactions with key residues, only three involved hydrogens bonds, with the remaining 26 forming hydrophobic contacts. One compound, ZINC02111387 interacted with all five key residues ( Figure 3 , Table 2 ). Bolded residues are involved in hydrogen bonding. An advantage of molecular docking is the ability of this method to predict the effect of viral mutations on the binding conformation of lead compounds. Changes to residues in the RBD may decrease or increase binding affinity and may also change the structural conformation of the entire domain. The four main variants of concern: alpha, beta, gamma, and delta, contain mutations in five amino acid residues found in the spike RBD ( Figure 4 ) [14] . The N501Y mutation has evolved independently in the alpha, beta and gamma variants and has been shown to affect the structural conformation of the RBD and increase binding affinity to ACE2. Notably, Asn501 has been identified in stabilisation of the Lys353 hotspot and as mentioned above, it is predicted to interact with all 10 lead compounds. All variants of concern carry a mutation in Glu484, which is only predicted to interact with ZINC96115460 and two variants, alpha and beta contain a mutation in Lys417, which is only predicted to interact with ZINC02111387. The two mutations unique to the delta variant, L452R and T478K, are not present in any of the interactions. To provide a better prediction of how these mutations may affect binding to specific compounds, further modelling is required, although this depends on the availability of high-resolution protein structures. Currently, molecular structures of the spike protein have been solved by cryo-electron microscopy for the alpha, beta and gamma variants [58] . However, the delta variant spike structure has not yet been solved and high resolution (<2.5 Å) X-ray crystallography structures are currently not available for any variants. Several approaches may be considered to expand the number of candidate compounds identifiable using computational methods. Firstly, screening larger compound libraries has been shown to increase the number of true-positive hits within the top ranked ligands [59] . However, this requires access to large amounts of computational resources and relies on the availability of extensive compound libraries. When focusing on natural products, even the largest collections contain only hundreds of thousands of molecules [60] , compared to other synthetic libraries with over one billion compounds [61] . Another approach is to reduce stringency when filtering for drug-like properties or to eliminate this step entirely. This approach, however, may increase the presence of undesirable properties in lead compounds including poor bioavailability and high cytotoxicity and hence may require further research into appropriate delivery methods or modification of functional groups using techniques such as bioisosterism. The top 10 ranked ligands from the virtual screen that were both commercially available and had favourable ADMET profiles were purchased for in vitro assessment of their anti-SARS-CoV-2 activity. The concentration of these compounds employed in antiviral assays was guided by the outcomes of a cytotoxicity assay; this was defined as 100µg/ml or the highest concentration in which cell viability was over 75%. The highest tested concentration of 100µg/ml showed >75% viability for six compounds, 10µg/ml showed >75% viability for three compounds, and 0.1µg/ml showed >75% viability for one compound (Table 3) . A SARS-CoV-2 infection assay in HEK-24 cells genetically modified to express ACE2 and TMPRSS2 was used to assess the candidate natural compounds. Due to the expression of the TMPRSS2 protease in the human respiratory tract, viral particles largely utilise the plasma membrane route for entry into these tissues. Some alternative cell lines such as VeroE6, do not express the required proteases and entry is predominantly endosomal [28, 62] . Therefore, based on physiological relevance, the genetically modified HEK-24 cell line was selected for experimental validation. Using this in vitro model, anti-SARS-CoV-2 activity was exhibited by ZINC02111387, ZINC02122196, SN00074072 (maximal dose 100µg/ml) and ZINC04090608 (maximal dose 10µg/ml) ( Figure 5 , For the four compounds that were inhibitory for SARS-CoV-2, docking scores ranged from (-8.3)-(-8.0), the number of predicted interacting residues in the spike protein ranged from 7-11, and the number of hydrogen bonds formed ranged from 1-2. Notably, the most potent compound ZINC02111387, was predicted to interact with all five key spike residues identified as important for binding to the native ACE2 receptor (Lys417, Leu455, Gln493, Gly496 and Asn501). The remaining three compounds were predicted to interact with two key residues, Gly496 and Asn501. Although there was no calculable association between docking score and virus neutralisation, the two lowest scoring compounds ZINC06624435 and ZINC85878555 did not exhibit any antiviral activity. The modest number of compounds tested in vitro reflects the low off-the-shelf availability of natural compounds with drug-like ADMET profiles. Natural products are advantageous for drug discovery due to their known ability to carry out diverse functions in biological systems and their exceptional structural diversity which is often not observed in large chemical databases [63] . However, many natural compounds are not readily available and if not manufactured synthetically, may require complex extraction and purification procedures. and solvent-exposed regions [64] . Though the antiviral activity of four compounds was confirmed experimentally, validation of the binding mechanism cannot be determined from the virus neutralisation assay. Whilst molecular docking provides a prediction of the binding conformation, viral inhibition may potentially occur through allosteric effects or interactions with biomolecules other than spike. Additional techniques are required to validate the docked conformation, such as SPR for investigating binding kinetics and cryo-electron microscopy or X-ray co-crystallography for analysis of the interaction at an atomic level. From a treatment perspective, however, viral neutralisation assays provide a better indication of drug efficacy in a biological system. Although beyond the scope of this study, further research into the binding mechanism of the anti-SARS-CoV-2 compounds may help to identify key spike residues for drug targeting and to serve as the basis for designing chemical modifications to improve the activity of these compounds. This study demonstrates the ability of a targeted structure-based molecular docking approach combined with in vitro validation for the identification of novel antiviral drugs. Despite testing a limited number of compounds in vitro (n=10), we were able to identify four compounds with previously unknown activity against SARS-CoV-2. These active compounds showed 18-40% viral inhibition using an established live virus infection assay. While their activity is low when compared to neutralising antibodies (which have been shown to inhibit 100% of viral particles in the nM range [65] ), it nonetheless supports our modelling approach and its capacity to identify compounds with binding affinity to the target site. It may be possible to further augment the binding affinity and antiviral activity of these candidates by targeted chemical modification (bioisosterism), although this is beyond the scope of this modelling project.  A live virus neutralization assay verified the antiviral activity of selected compounds  ZINC02111387, ZINC02122196, SN00074072 and ZINC04090608 were identified as active compounds Declarations of interest: none Emergency Situational Updates. 2021, World Health Organisation SARS-CoV-2 Infectivity and Severity of COVID-19 According to SARS-CoV-2 Variants: Current Evidence High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerg Infect Dis The prevalence of symptoms in 24,410 adults infected by the novel coronavirus (SARS-CoV-2; COVID-19): A systematic review and meta-analysis of 148 studies from 9 countries Real-time tracking of self-reported symptoms to predict potential COVID-19 Cardiac and arrhythmic complications in patients with COVID-19 Incidence of thrombotic complications in critically ill ICU patients with COVID-19 Neurological complications of coronavirus infection; a comparative review and lessons learned during the COVID-19 pandemic Post-COVID-19 Syndrome: The Persistent Symptoms at the Post-viral Stage of the Disease. A Systematic Review of the Current Data Acute and persistent symptoms in non-hospitalized PCR-confirmed COVID-19 patients Sequelae in Adults at 6 Months After COVID-19 Infection Persistent Symptoms in Patients After Acute COVID-19 Emerging SARS-CoV-2 variants of concern and potential intervention approaches Structural Evaluation of the Spike Glycoprotein Variants on SARS-CoV-2 Transmission and Immune Evasion COVID-19) | Drugs. Emergency Preparedness | Drugs Broad spectrum antiviral remdesivir inhibits human endemic and zoonotic deltacoronaviruses with a highly divergent RNA dependent RNA polymerase WHO. Therapeutics and COVID-19: living guideline Neutralizing monoclonal antibodies for treatment of COVID-19 Passive antibody therapy for infectious diseases Antibody resistance of SARS-CoV-2 variants B.1.351 and B.1.1.7. Nature Identification of new GLUT2-selective inhibitors through in silico ligand screening and validation in eukaryotic expression systems Alzheimer's disease: An overview of amyloid beta dependent pathogenesis and its therapeutic implications along with in silico approaches emphasizing the role of natural products In silico screening for human norovirus antivirals reveals a novel nonnucleoside inhibitor of the viral polymerase Structure-based discovery of novel chemotypes for adenosine A(2A) receptor antagonists Natural Products as Sources of New Drugs from 1981 to 2014 SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Structural basis of receptor recognition by SARS-CoV-2 Potential of Plant Bioactive Compounds as SARS-CoV-2 Main Protease (M(pro)) and Spike (S) Glycoprotein Inhibitors: A Molecular Docking Study. Scientifica (Cairo) A phytochemical-based medication search for the SARS-CoV-2 infection by molecular docking models towards spike glycoproteins and main proteases Molecular Docking and Molecular Dynamics Aided Virtual Search of OliveNet Directory for Secoiridoids to Combat SARS-CoV-2 Infection and Associated Hyperinflammatory Responses An in-silico evaluation of different Saikosaponins for their potency against SARS-CoV-2 using NSP15 and fusion spike glycoprotein as targets Molecular docking and dynamics studies of curcumin with COVID-19 proteins. Netw Model Anal Health Inform Bioinform In silico Molecular Docking Analysis Targeting SARS-CoV-2 Spike Protein and Selected Herbal Constituents Epigallocatechin gallate and theaflavin gallate interaction in SARS-CoV-2 spike-protein central channel with reference to the hydroxychloroquine interaction: Bioinformatics and molecular docking study Molecular docking study of potential phytochemicals and their effects on the complex of SARS-CoV2 spike protein and human ACE2 Molecular Docking Analysis Of Some Phytochemicals On Two SARS-CoV-2 Targets: Potential Lead Compounds Against Two Target Sites of SARS-CoV-2 Obtained from Plants Pinpointing the potential hits for hindering interaction of SARS-CoV-2 Sprotein with ACE2 from the pool of antiviral phytochemicals utilizing molecular docking and molecular dynamics (MD) simulations The discovery of potential natural products for targeting SARS-CoV-2 spike protein by virtual screening Natural Products Database Screening for the Discovery of Naturally Occurring SARS-Cov-2 Spike Glycoprotein Blockers. ChemistrySelect Screening of Natural Products Targeting SARS-CoV-2-ACE2 Receptor Interface -A MixMD Based HTVS Pipeline. Frontiers in Chemistry Phenol-Explorer 3.0: a major update of the Phenol-Explorer database to incorporate data on the effects of food processing on polyphenol content An Updated Review on Marine Anticancer Compounds: The Use of Virtual Screening for the Discovery of Small-Molecule Cancer Drugs ZINC 15--Ligand Discovery for Everyone Super Natural II--a database of natural products HMDB: a knowledgebase for the human metabolome AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading Small-molecule library screening by docking with PyRx The PyMOL Molecular Graphics System Version 2.3.5. 2019, Schrödinger, LLC Open Babel: An open chemical toolbox LigPlot+: multiple ligand-protein interaction diagrams for drug discovery SARS-CoV-2 neutralizing antibodies: Longevity, breadth, and evasion by emerging viral variants Rationalizing the chemical space of protein-protein interaction inhibitors Effect of natural mutations of SARS-CoV-2 on spike structure, conformation, and antigenicity Ultra-large library docking for discovering new chemotypes Review on natural products databases: where to find data in 2020 An open-source drug discovery platform enables ultra-large virtual screens SARS-CoV-2 tropism, entry, replication, and propagation: Considerations for drug discovery and development Computational drug design targeting protein-protein interactions A practical guide to large-scale docking A SARS-CoV-2 neutralizing antibody with extensive Spike binding coverage and modified for optimal therapeutic outcomes The authors acknowledge the technical assistance of Nathaniel Butterworth of the Sydney Informatics Hub, a Core Research Facility of the University of Sydney and the use of University of Sydney's high performance computing cluster, Artemis.This research/project was also undertaken with the assistance of the high-performance computing system Gadi from the National Computational Infrastructure (NCI), which is supported by the Australian Government.Investigators J Wu and H Power received salary support from JAT Energy (Toorak, VIC, Australia). The authors declare no competing interests.