key: cord-0720991-31ngknq4 authors: Sarma, Phulen; Shekhar, Nishant; Prajapat, Manisha; Avti, Pramod; Kaur, Hardeep; Kumar, Subodh; Singh, Sanjay; Kumar, Harish; Prakash, Ajay; Dhibar, Deba Prasad; Medhi, Bikash title: In-silico homology assisted identification of inhibitor of RNA binding against 2019-nCoV N-protein (N terminal domain) date: 2020-05-18 journal: J Biomol Struct Dyn DOI: 10.1080/07391102.2020.1753580 sha: 50bc439439b15ffc3b8599e0c80c427daccff41b doc_id: 720991 cord_uid: 31ngknq4 The N terminal domain (NTD) of Nucleocapsid protein (N protein) of coronavirus (CoV) binds to the viral (+) sense RNA and results in CoV ribonucleoprotien (CoV RNP) complex, essential for the virus replication. In this study, the RNA-binding N terminal domain (NTD) of the N protein was targeted for the identification of possible inhibitors of RNA binding. Two NTD structures of N proteins were selected (2OFZ and 1SSK, 92% homology) for virtual screening of 56,079 compounds from Asinex and Maybridge library to identify top 15 hits for each of the targets based on ‘docking score’. These top-hits were further screened for MM-GBSA binding free energy, pharmacokinetic properties (QikProp) and drug-likeness (SwissADME) and subjected to molecular dynamics (MD) studies. Two suitable binders (ZINC00003118440 and ZINC0000146942) against the target 2OFZ were identified. ZINC00003118440 is a theophylline derivative under the drug class ‘bronchodilators’ and further screening with approved bronchodilators was also studied to identify their ability to bind to the RNA binding region on the N protein. The other identified top hit is ZINC0000146942, which is a 3,4dihydropyrimidone class molecule. Hence this study suggests two important class of compounds, theophylline and pyrimidone derivaties as possible inhibitors of RNA binding to the N terminal domain of N protein of coronavirus, thus opening new avenues for in vitro validations. Communicated by Ramaswamy H. Sarma Coronaviruses are enveloped spherical or pleomorphic single stranded RNA viruses with characteristic bears club shaped glycoproteins on their surface . Seven strains of human CoVs are documented, which include 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV and 2019-nCoV . The previous two Coronavirus epidemics, SARS-CoV and MERS-CoV, with 10% mortality for SARS-CoV and 37% for MERS-CoV affected $10,000 individual lives . The reemergence and severe outbreak of Coronavirus, now as COVID-19 has caused global health emergency . Originated from Wuhan, China, the virus has spread as far as Thailand, Japan, Korea, the USA, Vietnam, Singapore, India and European nations. WHO has declared COVID-19 as a pandemic (Coronavirus Disease (COVID-19)-Events as they happen, n.d.). The 2019-nCoV is reported to be the seventh and newest member of the Coronavirus, with human infections (Zhu et al., 2020) . The CoV has several conserved structural proteins e.g. the matrix protein (M protein), small envelop protein (E protein), trimeric spike (S) glycoprotein, nucleocapsid protein (N protein) and proteases (papain like and the main protease) . The nucleocapsid protein is typically located inside the virus and is one of the most abundant structural proteins in the Coronaviruses. The N protein binds with the viral RNA genome to form a virion core, which is vital for its replication and transcription. In addition, the N protein also takes an essential part in viral RNA synthesis (Chang et al., 2016) . Recent studies suggest that N proteins of coronaviruses can be useful antiviral drug targets against infections caused by these viruses for its importance in the replication initiation machinery (Chenavas et al., 2013; Lin et al., 2014) . One of the interesting features is that all CoV has the least variable structure of N protein (Chang et al., 2006 (Chang et al., , 2016 . The N terminal domain (NTD) of the nucleocapsid protein (N protein) plays a significant role in binding to genomic and sub-genomic RNAs in MHV and IBV virions. The N protein of CoV plays a major role in the viral replication cycle by the formation of the ribonucleoprotien complex with the help of its interactions through the viral RNA and N terminal domain (NTD) of the N protein (Lin et al., 2014) . It is due to this activity, the SARS-CoVN protein is suggested to be an RNA chaperone in a previous study and a stepping stone in viral genomic RNA replication. Few studies are there, which tried to design a new drug targeting to disrupt the interactions between coronavirus N protein NTD and CoV-RNA. A molecule 'PJ34' designed targeting the N protein RNA binding domain of HCoV-OC43 N-NTD (PDB 4KXJ) is reported (Lin et al., 2014) , however, the similarity between HCoV-OC43 N-NTD (PDB 4KXJ) and the 2019-nCoV N protein is only 46%. Here comes the need for designing new inhibitors against the RNA binding region of N protein to inhibit the binding of RNA and subsequent inhibition of viral replication in case of the 2019-nCoV. The computational progressions were carried out on an Acer Predator Helios 300 laptop running on Linux Ubuntu OS 18.04.02 LTS, with Maestro release version 2019-3 from Schr€ odinger. GPU machine used to render molecular modeling and dynamic simulations of the mentioned molecules were Nvidia GeForce GTX 1660 Ti (6GB). In this study we tend to design an inhibitor of RNA binding. Our target of interest was the NTD of nucleocapsid (N) protein. In the nucleocapsid (N) protein, our segment of interest was the RNA binding domain of the nucleocapsid protein. However as there was no structure available on the nucleocapsid protein of 2019-nCov, the structures that were most close to the 2019-nCoV nucleocapsid protein N terminal domain (RNA binding domain) were 2OFZ (Saikatendu et al., 2007) and 1SSK ) (92% sequence homology with the Wuhan sea food market pneumonia virus nucleocapsid phosphoprotein, accession number QHN73802). Although, both the structures represented the RBD of the N protein, none of the structures were bound to any reported inhibitor of RNA binding. Among the selected structures, PDB IDs, 1SSK was a NMR structure (resolution is not given) and the second target structure PDB ID 2OFZ is an ultrahigh resolution crystallized X-Ray structure (1.17 Å). The PDB ids of the target structures (2OFZ and 1SSK) were retrieved from the RCSB database (PDB Database, n.d.). In order to prepare the structure of the protein, the protein preparation wizard tool of Maestro version 10.2 was used. The back chains and missing side of the protein were included. On default settings and OPLS3 force field, the protein was minimized (Desmond, n.d.; Meng et al., 2011) . Water molecules beyond 5.0 Å were removed to avoid hindrance in the binding region. Ligand structure retrieval and preparation 36,750 compounds from the Maybridge library and 19,329 compounds of the Asinex library from the ZINC database were obtained and prepared in the ligprep module of Schr€ odinger. This step is crucial for processing input structure. The optimization was carried out by the OPLS 2005 force field, which produced the low-energy isomers with ready to dock poses (LigPrep, n.d.) . For the determination of the binding pocket, Sitemap was used. Picturized representation of the estimated binding pocket can be seen in Figure 1 . The guided pocket coordinates were then docked with ribonucleoside 5 0 -monophosphates (AMP and UMP) to obtain an insight into the most active residues taking part in nucleotide binding (Lin et al., 2014) . The imperative insight from the docking to be noted is the aberrant difference between the binding coordinates of 2OFZ and 1SSK, owing to the crystal and solution structure of the N protein of SARS-CoV, respectively. The spatial UMP-AMP docked pose RMSD for 2OFZ and 1SSK were found to be 3.74 Å and 3.85 Å, respectively. The pictorial and tabular illustration showing interacting residues can be seen in Table 1 and Figure 2 . The glide tool of Maestro was used for the virtual screening purpose. A total of 56,079 compounds were screened through the three step VSW (virtual screen workflow) in a tandem HTVS, SP and XP progression. Enhanced sampling method and flexible docking were enabled in XP to screen hits for N protein NTD models (2OFZ and 1SSK) (Desmond, n.d.; Meng et al., 2011) . Prime MM-GBSA was used to predict the free energy of binding between the receptor and the set of ligands. The binding free energy (DGbind) was calculated using the default parameters of the Prime module in Maestro, Schr€ odinger, LLC, New York, NY, USA (Singh & Muthusamy, 2013) . Prime uses a surface generalized Born (SGB) model employing a Gaussian surface instead of van der Waals surface for better representation of a solvent accessible surface area. MM/ GBSA, provides better statistical correlations against experimental binding data than previous similar reported studies (Nain et al., 2019; Adasme-Carreño et al, 2014) . The co-crystallized pose-viewer complexes were used as input files with implicit VSGB solvation model, OPLS3 force field and all other settings as default in Prime MM-GBSA (GlideScore/Docking Score doesn't correlate with my known activities. What is wrong? | Schr€ odinger, 2020, What do all the Prime MM-GBSA energy properties mean? | Schr€ odinger, 2020). The N protein NTD co-crystallized models (15 molecules for each 2OFZ and 1SSK) screened with the highest docking score were submitted for evaluation of the physicochemical and kinetic properties in QikProp module of Schr€ odinger. Drug-likeness of the hits were evaluated using values obtained from SwissADME (SwissADME, n.d.). MD simulations can provide in-depth knowledge of the dynamic behavior of bio-molecules in a graphical simulation which can demonstrate a free-energy landscape that is close to the native protein state inside the body. Hence employing the docked complexes to MD simulations provide a more accurate interaction profile. Molecular dynamics studies of the selected ligands were carried out against the chosen target structures. Module Desmond was used to carrying out MD simulations. In order to study the interactions of three targeted proteins with the ligands, the optimized potentials for the liquid simulations (OPLS)-2005 force field was employed. Firstly, a position restraint of 6000 ps was selected for the drug-target complex that allows water molecules in the system. Secondly, various frames were applied to minimize the complex upon which the force field is implied. Consequently, the root means square deviation (RMSD) for the C a , Ligand root mean square fluctuation (RMSF), Ligand contacts were obtained from the data to monitor the stability of three protein in its dynamic form along the simulated trajectory (Desmond, n.d.) . We used two PDB structures for virtual screening (2OFZ and 1SSK). Fifteen hits of each were selected on the basis of docking score. The details of the screened molecules are shown in Table 2 (for 2OFZ) and Table 3 (for 1SSK). MM-GBSA: In the case of target 2OFZ, the top five ligands with the highest MM-GBSA score were ligand 1 > 7>3 > 11 > 9. The highest MM-GBSA score was found to be À54.798. In the case of the target 1SSK, Poor binding affinity was seen in the case of all ligands with the highest MMGBSA (DGbind) found to be À44. Ligands with the highest MMGBSA score were ligand no 8 > 5> 12. Pharmacokinetics and physicochemical properties and drug likeness of the ligands and selection of hits for further molecular dynamic simulation studies A log p value of 1-3 may indicate optimal physicochemical properties and a logP value more than five results in rapid turnover rate, low solubility and poor absorption. Log p not greater than 5 is a component of Lipinski's rule (20). In our study, as the target is intracellular (nucleocapsid protein), the lipophilic nature of the ligand will be very important and we kept the range of 1 to <5. In the case of 2OFZ, among the 15 molecules selected (2OFZ-ligand complexes), 8 ligands had log p value between 1 and <5 (ligands 1, 4, 5, 7, 8, 11, 14 and 15) . Among the ligands with log p value in the range 1 to <5, taking an arbitrary cutoff MM-GBSA score of À45, we found that the binding affinity was weak (À45) in case of ligands 4, 8, 14 and 15. Again ligand 14 showed strong HERG inhibition. So, on the basis of lipophilicity and binding affinity, the best performing ligands were ligand 1, 5, 7 and 11. The selected complexes were advanced for MD simulations on discrete basis of docking score (ligand no 1, 2 and 3), on the basis of lipophilicity (log p) and binding affinity (MMGBSA score) (ligands 1, 5, 7 and 11). So, we conducted molecular dynamics study of a total of 6 ligands (ligand 1, 2, 3, 5, 7 and 11). In the case of 1SSK, among the 15 molecules selected (1SSK-ligand complexes), 9 ligands had log p value between 1 to <5 (ligands 2, 3, 4, 6, 7, 8, 9, 12 and 14) . Among these, ligand no 6 showed strong potential for HERG inhibition and ligand 3 showed additional rule of three violations. The rest of the molecules can be arranged in the order 8 > 12 > 9>2 > 4>14 > 7 on the basis of MM-GBSA. Top four molecules (8, 12, 9 and 2) were selected for MD simulation studies as the rest had poor MM-GBSA scores. Molecular dynamics simulation for the drug ligand complexes of Nucleocapsid protein N-terminal domain (PDB structure variants 2OFZ and 1SSK)were referred for the QPlog Po/w -Predicted octanol/gas partition coefficient (À2.0-6.5). QPlog HERG -Predicted IC50 value for the blockage of HERG K þ channels (concern below -5). QPPCaco -Predicted apparent Caco-2 cell permeability in nm/sec. (25 poor, >500 great). QPlogBB-Predicted brain/blood partition coefficient (-3.0-1.2). QPlogKhsa -Prediction of binding to human serum albumin. (À1.5-1.5). For Leadlikeliness and drug-likeliness (Rule of five and three) For the sake of convenience of referring the molecule, we used the serial number used as is in-text. QPlog Po/w -Predicted octanol/gas partition coefficient (À2.0-6.5). QPlog HERG -Predicted IC50 value for the blockage of HERG K þ channels (concern below -5). QPPCaco -Predicted apparent Caco-2 cell permeability in nm/sec. (25 poor, >500 great). -Predicted brain/blood partition coefficient (-3.0-1.2). QPlogKhsa -Prediction of binding to human serum albumin (À1.5-1.5). For Leadlikeliness and drug-likeliness (Rule of five and three) -0 ¼ YES, 1 or above dynamic profile assessment of N-protein interaction with the selected ligands (selected on the basis of MM-GBSA, Log P, HERG inhibition potential, rule of 5, rule of 3 and drug likeness rules). Considering the low structural constancy of the protein model itself, the 1SSK N-protein molecule simulation model was shed off from the picture as it was highly unstable with avg. RMSD(s) ranging between 10 and 12 Å (Supplementary Figure 1) . On the contrary, selected molecules from N-protein (PDB id: 2OFZ) complexes revealed few excellent binders. Lead molecules ZINC00003118440 and ZINC0000146942 emerged out to be high in interaction ratio and strength with N-protein active binding residues ( Figure 3 ) having avg. RMSD of 2.8 Å and 4.0 Å. Protein side chain RMSF is positively stable in the two complexes with the local ligand-contact maxima at 1.8 Å and 2.2 Å for ZINC0 0003118440 and ZINC0000146942, respectively. Interestingly, residues Gly 70, Val 73 and Pro 74 are the prime binding residues of N-protein that lie common with the top dynamically stable hit complexes and the nucleotide molecules AMP and UMP. The in-place docking poses of top hits ZINC00003118440 (left) and ZINC0000146942 (right) with SARS-CoV N protein NTD (2OFZ) is shown in Figure 4 . The MD simulation profile for the ejected complexes can be found in supplementary figures 1 and 2 accompanying the movie file of the 2OFZ-ZINC00003118440 complex (Supplementary material Movie 1). One essential feature that compounds ZINC00003118440 displays in the molecular dynamics is its commitment to the binding site throughout the trajectory, there was no leaving off the binding site accompanied by in-range protein fluctuation. The hit molecule ZINC000003118440 [8-(2-hydroxyethyl) aminophylline] is a synthetic derivative of the drug Theophylline (detailed structure showed in Figure 5 ). Theophylline is a bronchodilator of the Methylxanthine class. Few studies also reported the antiviral properties of Methylxanthines (Yamazaki and Tagaya, 1980) . However, if we see specifically for theophylline or theophylline derivatives, only one study reported that theophylline possesses antiviral activity against the hepatitis B virus (Zheng et al., 2011) and no further data is available against other types of viruses. In the management of COVID-19, the interim guidelines encourage the use of selective M1/M3 receptor blockers (anti-cholinergic agents) to reduce the secretion of lung glands, reduce spasm and associated wheeze and improve the respiratory function (Jin et al., 2020) . As the best hit was a theophylline derivative, which is a bronchodilator, and bronchodilators are often used in the management of viral lung diseases, we further screened all approved bronchodilator drugs Tripathi (2013) for their binding to the N protein NTD (RNA binding domain). The detailed data is shown in Table 4 . In addition to other activities, our study also demonstrated the binding affinity of the different bronchodilators against the NTD (RNA binding region) of the N protein of coronavirus. The binding affinity is in the order of Formeterol > Terbutaline > Ipratropium bromide > Tiotropium Bromide > Theophylline > salbutamol. One of the interesting facts is that formeterol also showed binding against 2019-nCoV PL protease (Zhavoronkov et al., 2020) . A preliminary in-vitro assay may provide further details and may pave the way for a drug with dual action (antiviral and bronchodilator) and guide us on the choice of bronchodilators in case of COVID-19. However, toxicity and the presence of other comorbidity parameters will also govern the choice of bronchodilators. The second compound ZINC000000146942 (Ethyl (4S)-4methyl-2-oxo-6-[(1S)-1-phenylethyl]-3,4-dihydro-1H-pyrimidine-5-carboxylate) is a derivative of 3,4 Dihydropyrimidone. 3,4 Dihydropyrimidinones (DHPMs) are implicated in a wide range of biological activities. Pyrimidone derivatives are already being used against viral infections (Phucho et al., 2009; Wierenga et al., 1985; Skulnick et al., 1985; Sharma et al., 2014; Seley-Radtke et al., 2018) . The pyrimidone nucleus is a core component of many of the anti-retroviral drugs (Sharma et al., 2014) . The pyrimidone scaffold is a backbone of many of the approved anti-retrovirals e.g. Zidovudine, Didanosine and Zalcitabine (Seley-Radtke & Yates, 2018). Other pyrimidone derivatives 5-iododeoxyuridine and 5-iodo-2 0 -deoxyuridine are extensively used against viral infections (Sharma et al., 2014) . Thus it sounds likely to find a derivative in a screening study targeting SARS CoV. We identified two potential hitsZINC000003118440 and ZINC000000146942 as inhibitors of RNA binding to NTD of N protein. The first hit is a theophylline derivative and the second hit is a pyrimidone derivative. Both the classes have reported to have antiviral effect against other viruses. However, this is the first time the binding potential of these two hits are reported against N protein NTD (RNA binding region) of SARS-CoV-2. Interestingly, the first compound was a theophylline derivative (commonly used bronchodilator). Hence, we screened all the approved bronchodilators against the N protein RNA binding site, which showed binding affinity (MMGBSA) in the order Formeterol > Terbutaline > Ipratropium bromide > Tiotropium Bromide > Theophylline > Salbutamol. These findings may help in the choice of bronchodilator in COVID-19. However these findings need invitro validation. Performance of the MM/GBSA scoring using a binding site hydrogen bond network-based frame selection: The protein kinase case Recent insights into the development of therapeutics against coronavirus diseases by targeting N protein Modular organization of SARS coronavirus nucleocapsid protein Influenza virus nucleoprotein: Structure, RNA binding, oligomerization and antiviral drug target COVID-19)-Events as they happen d.). Desmond j Schr€ odinger Health Benefits of Methylxanthines in Cacao and Chocolate Structure of the N-Terminal RNA-Binding Domain of the SARS CoV Nucleocapsid Protein GlideScore/Docking Score doesn't correlate with my known activities. What is wrong? j Schr€ odinger Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The Lancet Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein A rapid advice guideline for the diagnosis and treatment of 2019 novel coronavirus (2019-nCoV) infected pneumonia (standard version) LigPrep j Schr€ odinger Structural basis for the identification of the N-terminal domain of coronavirus nucleocapsid protein as an antiviral target Molecular docking: A powerful approach for structure-based drug discovery Energy-optimized pharmacophore coupled virtual screening in the discovery of quorum sensing inhibitors of LasR protein of Pseudomonas aeruginosa PDB Database. (n.d.). RCSB PDB: Homepage. Retrieved Recent progress in the chemistry of dihydropyrimidinones Drug targets for corona virus: A systematic review Ribonucleocapsid formation of severe acute respiratory syndrome coronavirus through molecular action of the N-terminal domain of N protein Therapeutic options for the treatment of 2019-novel coronavirus: An evidence-based approach The evolution of nucleoside analogue antivirals: A review for chemists and non-chemists. Part 1: Early structural modifications to the nucleoside scaffold Molecular modeling, quantum polarized ligand docking and structure-based 3D-QSAR analysis of the imidazole series as dual AT(1) and ET(A) receptor antagonists Significance and biological importance of pyrimidine in the microbial world Pyrimidinones. 1. 2-amino-5-halo-6-aryl-4(3H)-pyrimidinones. Interferon-inducing antiviral agents Essentials of medical pharmacology A novel coronavirus outbreak of global health concern What do all the Prime MM-GBSA energy properties mean? j Schr€ odinger Antiviral and other bioactivities of pyrimidinones Antiviral Effects of Atropine and Caffeine Potential 2019-nCoV 3C-like protease inhibitors designed using generative deep learning approaches Inhibition of HBV replication by theophylline A novel coronavirus from patients with pneumonia in China The authors thank Dr. Prajwal Nandekar and Mr. Vinod Devaraji of Schrodinger Corporation for their kind help and Mr. Nripendra Bhatta for his logistic help. No potential conflict of interest was reported by the authors. http://orcid.org/0000-0003-1926-7605 Pramod Avti http://orcid.org/0000-0001-5603-4523 Bikash Medhi http://orcid.org/0000-0002-4017-641X