key: cord-0773669-8c05w1ix authors: Yap, YeeLeng; Zhang, XueWu; Andonov, Anton; He, RunTao title: Structural analysis of inhibition mechanisms of Aurintricarboxylic Acid on SARS-CoV polymerase and other proteins date: 2005-06-23 journal: Comput Biol Chem DOI: 10.1016/j.compbiolchem.2005.04.006 sha: c2039ed5d9bde2d634fdf12edc9a559bf46fa2a7 doc_id: 773669 cord_uid: 8c05w1ix We recently published experimental results that indicated Aurintricarboxylic Acid (ATA) could selectively inhibit SARS-CoV replication inside host cells by greater than 1000 times. This inhibition suggested that ATA could be developed as potent anti-viral drug. Here, to extend our experimental observation, we have incorporated protein structural studies (with positive/negative controls) to investigate the potential binding modes/sites of ATA onto RNA-dependent RNA polymerase (RdRp) from SARS-CoV and other pathogenic positive-strand RNA-viruses, as well as other proteins in SARS-CoV based on the fact that ATA binds to Ca(2+)-activated neutral protease (m-calpain), the protein tyrosine phosphatase (PTP) and HIV integrase which have existing crystal structures. Eight regions with homologous 3D-conformation were derived for 10 proteins of interest. One of the region, R(binding) (754–766 in SARS-CoV's RdRp), located in the palm sub-domain mainly constituted of anti-parallel β-strand-turn-β-strand hairpin structures that covers two of the three RdRp catalytic sites (Asp 760, Asp761), was also predicted by molecular docking method (based on free energy of binding ΔG) to be important binding motif recognized by ATA. The existence of this strictly conserved region that incorporated catalytic residues, coupled with the homologous ATA binding pockets and their consistent ΔG values, suggested strongly ATA may be involved in an analogous inhibition mechanism of SARS-COV's RdRp in concomitant to the case in m-calpain, PTP and HIV integrase. Severe Acute Respiratory Syndrome (SARS) is an infectious disease that appeared in November 2002 Marra et al., 2003; Rota et al., 2003) , and has imperilled the health of human population in more than 30 nations during its outbreak. It has claimed over 770 lives and infected more than 8050 people (∼9.6% death rate) around the globe. On May 15, 2003, the primary etiological agent for SARS that belongs to the coronavirus family was found to fulfil Koch's postulate through experimental infection of cynomol-gus macaques (Macaca fascicularis). By May 18, 2004, the last human case of infection was successfully contained in China. The detailed chronicles for the discovery of SARS CoronaVirus (SARS-CoV) can be found in websites (James, 2003) . Despite almost two years of intensive SARS-CoV research around the world, the key question regarding the original reservoir of SARS-CoV remains widely speculative and unanswered. This raises the concerns that another outbreak could be imminent. This concern, coupled with the fact that preventive vaccine projects underway now (Marshall and Enserink, 2004) are expected to take years to be developed with possible adverse side effects (Glansbeek et al., 2002; Marshall and Enserink, 2004) , the anti-viral agents such as an effective protein inhibitor must be studied exhaustively to define alternative options. In the past, some studies have reported potent inhibitors that block key proteins from coronaviruses, including those from SARS-CoV (e.g. 3C-Like protease, S proteins) which are directly involved in the viral entry machinery Bermejo Martin et al., 2003; Chou et al., 2003; Koren et al., 2003; Pace et al., 2004; Takeda-Shitaka et al., 2004; Wu et al., 2004; Zhang and Yap, 2004) . The successful inhibitory binding of these anti-viral agents to the targeted proteins will disrupt the normal viral cell cycle and therefore render the viral infection to the human host cell unsuccessful Bermejo Martin et al., 2003; Chou et al., 2003; Koren et al., 2003; Pace et al., 2004; Takeda-Shitaka et al., 2004; Wu et al., 2004; Zhang and Yap, 2004) . In contrast to the widely studied proteins mentioned above, there are limited documentations reporting the inhibition of the RNA-dependent RNA polymerase (RdRp) from any coronaviruses by any potent inhibitor despite the fact that this attractive protein target that is crucial for the overall viral RNA replication inside the host cells (Dhanak et al., 2002; Sarisky, 2004) . Recently, we have shown by experiment, that Aurintricarboxylic Acid (ATA) could selec-tively inhibit SARS-CoV replication inside host cells with their viral mRNA transcripts production 1000-fold less than that in the untreated control (He et al., 2004) . This inhibition of viral replication is possibly related to SARS-CoV's RdRp. However, more extensive experiments will be needed to prove that ATA does bind to RdRp and inhibit the viral replication. We searched the appearance of term "ATA" in MedLine and found out that ATA is tightly correlated to polymerase activity (Akiyama et al., 1977; Givens and Manly, 1976; Swennen et al., 1978; Swennen et al., 1979) . Additionally, when ATA was compared with interferons (IFNs) alpha (␣) and beta (␤), which is the current standard treatment for SARS patients, ATA is reported to be approximately 10 times more potent than IFN alpha and 100 times more than interferon beta at their highest concentrations designed in our previous experiment. All these observations suggested that further exploration on ATA might be scientifically a rewarding task, and this chemical compound (ATA) can possibly be developed into an effective anti-SARS drug with high selectivity. Here, to extend our experimental observations, Fig. 1 . Structural alignment between proteins from SARS-CoV and other RNA viruses. The structural alignment between proteins from SARS-CoV and other RNA viruses based on their 3D atomic coordinate files were performed using Dali and LGA with manual alignment. Solvent inaccessible residues were represented in upper case, while the solvent accessible residues in lower case. The alpha helix residues in red, specifically the 3-10 helix residues in maroon, and beta strand residues in blue. The residues that have hydrogen bond to main chain amide in bold, and residues that have hydrogen bond to main chain carbonyl underlined. Finally, residues that are joined by disulphide bonds are represented in cedilla (ç) (1C2PA, Hepatitis C virus RNA-dependent RNA polymerase; 1DF0A, m-calpain; 1hhsa, bacteriophage phi6 RNA-dependent RNA polymerase; 1KHVA, Rabbit hemorrhagic disease virus RNA-dependent RNA polymerase; 1O5SA, SARS-CoV RNA-dependent RNA polymerase; 1QZ0A, Yersinia pestis phosphatase yoph; 1RDR0, Poliovirus RNA-dependent RNA polymerase; 1S4FA, Bovine viral diarrhea virus RNA-dependent RNA polymerase; 1SH0A, Norwalk virus RNA-dependent RNA polymerase; 3HVTA, HIV Type 1 RNA-dependent RNA polymerase). we have incorporated multiple bioinformatic tools to investigate the structural relationships between ATA and RdRps from SARS-CoV and other pathogenic positive-strand RNA viruses (with positive controls). Such a study attempts to reveal the potential binding modes/sites of ATA onto RdRps from these viruses. The positive controls used constituted of three known ATA's protein targets: (1) the Ca 2+ -activated neutral protease (m-calpain) (Posner et al., 1995) , (2) the protein tyrosine phosphatase (PTP) of Yersinia pestis (Liang et al., 2003; Sun et al., 2003) and (3) HIV integrase (Cushman and Sherman, 1992) , which have existing 3D crystal structures. Eventually, current study attempts to elucidate the plausible conformations for the inhibition mode of ATA, and hence provide clues to rational anti-SARS drug design based on ATA. In current study, the protein interactions with small molecule, ATA, were made up of theoretically-determined (six structures) and experimentally-determined (13 structures) structures as stated the following. The threedimensional theoretical models for RNA-dependent RNA polymerase (RdRp, ID = 1O5S) (Xu et al., 2003) , spike protein subunit 1 (S 1 , ID = 1Q4Z) (Spiga et al., 2003; Zhang and Yap, 2004) , spike protein subunit 2 (S 2 , ID = 1Q4Y) (Spiga et al., 2003; Zhang and Yap, 2004) of SARS-CoV, the 3-dimensional crystal structures for nucleocapsid protein (N protein, ID = 1SSK) , non-structural protein 9 (Nsp9, ID = 1QZ8) (Egloff et al., 2004) , 3Clike main protease (3Clpro, ID = 1Q2W) of SARS-CoV, the crystal structure for YopH from Yersinia pestis (ID = 1QZ0) , m-calpain from Rattus norvigecus (ID = 1DF0) (Strobl et al., 2000) , RNA-dependent RNA polymerase from Dsrna bacteriophage 6 (ID = 1HHS), Rabbit hemorrhagic disease virus (ID = 1KHW), Poliovirus (ID = 1RDR), Bovine viral diarrhea virus (IVDV, ID = 1S4F), Norwalk virus (ID = 1SH0), Hepatitis C virus (ID = 1C2PA) and HIV reverse transcriptase (ID = 3HVT) were retrieved from Protein Data Bank. The three-dimensional geometry of all the molecules were validated for restraint violations using PROCHECK (Laskowski et al., 1996) . The three-dimensional coordinate for the small molecule Aurintricarboxylic Acid (ATA-C 22 H 14 O 9 ) was derived using PRODRG2 (Schuttelkopf and Aalten, 2004) and JME editor (http://www.cem.msu.edu/ ∼reusch/VirtualText/Questions/MOLEDITOR/jme window. html). VAST and DALI programs were used to locate similar structural patterns between crystal structure of all molecules. Sequential structural alignment was done by CE (Shindyalov and Bourne, 1998) and COMPARER (Sali and Blundell, 1990) . Finally, three-dimensional structural comparative analysis was performed by LGA (Zemla, 2003) . Preparation of Macromolecule and ligand prior molecular docking was done using WhatIF software (Vriend, 1990) . Molecular docking to determine the best conformation in term of lowest Gibbs free energy and shape complementary was performed using Autodock 3.0 (Morris et al., 1998) . The visualisation of the 3-dimensional structural data was generated by Rasmol (Bernstein, 2000) . Protein 3D structural alignments were performed on 10 amino acid (mainly RdRps) sequences of interest, namely: (1) Hepatitis C virus RNA-dependent RNA polymerase; (2) m-calpain; (3) bacteriophage phi6 RNA-dependent RNA polymerase; (4) Rabbit hemorrhagic disease virus RNA-dependent RNA polymerase; (5) SARS-CoV RNAdependent RNA polymerase; (6) Yersinia pestis phosphatase yoph; (7) Poliovirus RNA-dependent RNA polymerase; (8) Bovine viral diarrhea virus RNA-dependent RNA polymerase; (9) Norwalk virus RNA-dependent RNA polymerase; (10) HIV Type 1 RNA-dependent RNA polymerase. Regions with homologous 3D-conformation were identified together with their conserved secondary structures (underneath their amino acid alignment in Fig. 1 ). In total, there are eight structurally conserved motif blocks (CMBs) with each block extending at least eight amino acid residues. The secondary structures for all CMBs include six ␣-helices and two ␤-strands regions. The exact positions of all CMBs in each protein are provided in Fig. 1 . The molecular docking method based on free energy of ligand binding, G, between ATA and all proteins (Figs. 2 and 3) revealed that ATA binds favorably to one structurally conserved region (R binding ) among all proteins ( Fig. 1 in red box). As shown in Fig. 4 , the corresponding R binding region in SARS-CoV's polymerase (Ser754-Tyr766) overlapped with one CMB. R binding is located in the palm sub-domain and constituted mainly of anti-parallel ␤-strand-turn-␤-strand hairpin structures. This conserved region is similar to the majority of the remaining nine proteins in term of their secondary structures. Surprising, this R binding region also contains a highly conserved 'XSDD' amino acid motif that is especially prominent among viral RdRps, of which two of the highly conserved aspartatic acid (D) residues form the catalytic center important for polymerase activity (Xu et al., 2003) . The free energy of ligand binding (final intermolecular energy + torsional free energy), G, between ATA and all proteins were compared (Table 1 ). For the proteins that were documented to be inhibited by ATA molecule were assigned as positive controls (HIV integrase, yoph and m-calpain), their estimated free energies of binding were −11.88 kcal/mol, −7.79 kcal/mol and −7.67 kcal/mol, Fig. 2 . Calculated structure (using Autodock) for the interaction of ATA with ypoH. YpoH is a protein tyrosine phosphatase which is essential for virulence in the Yersinia pestis. It is known that the functionality of this protein is inhibited strongly by ATA molecule (Liang et al., 2003) . This figure shows the 10 most possible confirmations of ypoH-ATA complexes. The border residues that have contact with ATA are highlighted in red. This region is structurally conserved between ypoH, m-calpain and SARS-CoV RdRps, and constituted mainly of anti-parallel ␤-strand-turn-␤-strand hairpin structures. Below, the two dimensional chemical structure for ATA (C 22 H 14 O 9 ) was shown. Fig. 3 . Calculated structure (using Autodock) for the interaction of ATA with m-calpain. The neutral protease (calpain) is a class of cytosolic enzyme that is activated during apoptosis. It is known to be inhibited strongly by ATA molecule (Posner et al., 1995) . This figure shows the 10 most possible confirmations of m-calpain-ATA complexes. The border residues that have contact with ATA are highlighted in red. This region is structurally conserved between ypoH, m-calpain and SARS-CoV RdRps, and constituted mainly of anti-parallel ␤-strand-turn-␤-strand hairpin structures. respectively. Any estimated free energy around/lower than −7.67 kcal/mol will potentially suggest similar inhibitory binding mechanism by ATA molecule, if the corresponded binding motif incorporates catalytic sides of that specific protein. When we studied the binding of ATA onto various RNA dependent RNA polymerases (RdRps) from other genomes, the free energy of binding for most RdRps were significantly higher than −7.67 kcal/mol, and therefore suggested a lower strength of inhibition upon binding by ATA molecule (Bovine viral diarrhea virus, Dsrna bacteriophage, Feline calicivirus, Hepatitis C virus, HIV, Poliovirus and Rabbit hemorrhagic disease virus). Only the RdRps from SARS-CoV ( G = −7.68) and Norwalk virus ( G = −14.92) were estimated to have a lower G (≥7.67 kcal/mol). This potentiated the possibility that ATA molecule could inhibit strongly the polymerase activity of these two RdRps as in HIV integrase, yoph and m-calpain. In summary, we first identified the homologous amino acid regions shared by 10 proteins (CMBs) using 3D structural alignments. Of these 10 proteins studied, three proteins were documented to be strongly inhibited by ATA molecules. In parallel, we employed the molecular docking method (based on free energy of ligand binding) to predict the most energy favorable binding conformation of ATA onto these 10 proteins and eventually determined one binding pocket (R binding ) Crystal structure −7.64 S 1 (Spiga et al., 2003) Theoretical model −18.59 S 2 (Spiga et al., 2003) Theoretical model −7.66 S 1 (Zhang and Yap, 2004) Theoretical model −14.79 S 2 (Zhang and Yap, 2004) Theoretical model −15.22 a Cushman and Sherman (1992), Liang et al. (2003) , Posner et al. (1995) . recognized by ATA, that also overlapped with CMBs. This binding pocket (R binding ), denoted to be 754-766 in SARS-CoV's RdRp, not only shared a highly conserved secondary structure with other RdRps, but also incorporates two of the three predicted RdRp catalytic residues (Asp 760, Asp761) identified by other researchers (Xu et al., 2003) . These residues are expected to be important for metal ion chelation (Beese and Steitz, 1991; Bressanelli et al., 2002; Huang et al., 1998) . Structurally, this binding pocket is located in the palm sub-domain constituted mainly of anti-parallel ␤strand-turn-␤-strand hairpin structures. As reported in Section 3, there are altogether eight structurally conserved motif blocks (CMBs) and their secondary structures include six ␣-helices and two ␤-strands regions. Among these structurally conserved regions, we subsequently identified that there was one common region recognized by ATA. The binding of ATA to this region also fulfilled the lowest free energy of ligand binding. By using this free energy of ligand binding, we were able to quantify the binding strength between a macromolecule and a ligand. Therefore, if a ligand binds strongly (with lower free energy of ligand binding G) to the active domains of one specific protein, it will presumably inhibit the activity for that specific protein. On the other hand, if ATA does not bind to the active domains of the protein, ATA will not be able to inhibit the function of the protein regardless of the strength of binding (e.g. 3CL main protease and ATA). Based on this rationale, it is likely that ATA could have inhibited the function of RdRp of SARS-CoV because the binding pocket for ATA and RdRp includes two strictly conserved aspartate residues (D) involved in binding divalent metal ions required for RdRp catalysis (Beese and Steitz, 1991; Bressanelli et al., 2002; Huang et al., 1998) . When ATA binds to this ␤-strand-turn-␤strand hairpin structures (R binding ), it covers the two aspartate residues and therefore inhibits the metal ion chelation process crucial for the functionality of this protein during viral replication. In contrast, although we found out that ATA binds to 3CL main protease at a much stronger binding strength (Table 1) , we do not anticipate a similar inhibitory binding because ATA does not bind to the active domains of 3CL main protease . Furthermore, we also predicted the ATA binding regions for Ca 2+ -activated neutral protease of Rattus norvegicus (mcalpain), the protein tyrosine phosphatase (PTP) of Yersinia pestis and HIV integrase of Human immunodeficiency virus 1, of which all were experimentally proven to be inhibited by ATA and contain existing crystal structures. However, to the best of our knowledge, there are no literatures attempting to explain explicitly how ATA inhibits the proteins mentioned above. Surprisingly, we performed molecular docking analysis and predicted that ATA binds to the region of these three proteins (Fig. 1) that have correspondence to R binding in SARS-CoV and other viral RdRps. This region (Fig. 1 in box) was identified to be highly conserved among viral RdRps, in term of their 3D structures and secondary structures. In essence, ATA selectively binds to the ␤-strand-turn-␤-strand hairpin structures. Such a strong conservation signifies that this region might be evolutionarily crucial and important for the function of the proteins. Therefore, it remains plausible that ATA inhibits the function of SARS-CoV's RdRp by bind-ing to the homologous region as in m-calpain, PTP and HIV integrase. Finally, we had also analyzed the binding of ATA onto viral RdRps from the standpoint of free energy of ligand binding ( G). The results of G between ATA and all viral RdRps, as well as the three proteins known to be readily inhibited by ATA molecules (Table 1) , suggested that ATA binds to RdRp from SARS-CoV at the equivalent binding strength similar to yoph and m-calpain. On the other hand, ATA was predicted to bind to the remaining viral RdRps (except for Norwalk virus) at a much lower binding strength whereas ATA binds to RdRp from Norwalk virus at a significantly higher binding strength (approx. two times). If our hypothesis that ATA inhibits the functionality of RdRps by binding to its catalytic sites is indeed true, these numerical representations of Gs would likely represent how well ATA will inhibit the viral replication inside virus-infected cell culture. To prove our hypothesis, we had also carried out some experimental study on how well ATA interrupts the viral replication in SARSinfected cell culture and demonstrated successfully that the viral load (e.g. HCV, West Nile virus and Feline calicivirus, etc.), with the addition of ATA at the highest concentration, corresponds to the ranking of the estimated free energy of ligand binding (experimental data to be published). The following conclusions can be drawn from this structural study on the binding of ATA to proteins from SARS-CoV and other positive-strand RNA viruses: 1. ATA could bind to SARS-CoV's RdRp and other SARS-CoV's proteins. The strongest binding was predicted to be for S 1 protein. By binding to a protein, ATA could inhibit their activity if the ATA binding motifs incorporate catalytic domains of that specific protein, for example: ATA binds to the catalytic sites of SARS-CoV's RdRp and inhibit their function. 2. Derived from the inhibitory binding of ATA on SARS-CoV's RdRp, ATA was also predicted to inhibit the polymerase activity of RdRps from other RNA viruses in a weaker manner (Bovine viral diarrhea virus, Dsrna bacteriophage, Feline calicivirus, Hepatitis C virus, HIV, Poliovirus, Rabbit hemorrhagic disease virus). However, stronger inhibition (approx. two times) was predicted for the RdRp from Norwalk virus (experimental data to be published). 3. This inhibitory binding mechanism, especially for RdRps, could serve to explain why mRNA transcripts were decreased by >1000 times when SARS-infected cell culture was treated with ATA in our previous study. It could also explicate why the inhibition on the replication on other RNA viruses are milder when virus-infected cells was treated with ATA (experimental data to be published). 4. ATA could serve as a template for rational anti-SARS drug design, targeting the replication machinery possibly over large groups of positive-strand RNA viruses. PRODRG: a tool for highthroughput crystallography of protein-ligand complexes The effect of aurintricarboxylic acid on RNA polymerase from rat liver Coronavirus main proteinase (3CLpro) structure: basis for design of anti-SARS drugs Structural basis for the 3 -5 exonuclease activity of Escherichia coli DNA polymerase I: a two metal ion mechanism Pentoxifylline and severe acute respiratory syndrome (SARS): a drug to be considered Recent changes to RasMol, recombining the variants Structural analysis of the hepatitis C virus RNA polymerase in complex with ribonucleotides Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS Inhibition of HIV-1 integration protein by aurintricarboxylic acid monomers, monomer analogs, and polymer fractions Identification and biological characterization of heterocyclic inhibitors of the hepatitis C virus RNA-dependent RNA polymerase Identification of a novel coronavirus in patients with severe acute respiratory syndrome The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world Inhibition of RNA-directed DNA polymerase by aurintricarboxylic acid Adverse effects of feline IL-12 during DNA vaccination against feline infectious peritonitis virus Potent and selective inhibition of SARS coronavirus replication by aurintricarboxylic acid Structure of a covalently trapped catalytic complex of HIV-1 reverse transcriptase: implications for drug resistance Structure of the N-terminal RNA-binding domain of the SARS CoV nucleocapsid protein SARS Web information Ribavirin in the treatment of SARS: a new trick for an old drug AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR Aurintricarboxylic acid blocks in vitro and in vivo activity of YopH, an essential virulent factor of Yersinia pestis, the agent of plague Medicine. Caution urged on SARS vaccines Automated docking using a lamarckian genetic algorithm and empirical binding free energy function The monoethyl ester of meconic acid is an active site inhibitor of HCV NS5B RNA-dependent RNA polymerase Aurintricarboxylic acid is an inhibitor of mu-and m-calpain Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming Non-nucleoside inhibitors of the HCV polymerase Protein structure alignment by incremental combinatorial extension (CE) of the optimal path Molecular modelling of S1 and S2 subunits of SARS coronavirus spike glycoprotein The crystal structure of calcium-free human m-calpain suggests an electrostatic switch mechanism for activation by calcium Crystal structure of the Yersinia protein-tyrosine phosphatase YopH complexed with a specific small molecule inhibitor Aurintricarboxylic acid stimulates specifically DNA-dependent RNA polymerase II-directed RNA synthesis in isolated larval nuclei of Artemia saline Mechanism of preferential stimulation of DNA-dependent RNA polymerase II by aurintricarboxylic acid in isolated Artemia larval nuclei Evaluation of homology modeling of the severe acute respiratory syndrome (SARS) coronavirus main protease for structure based drug design WHAT IF: a molecular modeling and drug design program Small molecules targeting severe acute respiratory syndrome human coronavirus Molecular model of SARS coronavirus polymerase: implications for biochemical functions and drug design The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor LGA: a method for finding 3D similarities in protein structures The 3D structure analysis of SARS-CoV S1 protein reveals a link to influenza virus neuraminidase and implications for drug and antibody discovery Exploring the binding mechanism of the main proteinase in SARS-associated coronavirus and its implication to anti-SARS drug design Old drugs as lead compounds for a new disease? Binding analysis of SARS coronavirus main proteinase with HIV, psychotic and parasite drugs Structural similarity between HIV-1 gp41 and SARS-CoV S2 proteins suggests an analogous membrane fusion mechanism The stay of Daniel Yap in the Unit was supported by the Hong Kong Innovation and Technology Fund BIOSUP-PORT program, and the work was supported by European Union's 6th Framework Program: BioSapiens-Network of Excellence, Grant LSHG CT-2003-503265, section 66010 and by the French Ministry of Research ACI IMPBIO (program Blastsets). We thank Antoine Danchin supporting this collaboration and for his helpful discussions.