key: cord-1045817-02q9y011 authors: Frick, David N.; Virdi, Rajdeep S.; Vuksanovic, Nemanja; Dahal, Narayan; Silvaggi, Nicholas R. title: Molecular Basis for ADP-ribose Binding to the Macro-X Domain of SARS-CoV-2 Nsp3 date: 2020-04-29 journal: bioRxiv DOI: 10.1101/2020.03.31.014639 sha: 4fbe5534e81f596da4da7795e2070b8a9ea34b60 doc_id: 1045817 cord_uid: 02q9y011 The virus that causes COVID-19, SARS-CoV-2, has a large RNA genome that encodes numerous proteins that might be targets for antiviral drugs. Some of these proteins, such as the RNA-dependent RNA polymers, helicase and main protease, are well conserved between SARS-CoV-2 and the original SARS virus, but several others are not. This study examines one of the proteins encoded by SARS-CoV-2 that is most different, a macrodomain of nonstructural protein 3 (nsp3). Although 26% of the amino acids in this SARS-CoV-2 macrodomain differ from those seen in other coronaviruses, biochemical and structural data reveal that the protein retains the ability to bind ADP-ribose, which is an important characteristic of beta coronaviruses, and potential therapeutic target. The development of antivirals targeting Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the causative agent of the present COVID-19 pandemic, 1 will most likely focus on viral proteins and enzymes needed for replication. 2 Like other coronaviruses, SARS-CoV-2 has a large positive sense (+)RNA genome over 30,000 nucleotides long with several open reading frames. Most of the proteins that form the viral replicase are encoded by the "rep 1ab" reading frame, which codes for a 7,096 amino acid-long polyprotein that is ultimately processed into at least 15 functional peptides, five of which are only produced by a translational frameshift event occurring after nsp10 (Fig. 1) . Parts of the SARS-CoV-2 rep 1ab polyprotein are very similar to the rep 1ab protein of the coronavirus that caused the SARS epidemic in 2003 (which will be referred to here as SARS-CoV-1), suggesting the that drugs targeting the SARS-CoV-1 nsp5-14 might be effective against SARS-CoV-2. However, some portions of the SARS rep 1ab polyproteins are quite different. In contrast to the well-conserved SARS nsp5 protease, nsp12 polymerase and nsp13 helicase enzymes, significantly more differences exist between the nsp3 proteins encoded by SARS-CoV-1 and SARS-CoV-2. The most variation occurs in a domain of Nsp3 domain suspected to bind ADP-ribose, that will be referred to here as the "macro X" domain, to differentiate it from the two downstream "SARS unique macrodomains (SUDs)," which do not bind ADP-ribose. 3 The macro X domain of SARS-CoV-1 is also able to catalyze the hydrolysis of ADP-ribose 1'' phosphate, albeit at a slow rate. 4 All the above differences preclude the use of the SARS-CoV-1 macro X domain structures as scaffolds to design compounds that might target this nsp3 region in SARS-CoV-2, especially in light of the observation that the same nsp3 domain from gamma coronaviruses does not bind ADP-ribose in vitro. 5 Because ADP-ribose binding is a property that could be used to identify possible antiviral agents, the ability of the SARS-CoV-2 macro X domain to bind ADP-ribose was examined using a recombinant purified protein and isothermal titration calorimetry (ITC). We also determined the structure of the SARS-CoV-2 macro X domain in order to examine the biochemical context of ADP-ribose binding and to provide data for rational inhibitor design or in silico screening. 1 mM isopropyl-β-D-thiogalactoside. After growing 16 h at 23 °C, the cells were harvested by centrifugation at 4,000 rpm, 4°C. The resulting cell pellet was suspended in 25 mL of IMAC buffer (20 mM Tris pH 8, 0.5 M NaCl), sonicated on ice for five 1 min bursts, with 2 min rests between, and clarified by centrifugation at 10,000g for 30 min. The supernatant was loaded onto a 5 ml Ni-NTA column and the fractions were eluted with a step gradient from 5 to 500 mM imidazole. Fractions containing the macro X domain protein (5 ml total) were loaded on a 250 ml Sephacryl S300 gel filtration column and eluted with 10 mM MOPS, 150 mM NaCl. Concentration of the purified protein was determined by measuring absorbance at 260 nm using a molar extinction coefficient of 10,555 M -1 cm -1 , which was calculated with the ProtParam tool (https://web.expasy.org/protparam/). Isothermal Titration Calorimetry (ITC)-Binding of ADP-ribose to the SARS-CoV-2 macro X domain was measured using a Nano ITC (TA Instruments). Before starting the measurement, samples of both ligand and protein were diluted in 10 mM MOPS, 150 mM NaCl (pH 7) and were degassed at 400 mmHg for 30 minutes. Measurements were taken at 20 °C by injecting 2.0 µ l aliquots of 500 µM ADP-Ribose (Sigma) to 50 µM protein (175 µl initial volume) with 250 rpm stirring rate. Using NanoAnalyze Software (v. 3.11.0), data were fitted by non-linear regression to an independent binding model. Briefly, after baseline correction, background heats from ligand-to-buffer titrations were subtracted, and the corrected heats from the binding reaction were used to find best fit parameters for the stoichiometry of the binding (n), free energy of binding (ΔG), apparent enthalpy of binding (ΔH), and entropy change (ΔS). Dissociation constants (K d ) were calculated from the ΔG. Crystallization and Structure Determination-In preparation for crystallization experiments, the purified SARS-CoV-2 macro X domain protein was cleaved with tobacco etch virus (TEV) protease to remove the Nterminal His 6 -tag and passed back through the Ni-NTA column. The flow-through fractions were desalted into 10 mM HEPES, pH 7.2 using a 2x5ml HiTrap desalting column (GE Life Sciences) and concentrated to 10 mg/mL in a centrifugal concentrator. This preparation of the protein was mixed 1 μ L:1 μ L with the Morpheus HT screen reagents (Molecular Dimensions) in a 96-well SwissSci MRC UV-transmissible sitting drop plate. Large, diffraction-quality crystals grew directly from a number of the screen conditions. The crystal ultimately used for structure determination grew from condition D9: 0.12 M alcohols [0.02 M each 1,6-hexanediol, 1butanol, 1,2-propanediol, 2-propanol, 1,4-butanediol, and 1,3-propanediol], 0.1 M buffer system 3, pH 8.5 [0.05 M each TRIS and bicine], and 30% precipitant mix 1 [20% poly(ethylene glycol) (PEG) 500 monomethylether, 10% PEG 20,000]. Large, thick plates grew within 1 week at 22 °C. Owing to the high concentration of PEG 500 MME, the crystal did not require additional cryo-protection and was flash-cooled by looping it directly from the sitting drop and plunging it into liquid nitrogen. Diffraction data were collected on Life Sciences Collaborative Access Team (LS-CAT) beamline 21-ID-F at the Advanced Photon Source of Argonne National Laboratory. The wavelength at this station is fixed at 0.9787 Å; the detector is a MarMosaic M300 chargecoupled device. The data were collected with an oscillation width of 0.5° per image for a total oscillation of 180°. The data were indexed and integrated with DIALS 7, 8 as implemented in version 7.2 of the CCP4 software suite. 9, 10 Data scaling and reduction was performed using AIMLESS. [11] [12] [13] Data collection statistics are provided in Table 1 . The structure was determined by molecular replacement in PHASER 14 using the model of the SARS-CoV-1 macro X domain as the search model (PDB ID 2FAV 6 ). The model underwent iterative rounds of (re)building in COOT 15 and refinement in PHENIX.refine. 16, 17 The very high resolution of the data justified a full anisotropic treatment of the protein and solvent temperature factors. Model refinement and validation statistics are provided in Table 1 . The coordinates were deposited in the Protein Data Bank with accession code 6WEY. Hypervariability in the nsp3 Macro X domain-The structures of most of the soluble portions of the SARS-CoV-1 nsp proteins have been examined at atomic resolution to help understand coronavirus replication and facilitate antiviral drug discovery. The amino acid sequences of each of these proteins were compared with the homologous regions of the rep 1ab protein encoded by SARS-CoV-2 (GenPept Accession YP_009724389). The most similar proteins were the RNA helicases (nsp13) which are identical in all but one of their 603 amino acids: a conservative Val to Ile substitution near their C-termini). The RNA-dependent RNA polymerases (nsp12) are also well-conserved, sharing all but 34 of 955 amino acids. The primary protease that cleaves the polyprotein (nsp5) is also similar in SARS-CoV-1 and SARS-CoV-2, with only 13 amino acids that differ among 306 (4.2% different) ( Fig. 1) . At the other end of the spectrum are the nsp3 proteins, which are notably more different in the two SARS viruses. Nsp3 is a large multidomain membrane-bound protein, 18 and its clearest role in viral replication is cleaving the rep polyprotein. Over 17% of the amino acids in the nsp3 protease domain differ between SARS-CoV-1 and SARS-CoV-2. Other parts of nsp3 are even more variable, like the macrodomains that lie N-terminal to the nsp3 protease domain. Macrodomains consist of four helices that surround a mixed beta sheet. A ligand-binding pocket that typically binds ADP-ribose or related compounds lies between the helices and the sheet. 19 SARS-CoV-1 has three macrodomains in tandem, but only the first binds ADP ribose. The amino acid sequences of this macro X domain differ by 26% between SARS-CoV-1 and SARS-CoV-2 ( Fig. 1) . Many of the 47 variant residues in the 180 amino acid long SARS macro X domain are clustered near its Nterminus in a region that is particularly variable in the three Coronaviridae genera (Fig. 2) . Macro X domains from coronaviruses that cause the common cold (alpha coronaviruses) 5 and the beta coronaviruses like SARS-CoV-1 and Middle East respiratory syndrome coronavirus (MERS-CoV) 20 all bind ADP ribose. However, the same domain in gamma coronaviruses does not bind ADP-ribose. 5 Expression and Purification of the SARS-CoV-2 Macro X domain-An E. coli expression vector for the macro X domain was built to express and N-terminally Histagged protein similar to a SARS-CoV-1 protein studied by Egloff et al. 6 Upon induction, a one liter culture of BL21(DE3) cells harboring the vector expresses 50-100 mg of the macro X domain protein that can be purified in one step to apparent homogeneity using immobilized metal affinity chromatography (Fig. 3A) . The protein was polished further with gel filtration chromatography and concentrated before analysis and crystallization. Nucleotide Binding by the SARS-CoV-2 Macro X domain-Repeated ITC experiments revealed that the purified recombinant protein bound ADP-ribose ( Fig. 3C ) with a dissociation constant of 10 ± 4 µM (uncertainty is the standard deviation of K d 's from independent titrations). To examine binding specificity, similar titrations were repeated with related nucleotides. The SARS-CoV-2 protein bound ADP, cAMP, ATP and ADP-glucose ( Fig. 3D ). All nucleotides lacking the ribose moiety bound with similar high affinities, but none bound with an enthalpy change like that seen with ADP-ribose, suggesting specific contacts form between ADP ribose the SARS-CoV-2 protein. Based on what is seen in the SARS-CoV-2 structures below, these contacts likely occur with the conserved D226 and N244 (positions 30 and 43 in the numbering above the alignment in Fig. 2) . None of the other nucleotides bind with an entropic penalty as seen with ADP-ribose, either, suggesting that the ribose moiety becomes structured when bound to the macrodomain. The energetics of ADP-ribose binding to the SARS-CoV protein are similar to those seen with the same protein from SARS-CoV-1, 6 MERS-CoV. 20 Enthalpy and entropy of binding were also very similar for all three protein (Fig. 3D ). Unlike what is seen with the macro X protein from an alpha coronavirus, 5 enthalpy appears to drive ADP-ribose binding to the macro X domains of the three beta coronaviruses. Structure of the SARS-CoV-2 Macro X Domain-The SARS-CoV-2 macro X domain (NSP3 residues 207 to 277) crystallized in space group P2 1 2 1 2 1 with 1 molecule per asymmetric unit. These crystals had a solvent content of 43% and diffracted extremely well. The final resolution limit of the data was set at 0.95 Å ( Table 1 ). The quality of the electron density maps is correspondingly excellent (Fig. 4A ). The section of the structure depicted in this image is located on the surface of the protein and the B-factors of these residues are close to the average B-factor of the protein (7.2 vs 10.2 Å 2 ), indicating that this sample accurately represents the overall quality of the maps. The final model contains the entire sequence from V207 to S377 of nsp3, an N-terminal glycine residue that was left from the TEV-protease cleavage, and 374 solvent molecules. The R cryst and R free values of the final model were 0.119 and 0.137, respectively ( Table 1) . The tertiary structure, as expected, ranges from nearly identical to very similar to those of other coronavirus macro domains, including: SARS CoV-1 (2FAV 6 ; 74.7% sequence identity) with a root mean square deviation (RMSD) value for 162 of 172 Cα atoms of 0.6 Å; MERS CoV (5DUS 20 ; 42.2% identical) with a 1.2 Å RMSD for 161 of 172 Cα atoms; human alpha coronavirus 229E (3EWR 21 ; 32.5% identical) with a 1.5 Å RMSD for 154 of 172 Cα atoms; feline coronavirus (FCoV; 3JZT 22 26.8% identical) with a 1.5 Å RMSD for 153 of 172 Cα atoms; and the gamma CoV infectious bronchitis virus (IBV; 3EWP 21 26.7% identical) with a 2.1 Å RMSD for 150 of 172 Cα atoms. The regions of high sequence conservation are not clustered in any particular region(s) of the molecule (Fig. 4B ). This makes sense in light of the fact that the protein atoms involved in hydrogen bonding interactions with the ligands in these structures are more often part of the main chain; there are relatively few interactions of side chains with the ligands. At the time of writing, we discovered that Michalska et al. of the Center for Structural Genomics of Infectious Diseases (CSGID) deposited coordinates for a very similar construct of the SARS-CoV-2 macro X domain including from E206 to E275 of the nsp3 protein, plus an additional 4 residues at the N-terminus (6VSX; unpublished). Their crystals also allowed binding of ADPribose (6W02) and AMP (6W6Y), while ours seemed to be packed too tightly to permit ligands to access the binding site (data not shown). We compared our ultrahigh-resolution model of the unliganded protein to the ADP-ribose-bound form. The RMSD values for the fitting, done by secondary structure matching (SSM) 23 as implemented in COOT, is 0.59 Å for 165 of 172 Cα atoms. This is very similar to the RMSD values of the free protein (6VXS; 0.66 Å) and the AMP-bound form (6W6Y; 0.50 Å), indicating that there are no large conformational changes that occur upon ligand binding. In fact, the only notable conformational changes occur in three surface-exposed loops in or near the ligand-binding pocket (Fig. 4C) . These loops connect strand β 2 with helix α 2 (the β 2-α2 loop), strand β 4 with helix α 4 (β4-α4 loop), and strand β 5 with helix α 5 (β5-α5 loop). The subtle change in conformation of the β 4-α4 loop (purple in Fig. 4C ) appears to be the result of crystal contacts and not the direct influence of ADP-ribose binding. The other two loops are more intimately involved in ligand binding. The main chain of the α 2-β2 loop rotates 180° in order to allow the amide N atom of G252 to participate in a hydrogen bonding interaction with the 1'-hydroxyl of the ribose moiety of ADPribose. This loop also carries N244, which directly interacts with the ribose. The phenyl ring of F336 in the β 5α 5 loop occupies the portion of the binding pocket in the unliganded structure that is occupied by the β -phosphate of ADP-ribose. Thus, without a rearrangement of the β 5α 5 loop, ADP-ribose would not be able to bind. Conclusion-The significance of the study stems mainly from the demonstration that the SARS-CoV-2 macro X domain binds ADP ribose. This is the first step needed to justify screens for potential antivirals that bind in place of ADP ribose. More work needs to be done, however, to understand the antiviral potential of such compounds because the biological role for ADP-ribose binding is still not fully understood. Some work with alpha coronaviruses suggest that ADP-ribose binding by the macro X domain is not needed for viral replication. 24 However, studies with other (+)RNA viruses suggest that macrodomains are essential for virulence. 25 This work is also noteworthy because the synthetic codonoptimized plasmid reported here produces up to 100 mg of soluble macro X domain protein per liter of E. coli culture, and this protein retains a high affinity for ADP ribose. The protein could be used for structural studies and screening campaigns. Screening assays with the SARS-CoV-2 protein might actually be more efficient, since the SARS-CoV-2 protein binds ADP-ribose somewhat more tightly (K d = 10 µM) than the SARS-CoV-1 protein (K d = 24 µM). The recombinant protein reported here, together with the detailed structural information, might also be useful to others developing SARS-CoV-2 diagnostics and/or therapeutics. The SARS-Co rep 1ab peptide sequence was aligned with each of the PDB files listed, which describe an atomic structure of a homolog region of the SARS-CoV-1 rep 1ab polyprotein. Nsp's are shown in sequence as black arrows (note: there is no "nsp11", the translational frameshift occurs after nsp10). The percent of amino that differ in each protein is plotted. Nsp1 is and i feron antagonist. 26 The Nsp3 X domain is studied here, the Nsp3 SUD consists of tandem macrodomains that bind quadruplex structures, 3 Nsp dC is the C-terminus of the SUD, 27 Nsp3pro is a papain-like protease, 28 and Nsp3 RBD is an er possible RNA binding domain. 29 Nsp5 is the main viral protease. 30 Nsp7 and Nsp8 are polymerase cofactors. 31 Nsp9 i RNA binding protein. 32 Nsp10 is a zinc-binding cofactor for Nsp14 and Nsp16. 33 Nsp12 is the RNA polymerase. 31 Nsp13 helicase. 34 Nsp14 is a 3'-5' exonuclease and a 7-methyltransferase. 33 Nsp15 is an RNA endonuclease. 35 Nsp16 is an RNA 2'-O-Methyltransferase. 36 7 CoV-2 logous 1", and d interind Ganoth-9 is an 13 is a A cap How good are my data and what is the resolution Scaling and assessment of data quality An introduction to data reduction: space-group determination, scaling and intensity statistics Phaser crystallographic software Features and development of Coot Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix Towards automated crystallographic structure refinement with phenix.refine Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein Macrodomain ADP-ribosylhydrolase and the pathogenesis of infectious diseases Macro Domain from Middle East Respiratory Syndrome Coronavirus (MERS-CoV) Is an Efficient ADP-ribose Binding Module: CRYSTAL STRUCTURE AND BIOCHEMICAL STUDIES Crystal structures of two coronavirus ADP-ribose-1''-monophosphatases and their complexes with ADP-Ribose: a systematic structural analysis of the viral ADRP domain Structure of the X (ADRP) domain of nsp3 from feline coronavirus Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions The ADRP domain from a virulent strain of infectious bronchitis virus is not sufficient to confer a pathogenic phenotype to the attenuated Beaudette strain Novel beta-barrel fold in the nuclear magnetic resonance structure of the replicase nonstructural protein 1 from the severe acute respiratory syndrome coronavirus X-ray structural and biological evaluation of a series of potent and highly selective inhibitors of human coronavirus papain-like proteases Nuclear magnetic resonance structure of the nucleic acidbinding domain of severe acute respiratory syndrome coronavirus nonstructural protein 3 Structure-based design, synthesis, and evaluation of peptide-mimetic SARS 3CL protease inhibitors Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex Crystal structure and mechanistic determinants of SARS coronavirus nonstructural protein 15 define an endoribonuclease family Crystal structure and functional analysis of the SARS-coronavirus RNA cap 2'-Omethyltransferase nsp10/nsp16 complex