key: cord-1055254-78g2s9ge authors: Moeller, Nicholas H.; Shi, Ke; Demir, Özlem; Banerjee, Surajit; Yin, Lulu; Belica, Christopher; Durfee, Cameron; Amaro, Rommie E.; Aihara, Hideki title: Structure and dynamics of SARS-CoV-2 proofreading exoribonuclease ExoN date: 2021-04-04 journal: bioRxiv DOI: 10.1101/2021.04.02.438274 sha: a593871722bd875ec4eb1e77b27ddb0772d1785c doc_id: 1055254 cord_uid: 78g2s9ge High-fidelity replication of the large RNA genome of coronaviruses (CoVs) is mediated by a 3′-to-5′ exoribonuclease (ExoN) in non-structural protein 14 (nsp14), which excises nucleotides including antiviral drugs mis-incorporated by the low-fidelity viral RNA-dependent RNA polymerase (RdRp) and has also been implicated in viral RNA recombination and resistance to innate immunity. Here we determined a 1.6-Å resolution crystal structure of SARS-CoV-2 ExoN in complex with its essential co-factor, nsp10. The structure shows a highly basic and concave surface flanking the active site, comprising several Lys residues of nsp14 and the N-terminal amino group of nsp10. Modeling suggests that this basic patch binds to the template strand of double-stranded RNA substrates to position the 3′ end of the nascent strand in the ExoN active site, which is corroborated by mutational and computational analyses. Molecular dynamics simulations further show remarkable flexibility of multi-domain nsp14 and suggest that nsp10 stabilizes ExoN for substrate RNA-binding to support its exoribonuclease activity. Our high-resolution structure of the SARS-CoV-2 ExoN-nsp10 complex serves as a platform for future development of anti-coronaviral drugs or strategies to attenuate the viral virulence. The 29.9 kb single-stranded RNA genome of SARS-CoV-2, the causative agent of the global COVID-19 33 pandemic, is replicated and transcribed by the viral RNA-dependent RNA polymerase (RdRp, nsp12) 34 (1-3). Unlike the high-fidelity cellular replicative DNA polymerases, viral RdRp enzymes including the 35 coronavirus (CoV) RdRp do not contain a proofreading exonuclease domain to ensure high fidelity. The 36 resulting higher mutation rate (10 -4 to 10 -6 substitutions/nucleotide/round of replication) is generally 37 thought to promote rapid viral adaptation in response to selective pressure (4-6). However, the lack of 38 proofreading activity in RdRp poses a particular challenge for the replication of coronaviruses, which 39 feature the largest known RNA virus genomes (27 ~ 32 kb, up to twice the length as the next-largest 40 non-segmented RNA viral genomes) (7, 8) . It has been reported that SARS-CoV nsp12 is the fastest 41 viral RdRp known but with an error rate more than one order of magnitude higher than the generally 42 admitted error rate of viral RdRps (9), clearly necessitating a unique proofreading mechanism. 43 To mitigate the low fidelity of RdRp, all coronaviruses encode a 3'-to-5' exoribonuclease (ExoN) 44 in nsp14 (10-12). Mutations of SARS-CoV-2 nsp14 exhibit strong association with increased genome-45 wide mutation load (13, 14) , and genetic inactivation of ExoN in engineered SARS-CoV and murine 46 hepatitis virus (MHV) leads to 15 to 20-fold increases in mutation rates (7, 15, 16) . Furthermore, in a 47 mouse model, SARS-CoV with inactivated ExoN shows a mutator phenotype with decreased fitness and 48 lower virulence over serial passage, suggesting a potential strategy for generating a live, impaired-49 fidelity coronavirus vaccine (17) . Alternatively, recent studies show that ExoN inactivation is lethal for 50 SARS-CoV-2 and Middle East Respiratory Syndrome (MERS)-CoV (18), hinting at additional functions 51 for ExoN in viral replication. Indeed, the ExoN activity has been reported to mediate extensive viral 52 RNA recombination required for subgenomic mRNA synthesis during normal replication of CoVs 53 including SARS- , and it was shown to be required for resistance to the antiviral innate 54 immune response for MHV (20) . ExoN inactivation also significantly increases the sensitivity of CoVs 55 to nucleoside analogs that target RdRp, which is consistent with the biochemical activity of ExoN to 56 excise mutagenic or chain-terminating nucleotides mis-incorporated by . These 57 observations combine to suggest that chemical inhibition of ExoN could be an effective antiviral 58 strategy against CoVs. In this study, we determined a high-resolution crystal structure of the SARS-59 CoV-2 ExoN-nsp10 complex and studied its biochemical activities. Furthermore, we used molecular 60 dynamics (MD) simulations to better understand the dynamics of nsp14, nsp10, and their interaction 61 with RNA. 62 63 The multifunctional SARS-CoV-2 nsp14 consists of the N-terminal ExoN domain involved in 65 proofreading and the C-terminal guanine N7 methyl transferase (N7-MTase) domain that functions in 66 mRNA capping. We co-expressed in bacteria the full-length 527-residue SARS-CoV-2 nsp14 or its N-67 terminal fragment (residues 1 to 289) containing only the ExoN domain, with full-length 139-residue 68 nsp10 in both cases and purified the heterodimeric complexes. The nsp14-nsp10 and ExoN-nsp10 69 complexes both showed the expected 3'-to-5' exonuclease activity on a 5'-fluorescently labeled 20-70 nucleotide (nt) RNA (LS2U: 5'-GUCAUUCUCCUAAGAAGCUU; similar to 'LS2' used previously in 71 SARS-CoV ExoN studies (21)) ( Fig. 1A, B) . Although LS2U RNA by itself served as a substrate, more 72 extensive degradation was observed when it was annealed to an unlabeled 40-nt template strand 73 (LS15A_RNA ; Table 1 ) to generate a double-stranded (ds) RNA with a 20-nt 5'-overhang. Introducing 74 a base-mismatch at the 3' end of the degradable strand by using an alternative bottom strand 75 (LS15_RNA ; Table 1 ) had no discernable effect on the processing by either complex (Fig. 1A, B) . 76 When DNA was used as the template strand (LS15_DNA ; Table 1 ) to generate an RNA/DNA 77 heteroduplex substrate that is expected to take the A-form conformation similarly to dsRNA, the activity 78 was observed but weaker than for dsRNA. No nuclease activity was observed on a 5'-fluorescently 79 labeled 20-nt DNA (LS2_DNA; showed similar activities to SARS-CoV-2 ExoN-nsp10 (Fig. 1C, Supplementary Fig. 1) . 87 Previous X-ray crystallographic studies have provided the structure of SARS-CoV nsp14-nsp10 88 complex at resolutions ranging from 3.2 to 3.4 Å (21, 24). To obtain higher resolution view of a CoV 89 exoribonuclease complex and to reveal possible structural difference between SARS-CoV and SARS-90 CoV-2 ExoN, we have crystallized the SARS-CoV-2 ExoN-nsp10 complex. An ExoN variant with a 91 nuclease-inactivating mutation (E191Q) (Fig. 1C, Supplementary Fig. 1 ) was used in our 92 crystallographic studies as it was expressed more robustly and generated a more stable complex with 93 nsp10 than wild-type ExoN. We obtained crystals under two different conditions, one containing 94 ammonium tartrate and the other containing magnesium chloride (MgCl2), albeit in the same crystal 95 form. The structures were determined by molecular replacement phasing and refined to 1.64 and 2.10-Å 96 resolution for the tartrate and magnesium-bound crystals, respectively ( Fig. 2A, Table 2 ). The final 97 models consist of nsp14 residues Asn3 to Arg289 (Val287 for the lower resolution structure) and nsp10 98 residues Ala1 to Cys130, with two zinc ions bound to each polypeptide chain. As expected from high 99 sequence conservations, SARS-CoV-2 ExoN-nsp10 complex shows high structural similarity to its 100 counterpart from SARS-CoV (root-mean-square deviation of 0.95 Å for all main chain atoms against 101 5C8T (24)), whose shape was previously described to resemble 'hand (ExoN) over a fist (nsp10)' (21) 102 ( Fig. 2B) . A superposition between the SARS-CoV and SARS-CoV-2 ExoN-nsp10 structures shows 103 only relatively small (3.0 Å or less) deviations in several regions of the complex, including the tip of the 104 'fingers' region of ExoN comprising nsp14 residues 40 ~ 50, and surface-exposed loops of nsp10 105 ( Supplementary Fig. 2) . 106 While our structures of SARS-CoV-2 ExoN-nsp10 obtained in the two different crystallization 107 conditions are highly similar to each other, they show notable differences in the exonuclease active site 108 located around the 'knuckles' of ExoN. In the crystal grown in the presence of MgCl2, we observed a 109 magnesium ion octahedrally coordinated by Asp90, Glu92, Asp273, and three water molecules (Fig. 2C , 110 Supplementary Fig. 3d ). Another magnesium ion required for the conserved two-metal ion mechanism 111 of 3'-5' editing exonucleases (25, 26) was not observed. The previously reported SARS-CoV nsp14-112 nsp10 structures also showed only one metal ion, bound at an alternative site between Asp90 and 113 Glu191 (21, 24) . This site is unoccupied in our structure presumably due to the E191Q mutation. In 114 contrast, the higher resolution tartrate-bound structure shows a unique configuration of metal-free active 115 site ( Fig. 2D, Supplementary Fig. 3c ). Without the magnesium ion, Asp90 takes two distinct 116 conformers with its carboxylate group in orthogonal orientations. Glu92 is pointed away from 117 Asp90/Asp273 and hydrogen-bonded to Gln108 side chain, whereas His268 in turn is flipped away from 118 Glu92. A comparison between the Mg 2+ -bound and free structures shows a significant rearrangement for 119 residues Gly265 to Val269 including the main chain atoms, accompanying an inward movement of 120 His268 upon Mg 2+ -binding (Fig. 2E) . These observations demonstrate high flexibility of the ExoN 121 active site in the absence of divalent metal co-factors. 122 To obtain an idea about how SARS-CoV-2 ExoN-nsp10 complex engages RNA substrates, we 123 modeled an RNA-bound ExoN-nsp10 structure based on the double-stranded (ds) RNA-bound structures 124 of Lassa virus nucleoprotein (NP) exonuclease domain, which is another DEDDh-family 3'-to-5' 125 exoribonuclease. A superposition of the Lassa NP-RNA complex (27, 28) on ExoN-nsp10 based on their 126 conserved catalytic residues (Lassa NP: D389/E391/D466/H528/D533 according to the numbering in 127 4FVU (27), vs. SARS-CoV-2 ExoN: D90/E92/E191/H268/D273) places the A-form dsRNA in a 128 shallow groove on ExoN surface adjacent to the active site, with remarkable shape complementarity 129 ( Fig. 3 B, C) . In this model, the sugar-phosphate backbone of the non-degradable (template) RNA 130 strand tracks a positively charged patch on the ExoN surface including Lys9 and Lys61, whereas the 3' 131 end of its complementary (degradable) strand is presented to the active site. The extensive protein 132 contacts made by the non-degradable strand in a dsRNA substrate is consistent with the preference for 133 dsRNA substrates by SARS-CoV-2 ExoN as shown above ( Fig. 1) and by SARS-CoV ExoN reported 134 earlier (29). Notably, we observed ordered tartrate ions from the crystallization condition bound to this 135 basic patch in our crystal structure, potentially mimicking RNA backbone phosphate interactions 136 ( Supplementary Figs. 3a, 3b, and 4) . 137 Our hypothetical model described above suggests that the basic patch of ExoN helps position the 138 substrate RNA for exonucleolytic degradation. Lys9 and Lys61 are involved in the RNA backbone 139 interaction in our model. In addition, Lys139 is located farther down along the basic patch toward the 140 direction of the 5'-overhang of the template strand ( Fig. 3A ). Thus, we tested the activities of SARS-141 CoV-2 ExoN with single amino acid substitutions, K9A, K61A, and K139A. These ExoN mutants were 142 co-expressed with nsp10 and purified as heterodimeric complexes. In the exoribonuclease assay using 143 the RNA substrates described above, all three lysine-to-alanine mutants showed lower activity than 144 wild-type ExoN (Fig. 4) . In particular, the K9A and K61A substitutions caused severer defect than 145 K139A, consistent with our dsRNA-binding model ( Fig. 3 B, C) . While the precise conformation of 146 LS2U RNA in the absence of a complementary strand is unknown, its binding to ExoN must also 147 depend on these Lys residues, underscoring the importance of electrostatic interactions with RNA by the 148 mutated lysine residues in the ExoN activity. 149 Previous studies showed that the exoribonuclease activity of nsp14 is strongly stimulated by 150 nsp10 for both SARS-CoV and SARS-CoV-2 (29-32). In our crystal structure, the N-terminal residues 151 of ExoN and those of nsp10 are wrapped around each other in a 'criss-cross' arrangement and forming 152 several hydrogen-bond contacts, including one between nsp14 Lys9 and nsp10 Ala1 (Supplementary 153 Fig. 3a ). In addition, the first a-helix of nsp10 interacts with the ExoN loop harboring nsp14 Lys61, 154 where the main chain amide group of Lys61 is hydrogen-bonded to the side chain of nsp10 Ser15 (Fig. 155 3A) . In the absence of nsp10 supporting the RNA-binding groove from the back (Fig. 3C , To obtain further insights into the role of nsp10 and to support our RNA-binding model, we 162 performed explicitly solvated, all-atom molecular dynamics (MD) simulations of full-length SARS-163 CoV-2 nsp14, constructed from our ExoN-nsp10 co-crystal structure and a homology model of the C-164 terminal N7-MTase domain. Three independent copies of MD simulations totaling 2.6-µs were 165 performed for each of nsp14 alone, nsp14-nsp10 complex, and the nsp14-nsp10-RNA complex based on 166 our docking model described above. In addition, three independent copies of Gaussian-accelerated MD 167 simulations (GAMD) totaling 0.6-µs were performed for each system to enhance conformational 168 sampling. Comparing trajectories of these simulations for the 3 systems, the most noticeable difference 169 is an extreme flexibility of the 'fingers' region of ExoN primarily comprising its N-terminal residues 170 (nsp14 residues 1-60), which showed large deviations from the starting model and eventually became 171 highly disordered in the absence of nsp10. A principal component analysis for the 3 systems show that 172 the conformational space sampled by nsp14 is significantly larger in the absence of nsp10 (Fig. 5A , 173 Supplementary Fig. 7) . 174 The first principal component (PC1), which is broadly sampled by all 3 systems, corresponds to 175 a large hinge motion of the N7-MTase domain (~50 Å translocation at the distal end, Supplementary 176 Fig. 8, Supplementary animation 1) . In the conformation with minimal PC1 (Fig. 5B, left) , the 177 substrates (S-adenosyl methionine [SAM] and GpppA)-binding cleft of the N7-MTase domain abuts 178 against the ExoN domain, leading to occlusion of the substrates. On the other extreme with maximal 179 PC1, the cleft is more open to the solvent (Fig. 5B, right) . The second principal component (PC2) 180 corresponds to an ordered-to-disordered transition of the 'fingers' region of ExoN, which shows a large 181 population of disordered conformations only for the nsp14-alone system as mentioned above (Fig. 5C , further support for our model for dsRNA-binding (Fig. 5 F, G, Supplementary Fig. 10 ). An ionic 189 interaction between Ala1 of nsp10 and RNA backbone phosphate was particularly persistent and 190 observed for 97 % of the time during the simulations (3.2 Å distance cutoff), which led to a significant 191 stabilization of this residue in the presence of RNA (Table 3) Our X-ray crystallographic, biochemical, and computational analyses shed light on the substrate 199 preference, structure, and dynamics of the SARS-CoV-2 ExoN-nsp10 exoribonuclease complex and 200 further identified important roles of nsp10 in RNA substrate binding. It is particularly notable that the 201 ExoN-nsp10 complex preferentially degrades dsRNA substrates. This is in contrast to the proofreading 202 exonuclease domain of high-fidelity DNA polymerases, whose active site engages the single-stranded 203 DNA 3' end in partially melted double-stranded substrates (25, 33), and suggests a unique mechanism of 204 proofreading. The extensive ExoN/nsp10 interface buries a total of 2203 Å 2 of surfaces from both 205 proteins, spanning both the 'fingers' and 'palm' regions of ExoN. Folding of the fingers region depends 206 on its interaction with nsp10, which involves several critical residues including nsp10 Tyr96 (31) (Fig. 207 2A, Supplementary Fig. 6 ). On the other hand, an interesting feature for the interaction in the palm 208 region includes the insertion of Phe16 and Phe19 from the first a-helix of nsp10 into a deep 209 hydrophobic pocket of ExoN, which is essential for the stable complex formation (31). Notably, this 210 hydrophobic pocket is located on the backside from the ExoN active site, where nsp10 Phe19 side chain 211 makes van der Waals contacts with the main chain of an ExoN a-helix harboring one of the catalytic 212 residues Glu191 (Supplementary Fig. 6) . Thus, targeting said pocket of ExoN by small molecules to 213 block its interaction with nsp10 or potentially to allosterically modulate its catalytic activity could be a 214 possible strategy of inhibition. 215 MD simulations revealed remarkable flexibility in full-length nsp14 (Supplementary Figs. 7, 8, 216 and Supplementary animations 1, 2) , which affects solvent accessibility of the SAM/GpppA-binding 217 cleft and may play an important role in the catalytic cycle of N7-MTase (Fig. 5B) . Similar 218 conformational variation, albeit with a much smaller magnitude, was previously observed between two 219 SARS-CoV nsp14 molecules in the asymmetric unit of a crystal (Supplementary Fig. 8) (21) . Although 220 this hinge motion was observed for all 3 systems (nsp14-alone, nsp14-nsp10, and nsp14-nsp10-RNA) in 221 our simulations, they showed different distributions of the PC1 value (Supplementary Fig. 7) . In 222 addition, conformational sampling in the nsp14-alone system shows several clusters with distinct 223 combinations of PC1 and PC2 values (Fig. 5A, left) The 5'-fluorescein labeled oligonucleotides (Table 1) performed with NAMD2.14 program (43), while Gaussian-accelerated MD simulations (GAMD) were 296 performed with Amber20 program (44). First, each system was minimized in 4 consecutive steps by 297 gradually decreasing restraints. Subsequently, each system was heated from 0 to 310 K slowly, and then 298 equilibrated for about 1 ns by gradually decreasing restraints in 3 consecutive steps. For cMD, three 299 independent copies (2x 1 µs and 1x 0.6 µs) of simulation were run for each system. For GAMD, three 300 independent copies of 0.2 µs of simulation were run for each system using dual boost method following 301 a 20-ns MD run to calculate parameters for GAMD production runs. All cMD and GAMD simulations 302 were performed at 310 K and 1 atm and with a 2 fs timestep. For each system, 32,000 data points with 303 0.1 ns intervals were collected from simulations and analyzed. Stability of MD simulations are shown 304 with RMSD plots of nsp14 ExoN domain (Supplementary Fig. 13) . MDTraj (45) Table 1 for the substrate 484 sequences. 485 The image on the right is same as Fig. 3C . A live, impaired-fidelity coronavirus vaccine protects in an aged, 364 immunocompromised mouse model of lethal disease The Enzymatic Activity of the nsp14 Exoribonuclease Is Critical for 366 Replication of MERS-CoV and SARS-CoV-2 The coronavirus proofreading exoribonuclease mediates extensive viral 368 recombination Murine Hepatitis Virus nsp14 Exoribonuclease Activity Is Required for 370 Resistance to Innate Immunity Structural and molecular basis of mismatch correction and ribavirin 372 excision from coronavirus RNA Coronavirus Susceptibility to the Antiviral Remdesivir (GS-5734) Is 374 Mediated by the Viral Polymerase and the Proofreading Exoribonuclease Coronaviruses lacking 376 exoribonuclease activity are susceptible to lethal mutagenesis: evidence for proofreading and 377 potential therapeutics Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 379 complex Structural basis for the 3'-5' exonuclease activity of Escherichia coli 381 DNA polymerase I: a two metal ion mechanism Biochemical characterization of exoribonuclease encoded by SARS 383 coronavirus Structural basis for the dsRNA 385 specificity of the Lassa virus NP exonuclease Structures of arenaviral nucleoproteins with triphosphate dsRNA reveal a 387 unique mechanism of immune suppression RNA 3'-end mismatch excision by the severe acute respiratory 389 syndrome coronavirus nonstructural protein nsp10/nsp14 exoribonuclease complex Coronavirus Nsp10, a critical co-factor for activation of multiple 392 replicative enzymes New targets for drug design: Importance of nsp14/nsp10 complex 394 formation for the 3'-5' exoribonucleolytic activity on SARS-CoV-2 Characterisation of the SARS-CoV-2 ExoN (nsp14ExoN-nsp10) 397 complex: implications for its role in viral genome stability and inhibitor identification Structure of DNA polymerase I Klenow fragment 400 bound to duplex DNA Structure-function analysis of severe acute respiratory syndrome 402 coronavirus RNA cap guanine-N7-methyltransferase Phaser crystallographic software Features and development of Coot Macromolecular structure determination using X-rays, neutrons and 408 electrons: recent developments in Phenix Schrödinger Release 2021-1: Prime, Schrödinger, LLC PDB2PQR: an automated pipeline for 411 the setup of Poisson-Boltzmann electrostatics calculations 415 42. Pang YP (1999) Novel Zinc Protein Molecular Dynamics Simulations: Steps Toward 416 Antiangiogenesis for Cancer Treatment Scalable molecular dynamics on CPU and GPU architectures with NAMD. 418 MDTraj: A Modern Open Library for the Analysis of Molecular 421 Dynamics Trajectories Electrostatics of nanosystems: 423 application to microtubules and the ribosome 24). C, Structures that correspond to PC2 minimum and maximum values for the 492 nsp14-alone system. N-terminal region (residues 1-71) of nsp14 is depicted in purple ribbons while the 493 rest of nsp14 is depicted in blue ribbons. Transparent yellow spheres represent the Ca atoms of nsp14 494 residues that constitute nsp10 binding site. D, ExoN domain in nsp14-alone system with root-mean-495 square fluctuations (RMSF) of Ca atoms depicted on the structure with varying tube thickness and color 496 (low in blue to high in red). The view is similar to system with Ca RMSF depicted on the structure with varying tube thickness and color. F, RNA after Nsp14 and nsp10 are depicted as blue and green ribbons, respectively. Dark 500 purple spheres represent two Mg ions in the active site. G, RNA after 1 µs MD simulation of the nsp14-501 nsp10-RNA system, with nsp14 ExoN domain (cyan) or nsp10 (green) residues making persistent 502 hydrogen-bond or salt bridge interactions with RNA in MD simulations shown as sticks. The active site 503 residues of ExoN are also shown Supplementary Fig. 12 | SDS-PAGE of purified SARS-CoV-2 ExoN(E191Q)-nsp10 complex This protein complex was used in the crystallographic studies