key: cord-0902170-f72t6gy5 authors: Perry, Jason K.; Appleby, Todd C.; Bilello, John P.; Feng, Joy Y.; Schmitz, Uli; Campbell, Elizabeth A. title: An atomistic model of the coronavirus replication-transcription complex as a hexamer assembled around nsp15 date: 2021-06-08 journal: bioRxiv DOI: 10.1101/2021.06.08.447516 sha: 2ec1c849ba3e9aae0daf1f97835911c2c876d6c3 doc_id: 902170 cord_uid: f72t6gy5 Using available cryo-EM and x-ray crystal structures of the nonstructural proteins that are responsible for SARS-CoV-2 viral RNA replication and transcription, we have constructed an atomistic model of how the proteins assemble into a functioning superstructure. Our principal finding is that the complex is hexameric, centered around nsp15. The nsp15 hexamer is capped on two faces by trimers of nsp14/nsp16/(nsp10)2, where nsp14 is seen to undergo a large conformational change between its two domains. This conformational change facilitates binding of six nsp12/nsp7/(nsp8)2 polymerase subunits to the complex. To this, six subunits of nsp13 are arranged around the superstructure, but not evenly distributed. Two of the six polymerase subunits are each proposed to carry dimers of nsp13, while two others are proposed to carry monomers. The polymerase subunits that coordinate nsp13 dimers also bind the nucleocapsid, which positions the 5’-UTR TRS-L RNA over the polymerase active site, a state distinguishing transcription from replication. Analyzing the path of the viral RNA indicates the dsRNA that exits the polymerase passes over the nsp14 exonuclease and nsp15 endonuclease sites before being unwound by a convergence of zinc fingers from nsp10 and nsp14. The template strand is then directed away from the complex, while the nascent strand is directed to the sites responsible for mRNA capping (the nsp12 NiRAN and the nsp14 and nsp16 methyltransferases). The model presents a cohesive picture of the multiple functions of the coronavirus replication-transcription complex and addresses fundamental questions related to proofreading, template switching, mRNA capping and the role of the endonuclease. It provides a platform to guide biochemical and structural research to address the stoichiometric and spatial configuration of the replication-transcription complex. Author Summary The replication of the coronavirus genome and the synthesis of subgenomic mRNA is a complex process involving multiple viral proteins. Despite a fairly complete structural picture of the individual proteins that are believed to coalesce into a larger replication-transcription complex, there is no clear model of how these proteins interact. Here we present the first detailed atomistic model of a complete replication-transcription complex for SARS-CoV-2, made up of the non-structural proteins nsp7-nsp16, as well as the nucleocapsid. Forming a large, hexameric superstructure centered around nsp15, the model provides new perspective on the function of its individual components, including the exonuclease, the endonuclease, the NiRAN site, the helicase, the multiple zinc fingers, and the nucleocapsid. It offers a cohesive view of replication, proofreading, template switching and mRNA capping, which should serve as a guide for future experimental exploration. Author Summary The replication of the coronavirus genome and the synthesis of subgenomic mRNA is a complex process 29 involving multiple viral proteins. Despite a fairly complete structural picture of the individual proteins 30 that are believed to coalesce into a larger replication-transcription complex, there is no clear model of 31 how these proteins interact. Here we present the first detailed atomistic model of a complete 32 replication-transcription complex for SARS-CoV-2, made up of the non-structural proteins nsp7-nsp16, 33 as well as the nucleocapsid. Forming a large, hexameric superstructure centered around nsp15, the 34 model provides new perspective on the function of its individual components, including the 35 exonuclease, the endonuclease, the NiRAN site, the helicase, the multiple zinc fingers, and the Introduction 61 nsp7, nsp8, nsp9, nsp10 and the nucleocapsid (N) have been shown to be involved in replication and 62 transcription as well. [10] [11] [12] [13] 63 Unlike many other viral families, structures have been determined for all key proteins that are presumed 64 to make up the coronavirus replication-transcription complex (RTC), several of which are shown in 65 Figure S1 . To date, there are x-ray crystal structures of coronavirus nsp13, [14, 15] heterodimeric 66 nsp14/nsp10, [6, 16, 17] heterodimeric nsp16/nsp10, [11, 12, [18] [19] [20] [21] hexameric nsp15, [22] [23] [24] [25] [26] [27] dimeric 67 nsp9, [28] [29] [30] and the N protein NTD bound to both dsRNA and the specific viral RNA oligo known as the 68 transcription regulatory sequence (TRS), which is critical to the unusual template switching process that 69 occurs during transcription. [31] Cryo-EM has been especially successful in illuminating the structure of 70 the polymerase complex, made up of nsp12, nsp7, two subunits of nsp8, and up to two subunits of 71 nsp13. [2, 5, 10, [32] [33] [34] [35] [36] In this complex, nsp7 and the two nsp8 subunits sit atop the nsp12 Pol active 72 site, coordinating to the thumb and fingers domains. The long and flexible N-terminal (N-term) nsp8 73 helices extend out over the exiting dsRNA when the complex is captured in its replicating state. Coordinated to these nsp8 helices, two subunits of nsp13 sit above the polymerase complex, where one 75 of them has been observed engaging the downstream RNA template overhang. Yet despite this wealth of structural information, there is no clear picture of how the polymerase 77 complex interacts with the remaining proteins to form the complete RTC, leaving major questions of 78 viral RNA processing unanswered. Functionally, it is inferred that the nsp14 ExoN must have some 79 interaction with nsp12 in order to gain access to the 3' end of the nascent strand to carry out 80 proofreading during RNA synthesis. The dsRNA that emerges from the polymerase must eventually be 81 unwound. To produce transcripts for protein translation, the resulting 5' end of the nascent positive-82 strand must be directed to the mRNA capping sites: presumably the NiRAN site of nsp12 and the two 83 MTase sites of nsp14 and nsp16. The complicated process of template switching during transcription However, two exceptions emerged. The nsp9 dimer, a putative RNA binding protein, demonstrated a 130 clear preference for binding to the 2A domain of nsp13 (Figure S2 ). This proved to be a consistent 131 finding, as we followed up with docking of the nsp9 dimer to several available structures of nsp13, 132 capturing a variety of conformations of this protein. Notably, the position of nsp9 on nsp13 is such that 133 it is ideally situated to interact with the 5' end of the ssRNA as it exits the helicase RNA binding groove. Additionally, we found that the NTD of the N protein, as bound to a 10-nt oligo corresponding to the 135 TRS-L of the SARS-CoV-2 genome, binds robustly between the two nsp13 subunits of the polymerase is also well exposed to solvent on the exit side of the polymerase, implying that full-length N protein 142 could bind unimpeded. Interestingly, we found that structures of the N protein that are either apo or 143 dsRNA binding do not dock to this site. Only the structure with the co-crystalized TRS-L segment binds 144 here, which is likely due to its more spherical shape. Nsp15-dsRNA interaction -Having completed this initial survey of potential binary protein-protein 146 interactions, we considered the possibility that nsp15 retained its hexameric form in the RTC. Docking 147 of the nsp15 hexamer to the polymerase complex did not identify any direct interaction between the 148 proteins, but instead revealed a specific interaction between nsp15 and dsRNA that was common to all 149 of the top poses. We followed up with docking of isolated dsRNA to the nsp15 hexamer and again found 150 this particular binding mode was dominant (Figure 2c ). The RNA runs from one face of the nsp15 151 hexamer to the other and is in contact with three subunits (A1, B1 and B2). As shown in Figure S3 We optimized the structure of ExoN/nsp10 interacting with nsp15 and dsRNA as described above (Figure 192 3a) and arranged six subunits symmetrically around the hexamer. We then docked the MTase domain to 193 this complex and observed a compelling binding mode in which the MTase rotated approximately 180° 194 with respect to its x-ray structure conformation (Figure 3b three dsRNA helices, while also positioning clusters of zinc fingers around the three antiparallel dsRNA 216 helices just above the EndoN sites (Figure 3d ). The significance of this placement of nsp16/nsp10.16 with respect to nsp15/nsp14/nsp10.14 is that it 218 puts both MTase sites near each other, an outcome consistent with their roles in capping. It also adds 219 an additional pair of zinc fingers to interact with the dsRNA just past the EndoN site. In total there are 220 six of these in close proximity (two from nsp10.14, two from nsp10.16, one from nsp14 ExoN, and one 221 from nsp14 MTase). Interestingly, a narrow channel lined with basic residues is carved out by this 222 arrangement of the proteins that could accommodate ssRNA. It runs from the EndoN site to the trigonal 223 face accommodating both MTase sites. Given the dsRNA steric blockade also created by these proteins, 224 we concluded this was the site of strand separation. Without the need for a helicase, the model suggests strand separation is facilitated by the zinc fingers, three of which (two from nsp10.14 and one from 226 nsp14 MTase) act to direct the 3' strand away from the complex, while two others (one from nsp10.16 227 and one from nsp14 ExoN) direct the 5' strand into the basic channel. From there, the 5' strand is 228 funneled to the mRNA capping sites (nsp14 MTase and nsp16 MTase). We built a model of RNA 229 following these two paths to illustrate the point. Nsp12/(nsp8) 2 -nsp14/nsp15 interaction -As discussed above, the most likely site for the ExoN and the the two nsp14 domains (Figure 4b ). Notably, D825 on this beta hairpin is in proximity to another nsp14 zinc finger, which sits below the ExoN active site. This complex was optimized, allowing additional 250 flexibility for the poorly resolved nsp12 C-term residues 917-929. When arranging additional polymerase complexes around the hexamer, we then focused on the position 252 of the nsp8 N-term helices which extend over the dsRNA. These helices are composed of a short helix 253 (residues 12-28) and a long helix (residues 32-97), which are seen to fold back into a bundle via a short 254 connecting loop (residues 29-31). But this area, rich in both the zinc fingers and basic residues, creates well defined pathways to separate 403 the template and nascent strands (Figure 6d) . The nsp14 C-term residues 519-526 sit in the major 404 groove of the dsRNA, just above the EndoN site. Just beyond this point, the nsp14 CTD zinc finger and 405 one of the nsp14.10 zinc fingers sit above the nascent strand, distorting its path. The template strand is 406 sterically prevented from continuing in its dsRNA form and is directed between the two nsp10.14 zinc 407 fingers away from the complex. The nascent strand is then directed into a channel lined with basic 408 residues primarily coming from nsp14 ExoN. It encounters two additional zinc fingers (the nsp14 ExoN TRS-L and completes synthesis of the 5' leader, skipping over everything in between. The N protein has been thought to be involved in this process of recoupling, in that its NTD has been shown to specifically 463 bind to the TRS sequence and is essential to transcription. [13, 56] The N protein CTD is a dimerization 464 domain, such that binding of an N dimer to both the TRS-L and TRS-B would bring the two templates 465 into close proximity. Yet a detailed picture of the mechanics of template switching is largely a mystery. Here we showed via protein-protein docking that the N protein, when bound to the TRS oligo, positions 467 itself between two nsp13 subunits over the polymerase active site. The orientation is such that once the 468 complementary sequence is synthesized, the nascent strand could potentially recouple to this parallel 469 template. We envision several factors that would allow this template switch to happen. oligo is positioned over the polymerase active site, parallel to the template, with its 5' end exposed on 672 the entrance side of the polymerase and its 3' end exposed on the exit side. The C-term S180 residue of 673 the N NTD is also exposed on the exit side of the polymerase, indicating full length N could bind to the Coronavirus biology and replication: implications for SARS-CoV-2. Nat Rev 523 Microbiol Structure of replicating SARS-CoV-2 polymerase Discovery of an essential nucleotidylating activity associated with a newly 527 delineated conserved domain in the RNA polymerase-containing protein of all nidoviruses Coronavirus replication-transcription complex: Vital and selective NMPylation 530 of a conserved site in nsp9 by the NiRAN-RdRp subunit Cryo-EM Structure of an Extended SARS-CoV-2 Replication and Transcription 532 Complex Reveals an Intermediate State in Cap Synthesis Structural and molecular basis of mismatch correction and ribavirin excision 534 from coronavirus RNA Coronavirus nonstructural protein 16 is a cap-0 binding enzyme possessing 536 (nucleoside-2'O)-methyltransferase activity Multiple enzymatic activities associated with severe acute respiratory 538 syndrome coronavirus helicase Old" protein with a new story: Coronavirus endoribonuclease is 540 important for evading host antiviral defenses Structure of the SARS-CoV nsp12 polymerase bound to nsp7 542 and nsp8 co-factors RNA 3'-end mismatch excision by the severe acute respiratory syndrome 544 coronavirus nonstructural protein nsp10/nsp14 exoribonuclease complex Crystal structure and functional analysis of the SARS-coronavirus RNA cap 2'-O-547 methyltransferase nsp10/nsp16 complex Interactions between coronavirus nucleocapsid protein and viral RNAs: 549 implications for viral transcription Crystal structure of Middle East respiratory syndrome coronavirus helicase. PLoS 551 Pathog Delicate structural coordination of the Severe Acute Respiratory Syndrome 553 coronavirus Nsp13 upon ATP hydrolysis Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 555 complex Structure and dynamics of SARS-CoV-2 proofreading exoribonuclease ExoN. 557 bioRxiv Biochemical and structural insights into the mechanisms of SARS coronavirus 559 RNA ribose 2'-O-methylation by nsp16/nsp10 protein complex Structural analysis of the SARS-CoV-2 methyltransferase complex involved in 562 RNA cap creation bound to sinefungin Crystal structure of SARS-CoV-2 nsp10/nsp16 2'-O-methylase and its implication on 564 antiviral drug design High-resolution structures of the SARS-CoV-2 2'-O-methyltransferase 566 reveal strategies for structure-based inhibitor design Structural and functional analyses of the severe acute respiratory syndrome 568 coronavirus endoribonuclease Nsp15 Crystal structure of a monomeric form of severe acute respiratory syndrome 570 coronavirus endonuclease nsp15 suggests a role for hexamerization as an allosteric switch Crystal structure of Nsp15 endoribonuclease NendoU from SARS-CoV-2. Protein 573 Sci Tipiracil binds to uridine site and inhibits Nsp15 endoribonuclease NendoU from 575 SARS-CoV-2 Crystal structure and mechanistic determinants of SARS coronavirus 577 nonstructural protein 15 define an endoribonuclease family Structural and Biochemical Characterization of Endoribonuclease Nsp15 Encoded 580 by Middle East Respiratory Syndrome Coronavirus The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is 582 a single-stranded RNA-binding subunit unique in the RNA virus world Variable oligomerization modes in coronavirus non-structural protein 9 The nsp9 replicase protein of SARS-coronavirus, structure and functional 587 insights Structural basis of RNA recognition by the SARS-CoV-2 nucleocapsid 589 phosphoprotein Structural Basis for Helicase-Polymerase Coupling in the SARS-CoV-2 Replication-591 Transcription Complex Structure of the RNA-dependent RNA polymerase from COVID-19 virus Structural and Biochemical Characterization of the nsp12-nsp7-nsp8 Core 595 Polymerase Complex from SARS-CoV-2 Structural Basis for RNA Replication by the SARS-CoV-2 Polymerase Architecture of a SARS-CoV-2 mini replication and transcription complex Continuous and Discontinuous RNA Synthesis in Coronaviruses Biochemical and genetic analyses of murine hepatitis virus Nsp15 603 endoribonuclease Situ Tagged nsp15 Reveals Interactions with Coronavirus Replication/Transcription Complex-Associated Proteins Structure of DNA polymerase I Klenow fragment bound 609 to duplex DNA Probing the structural and molecular basis of nucleotide selectivity by human 611 mitochondrial DNA polymerase gamma Structure of the processive human Pol delta holoenzyme Structural basis for the dsRNA specificity of the Lassa virus NP exonuclease. 615 PLoS One Structural basis for backtracking by the SARS-CoV-2 replication-transcription 617 complex Crystal structures of the BsPif1 helicase reveal that a major movement of the 620 2B SH3 domain is required for DNA unwinding Major genetic marker of nidoviruses encodes a replicative endoribonuclease Coronavirus nonstructural protein 15 mediates evasion of dsRNA sensors and 624 limits apoptosis in macrophages Coronavirus endoribonuclease targets viral polyuridine 626 sequences to evade activating host sensors The 5' end of coronavirus minus-strand RNAs contains a short 629 poly(U) tract Structural snapshots of actively transcribing influenza 631 polymerase Molecular mechanisms of coronavirus RNA capping and methylation. Virol 633 Sin AT-527, a Double Prodrug of a Guanosine Nucleotide Analog, Is a Potent 637 Inhibitor of SARS-CoV-2 In Vitro and a Promising Oral Antiviral for Coronavirus nucleocapsid protein facilitates template switching and is required 640 for efficient transcription A U-turn motif-containing stem-loop in the coronavirus 5' untranslated region plays 642 a functional role in replication. RNA Mouse hepatitis virus stem-loop 4 functions as a spacer element required to drive 644 subgenomic RNA synthesis Structural lability in stem-loop 1 drives a 5' UTR-3' UTR interaction in coronavirus 646 replication Structure of the SARS coronavirus nucleocapsid protein RNA-binding 648 dimerization domain suggests a mechanism for helical packaging of viral RNA PIPER: an FFT-based protein docking program with pairwise potentials. 653 Proteins Binding of the polymerase 691 is largely through nsp12 interactions with the conformationally altered face of nsp14. Much of this 692 binding comes from the C-term helices of nsp12 (residues 855-923) interacting with the MTase domain 693 of nsp14. (b) The nsp12 beta hairpin (residues 815-831) sits in the cleft between the two domains of 694 nsp14, in close proximity to a zinc finger. (c) The short N-term helix of nsp8.2 (residues 12-28) extends 695 to interact with nsp15, with the N-term residues (1-11) sitting under the dsRNA just ahead of the EndoN 696 site. (d) View of a pair of polymerases bound to the nsp15/nsp14/nsp16 complex 12 nsp10 and 2 N proteins. The six nsp13 subunits are 702 arranged across nsp12 pairs in 2/0 (A1/B1), 1/1 (A2/B2), and 0/2 (A3/B3) stoichiometries. The 703 polymerase complexes with two associated nsp13 subunits The 705 two polymerase complexes with a single nsp13 subunit (A2 and B2) are responsible for replication It is separated into 709 template (blue) and nascent (red) strands at nsp10, and the nascent strand is directed to the NiRAN and 710 two MTase sites. (b) Detail of dsRNA exiting the polymerase and passing over the ExoN. The dsRNA is 711 expected to shift into the ExoN active site when encountering a prematurely terminated nascent strand. 712 (c) Detail of the dsRNA passing over the EndoN site, where the template strand would be the substrate. 713 (d) Detail of strand separation occurring at the convergence of two zinc fingers from nsp10.14 and one 714 from nsp14 CTD. The template strand is directed away from the complex Model of template switching (a) The 5'-UTR coordinates to the nsp13 dimer, 722 with TRS-L bound N protein positioned above the polymerase active site. RNA synthesis begins on the 3' 723 end of the template. (b) Synthesis continues until the N protein dimerizes with another N protein 724 bound to TRS-B on the template. (c) The N proteins release the RNA. (d) The complementary TRS-B of 725 the nascent strand recouples with TRS-L. (e) The shift in RNA position triggers nsp13 template 726 backtracking, unwinding the dsRNA. (f) Once fully unwound Key components of the coronavirus RTC. (a) SARS-CoV-2 polymerase complex of nsp12 732 (green), nsp13 (orange), nsp7 (white) and nsp8 (yellow) (PDB: 6XEZ). (b) Homology model of (c) SARS-CoV-2 2'O-MTase nsp16 (pink) with nsp10 cofactor 735 (PDB: 6WVN). (d) Hexamer of SARS-CoV-2 EndoN nsp15 (cyan) (PDB: 6X1B). Subunits of the nsp15 736 hexamer are labeled, reflecting two trigonal faces Detail of dsRNA path over hexameric nsp15. The RNA interacts with eight basic residues 746 from three different subunits and passes over the EndoN site The hexameric RTC is divided into six separate PDB files