key: cord-0944157-yi27j0hx authors: Treviño, Miguel Á.; Pantoja-Uceda, David; Laurents, Douglas V.; Mompeán, Miguel title: SARS-CoV-2 Nsp8 N-terminal domain dimerizes and harbors autonomously folded elements date: 2021-03-14 journal: bioRxiv DOI: 10.1101/2021.03.12.435186 sha: 18bd8a5fcdc7238c98eae712451ac6cdeb0f0394 doc_id: 944157 cord_uid: yi27j0hx The SARS-CoV-2 Nsp8 protein is a critical component of the RNA replicase, as its N-terminal domain (NTD) anchors Nsp12, the RNA, and Nsp13. Whereas its C-terminal domain (CTD) structure is well resolved, there is an open debate regarding the conformation adopted by the NTD as it is predicted as disordered but found in a variety of complex-dependent conformations or missing from many other structures. Using NMR spectroscopy, we show that the SARS CoV-2 Nsp8 NTD features both well folded secondary structure and disordered segments. Our results suggest that while part of this domain corresponding to two long α-helices forms autonomously, the folding of other segments would require interaction with other replicase components. When isolated, the α-helix population progressively declines towards the C-termini, and dynamics measurements indicate that the Nsp8 NTD behaves as a dimer under our conditions. The COVID-19 pandemic has currently (Mar. 12 th , 2021) affected over 118 million persons and caused over 2.6 million deaths worldwide (https://coronavirus.jhu.edu/map.html). The virus responsible for the disease, SARS-CoV-2, is a coronavirus whose unusually long (30 kB) RNA genome is replicated by a rather sophisticated RNA polymerase. This polymerase complex is composed of several non-structural proteins (Nsp) including the RNA-dependent RNA polymerase, Nsp12, as well as Nsp7 and Nsp8, which embrace the RNA to promote progressivity 1 . Nsp8 serves as the platform onto which Nsp7 and Nsp12 bind, and cryo-EM studies have shown that it also contains binding regions to recruit the helicase Nsp13 within the replicasetranscription complex (RTC) 2,3 . Due to this important role, a score of cryo-EM and crystallographic studies, detailed in Sup. 15 ). In all these, the Nsp8 CTD appears well structured, adopting the same globular fold which interacts with Nsp7. Conversely, the NTD of Nsp8 shows a high degree of plasticity, with different conformations present in distinct contexts. Interestingly enough, bioinformatic analyses predict disorder in half of the Nsp8 NTD 16 , which may correlate with the absence of large portions of NTD residues in many crystallographic and cryo-EM structures (Sup . Table 1 ). This, together with the number of distinct non-structural proteins that anchor to Nsp8 to create the replicase, suggest that the Nsp8 behaves as an intrinsically disordered protein region (IDPR) that folds upon complexation. Indeed, an early crystallographic study of SARS Nsp8, whose NTD is identical to SARS CoV-2 Nsp8 except for the conservative Y15-->F substitution, showed that the NTD can adopt two strikingly different conformations in combination with a Nsp7 to form a hollow PCNA-like complex 1 . Building on the critical observation that the Nsp8 NTD integrates Nsp12, Nsp13 and RNA 12 , and its predicted intrinsic disorder and observed conformational plasticity prevent direct observation by crystallography or cryo-EM, we sought to characterize its conformation and dynamics using solution NMR spectroscopy. We advance that our results also provide a framework to rationalize the role of Nsp8 dimerization in the assembly of the replicase. SARS CoV-2 Nsp8 NTD production and purification: Following purification by Ni ++ -NTA affinity, His tag cleavage and polishing with anion exchange chromatography (Fig. S1B ), sample homogeneity was confirmed by SDS PAGE (Fig. S1A) . The average yields were 5.0 mg/L from LB broth and 2.8 mg/L from minimal media for labeled samples. Nsp8 NTD solution secondary structure: The 2D 1 H-15 N HSQC spectrum of the Nsp8 NTD shows the excellent dispersion that is a hallmark of well-folded proteins for most signals (Fig. 1A) . Closer inspection reveals a subset of narrower, more intense resonances corresponding to residues at the N-and C-termini. Spectral analysis led to the assignment of over 95% of the backbone 13 Cα, 13 The assigned chemical shifts have been deposited in the BMRB under accession code 50788. The 13 Cα conformational chemical shifts (Δδ) reveal two highly populated αhelical structures spanning residues 11-27 and 32-50 (Fig. 1B) . Following residue 50, the helix continues but its population gradually declines, with the last stretch of residues (74-84) being chiefly disordered. The first ten residues are also disordered. The position and high population of the two α-helical segments was corroborated by 1 Hα and 13 CO conformational chemical shifts as well as 1 HN-1 Hα coupling constants ( 3 JHNHA) (Fig. S2) . This secondary structure, observed at 5 ºC and quasi physiological pH, is maintained at 25 ºC and 37 ºC (Fig. S3) . Regarding the residues linking the two α-helices, N28-G29-D30-S31, their 13 Cα, 13 Cβ, 13 CO, 1 HN and 1 Hα chemical shifts suggest that they adopt a type II´ tight turn 17,18 . The per-residue { 1 H}-15 N NOE ratios, which are sensitive to dynamics on fast ps/ns timescales, are plotted in Fig. 1C . The residues composing the two α-helical segments show relatively high values averaging about 0.75. This indicates considerable stiffness, but is significantly below the theoretical ratio of 0.86 expected for complete rigidity, which is often observed in well-folded proteins. Ratios are lower in the first ten residues and decrease progressively beyond residue 60, reflecting increased mobility. The values for the residues 60 -80 are in the neighborhood of 0.5, meaning that this segment is moderately mobile but significantly more rigid than a statistical coil which typically has values close to zero or even negative. Residues 11 -50 also show reduced longitudinal (R1) and elevated transverse (R2) relaxation rates which reflect dampened mobility on μs/ms timescales (Fig. 1D,1E Nsp8 is an essential anchoring scaffold for the replication-transcription complex (RTC) of the SARS-CoV-2, whose NTD binds with the other critical components, particularly the RNA-dependent RNA polymerase Nsp12, the helicase Nsp13 and RNA. In these assemblies, the Nsp8 NTD adopts well-folded structures that contrast with bioinformatics analyses that predict this domain to be largely disordered 16 . In other many crystallographic or cryo-EM structures, large portions of the NTD are missing (Table S1) or adopt dramatically distinct conformations (Fig. 2) , which evinces structural plasticity. In particular, the Nsp8 NTD of SARS-CoV, which is almost identical to SARS-CoV-2 Nsp8 NTD, adopts two strikingly different conformations when eight Nsp8 monomers combine with eight Nsp7 monomers to form a hexadecameric ring 1 (Figure 2A) . In one conformation, most of the Nsp8 NTD is absent whereas in the other, three α-helices are adopted ( One possibility is a concomitant structure formation upon binding event wherein an isolated, disordered Nsp8 NTD folds upon complexation. In line with this scenario, the Nsp8 forms distinct dimers with Nsp7 wherein the Nsp8 NTD folds in one type of dimer 1 (Fig. 2A) . In this structure and larger complexes, Nsp7 interacts with the CTD of Nsp8 2,3,8-15 (Fig. 2B) , which might suggest that the Nsp8 CTD then coaxes its NTD to fold. By contrast, here we show by NMR spectroscopy that the Nsp8 NTD, in the absence of other subunits and its CTD, exists as a folded dimer. This autonomous structure, consisting of two rather long, rigid α-helices spanning residues 11-50 and followed by a moderately-to-fully disordered segment (residues 51-84), might fold first and thus provide a foundation for anchoring Nsp7, Nsp12, Nsp13 and RNA to build up the RTC (Fig. 2B) . Within this working model, the 51-84 residue segment could become fully helical as the complex grows (Fig. 2B) . One puzzling detail is the autonomous dimerization of the Nsp8 NTD, which could contribute to the formation of a hexadecameric ring structure 1 (Fig 2A) . However, the two Nsp8 NTD do not contact each other in recently reported assemblies of the RTC 2,3,8-15 (Fig 2B) . We can envision two possible scenarios. First, eight copies of Nsp8 and Nsp7 could really combine to form a hexameric ring, which would afford PCNA-like progressivity to the RTC as proposed by Zhai et al. 1 (Fig 2A) . By contrast, if the physiological RTC were like the recent cryo-EM structures 2,3,8-15 (Fig. 2B) , then the dimerization of the Nsp8 NTD might initially serve to self-chaperone hydrophobic patches which evolve to form interactions with Nsp12, Nsp13 and RNA. The NMR data generated in this study represent valuable tools to test these scenarios by following changes in the assigned 1 H-15 N HSQC spectrum of Nsp8 NTD upon titration with other RTC components. As Nsp8 is the foundation of the replicase, these assignments shall also be key for identifying drug-like inhibitors for developing improved therapeutics by blocking Nsp8 NTD dimerization or associations with other replicase subunits. Beyond the replicase, these data can be used to map interactions with SARS-CoV-2 Orf6, a known partner 22 , which plays a key role in blocking the interferon response. Sample production and isotopic labeling: The gene coding for the Nsp8 NTD, whose sequence is The domain was purified by three chromatography steps. Briefly, after lysis by sonication, the obtained supernatant was purified on a HisTrap FF crude 5 mL column (Cytiva, Marlborough, MA) and eluted using an imidazole gradient whose initial and final concentrations were 10 and 500 mM, respectively. The eluted protein fusion was cleaved overnight at room temperature with TEV protease and dialyzed to eliminate imidazole. The cleaved sample was reloaded on the same column and the flowthrough was collected and applied to a Bio-Scale™ Mini Bio-Gel ® P-6 Desalting Cartridges (Bio-rad, Hercules, CA) for desalting. Finally, the sample was loaded on a 5 mL HiTrap Q HP anion exchange column (Cytiva, Marlborough, MA) at pH 8 and eluted, collecting the nonretained fraction. Homogeneity following purification was confirmed by gradient gel (4 -20% acrylamide) SDS PAGE and NMR spectroscopy. Prior to NMR spectroscopy, the sample was transferred to a buffer containing 50 mM NaCl, 10 mM KH2PO4, pH 6.1. The spectra were referenced using DSS as the internal chemical shift standard. Spectra were recorded at 5.0 ºC, except 2D 1 H- 15 Two strikingly different conformations are adopted in Nsp8 NTD dimers in the hexadecameric ring adopted by combination of Nsp7 and Nsp8 dimers in SARS-CoV (PDB 2AHM) 1 . The zoomed box illustrates that one of the two monomers in the Nsp8 NTD dimers contains three helices, while in the second monomer part of the NTD is missing (see Table S1 ) but the remaining C-terminal part of the NTD preserves the helix. B. Two different structures of the RTC of SARS-CoV-2. The NTD of Nsp8 harbors interaction sites to bind RNA, Nsp12 (left, PDB 6YYT 2 and Nsp13 (right, PDB 6XEZ) 3 , and adopts a helix-loop-helix conformation (zoomed regions). The bottom box illustrates an NMR-based 2D representation of the NTD in the absence of RNA, Nsp12 and Nsp13. Based on data shown in Fig. 1 , a small yet rigid helix-loop-helix core forms autonomously, and is flanked by two N-and C-terminal segments. Whereas the Nterminal stretch is fully disordered, the C-terminal part possesses both nascent helical structure and a disordered segment. Nsp12 and Nsp13 bind to these C-terminal regions of the NTD, which become fully helical upon complexation. A. Chromatogram of the elution of 13 C-15 N Thioredoxin-His6-TEVcleavage sequence-Nsp8NTD from the Ni ++ NTA column. The blue trace corresponds to absorbance at 280 nm, the green trace marks the imidazole gradient, which ranged from 10 mM to 500 mM, the red lines mark the collected fractions, the magenta line indicates the injection point and the light blue line represents the conductivity. B. Gradient (4 -20% arcylamide) PAGE-SDS gel of the final purified 13 C, 15 N Nsp8 NTD. The numbers on the right indicate the size, in kDa, of the molecular weight markers. Supporting Figure 3 . The Nsp8 NTD preserves its autonomous structure at physiological temperature. A. 2D 1 H-15 N HSQC spectra of 13 C, 15 N-labeled Nsp8 NTD at 5.0 ºC (left panel) 25.0 ºC (middle) and 37.0 ºC (right panel). Although some peaks lower in intensity due putatively to exchange with 1 H2O, native signals are generally preserved at 37 ºC. B. 13 Cα conformational chemical shifts measured on the basis of 3D HNCA spectra at 5.0 ºC (blue bars), 25.0 ºC (gold bars) and 37.0 ºC (red bars). The very small changes indicates that the content of -helix scarcely changes over this temperature range. Insights into SARS-CoV Transcription and Replication from the Structure of the Nsp7-Nsp8 Hexadecamer Structure of Replicating SARS-CoV-2 Polymerase Structural Basis for Helicase-Polymerase Coupling in the SARS-CoV-2 Replication-Transcription Complex Structural Analysis of the Putative SARS-CoV-2 Primase Complex Crystal Structure of 2019-NCoV Nsp7-Nsp8c Complex. PDB: 6M5I The 1.95A Crystal Structure of the Co-Factor Complex of Nsp7 and the C-Terminal Domain of Nsp8 from SARS CoV-2. PDB: 6WQD The 1.5A Crystal Structure of the Co-Factor Complex of Nsp7 and the C-Terminal Domain of Nsp8 from SARS CoV-2. PDB: 6XIP Structure of the RNA-Dependent RNA Polymerase from COVID-19 Virus Structure of SARS-CoV-2 RDRp/RNA Complex at 3.4 A Resolution. PDB: 6XQB Structural and Biochemical Characterization of the Nsp12-Nsp7-Nsp8 Core Polymerase Complex from SARS-CoV-2 Mechanism of SARS-CoV-2 Polymerase Stalling by Remdesivir Structural Basis for Inhibition of the RNA-Dependent RNA Polymerase from SARS-CoV-2 by Remdesivir Structural Basis for RNA Replication by the SARS-CoV-2 Polymerase Architecture of a SARS-CoV-2 Mini Replication and Transcription Complex Cryo-EM Structure of an Extended SARS-CoV-2 Replication and Transcription Complex Reveals an Intermediate State in Cap Synthesis Δδ 13 CO (panel A), and Δδ 1 H (panel B), and 3 JHNHA (panel C) coupling constant values are plotted versus sequence number. The lines mark the values expected for α-helix in panel A and B; in panel C, the red, green and lines represent the values expected for β-sheet