key: cord-1048957-6di518oq authors: Wang, Bing; Svetlov, Vladimir; Wolf, Yuri I.; Koonin, Eugene V.; Nudler, Evgeny; Artsimovitch, Irina title: Allosteric Activation of SARS-CoV-2 RNA-Dependent RNA Polymerase by Remdesivir Triphosphate and Other Phosphorylated Nucleotides date: 2021-06-22 journal: mBio DOI: 10.1128/mbio.01423-21 sha: cfdd2742ff7d00b3afbc9562288100a2f03e6b8f doc_id: 1048957 cord_uid: 6di518oq The catalytic subunit of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA-dependent RNA polymerase (RdRp) Nsp12 has a unique nidovirus RdRp-associated nucleotidyltransferase (NiRAN) domain that transfers nucleoside monophosphates to the Nsp9 protein and the nascent RNA. The NiRAN and RdRp modules form a dynamic interface distant from their catalytic sites, and both activities are essential for viral replication. We report that codon-optimized (for the pause-free translation in bacterial cells) Nsp12 exists in an inactive state in which NiRAN-RdRp interactions are broken, whereas translation by slow ribosomes and incubation with accessory Nsp7/8 subunits or nucleoside triphosphates (NTPs) partially rescue RdRp activity. Our data show that adenosine and remdesivir triphosphates promote the synthesis of A-less RNAs, as does ppGpp, while amino acid substitutions at the NiRAN-RdRp interface augment activation, suggesting that ligand binding to the NiRAN catalytic site modulates RdRp activity. The existence of allosterically linked nucleotidyl transferase sites that utilize the same substrates has important implications for understanding the mechanism of SARS-CoV-2 replication and the design of its inhibitors. Middle East respiratory syndrome (MERS), and coronavirus (CoV) disease 2019 (COVID- 19) epidemics have been caused by respiratory RNA viruses, the last three by betacoronaviruses (betaCoVs), which belong to the family Coronaviridae in the order Nidovirales (2) . The ongoing COVID-19 pandemic led to a dramatic loss of human life and devastating economic and social disruptions around the world. The zoonotic origin of its causative agent, the SARS-CoV-2 clade of Severe acute respiratory syndrome-related coronavirus species, the rapid rise of mutant strains within the infected human population, and numerous instances of retransmission to zoonotic hosts speak to its resilience as a persistent human pathogen and the likelihood of the emergence of new betaCoV variants with pandemic potential (3) (4) (5) (6) . Adequate pandemic response measures, from the development of effective antivirals to genomic surveillance, require a detailed understanding of SARS-CoV-2's molecular and structural biology. However, CoVs have been studied less thoroughly than other viral pathogens, in part owing to their extraordinarily large genome size (by far the largest among the known RNA viruses) and complex biology (7) . Upon infecting human cells, the CoV plus-strand RNA genome is translated to produce a long polyprotein that is cleaved into several nonstructural proteins (Nsps), which are required for viral replication and gene expression by CoV-encoded protease (8) . Among these, SARS-CoV-2 Nsp12 plays a central role as a catalytic subunit of RNAdependent RNA polymerase (RdRp). The RdRp is the only protein that is universally conserved among RNA viruses (9) and therefore is an attractive target for broad-spectrum antivirals. Many nucleoside analogs identified as RNA synthesis inhibitors in other viruses have been actively pursued for retargeting against SARS-CoV-2 (10) . The transcription machinery of CoV is unique among RNA viruses in its complexity; the transcribing RdRp associates with the replicative helicase Nsp13, proofreading exonuclease Nsp14/10, and several other viral proteins in a large membrane-bound replication-transcription complex (RTC) (11) . The RTC components are highly conserved among CoVs (Fig. 1) . Furthermore, unlike many well-studied single-subunit viral RdRps (9), a minimally active SARS-CoV-2 RdRp consists of Nsp12 and three accessory subunits: Nsp7 and two copies of Nsp8 (7Á8 2 Á12) (12-14) ( Fig. 2A) . Nsp12 is a large (932-residue) multidomain protein. In addition to containing the RdRp module, composed of finger, palm, and thumb domains, Nsp12 contains a large nidovirus RdRp-associated nucleotidyl transferase (NiRAN) domain, which is connected to the finger domain through an interface domain ( Fig. 2A) . The NiRAN domain is unique to Nidovirales and has been suggested to perform a range of activities, from RNA capping to protein-primed initiation of RNA synthesis (15) . A recent report identified the accessory RNA-binding protein Nsp9 as the physiological target of NiRAN NMPylase and showed that this activity is critical for viral replication (16) . Consistent with these findings, the NiRAN domain active site was observed to bind Nsp9 in a single-particle cryogenic electron microscopy (cryoEM) study (17) , apparently in a catalytically inactive arrangement. Thus, Nsp12 is a bifunctional enzyme with two active sites, one of which transfers an NMP moiety to the 39 end of the nascent RNA (active site 1 [AS1] in the RdRp domain) and the other one, to the N terminus of Nsp9 (AS2, in the NiRAN domain). AS1 and AS2 utilize standard nucleoside triphosphates (NTPs) as the substrates, but can also accommodate as ligands a variety of nucleotide derivatives. In particular, AS1 readily incorporates remdesivir (18) and favipiravir (19) monophosphates into FIG 1 Conservation of amino acid residues in genomes of alpha-, beta-, gamma-, and deltacoronavirus genera; only those proteins that are present in all Coronaviridae are shown (see Data Set S1 in the supplemental material). The Nsps are indicated by numbers; Nsp7 to -16 (shown in gray), which comprise the replication-transcription complex (RTC), are more conserved than structural (E, M, N, S) proteins and other Nsps. RNA, and structural evidence suggests that AS2 is similarly promiscuous (14, 17, 20) . Thus, the effects of NTPs and nucleoside analogs on both catalytic activities must be taken into account when interpreting experimental data and evaluating the antiviral potential of lead molecules. SARS-CoV-2 RdRp contains intrinsically disordered regions (IDRs) that undergo large context-dependent conformational changes, e.g., upon interaction with the product RNA (13) or upon binding to ligands in the AS2 of NiRAN (14, 17, 20) . These inherent dynamic properties suggest that RdRp's activity can be modulated, positively or negatively, by factors that control the folding of the enzyme. Here, we report that the RdRp used in several structural and functional studies is largely inactive because synonymous codon substitutions in Nsp12 designed to maximize its expression instead trigger its misfolding. We identify a region containing a cluster of rare codons that plays a critical role in the proper folding of Nsp12 and show that Nsp12 expression in a bacterial strain with slow ribosomes and/or incubation with the accessory Nsp7/8 subunits increases RdRp activity. We further show that nucleoside analogs that cannot be incorporated into RNA can nonetheless activate RNA chain extension, presumably through binding to AS2. Our findings have immediate implications for functional studies and identification of novel inhibitors of SARS-CoV-2 RdRp and highlight the need for improved mRNA-recoding algorithms during the rational design of other biotechnologically and medically important expression systems. Nsp12s expressed from different coding sequences differ in activity and conformation. A rapidly growing collection of cryoEM structures of RdRp bound to different partners provides an excellent framework for understanding the mechanism of RNA synthesis and for the identification of novel RdRp inhibitors (12-14, 17, 21, 22) . As is the case with other systems, structural models require validation by functional studies that critically depend on the availability of robust expression systems and The 29-nt RNA hairpin scaffold is extended by RdRp to produce a 40-nt product; additional extension is thought to be mediated by Nsp8 after the completion of RNA synthesis (40) . Cy 5.5, cyanine 5.5. (C) RNA extension by RdRp at 37°C under the indicated conditions; 15 mM KCl is a permissive condition. Removal of the His tag (DHis) does not increase Nsp12 R activity, but Nsp12 A expressed from an mRNA that retains rare codons is more active. Fractions of the extended RNA (% Ext.) at 10 min are shown (means 6 SEM; n = 3). (D) Interactions with the RNA hairpin scaffold analyzed by electrophoretic mobility shift assays. RdRps at the indicated concentrations were incubated with 100 nM RNA. SARS-CoV-2 RdRp Misfolding and Activation ® highly active RdRp preparations. Given that the structures obtained for RdRp produced in Escherichia coli (12, 14) and in insect cells (13) are closely similar, we used the E. coli expression platform (see Fig. S1 in the supplemental material) to initiate mechanistic studies of SARS-CoV-2 RdRp. For the sake of expediency, we used an Nsp12 expression vector described in reference 12 (we refer to Nsp12 produced from this vector as Nsp12 R , where R indicates the laboratory where this plasmid was constructed) and Nsp7-and Nsp8-producing vectors that we constructed for this study. Nsp12 R contains a noncleavable C-terminal His 10 tag, is soluble when produced in E. coli, and is easily purified under "native" (nondenaturing) conditions. We found that the 7Á8 2 Á12 R enzyme exhibited negligible activity on a number of different templates, including the optimal hairpin scaffold ( Fig. 2B ) used by Hillen et al. (13) , which could be extended only at a very low concentration of salt. An extensive experimental survey of different combinations of purification schemes, RNA scaffolds, and reaction conditions failed to identify conditions that would support efficient primer extension, and the removal of the His tag, which has been proposed to interfere with RdRp activity (23), did not increase activity under permissive (15 mM KCl) conditions (Fig. 2C ). In their follow-up study, Wang et al. reported similar results (24) , prompting us to conclude that further attempts to boost the activity of the 7Á8 2 Á12 R enzyme produced under these conditions would be futile. Our survey of published reports failed to reveal an obvious reason for the observed low activity of the 7Á8 2 Á12 R enzyme. Under similar reaction conditions, some RdRps were able to completely extend the RNA primer in minutes (13, 19, 23, 25) , whereas others failed to do so in an hour (21, 24) , regardless of the expression system. Idiosyncratic but reproducible variations in activity can arise from recombinant protein misfolding; indeed, coexpression of Nsp12 with cellular chaperones has been shown to enhance its activity (23, 26) . A likely source of this variability may lie in the coding mRNA itself; whereas all recombinant Nsp12s have the same amino acid sequence (ignoring the tags), their coding sequences (CDSs) have been altered to match the codon usage of their respective hosts to maximize protein expression. Codon optimization is routinely used for protein expression in heterologous systems (27) , yet protein function can be compromised even by a single synonymous codon substitution (28, 29) . Furthermore, protein expression in a BL21 RIL strain, which alleviates codon imbalance by supplying a subset of rare tRNAs and is thus commonly used to express heterologous proteins in E. coli, can hinder proper folding (30) . The abrogation of ribosome pausing at rare codons is thought to uncouple nascent peptide synthesis from its folding, giving rise to misfolded proteins (29, 31, 32) . Although robust viral gene expression may promote host takeover, SARS-CoV-2 mRNAs, including nsp12, are not efficiently translated in human cells (33) . The viral nsp12 mRNA contains clusters of rare codons (in humans) (Fig. S2 ), yet the resulting enzyme is active and able to sustain efficient infection. This suggests that pause-prone translation may facilitate the proper folding of Nsp12. In contrast, the nsp12 R codon usage matches that of highly expressed E. coli genes, raising the possibility that the nsp12 R CDS has been optimized for the maximum expression of soluble protein in a bacterial host but not for enzymatic activity and/or acquisition of a native structure. There is no a priori reason to believe that an overexpressed soluble protein retains all of its activity, and the abundance of Nsp12 in the heterologous host may be made possible by its diminished NTP binding and/or condensation activity, defects in RNA binding, or other functionalities. To evaluate this possibility, we designed an nsp12 A variant (where A indicates that it was constructed by I. Artsimovitch) that contains more rare codons (Fig. S2) , including in the regions that bear rare codons in the viral mRNA. Interestingly, nsp12 T expression vector (constructed in T. Tuschl's lab) also gives rise to active RdRp (14) ; in this study, the viral nsp12 mRNA was reverse transcribed and expressed in the E. coli BL21 RIL strain, which contains extra copies of the argU, ileY, and leuW rare tRNA genes. While comparable codon frequency measurements are not available for the RIL strain, it does not carry all rare tRNAs required for the efficient translation of the viral nsp12 mRNA, suggesting that nsp12 T codon usage is suboptimal. We found that RdRp assembled with Nsp12 A had a much higher activity on the hairpin scaffold (Fig. 2C) . We also noted that Nsp12 A and Nsp12 T copurified with nucleic acids (Fig. S3A) ; subsequent gel shift assays revealed that 7Á8 2 Á12 A readily bound the RNA hairpin, whereas 7Á8 2 Á12 R did not (Fig. 2D ). We found that the 7Á8 2 Á12 T enzyme behaved similarly to 7Á8 2 Á12 A (Fig. S3B ), but since the nsp12 T expression vector lacks restriction sites required for protein engineering, we used nsp12 A in all subsequent experiments. To test whether Nsp12 R was misfolded, we used several approaches. First, we assessed the Nsp12 thermal stability using differential scanning fluorimetry (34). We recorded melting temperatures (T m ) of 41.3°C for Nsp12 R and 47.3°C for Nsp12 A (Fig. S4A) ; for another E. coli-expressed Nsp12, a T m of 43.6°C was reported (35) . Second, we compared the intrinsic fluorescence spectra of Nsp12, which contains nine tryptophan residues that are expected to be sensitive to the microenvironment (36) . Nsp12 A and Nsp12 R exhibited similar emission peaks, but the Nsp12 A intensity was 2fold higher (Fig. 3A) , suggesting that at least one Trp was more buried; the derivative spectra ( Fig. S4B ) did not reveal any additional differences. These results show that and cross-links (inside) mapped onto the Nsp12 schematic, with the domains colored as in panel A. Colors indicate differences in reactivity; residues in red were reactive only in Nsp12 R , those in blue were reactive in Nsp12 A , and those in black were reactive in both proteins. Only high-confidence monolinks (,10 25 ) and cross-links (,10 23 ) are shown (see Data Set S2). (C) Conservation of the NiRAN-RdRp interaction surfaces mapped on the transcription complex structure (PDB accession no. 6XEZ). Amino acid residues are colored according to their conservation. Key residues in AS1 (D760), in AS2 (D218), and at the NiRAN-palm interface (Y129 and S709) are shown as spheres; ADP bound to AS2 is shown as sticks and the Mg 21 ion as a purple sphere. SARS-CoV-2 RdRp Misfolding and Activation ® Nsp12 A and Nsp12 R are structurally distinct, but we cannot identify the regions of altered structure. We next used a carboxyl-and amine-reactive reagent, EDC [1-ethyl-3-(3-dimethylaminopropyl)carbodiimide] to map solvent-accessible (surface) residues and intraprotein cross-links by mass spectrometry. We observed substantial differences in accessibility of several regions centered at residues 150 (NiRAN domain), 415 (fingers), 600 (palm), and 850 (thumb) and in cross-linking, particularly of the NiRAN domain (Fig. 3B) . Attenuated translation and accessory subunits promote an active Nsp12 conformation. Although an overall excess of underrepresented codons can slow down translation, in many cases, the ribosome has to pause at one or more specific rare codons to ensure proper protein folding at key junctures (37, 38) . The differences between the codon frequencies between the 2.8-kb mRNAs encoding Nsp12 A and Nsp12 R are extensive (Fig. S2) . The produced proteins also differ in their N and C termini ( Fig. S1 ), but based on available structural data, an extra N-terminal glycine would not be expected to account for the observed dramatic differences in EDC reactivity ( Fig. 3B and Data Set S2). Comparative analysis identified two regions that contained rare codon clusters in the native SARS-CoV-2 RNA and in Nsp12 A mRNAs, but not in Nsp12 R (Fig. 3A and Fig. S5 ). We constructed chimeric proteins in which these Nsp12 A segments were replaced with corresponding segments from Nsp12 R , generating proteins with identical amino acid sequences (Fig. 4A ). We found that whereas swapping of codons (143 to 346) between the mRNAs producing active and inactive Nsp12 variants did not alter the RdRp activity, a chimeric protein containing codons 350 to 435 derived from the Nsp12 R CDS was defective (Fig. 4B) . Together with the EDC modification patterns (Fig. 3B) , this result suggests that controlled translation of the codon 350-435 region is important for Nsp12 folding and that changes in contacts with Nsp7 (Fig. S6) , which are critical for RdRp activity (13), may be partially responsible for the low activity of Nsp12 R . During expression of the viral genome, Nsp7, Nsp8, and Nsp12 are cotranslated with other Nsps as giant Reactivation of Nsp12 R via 37°C preincubation with the accessory Nsp7 and Nsp8 subunits to form the RdRp holoenzyme. (D) Translation by slow ribosomes yields a more active Nsp12. RNA extension is shown as means 6 SEM, and the P value was calculated by an unpaired two-tailed t test. n.s., not significant; **, P , 0.01. precursors that are later processed into individual polypeptides, and the RdRp may be assembled concurrently with protein synthesis. Analysis of Nsp7/Nsp12 interactions by Trp fluorescence reveals that Nsp7 binds to both Nsp12 subunits and might favor a similar Nsp12 A -like state (Fig. S6 ). In support of the "scaffolding" function of the accessory subunits (35), we found that preincubation of Nsp12 R with Nsp7 and Nsp8 led to an increased activity (Fig. 4C) . We next tested if slowing translation during protein expression would promote Nsp12 folding. We constructed a BL21 strain with a K42T mutant of the ribosomal protein S12, which causes an approximately 2-fold reduction in the translation rate (39) , and compared the Nsp12 R protein purified from this "slow" BL21 variant to the protein purified from wild-type BL21. We found that Nsp12 R purified from the mutant BL21 was approximately 2-fold more active (Fig. 4D) , consistent with the favorable effect of attenuated translation. Allosteric RdRp activation by nucleotides. Our results show that Nsp12 A and Nsp12 R differ dramatically in the conformations and interactions of their NiRAN domains (Fig. 3B) . Although the NiRAN domain is not known to affect RNA chain extension directly, it interacts with the catalytic palm domain (13, 14, 21) and may modulate catalysis allosterically. The NiRAN domain is partially disordered in most unliganded structures of RdRp and transcription complexes but becomes ordered upon binding of ADP-Mg 21 , GDP-Mg 21 , and PP i -Mg 21 to AS2 (Fig. 5A) (14, 17, 20) . We hypothesized that, upon binding to nucleotides, the NiRAN domain would become more rigid, favoring an active RdRp conformation, thus leading to more efficient RNA elongation. First, SARS-CoV-2 RdRp Misfolding and Activation ® we compared rates of RNA synthesis under standard conditions in which RdRp is bound to the RNA scaffold prior to the addition of the NTP substrates to the "NTPprimed" reaction mixture, in which the order of reagent addition was reversed (Fig. 5B) . The results show that preincubation with NTPs strongly potentiates Nsp12 R activity, an effect that may be mediated by the NiRAN domain. Given that nucleotide binding to AS2 has been shown to remodel the NiRAN domain in the active Nsp12 (14), we surmised that NTP-mediated activation should also occur in Nsp12 A . To separate the direct and allosteric effects of NTPs, we used a CU template, which contains only purines in the transcribed region; as expected, the RNA was extended in the presence of CTP and UTP (Fig. 5C ). In addition to detecting the runoff RNA (40 nucleotides [nt]), we detected a longer product that likely results from the terminal transferase activity of Nsp8, activity which prefers blunt over 39 recessed ends and ATP as a substrate (40) . To assay the hypothetical allosteric activation of the RdRp by NTP bound to the NiRAN domain, we chose conditions under which less than 50% of the scaffold was extended. Consistently with the allosteric effects of nontemplated nucleotides, transcription was activated in the presence of 1 mM ATP (.4-fold) or GTP (.10-fold) (Fig. 5C ). An apparent promiscuity of AS2 suggests that other nucleotides might be able to substitute for ATP and GTP. To test this idea, we used a pause-promoting ATP analog, remdesivir triphosphate (RTP), and an allosteric effector of E. coli RNAP guanosine tetraphosphate (ppGpp). We found that the effects of RTP and ppGpp mimicked those of ATP and GTP, respectively, on the CU template (see Materials and Methods). Activation was also observed with the 4N template, on which ATP, GTP, and RTP but not ppGpp can be utilized as the substrates; as expected, RMP incorporation led to RdRp stalling before reaching the end of the template (18, 22) . Consistently with the reported preference of the Nsp8 terminal transferase activity for ATP (40) , the fraction of the extended RNA is reduced in the presence of GTP and ppGpp compared to that in the presence of ATP (Fig. 5C) . We hypothesized that RdRp-activating nucleotides act via binding to AS2 and stabilizing the RdRp-NiRAN interface. To test this hypothesis, we replaced two conserved residues at the interface. Tyr129 in the NiRAN domain is nearly invariant among all CoVs, whereas only small residues (Ser, Ala, Gly) are found at position 709 in the palm domain (Fig. 3C) . Given this evolutionary conservation, we suspected that replacements of these amino acids might compromise the interdomain contacts, making RNA synthesis more dependent on the state of the NiRAN domain. Consistently, we found that Y129A and S709R substitutions reduced RNA synthesis activity while potentiating activation by 0.5 mM GTP (Fig. 5D) . The catalytic activity of the NiRAN domain has been shown to be independent of the RdRp function; Nsp9 modification occurs normally in an enzyme with substitutions in AS1 that inactivate the RdRp (16) . To determine if the converse is true, we replaced Asp218, which coordinates the Mg 21 ion in the AS2 (14) and is critical for Nsp9 NMPylation and viral replication (16) , with Ala. This substitution did not compromise RNA synthesis, confirming that AS1 and AS2 are functionally independent, but it modestly reduced GTP-dependent activation (Fig. 5D) , suggesting that, if the allosteric GTP binds to the NiRAN AS2, Asp218 does not measurably contribute to nucleotide affinity. This observation is not entirely surprising because Asp residues are critical for substrate positioning but make lesser contributions to substrate binding in other viral polymerases (41) . Our results lead to two principal conclusions. First, SARS-CoV-2 Nsp12 depends on cotranslational folding, facilitated by ribosome pausing, and on interactions with the accessory subunits to attain the active conformation. Second, the two nucleotidyl transfer catalytic sites in Nsp12, a unique property of Nidovirales, appear to be connected allosterically, with nucleotides including various analogs that bind to NiRAN AS2 and activate RNA chain extension in RdRp AS1. Pause-free translation yields inactive RdRp. Our results demonstrate that overoptimized Nsp12 R mRNA produces a soluble but misfolded protein in which RNA binding and catalytic activity ( Fig. 2C and D) are compromised. Notably, despite the dramatic differences in their activities, all structures of SARS-CoV-2 transcription complexes reported so far are closely similar (12) (13) (14) , reflecting the bias introduced during cryoEM analysis, in which only a small fraction of "good" particles is selected based on image analysis (e.g., about 1% in a study of RdRp inhibition by remdesivir [42] ). A preparation comprised of largely inactive enzymes remains amenable to the cryoEM analysis but would compromise biochemical experiments; to rephrase the fourth commandment of enzymology (43), thou shalt not waste clean thinking on dead enzymes. For example, a conclusion that SARS RdRp is more active than the SARS-CoV-2 enzyme (35) is predicated on the assumption that both RdRps are properly folded. Even more critically, inactive RdRps cannot be used to screen potential inhibitors. While recoding is routinely used to optimize heterologous protein expression (27), the existence and frequent clustering of rare codons in mRNAs encoding many essential proteins, especially, large, multidomain ones, indicate their crucial role as regulators of protein folding. For example, native nonoptimal codons in intrinsically disordered regions (IDRs) are essential for the function of circadian clock oscillators (44, 45) . IDRs often serve as platforms for protein-protein interactions (46) but can become trapped in unproductive states in the absence of their interaction partners. Our analysis supports this scenario by showing that an unstructured region that binds Nsp7 displays substantial differential sensitivity to EDC (Fig. 3B ) and that interaction with Nsp7 locks Nsp12 in an active conformation (Fig. S6) . When added to misfolded Nsp12, Nsp7/8 only modestly increases its activity (Fig. 4C) . However, because all Nsps are produced as a giant precursor in coronavirus-infected cells (8) , the accessory subunits may aid Nsp12 folding cotranslationally, as apparently happens during their coexpression in E. coli (23) . Likewise, coexpression of E. coli RNA polymerase subunits suppresses assembly defects conferred by deletions in the catalytic subunits (47). More broadly, our findings have implications for the heterologous expression of countless other proteins. Although examples of deleterious synonymous substitutions have been reported, these cases have been generally perceived as outliers. In retrospect, optimization-induced misfolding is likely to be far more prevalent than previously thought, with different recoding approaches impacting the structure and activity of the resulting protein in substantially different ways. The importance of cotranslational folding, particularly for large and dynamic proteins that contain essential mobile regions, emphasizes a need for the integration of diverse approaches, from ribosome profiling to machine learning, during a rational design of coding sequences to avoid misfolding traps. Another important implication of the codon usage impact on SARS-CoV-2 protein folding, structure, and activity lies in the interpretation of genomic surveillance data. So far, the focus of the analysis of the genetic variability of SARS-CoV-2 has been on characterization of variants of concern, and designation of its evolutionary lineages has been in nonsynonymous changes, i.e., amino acid substitutions. Many of those amino acid substitutions show little-to-no impact on properties of proteins in which they appear (48, 49) . We posit that many synonymic mutations, and even some nonsynonymic ones, may manifest their effects primarily at the level of cotranslational folding, rather than in the properties of the folded protein in vitro, or impact those through altering the ratio of folded to misfolded proteins during the infection. Crosstalk between two catalytic sites of SARS-CoV-2 Nsp12. Decades of studies of viral RdRps focused on the mechanism of RNA synthesis and identification of nucleoside analogs that inhibit viral replication. During the COVID-19 pandemic, repurposing of the existing drugs targeting RdRp, justified by structural similarity among RdRp active sites (9), became an urgent priority. Among these drugs, remdesivir received the most attention, even though the estimates of its clinical effectiveness range from moderate to insignificant (50, 51) . The CoV RdRp readily uses RTP as a substrate in place of ATP and temporarily stalls downstream at the site of RMP incorporation (18, 22) . However, the proposed mechanisms of the inhibitory effect of RTP vary widely, from RdRp stalling to RNA chain termination to disassembly of the RdRp (52) (53) (54) . It is presently unclear whether antiviral effects of remdesivir are due to delays in RNA synthesis or to errors in the product RNAs, as is the case with another purine analog, favipiravir (19) . Although efforts aimed at the identification of nucleoside analog inhibitors of RdRp are focused on AS1, it is clear that effects of nucleoside analogs on SARS-CoV-2 replication may be multifaceted. Nsp12 contains two active sites separated by more than 80 Å (Fig. 6) , both of which can bind NTPs and nucleoside analogs. The functions of AS1 and AS2 are largely independent; Nsp12 containing double substitutions in AS1 that abolish elongation is fully competent for Nsp9 NMPylation (16) , whereas the D218A substitution in Nsp12 that abolishes NMPylation blocks viral replication (16) but does not compromise RNA extension (Fig. 5D ). In addition, each subunit of the Nsp8 dimer can also bind NTPs (40) , and although there is no structural evidence of nucleotide binding to Nsp8 and its terminal transferase activity might be posttranscriptional, SARS-CoV-2 RdRp contains, all together, four nucleotide-binding sites. Thus, one cannot assume that the observed effect of a nucleotide is mediated via the "primary" nucleotide binding to AS1; indeed, we show here that RTP promotes RNA synthesis when it cannot be incorporated into RNA, and this effect is even more pronounced with ppGpp (Fig. 4C) . Competitive inhibitors binding in AS2 or transferring noncognate ligands to Nsp9 are likely to inhibit replication. In the latter case, misincorporation may have more lasting effects because errors in the nascent RNA can be corrected by the SARS-CoV-2 proofreading exonuclease Nsp14 (26) . We hypothesize that AS1 and AS2 are allosterically linked, enabling coordinated control of the RdRp activity. The NiRAN and palm domains form an extensive interface composed of highly conserved residues, including Tyr129 (Fig. 3C) . Upon binding to AS2, nucleotides induce NiRAN folding and lead to subtle changes at the domain interface (14, 17, 20) . We show that binding of nucleotides that cannot be incorporated into RNA potentiates RdRp activity (Fig. 5C ). There is currently no direct evidence that this effect is triggered through their binding to AS2, but the effects of substitutions in the NiRAN domain (Fig. 5D ) and structural data (14, 17, 20) support this model. Although rigorous computational, structural, and biochemical analyses will be required to test this hypothesis, it is already clear that, when considering the effects of various nucleotide analogs on viral RNA synthesis, their binding to AS2 (and, perhaps, AS3 and AS4 as well) cannot be ignored. The open active sites of viral RdRp can accommodate highly diverse substrates, some of which have been developed into therapeutics (10) . Our findings that ppGpp activates RdRp similarly to GTP (Fig. 5C) suggests that other nucleotide-binding sites in Nsp12 are also promiscuous. Furthermore, the interplay between the binding of RTP to both catalytic and allosteric (relative to RNA synthesis) sites and competition therein with cellular NTPs call for a nuanced interpretation of the remdesivir inhibition mechanism, as does potential competition with ppGpp binding to the allosteric site (AS2). The biological activity of ppGpp has long been considered to be limited to bacteria and plastids, but its action as an alarmone has recently been demonstrated in human cells (55) , raising the possibility that ppGpp and other nontemplating nucleotides impact SARS-CoV-2 replication in host cells. Allosteric control of SARS-CoV-2 RdRp invites interesting parallels with E. coli Qb replicase, which also consists of four subunits, the phage-encoded RdRp (b-subunit) and three host RNA-binding proteins, the translation elongation GTPases EF-Tu and EF-Ts and a ribosomal protein, S1 (56) . Similarly to Nsp7/8, EF-Tu and EF-Ts aid in the cotranslational assembly of Qb RdRp (57); EF-Tu also forms a part of the singlestranded RNA exit channel, assisting in RNA strand separation during elongation, whereas S1 acts as an initiation factor (56) . EF-Tu and EF-Ts binding to ppGpp modulates host translation (58) and RNA synthesis by Qb (59) , suggesting that RNA viruses from bacteria to humans may employ nucleotide analogs as sensors of cellular metabolism. The SARS-CoV-2 RdRp subunit composition and dynamics resemble those of structurally unrelated bacterial RNA polymerases (RNAPs). Bacterial enzymes are composed of 4 to 7 subunits and are elaborately controlled by regulatory nucleic acid signals and proteins that induce conformational changes in the transcription complex, as revealed by many recent cryoEM studies (60) (61) (62) . Notably, most natural and synthetic products that inhibit bacterial RNAPs alter protein interfaces or trap transient intermediates rather than block nucleotide addition and bind to many different sites (63) . Compared to simpler RdRps, the SARS-CoV-2 enzyme, with several active sites and many conserved protein interfaces, may be an easier target for diverse small molecules that inhibit subunit or domain interactions or interrupt allosteric signals. Given the outsized importance of coronaviruses to human health, efforts to identify diverse inhibitors of RdRp, beyond nucleotide analogs, should be prioritized. Given the broad utilization of nucleotides by host enzymes, such as polymerases, kinases, lyases, etc., the chances of off-target side effects during therapeutic administration thereof are greatly elevated (64), whereas a druggable target unique to the pathogen, such as the SARS-CoV-2 NiRAN domain, bears inherently lower risks of nonspecific interactions. The potential for the allosteric regulation of NTP condensation by SARS-CoV-2 RdRp has recently been highlighted by a computational study that identified several motifs under allosteric control (65) . Notably, the NiRAN domain makes extensive contacts with allosteric motif D and fewer contacts with motifs A and B (65), consistent with our findings that nontemplating phosphorylated nucleotide binding activates RdRp, solidifying its potential as the drug target. Construction of expression vectors. Plasmids used in this study are shown in Fig. S1 in the supplemental material. The SARS-CoV-2 nsp7/8/12 A genes were codon optimized for expression in E. coli, synthesized by GenScript, and subcloned into standard pET-derived expression vectors under the control of the T7 gene 10 promoter and lac repressor. The derivative plasmids were constructed by standard molecular biology approaches with restriction and modification enzymes from New England Biolabs, taking advantage of the existing or silent restriction sites engineered into the Nsp12 coding sequence. DNA oligonucleotides for vector construction and sequencing were obtained from Millipore Sigma. The sequences of all plasmids, including pET22a-Nsp12, were confirmed by Sanger sequencing at the Genomics Shared Resource Facility (The Ohio State University) and are available upon request. Protein expression and purification. Nsp7/8 were overexpressed in E. coli XJB(DE3) cells (Zymo Research; catalog no. T5051). Nsp12 variants were overexpressed in E. coli BL21(DE3) cells (Novagen; SARS-CoV-2 RdRp Misfolding and Activation ® catalog no. 69450). Strains were grown in lysogenic broth (LB) with appropriate antibiotics: kanamycin (50 mg/ml), carbenicillin (100 mg/ml), and chloramphenicol (25 mg/ml). All protein purification steps were carried out at 4°C. For Nsp7/8, cells were cultured at 37°C to an optical density at 600 nm (OD 600 ) of 0.6 to 0.8, and the temperature was lowered to 16°C. Expression was induced with 0.2 mM isopropyl-1-thio-b-D-galactopyranoside (IPTG; GoldBio; catalog no. I2481C25) for 18 h. Induced cells were harvested by centrifugation (6,000 Â g), resuspended in lysis buffer A (100 mM HEPES, pH 7.5, 300 mM NaCl, 5% glycerol [vol/vol], 1 mM phenylmethylsulfonyl fluoride [PMSF; ACROS Organics; catalog no. 329-98-6], 5 mM b-mercaptoethanol [b-ME], 10 mM imidazole), and lysed by sonication. The lysate was cleared by centrifugation (10,000 Â g). The soluble protein was purified by absorption to Ni 21 -nitrilotriacetic acid (NTA) resin (Cytiva; catalog no. 17531801), washed with Ni-buffer A (20 mM HEPES, pH 7.5, 300 mM NaCl, 5% glycerol, 5 mM b-ME, 50 mM imidazole), and eluted with Ni-buffer B (20 mM HEPES, pH 7.5, 50 mM NaCl, 5% glycerol, 5 mM b-ME, 300 mM imidazole). The eluted protein was further loaded onto a Resource Q ionexchange column (Cytiva; catalog no. 17117701) in Q buffer A (20 mM HEPES, pH 7.5, 5% glycerol, 5 mM b-ME) and eluted with a gradient of Q buffer B (20 mM HEPES, pH 7.5, 1 M NaCl, 5% glycerol, 5 mM b-ME). The fusion protein was treated with tobacco etch virus (TEV) protease at 4°C overnight supplemented with 20 mM imidazole and was passed through Ni 21 -NTA resin. The untagged, cleaved protein was loaded onto a Superdex 75 10/300 GL column (Cytiva; catalog no. 29148721) in Ni-buffer A. Peak fractions were assessed by SDS-PAGE and Coomassie blue staining. Purified protein was dialyzed into storage buffer A (20 mM HEPES, pH 7.5, 150 mM NaCl, 45% glycerol, 2.5 mM b-ME), aliquoted, and stored at 280°C. For Nsp12, cells were cultured at 37°C to an OD 600 of 0.6 to 0.8, and the temperature was lowered to 16°C. Expression was induced with 0.1 mM IPTG for 18 h. Expression by slow ribosomes. To test the effect of slow translation on Nsp12 activity, a derivative of BL21 (IA659) containing a K42T substitution in the ribosomal protein S12 was constructed by P1 transduction from the DEV3 E. coli strain (KL16 lac5 strA2; obtained from Kurt Fredrick, The Ohio State University) and selection on streptomycin (50 mg/liter). This substitution reduces the translation rate ;2-fold (39) . Following sequencing of the rpsL gene to confirm the substitution, the slow BL21 strain was transformed with the plasmid encoding Nsp12 R . The protein was purified as described above. RNA extension assays. An RNA oligonucleotide (59-UUUUCAUGCUACGCGUAGUUUUCUACGCG-39; 4N) with cyanine 5.5 at the 59 end was obtained from Millipore Sigma (USA). Prior to the reaction, the RNA was annealed in 20 mM HEPES, pH 7.5, 50 mM KCl by heating the mixture to 75°C and then gradually cooling it to 4°C. Reactions were carried out at 37°C with 500 nM Nsp12 variant, 1 mM Nsp7, 1.5 mM Nsp8, 200 nM RNA, and 250 mM NTPs (Cytiva; catalog no. 27202501) in the transcription buffer (20 mM HEPES, pH 7.5, 15 mM KCl, 5% glycerol, 2 mM MgCl 2 , 1 mM DTT). RNA extension reactions were stopped at the desired times by adding 2Â stop buffer (8 M urea, 20 mM EDTA, 1Â Tris-borate-EDTA [TBE], 0.2% bromophenol blue). Samples were heated for 2 min at 95°C and separated by electrophoresis in denaturing 9% acrylamide (19:1) gels (7 M urea, 0.5Â TBE). The RNA products were visualized and quantified using Typhoon FLA9000 (GE Healthcare) and ImageQuant software. RNA extension assays were carried out in triplicates. Means and standard errors of the means (SEM) were calculated by OriginPro 2021 (OriginLab), and an unpaired two-tailed t test was performed using Excel (Microsoft). Electrophoretic mobility shift assays. RdRp (Nsp12:Nsp7:Nsp8 = 1:2:3; indicated concentrations in Fig. 2D represent the Nsp12 concentration) in 20 mM HEPES, pH 7.5, 65/15 mM KCl, 5% glycerol, 2 mM MgCl 2 , 1 mM DTT were incubated with 100 nM 4N RNA at 37°C for 5 min. Then reactions were mixed with 10Â loading buffer (30% glycerol, 0.2% Orange G) and run on a 3% agarose gel in 1Â TBE on ice. The gel was visualized by Typhoon FLA9000. Activation of Nsp12 R . To test the effect of holo RdRp formation, 5 mM Nsp12 R mixed with Nsp7/8 (10 mM/15 mM) in storage buffer B was incubated at 0°C or 37°C for 15 min and then stored at 220°C. RNA extension was performed as described above, and the reaction was stopped at 8 min. To test the effect of NTPs, RdRp was first incubated with NTPs for 10 min at 37°C, and then 200 nM RNA was added to initiate the reaction; the final concentrations of RdRp (500 nM), RNA (250 nM), and NTPs (250 mM) were identical to those used in assays with the simultaneous addition of the RNA scaffold and substrates. The reaction was stopped by adding 2Â stop buffer at the indicated times. Allosteric activation by nucleotides. A CU RNA hairpin (59-AAAAGAAAAGACGCGUAGUUUUCU ACGCG-39; CU) labeled with cyanine 5.5 at the 59 end (Millipore Sigma) was annealed in 20 mM HEPES, pH 7.5, 50 mM KCl by heating the mixture to 75°C and then gradually cooling it to 4°C. RdRp holoenzymes (500 nM wild-type or mutant Nsp12 A , 1mM Nsp7, 1.5 mM Nsp8 [final concentrations]) were mixed with ATP, GTP, remdesivir triphosphate (RTP; MedChemExpress; catalog no. GS443902), or ppGpp (TriLink BioTechnologies; catalog no. N-6001) at concentrations indicated in Fig. 5 in 20 mM HEPES, pH 7.5, 15 mM KCl, 5% glycerol, 1 mM DTT and either 2 or 1 mM MgCl 2 (with 1 or 0.5 mM activating nucleotide, respectively). Reaction mixtures were incubated for 5 min at 37°C, and RNA chain extension was initiated by the addition of 200 nM RNA and 100 mM CTP and UTP. Following 15 min of incubation at 37°C, reactions were stopped at the desired times by adding 2Â stop buffer (8 M urea, 20 mM EDTA, 1Â TBE, 0.2% bromophenol blue). Tryptophan fluorescence. Tryptophan fluorescence spectroscopy was performed using a model F-7000 fluorescence spectrophotometer (Hitachi). The excitation wavelength was set at 280 nm, and the emission spectra were recorded from 310 to 370 nm, with a 5-nm slit width of excitation and emission. The scan speed was 240 nm/min. The temperature was maintained at 37°C by a thermostatic water circulator (NESLAB RTE-7; Thermo Scientific). The samples were prepared in 20 mM HEPES, pH 7.5, 65 mM KCl, 5% glycerol, 2 mM MgCl 2 , 1 mM DTT. One micromolar Nsp12 and 2 mM Nsp7 were used to record the spectra of Nsp12 and Nsp7, respectively. To record the spectra of Nsp7Á12, 1 mM Nsp12 was incubated with 2 mM Nsp7 at 37°C for 15 min. To collect the spectra of denatured proteins, 1 mM Nsp12 was incubated in 8 M urea at room temperature for 1 h. Three independent measurements, each in three technical replicates, were performed. The same results were obtained with proteins purified 3 months apart. Means, SEM, and second derivatives of the emission spectra were calculated by OriginPro 2021. Conservation analysis. To assess the relative conservation of coronavirus proteins, 3,309 diverse coronavirus genomes, representing alpha-, beta-, gamma-, and deltacoronavirus genera were downloaded from GenBank in May 2020. High-quality CDSs (containing no more than 32 contiguous codons with ambiguous bases) were translated into five (poly)proteins, conserved across all coronaviruses: orf1ab, S, E, M, and N. Alignments of five open reading frames (ORFs) were produced using the MUSCLE program (66) . For each alignment, column homogeneity and the weighted fraction of nongap characters (both ranging from 0 to 1) were calculated as described previously (67) . The product of these two values was used as the conservation index (ranging from 0 to 1). For the whole-genome pan-Coronaviridae conservation map, only consensus positions (those with a fraction of gaps below 0.5) were used. EDC modification and mass spectrometry. Approximately 0.5 mg/ml of Nsp12 in 20 mM HEPES, pH 7.5, 50 mM KCl, 2 mM MgCl 2 , 1 mM DTT was mixed with freshly prepared EDC [N-(3-dimethylaminopropyl)-N9-ethyl carbodiimide hydrochloride; Sigma, catalog no. 03449]. EDC was added to a final concentration of 2 mM, and the reaction was performed at room temperature for 30 min. The reaction was quenched with a 50Â molar excess of Tris-HCl (pH 7.5) for 5 min. Cross-linked protein samples were separated using SDS-PAGE, the protein bands were stained with GelCode Blue, and tryptic peptides were generated using an in-gel tryptic digestion kit (Thermo Scientific; catalog no. 89871); peptides were purified using Pierce 10-ml C 18 tips (Thermo Fisher; catalog no. PI87782). Peptides were analyzed in the Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific) coupled to an EASY-nLC (Thermo Scientific) liquid chromatography system, with a 2-mm, 500-mm EASY-Spray column. The peptides were eluted over a 180-min linear gradient from 96% buffer A (water) to 40% buffer B (acetonitrile) and then continued to 98% buffer B over 20 min with a flow rate of 200 nl/min. Each full mass spectrometry (MS) scan (R = 60,000) was followed by 20 data-dependent MS2 (R = 15,000) with high-energy collisional dissociation (HCD) and an isolation window of 2.0 m/z. The normalized collision energy was set to 35. Precursors of charge states 2 to 6 and 4 to 6 were collected for MS2 scans; monoisotopic precursor selection was enabled, and a dynamic exclusion window was set to 30.0 s. The resulting raw files were searched in enumerative mode with pFind3 (68) in open search mode against the Nsp12 sequence; the inferred modifications over a 1% cutoff were used as "variable" modifications in the subsequent pLink2 search. The same files were then searched in cross-link discovery mode using pLink2 (69) against the Nsp12 sequence, using [EDC] as the cross-linking reagent, trypsin as the enzyme generating the peptides, and variable modifications set as inferred by pFind3. Data availability. Mass spectrometry data sets have been deposited into MassIVE (accession no. MSV000086827, available for download at ftp://massive.ucsd.edu/MSV000086827/), and processed data are presented in Data Set S2. Supplemental material is available online only. An overview of current knowledge of deadly CoVs and their interface with innate immunity Characteristics of SARS-CoV-2 and COVID-19 On the road to ending the COVID-19 pandemic: are we there yet? The variant gambit: COVID-19's next move Zoonotic and reverse zoonotic events of SARS-CoV-2 and their impact on global health A nidovirus perspective on SARS-CoV-2 The nonstructural proteins directing coronavirus RNA synthesis and processing A comprehensive superposition of viral polymerase structures RNA-dependent RNA polymerase: structure, mechanism, and drug discovery for COVID-19 Double-membrane vesicles as platforms for viral replication Structure of the RNA-dependent RNA polymerase from COVID-19 virus Structure of replicating SARS-CoV-2 polymerase Structural basis for helicase-polymerase coupling in the SARS-CoV-2 replication-transcription complex Discovery of an essential nucleotidylating activity associated with a newly delineated conserved domain in the RNA polymerase-containing protein of all nidoviruses Coronavirus replication-transcription complex: vital and selective NMPylation of a conserved site in nsp9 by the NiRAN-RdRp subunit Cryo-EM structure of an extended SARS-CoV-2 replication and transcription complex reveals an intermediate state in cap synthesis Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase from severe acute respiratory syndrome coronavirus 2 with high potency Rapid incorporation of favipiravir by the fast and permissive viral RNA polymerase complex results in SARS-CoV-2 lethal mutagenesis Structure of the SARS-CoV-2 RNA-dependent RNA polymerase in the presence of favipiravir-RTP Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors Mechanism of SARS-CoV-2 polymerase stalling by remdesivir Remdesivir is effective in combating COVID-19 because it is a better substrate than ATP for the viral RNA-dependent RNA polymerase Structural basis for RNA replication by the SARS-CoV-2 polymerase The nucleotide addition cycle of the SARS-CoV-2 polymerase Remdesivir and SARS-CoV-2: structural requirements at both nsp12 RdRp and nsp14 exonuclease activesites Codon bias and heterologous protein expression A "silent" polymorphism in the MDR1 gene changes substrate specificity A code within the genetic code: codon usage regulates cotranslational protein folding Transient ribosomal attenuation coordinates protein synthesis and co-translational folding Cotranslational folding of proteins on the ribosome Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding The coding capacity of SARS-CoV-2 Characterization of aminoacyl-tRNA synthetase stability and substrate interaction by differential scanning fluorimetry Structural and biochemical characterization of the nsp12-nsp7-nsp8 core polymerase complex from SARS-CoV-2 Physical biology of GPCR signalling dynamics inferred from fluorescence spectroscopy and imaging Codon usage influences the local rate of translation elongation to regulate co-translational protein folding Nonoptimal codon usage influences protein structure in intrinsically disordered regions Hyper-accurate ribosomes inhibit growth Identification and characterization of a human coronavirus 229E nonstructural protein 8-associated RNA 3-terminal adenylyltransferase activity X-ray crystal structures elucidate the nucleotidyl transfer reaction of transcript initiation using two nucleotides Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir Ten commandments: lessons from the enzymology of DNA replication Non-optimal codon usage is a mechanism to achieve circadian clock conditionality Non-optimal codon usage affects expression, structure and function of clock protein FRQ Co-overexpression of Escherichia coli RNA polymerase subunits allows isolation and analysis of mutant enzymes lacking lineage-specific sequence insertions SARS-CoV-2 lineages and sub-lineages circulating worldwide: a dynamic overview Biochemical features and mutations of key proteins in SARS-CoV-2 and their impacts on RNA therapeutics Remdesivir for the treatment of COVID-19: a systematic review and meta-analysis of randomized controlled trials Major update: remdesivir for adults with COVID-19: a living systematic review and meta-analysis for the American College of Physicians practice points Piece of the puzzle: remdesivir disassembles the multimeric SARS-CoV-2 RNA-dependent RNA polymerase complex Template-dependent inhibition of coronavirus RNA-dependent RNA polymerase by remdesivir reveals a second mechanism of action Remdesivir is a delayed translocation inhibitor of SARS-CoV-2 replication ppGpp functions as an alarmone in metazoa Molecular insights into replication initiation by Qbeta replicase using ribosomal protein S1 Assembly of Q{beta} viral RNA polymerase with host translational elongation factors EF-Tu and -Ts ppGpp inhibition of elongation factors Tu, G and Ts during polypeptide synthesis Interaction of Qluta RNA replicase with guanine nucleotides. Different modes of inhibition and inactivation Structural basis for transcript elongation control by NusG family universal regulators Steps toward translocation-independent RNA polymerase inactivation by terminator ATPase rho Structural basis for NusA stabilized transcriptional pausing Diverse and unified mechanisms of transcription initiation in bacteria The mutational footprints of cancer therapies Exploring the allosteric territory of protein function SARS-CoV-2 RdRp Misfolding and Activation ® MUSCLE: a multiple sequence alignment method with reduced time and space complexity Evolution of DNA packaging in gene transfer agents Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine A highspeed search engine pLink 2 with systematic evaluation for proteomescale identification of cross-linked peptides