key: cord-0285369-lf8ij7di authors: Carlson, Christopher R.; Adly, Armin N.; Bi, Maxine; Cheng, Yifan; Morgan, David O. title: Reconstitution of the SARS-CoV-2 ribonucleosome provides insights into genomic RNA packaging and regulation by phosphorylation date: 2022-05-24 journal: bioRxiv DOI: 10.1101/2022.05.23.493138 sha: fc010da3199aab50bf4e9bdd08c7372432703ef9 doc_id: 285369 cord_uid: lf8ij7di The nucleocapsid (N) protein of coronaviruses is responsible for compaction of the ∼30-kb RNA genome in the ∼100-nm virion. Cryo-electron tomography suggests that each virion contains 35-40 viral ribonucleoprotein (vRNP) complexes, or ribonucleosomes, arrayed along the genome. There is, however, little mechanistic understanding of the vRNP complex. Here, we show that N protein, when combined with viral RNA fragments in vitro, forms cylindrical 15-nm particles similar to the vRNP structures observed within coronavirus virions. These vRNPs form in the presence of stem-loop-containing RNA and depend on regions of N protein that promote protein-RNA and protein-protein interactions. Phosphorylation of N protein in its disordered serine/arginine (SR) region weakens these interactions and disrupts vRNP assembly. We propose that unmodified N binds stem-loop-rich regions in genomic RNA to form compact vRNP complexes within the nucleocapsid, while phosphorylated N maintains uncompacted viral RNA to promote the protein’s transcriptional function. disordered region also contains sequences that interact with the Nsp3 protein on doublemembrane vesicles (11, (32) (33) (34) . N protein undergoes liquid-liquid phase separation in the presence of viral RNA to form biomolecular condensates (35) (36) (37) (38) . Phosphorylated N combines with RNA to form a liquid-like condensate that might serve as a compartment at the RTC to help protect viral replication and transcriptional machinery from the host cell's innate immune response (35, 38, 39) . Unmodified N protein, however, combines with RNA to form more rigid condensates that contain discrete substructures. These gel-like condensates may help to package viral genomic RNA in the nucleocapsid (35, 37, 38) . During viral assembly, hypo-phosphorylated N protein binds genomic RNA to form the compact nucleocapsid structure, which is then engulfed by ER membranes containing the S, E, and M proteins to form a mature virus (4, 5, 16, 30) . Early electron microscopy studies of coronavirus nucleocapsids demonstrated the existence of viral ribonucleoprotein (vRNP) complexes aligned helically along an RNA strand (40) (41) (42) . Recent cryo-electron tomography studies of intact SARS-CoV-2 virions revealed that each virus contains 35-40 discrete, cylindrical nucleosome-like vRNP complexes (43, 44) . These vRNPs, or ribonucleosomes, are ~15 nm in diameter and, through low resolution modeling efforts, are speculated to contain twelve N proteins in complex with up to 800 nt of RNA (30,000 nt ÷ 38 vRNPs = 800 nt). A 'beads-on-a-string' model has been proposed as a general mechanism of coronavirus packaging: vRNPs (the beads) locally compact RNA within the long genomic RNA strand (the string). Unlike string, however, the SARS-CoV-2 genomic RNA is highly structured, containing an elaborate array of heterogeneous secondary and tertiary structural elements that are present in both infected cells and in the virion (45, 46) . Thus, N protein must accommodate a variety of RNA structural elements to form the compact vRNPs of the nucleocapsid. Mechanistic insight into this model and overall vRNP architecture is lacking. In our previous work, we observed that purified N protein and a 400-nt viral RNA fragment assemble into vRNP particles similar to those seen inside the intact virus, suggesting that N protein and RNA alone are sufficient to form the vRNP (38) . Here, we explore the biochemical properties, composition, and regulation of these particles. We find that vRNPs form in the presence of stem-loop-containing RNA though a multitude of protein-protein and protein-RNA interactions. Phosphorylation of N protein weakens these interactions and inhibits the formation of ribonucleosomes. We previously observed vRNP complexes in vitro when N protein was mixed with a 400nt viral RNA from the 5' end of the genome, while cryo-electron tomography studies of intact viruses suggest that the vRNP packages up to 800 nt of RNA (38, 43, 44) . To further investigate the impact of RNA length on vRNP assembly, we mixed N protein with 400-, 600-, and 800-nt RNA fragments from the 5' end of the genome (5'-400, 5'-600, and 5'-800, respectively) and analyzed the resulting complexes by electrophoresis on a native TBE gel. All three RNAs shifted to a larger species in the presence of N (Fig. 1B) , indicating that N protein bound the RNAs and retarded their electrophoretic mobility. N protein in complex with 5'-600 RNA resulted in a particularly discrete, intense band, suggesting that it forms a stable RNA-N protein complex. We used mass photometry to better characterize these RNA-N protein complexes. Mass photometry uses light scattering to measure the mass of single molecules in solution, resulting in a histogram of mass measurements centered around the average molecular mass of the protein complex. N protein in complex with 5'-400 RNA resulted in two mass peaks that were smaller than the single broad peak of N protein bound to 5'-600 RNA, suggesting the 5'-400 vRNP was not fully assembled and contained subcomplexes (Fig. 1C) . N protein mixed with 5'-800 RNA formed two broad peaks: one smaller peak that appears similar in size to the 5'-600 species (both ~750-800 kDa), and a second larger peak roughly twice as large as the first (~1400 kDa). This suggests that one (~750 kDa) or two vRNPs (~1400 kDa) can form on a single 5'-800 RNA molecule (Fig. 1C) . The 5'-600 RNA was therefore chosen as a representative viral RNA to further study the ribonucleosome. To purify the vRNP complex for more detailed analysis, N protein was mixed with 5'-600 RNA and separated by centrifugation on a 10-40% glycerol gradient. Individual fractions were analyzed by native gel electrophoresis (Fig. 1D, top) , after which fractions 7 and 8 were combined for analysis by mass photometry. We observed three major peaks centered at 97 ± 2 kDa, 207 ± 6 kDa, and 766 ± 6 kDa (Fig. 1E , top and table S1). These peaks likely correspond to free N protein dimer (predicted mass 91.2 kDa; see Fig. 2D , top), unbound 5'-600 RNA (predicted mass 192.5 kDa), and the vRNP complex, respectively. The presence of free N protein dimer and unbound RNA suggests that the vRNP complex dissociated upon dilution for mass photometry analysis. To stabilize the complex, a crosslinker (0.1% glutaraldehyde) was added to the 40% glycerol buffer, creating a gradient of glutaraldehyde throughout the glycerol to crosslink the protein complex during centrifugation (a technique known as gradient fixation, or GraFix) (47) . Analysis of the GraFix-purified fractions by native gel electrophoresis revealed sharper, more discrete bands compared to the non-crosslinked sample (Fig. 1D, bottom) . The distribution of vRNP complexes across the gradient was similar between the two conditions. The GraFix purified sample (fractions 7 + 8) was analyzed by mass photometry, revealing one peak with an approximate mass of 727 ± 4 kDa (Fig. 1E , bottom and table S1). This is consistent with the idea that the noncrosslinked sample dissociates upon dilution for mass photometry and suggests a likely stoichiometry of 12 N proteins (547.5 kDa) bound to one 5'-600 RNA (192.5 kDa; total predicted mass: 740 kDa). Alternatively, a complex of 8 N proteins (365 kDa) with two 5'-600 RNAs (385 kDa; total predicted mass: 750 kDa) is also consistent with these results. Negative stain electron microscopy (EM) of the GraFix-purified sample revealed discrete 15-nm particles with an electron-dense center surrounded by an outer ring (Fig. 1F ). Two-dimensional classification revealed particles with variable composition and conformation, suggesting inherent structural heterogeneity within the vRNP complex that may reflect the diverse RNA stem-loop structures in the 600 nt RNA strand (see Fig. 2A ). While these averages are heterogeneous, they are similar in size and shape to vRNP complexes previously observed within SARS-CoV-2 virions by cryo-electron tomography (43, 44) . We next tested if specific RNA sequences or regions of the genome promote formation of the vRNP. Four 600-nt genomic regions were transcribed in vitro, individually mixed with N protein, and analyzed by native gel electrophoresis: (1) 5'-600 (nucleotides 1-600), (2) Nsp3 (nucleotides 7,800-8,400), (3) Nsp8/9 (nucleotides 12,250-12,850), (4) Nsp10 (nucleotides 13,200-13,800). All RNAs appeared to form appropriately-sized vRNPs, although the Nsp3 RNA appeared less effective (Fig. 1G) . These results suggest that the vRNP can accommodate a variety of viral RNA and does not require specific sequences to form, although certain sequences may form more stable ribonucleosomes. We then sought to explore the relationship between RNA structure and vRNP formation by dissecting the structures required for vRNP assembly within the 5'-600 RNA. This highly structured 600 nt genomic region contains several well-characterized stemloops varying in size from 20 to ~150 nt ( Fig. 2A) (45, 48) . Three stem-loop RNAs (SL4a, 56 nt; SL7, 46 nt; SL8, 72 nt) were individually mixed with N protein, crosslinked (with 0.1% glutaraldehyde) to stabilize the resulting complexes and assessed for vRNP formation by native gel electrophoresis. Appropriately sized vRNP complexes formed in the presence of all three stem-loops ( Fig. 2B and fig. S1A ). Each crosslinked complex was analyzed by mass photometry. SL4a mixtures contained three broad peaks of 515 ± 19 kDa, 739 ± 5 kDa, and 876 ± 1 kDa (fig. S1B, top, and table S1). SL7 also generated three mass peaks at 502 ± 26 kDa, 615 ± 22 kDa, and 713 ± 13 kDa (fig. S1B, middle, and table S1). SL8 generated two peaks at 737 ± 11 kDa and 840 ± 12 kDa (fig. S1B, bottom, and table S1) and was chosen for further analysis due to less heterogeneity in the composition of the complex. SL8-containing vRNPs were purified by GraFix ( fig. S1C ). Analysis of peak fractions (7 + 8) by mass photometry (fig. S1D) indicated that SL8 vRNPs were similar in mass to vRNPs assembled with 5'-600 RNA (Fig. 1E ). Negative stain EM (Fig. 2C ) and two-dimensional class averages revealed ring structures that resemble vRNPs assembled with the 5'-600 RNA (Fig. 1F) . These data suggest that ribonucleosome formation does not require 600 continuous bases of RNA but can be achieved with multiple copies of a relatively short and simple stem-loop structure. Unlike the 5'-600 RNA, the short stem-loop RNA is unlikely to serve as a platform to recruit multiple copies of N protein to assemble a vRNP. We speculate that the binding of a stem-loop RNA to N protein induces a conformational change that promotes protein-protein interactions required for vRNP formation. In the more physiologically relevant context of long RNAs, these weak protein-protein interactions are likely stabilized by multivalent interactions with an RNA molecule. To test the requirement for secondary structure in vRNP formation, we analyzed a mutant SL8 (mSL8) carrying 12 mutations predicted to abolish the stem-loop structure. vRNP formation was reduced in the presence of mSL8, suggesting that ribonucleosomes form more readily with double-stranded stem-loop structures ( fig. S1E ). Analysis of non-crosslinked SL8-N protein complexes shed further light on vRNP assembly. Mass photometry of the SL8-N sample revealed a major species at ~110 kDa, with five evenly spaced complexes every 120-130 kDa thereafter up to ~755 kDa (108 ± 4 kDa, 225 ± 10 kDa, 360 ± 2 kDa, 468 ± 28 kDa, 600 ± 31 kDa, 736 ± 27 kDa) (Fig. 2D , middle, and table S1). N protein alone exists primarily as a ~96 kDa dimer (97 ± 1 kDa) at the low concentration used for mass photometry (Fig. 2D , top, and table S1; predicted mass 91.2 kDa), so the ~110 kDa peak likely represents one N protein dimer bound to one SL8 RNA (predicted mass of RNA: 23.1 kDa; predicted mass of complex: 114.4 kDa). The stepwise ~120-130 kDa increases in molecular mass are consistent with the addition of an N dimer bound to either one or two SL8 RNA molecules (predicted mass: 114.4 kDa or 137.5 kDa, respectively). These results support a potential assembly mechanism in which N protein dimers, bound to one or two stem-loops, iteratively assemble to form a full ribonucleosome containing twelve N proteins and six to twelve stem-loop RNAs (Fig. 2E ). These data support the possibility that the vRNP assembled with 5'-600 RNA (Fig. 1E ) contains 12 N proteins bound to one RNA. In some crosslinked vRNP preparations, we observed an additional large peak in mass photometry that is likely to contain more than 12 N proteins. As mentioned above, crosslinked SL8 vRNPs contain a broad peak of 840 ± 12 kDa in addition to the 737 ± 11 kDa peak (fig. S1B, bottom; also shown in Fig. 2D , bottom). Based on the similar molecular mass of the smaller peak in the crosslinked sample (737 ± 11 kDa) to the noncrosslinked sample (736 ± 27 kDa), we suspect that the larger crosslinked complex of 840 ± 12 kDa contains 14 N proteins. These results suggest that the ribonucleosome defaults to a stable complex of 12 N proteins bound to a variable number of RNA stemloops but can adapt to accommodate fewer or more N protein dimers bound to additional RNA. Next, we sought to explore the regions of the N protein required for vRNP formation. We analyzed mutant proteins lacking the following regions: (1) the 44-aa N-terminal extension (NTE), a poorly conserved prion-like sequence that promotes RNA-induced liquid-liquid phase separation of N protein (38, 49) ; (2) the highly conserved 31-aa serine/arginine (SR) region that has been implicated in RNA binding, oligomerization, and phosphorylation (15, 16, (29) (30) (31) (50) (51) (52) Mutant N proteins were mixed with 5'-600 RNA and analyzed by native gel electrophoresis (Fig. 3B ). All mutant N proteins, with the exception of the CTE deletion, appeared to form fully assembled vRNPs. Most mutants contained varying amounts of lower bands beneath the fully shifted vRNP. These lower bands might represent subcomplexes in which the 5'-600 RNA is bound to fewer N proteins, presumably due to defects in vRNP assembly or stability. Deletion of the CTE resulted in a small shift that was considerably lower than the fully shifted vRNP. These results suggest that the ∆CTE N protein binds RNA but fails to form the fully assembled vRNP, hinting at an important role for the CTE in vRNP formation. Studies of deletion mutants in complex with SL8 RNA, which minimizes the contribution of multivalent RNA binding, allowed us to investigate the critical proteinprotein interactions that contribute to ribonucleosome formation. Mutant N proteins were mixed with SL8 RNA, crosslinked, and analyzed by native gel electrophoresis and mass photometry (Fig. 3C , D, and table S1). Deletion of the NTE had little effect, other than to decrease the size of the vRNP complexes by ~30-40 kDa, suggesting that the NTE is not required for vRNP formation. All other deletion mutants had major defects in vRNP assembly. Deletion of the CTE and LH resulted in almost complete disappearance of the vRNP when analyzed by native gel electrophoresis (Fig. 3C ). These mutants appeared cloudy, and turbidity analysis revealed a higher absorbance at 340 nm compared to wild-type, suggesting formation of biomolecular condensates ( fig. S2B ). Mass photometry analysis of the LH deletion showed a dominant peak at ~110 kDa, with two minor peaks at ~230 kDa and ~345 kDa ( Fig. 3D and table S1). The smallest peak represents an N dimer bound to one SL8 RNA, with the next two representing stepwise additions of one or two N dimers bound to an RNA. Thus, protein-protein interactions mediated by the LH are required for vRNP formation. Deletion of the CTE resulted in no discernable peaks above background on the mass photometer (Fig. 3D) , further confirming the essential role of the CTE in vRNP formation and suggesting that tetramerization driven by the CTE is required for ribonucleosome formation or stability. Deletion of the SR and CBP regions also resulted in defects in vRNP assembly; both mutants exhibited a laddering of ribonucleoprotein subcomplexes when analyzed by native gel electrophoresis, as well as stepwise 120-130 kDa increases in molecular mass revealed by mass photometry (Fig. 3C , D, and table S1). These data suggest the SR and CBP regions are required for complete assembly of the ribonucleosome. LH and CBP deletions resulted in a minimal ribonucleoprotein complex of ~110 kDa, consistent with one N protein dimer bound to one SL8 RNA molecule. Interestingly, the SR deletion resulted in a minimal ribonucleoprotein complex of ~230 kDa, consistent with one N protein tetramer bound to two SL8 RNAs. Native gel analysis revealed an increase in free SL8 RNA in the various mutant N protein samples compared to wild-type, suggesting defects in RNA binding (Fig. 3C ). We performed fluorescence anisotropy to quantitatively measure the affinity of a 10-nt RNA (of random sequence) for mutant N proteins, which likely reflects RNA binding to the highaffinity RNA-binding site on the NTD ( fig. S2C ) (51). Wild-type N protein had an affinity of 28 ± 6 nM for the 10 nt RNA oligo, which is consistent with previous measurements of RNA binding to the NTD (51). All mutant N proteins, except the SR deletion, had similar affinities for RNA. Deletion of the SR region resulted in a modest ~5-fold decrease in affinity, consistent with previous reports that the SR region makes a slight contribution to RNA binding at the NTD (51). The increase in free SL8 RNA in native gel electrophoresis is therefore likely caused by defects in lower-affinity RNA binding at other sites in the N protein, which might be necessary for proper ribonucleosome formation. The SR region of N protein is heavily phosphorylated in cells infected by SARS-CoV-2, and this modification is required for the protein's role in viral transcription (15, 16, (29) (30) (31) . In contrast, N protein in the virion is thought to be poorly phosphorylated (16, 30) . We previously observed defects in vRNP formation when the 5'-400 RNA was mixed with a phosphomimetic N protein (the 10D mutant, in which 10 serines and threonines in the SR region are replaced with aspartic acid) (38) , and here we sought to further explore phosphoregulation of the ribonucleosome. We mixed 5'-600 RNA with the 10D mutant Negative stain EM and two-dimensional class average analysis of the GraFixpurified 10D ribonucleoprotein complex, however, revealed a markedly different structure compared to the wild-type vRNPs ( Fig. 4C and Fig. 1F ). The 10D complex appears extended and heterogeneous, unlike the compact structure of the wild-type vRNP, and does not average into discrete, recognizable two-dimensional classifications. We therefore speculate that the 600 nt RNA provides sufficient binding sites for twelve 10D N proteins, but the 10D mutant is unable to condense into the ring structure observed with the wild-type N protein. We note that there was no defect in RNA binding to the NTD of the 10D mutant (Kd = 31 ± 7 nM) ( fig. S2C ). Ribonucleosome formation by the 10D mutant with the SL8 RNA was severely reduced when analyzed by native gel electrophoresis and mass photometry (Fig. 4A , right, and Fig. 4D ). Both assays revealed a laddering of vRNP complexes, consistent with an inability of the 10D mutant to form a stable, fully assembled vRNP. Purification of the 10D + SL8 complex by GraFix revealed a clear shift toward lower molecular mass species when compared to wild-type N (Fig. 4E compared to fig. S1C ). This result was confirmed by mass photometry analysis of fractions 19 + 20 ( fig. S3B) . Interestingly, the minimal unit of vRNP complex assembly with the 10D mutant (like the SR deletion) is ~230 kDa, which is consistent with an N protein tetramer bound to two SL8 RNAs. Negative stain EM and two-dimensional class averages of the GraFix-purified complex (fractions 19 + 20) revealed a smaller overall structure with an electron density distribution clearly distinct from vRNP complexes formed by wild-type N (Fig. 4F compared to Fig. 2C ). We next tested vRNP assembly with N protein that had been phosphorylated in vitro. In recent work, Yaron et al. (29) elegantly demonstrated a multi-kinase cascade that results in maximally phosphorylated N protein: SRPK phosphorylates S188 and S206, which primes the protein for subsequent phosphorylation of 8 more sites within the SR by GSK3, which then primes a final 4 sites for phosphorylation by CK1 (Fig. 5A) . Consistent with this model, we observed maximal phosphorylation of N in the presence of all three kinases (Fig. 5B) . Phosphorylation was greatly reduced when both SRPK priming sites were mutated to alanine (S188A + S206A mutant) (Fig. 5B ). We mixed kinase-treated wild-type or S188A + S206A N proteins with SL8 RNA and purified the resulting vRNP complexes by GraFix (Fig. 5C ). Wild-type phosphorylated N protein migrated as a low molecular weight ribonucleoprotein complex across the gradient, similar to the 10D mutant. The poorly phosphorylated S188A + S206A mutant, however, formed an appropriately sized vRNP across the gradient, similar to wild type unphosphorylated N protein (Fig. 5C ). Mass photometry of the GraFix-purified samples further substantiated the defect in wild-type phospho-N vRNP assembly (Fig. 5D, top) , which is rescued by mutation of the two priming phosphorylation sites (the S188A + S206A mutant) (Fig. 5D , bottom). The 'beads-on-a-string' model for coronavirus genome packaging lacks mechanistic detail. Here, we demonstrate that the N protein of SARS-CoV-2 assembles with viral RNA in vitro to form ribonucleosomes. These structures, which have been observed previously in intact SARS-CoV-2 virions by cryo-electron microscopy (43, 44) , likely contain twelve N proteins (6 dimers) and a variable number of stem-loop RNA structures. Short stem-loop RNAs appear to induce conformational changes in N protein that promote protein-protein interactions necessary for ribonucleosome assembly. These interactions might involve SR binding to the CTD (52), LH binding to other regions of the N protein, helical stacking of the CBP (22), and tetramerization driven by the CTE (20, 50, 54, 55) . All of these binding interfaces contribute to the stability of the vRNP, but the CTE seems particularly critical for ribonucleosome formation. Coronavirus genomic RNA is structurally heterogeneous (45, 46) , and it remains unclear how ribonucleosomes accommodate variable RNA sequence and structure to package RNA in the virion. We find that the vRNP assembles in the presence of 600-nt RNA fragments from multiple genomic regions, suggesting that no specific sequences are required for vRNP formation. Furthermore, the ability of short stem-loop RNAs to trigger vRNP formation suggests that ribonucleosome formation does not require 600 continuous bases of RNA. Inside the virion, it is not known whether each ribonucleosome forms on a continuous stretch of RNA in a nucleosome-like fashion or instead acts as a hub that binds stem-loops distributed across the genome, creating a web of condensed, interlinked protein-RNA interactions with 'nodes' at the ~38 vRNPs. Studies of ribonucleosome assembly with a small stem-loop RNA demonstrate that the vRNP is compositionally adaptive -that is, it can contain a variable number of N protein dimers bound to a variable number of stem-loop RNAs and assembles by iterative additions of N protein dimers bound to stem-loop RNAs. Our data suggest that the most stable form of the vRNP is 12 N proteins in complex with ~600 nt of RNA, but we also observed complexes that contain fewer or more N protein dimers. Given the iterative assembly of the vRNP, the multitude of protein-protein and protein-RNA interactions, and the high concentrations of N protein and RNA in the nucleocapsid, it seems reasonable to expect that the vRNP can expand to expose binding sites that allow additional N protein dimers to insert themselves into, or dissociate from, the cylindrical vRNP complex. Our results, together with data from other studies, provide insights into the general architecture of coronavirus RNA packaging. There are ~38 vRNPs per virus (43, 44) , with each vRNP likely containing ~12 N proteins in complex with ~600 bases of viral RNA. This suggests that within a virus, the vRNPs contain ~500 N proteins bound to ~23,000 nt of RNA. It has been estimated that there are ~1000 N proteins per virus (56), while the viral genome is 30,000 nt in length, suggesting that some N proteins and RNA in the virion are not incorporated into vRNPs. Cryo-electron tomography studies indicate that most vRNPs are associated with the inner face of the membrane envelope, with a structurefree center in every virus (40, 43, 57) . Based on previous studies from our lab and others, this central region in the virion might contain a gel-like condensate of N protein bound heterogeneously to viral RNA (35, 37, 38) . N protein is highly phosphorylated in infected cells, and numerous kinases have been implicated in this phosphorylation (15, 16, 29, 30, 38) . (66) . Thus, inhibition of N protein phosphorylation represents a promising target for therapeutic intervention that has the potential to reduce mortality in individuals infected with SARS-CoV-2 and could serve as an early treatment in the event of future coronavirus outbreaks. Wild-type and mutant N proteins were produced as described previously (38) . Sequences of all RNAs used in this study are provided in table S2. The template for in vitro transcription of 5'-600 RNA was a synthetic DNA (IDT), inserted by Gibson assembly into a pUC18 vector with a 5' T7 promoter sequence. The 5'-600 insert, including the 5' T7 sequence, was excised by EcoR1 digestion and purified by size exclusion chromatography on a Sephacryl 1000 column equilibrated in TE buffer (10 mM Tris pH 8, 1 mM EDTA). Peak fractions of the purified DNA insert were pooled and stored at -4°C. Templates for all other long RNAs (5'-400, 5'-800, Nsp3, Nsp8/9, and Nsp10) were amplified by PCR of a plasmid containing the SARS-CoV-2 genome (a gift from Hiten Madhani, UCSF). All forward primers included a 5' T7 promoter sequence. The SL8 and mSL8 templates were generated by PCR of synthetic DNA (IDT). The sequence for mutant SL8 (mSL8) was designed manually and checked for predicted secondary structure by RNAfold (http://rna.tbi.univie.ac.at/). PCR-amplified DNA was purified and concentrated by spin column (Zymo Research #D4004) before being used to generate RNA. RNA synthesis was performed using the HiScribe T7 High Yield RNA synthesis kit (NEB #E2040S) according to the manufacturer's protocol. Following incubation at 37°C for 3 h, in vitro synthesized RNA was purified and concentrated by spin column (Zymo Research #R1018). To promote formation of proper RNA secondary structure, all purified RNAs were heat denatured at 95°C for 2 min in a pre-heated metal heat block, and then removed from heat and allowed to cool slowly to room temperature over the course of ~1 h. RNA concentration (A260) was quantified by nanodrop. The day before each experiment, N protein was dialyzed into reaction buffer (25 mM HEPES pH 7.5, 70 mM KCl) overnight. RNA was transcribed in vitro the day of analysis, heat-denatured and cooled slowly to allow for proper secondary structure. To assemble vRNP complexes, RNA was mixed with N protein (256 ng/µl RNA and 15 µM N, unless otherwise indicated) in a total volume of 10 µl and incubated for 10 min at 25°C. Samples containing stem-loop RNAs (SL4a, SL7, SL8, SL8m) were crosslinked by addition of 0.1% glutaraldehyde for 10 min at 25°C and then quenched with 100 mM Tris pH 7.5. Samples containing longer RNAs (5'-400, 5'-600, 5'-800, Nsp3, Nsp8/9, Nsp10) were not crosslinked. After assembly, vRNP complexes were analyzed as described below. After assembly (and crosslinking in the case of stem-loop RNAs), 10 µl vRNP mixtures were diluted 1:10 in dilution buffer (25 mM HEPES pH 7.5, 70 mM KCl, 10% glycerol). 2 µl of diluted vRNP mixtures was loaded onto a 5% polyacrylamide native TBE gel (Bio-Rad) and run at 125 V for 80 min at 4°C. 1 µl of the diluted samples was then denatured by addition of 4 M urea and Proteinase K (40 U/ml; New England Biolabs #P8107S), incubated for 5 min at 65°C, loaded onto a 6% polyacrylamide TBE-Urea Gel (Thermo Fisher), and run at 160 V for 50 min at room temperature. Gels were stained with SYBR Gold (Invitrogen) and imaged on a Typhoon FLA9500 Multimode imager set to detect Cy3. Mass photometry experiments were performed using a OneMP instrument (Refeyn). A Freshly prepared and renatured RNA was mixed with dialyzed N protein and incubated for 2 min at room temperature. Absorbance was measured at 260 nm and 340 nm using the Nanodrop Micro-UV/Vis Spectrophotometer. Turbidity was calculated by normalization of the 340 nm measurements to the absorbance value at 260 nm. For negative-stain EM, 2.5 μl of vRNP samples were applied to a glow discharged Cu grid covered by continuous carbon film and stained with 0.75% (w/v) uranyl formate. A Tecnai T12 microscope (ThermoFisher FEI Company) operated at 120 kV was employed to analyze these negatively stained grids. Micrographs were recorded at a nominal magnification of 52,000X using a Gatan Rio 16 camera, corresponding to a pixel size of 1.34 Å on the specimen. All images were processed using cryoSPARC. Micrographs were processed with Patch-Based CTF Estimation, and particles were picked using the blob picker followed by the template picker. Iterations of 2D classification generated final 2D averages. Fluorescent RNA was ordered from IDT as a 10-nt degenerate sequence (random nucleotide at every position) with a 3'-FAM modification. N protein constructs were serially diluted in dialysis buffer, mixed with 10 nM fluorescent RNA and incubated at room temperature for 5 min. Fluorescence was measured on a K2 Multifrequency Fluorometer. RNA was excited with polarized light at 488 nm and emission was recorded at 520 nm. Data from three independent N protein titrations were fit to a one-site binding curve using GraphPad Prism to determine KD. Glycerol gradients were assembled as previously described, with slight modifications (67) . Briefly, 10-40% glycerol gradients (dialysis buffer containing 10% or 40% glycerol) were poured and mixed with the Gradient Master (BioComp). For GraFix purification, fresh 0.1% glutaraldehyde was added to the 40% glycerol buffer prior to gradient assembly. vRNP samples (generally 75 µl of 15 µM N with 256 ng/µl RNA) were gently added on top of the assembled 5 ml gradients and samples were centrifuged in a prechilled Ti55 rotor at 35,000 rpm for 17 h. Gradient fractions were collected by puncturing the bottom of the tube with a butterfly needle and collecting two drops per well. For analysis by negative stain electron microscopy and mass photometry, peak fractions were combined and buffer exchanged using centrifugal concentrators (Millipore Sigma #UFC510024). Concentrated samples were then re-diluted 1:10 with dialysis buffer (0% glycerol) and re-concentrated. Samples were diluted and re-concentrated three times. Kinases were purchased from Promega (SRPK1: #VA7558, GSK-3β: #V1991, CK1ε: (table S1) . (E) Predictions of N protein and RNA stoichiometry, based on measured masses of N protein in complex with SL8 RNA without crosslinker (D, middle panel). Measured masses are means ± standard deviation in two independent experiments (table S1). Below the table is a schematic of a proposed assembly mechanism in which N protein dimers, bound to one or two stem-loop RNAs, iteratively assemble to the full vRNP. The proposed mechanism of sequential phosphorylation (29) is initiated by SRPK at S188 and S206 (orange), which leads to downstream phosphorylation of eight sites by GSK3 (green), allowing for final phosphorylation of four additional sites by CK1 (purple). In the phosphomimetic 10D mutant used in Fig. 4 , the SRPK and GSK3 sites are changed to aspartic acid. (B) Wild-type (WT) and mutant N protein constructs were incubated with the indicated kinases in the presence of radiolabeled ATP and analyzed by SDS-PAGE and autoradiography. Phosphorylated N is indicated. Asterisk denotes autophosphorylation of CK1. Molecular mass marker shown on right (kDa). (C) N protein (WT or S188A + S206A) was phosphorylated by SRPK, GSK3, and CK1, and then mixed with SL8 RNA. The resulting ribonucleoprotein complexes were separated by glycerol gradient centrifugation in the presence of crosslinker (GraFix) and analyzed by native gel electrophoresis. (D) Peak fractions from the GraFix analyses in C were analyzed by mass photometry. Top: fractions 19 + 20 of wild-type N; bottom: fractions 7 + 8 of S188A + S206A mutant N. Representative of two independent experiments (table S1). A pneumonia outbreak associated with a new coronavirus of probable bat origin Human Coronavirus: Host-Pathogen Interaction The molecular biology of coronaviruses SARS-CoV-2 nucleocapsid protein adheres to replication organelles before viral assembly at the Golgi/ERGIC and lysosome-mediated egress Coronavirus biology and replication: implications for SARS-CoV-2 A unifying structural and functional model of the coronavirus replication organelle: Tracking down RNA synthesis Ultrastructure and origin of membrane vesicles associated with the severe acute respiratory syndrome coronavirus replication complex The Architecture of SARS-CoV-2 Transcriptome Continuous and Discontinuous RNA Synthesis in Coronaviruses A contemporary view of coronavirus transcription Nucleocapsid Protein Recruitment to Replication-Transcription Complexes Plays a Crucial Role in Coronaviral Life Cycle Determination of host proteins composing the microenvironment of coronavirus replicase complexes by proximity-labeling The intracellular sites of early replication and budding of SARScoronavirus Four proteins processed from the replicase gene polyprotein of mouse hepatitis virus colocalize in the cell periphery and adjacent to sites of virion assembly The Global Phosphorylation Landscape of SARS-CoV-2 Infection Nucleocapsid phosphorylation and RNA helicase DDX1 recruitment enables coronavirus transition from discontinuous to continuous transcription Coronavirus nucleocapsid protein facilitates template switching and is required for efficient transcription The nucleoprotein is required for efficient coronavirus genome replication The SARS coronavirus nucleocapsid protein--forms and functions Architecture and self-assembly of the SARS-CoV-2 nucleocapsid protein A molecular pore spans the double membrane of the coronavirus replication organelle The SARS-CoV-2 nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein SARS-CoV-2 nucleocapsid protein phase-separates with RNA and with human hnRNPs Genomic RNA Elements Drive Phase Separation of the SARS-CoV-2 Nucleocapsid Phosphoregulation of Phase Separation by the SARS-CoV-2 N Protein Suggests a Biophysical Basis for its Dual Functions The coronavirus nucleocapsid protein is dynamically associated with the replication-transcription complexes Cryo-electron tomography of mouse hepatitis virus: Insights into the structure of the coronavirion Ribonucleoprotein of avian infectious bronchitis virus Ribonucleoprotein-like structures from coronavirus particles Molecular Architecture of the SARS-CoV-2 Virus SARS-CoV-2 structure and replication characterized by in situ cryoelectron tomography Comprehensive in vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms The architecture of the SARS-CoV-2 RNA genome inside virion GraFix: sample preparation for single-particle electron cryomicroscopy The structure and functions of coronavirus genomic 3' and 5' ends Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus Characterization of the coronavirus M protein and nucleocapsid interaction in infected cells A key role for the carboxy-terminal tail of the murine coronavirus nucleocapsid protein in coordination of genome packaging Recognition of the murine coronavirus genomic RNA packaging signal depends on the second RNA-binding domain of the nucleocapsid protein Targeting the coronavirus nucleocapsid protein through GSK-3 inhibition Analysis of Nucleosome Sliding by ATP-Dependent Chromatin Remodeling Enzymes We thank the members of the Morgan laboratory for discussions and comments on the manuscript; and Conor Howard, Elise Muñoz, Hayden Saunders, and R. Das for discussions and technical assistance. Competing interests: The authors declare no competing interests. Data and materials availability: All data needed to evaluate the conclusions of the paper are included in the paper or supplementary materials.