key: cord-0878633-pfrusy8i authors: Zhang, Kaiming; Zheludev, Ivan N.; Hagey, Rachel J.; Wu, Marie Teng-Pei; Haslecker, Raphael; Hou, Yixuan J.; Kretsch, Rachael; Pintilie, Grigore D.; Rangan, Ramya; Kladwang, Wipapat; Li, Shanshan; Pham, Edward A.; Bernardin-Souibgui, Claire; Baric, Ralph S.; Sheahan, Timothy P.; D′Souza, Victoria; Glenn, Jeffrey S.; Chiu, Wah; Das, Rhiju title: Cryo-electron Microscopy and Exploratory Antisense Targeting of the 28-kDa Frameshift Stimulation Element from the SARS-CoV-2 RNA Genome date: 2020-07-20 journal: bioRxiv DOI: 10.1101/2020.07.18.209270 sha: 65b1a2cfb2ae79097bf15941df97093752c9be53 doc_id: 878633 cord_uid: pfrusy8i Drug discovery campaigns against Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) are beginning to target the viral RNA genome1, 2. The frameshift stimulation element (FSE) of the SARS-CoV-2 genome is required for balanced expression of essential viral proteins and is highly conserved, making it a potential candidate for antiviral targeting by small molecules and oligonucleotides3–6. To aid global efforts focusing on SARS-CoV-2 frameshifting, we report exploratory results from frameshifting and cellular replication experiments with locked nucleic acid (LNA) antisense oligonucleotides (ASOs), which support the FSE as a therapeutic target but highlight difficulties in achieving strong inactivation. To understand current limitations, we applied cryogenic electron microscopy (cryo-EM) and the Ribosolve7 pipeline to determine a three-dimensional structure of the SARS-CoV-2 FSE, validated through an RNA nanostructure tagging method. This is the smallest macromolecule (88 nt; 28 kDa) resolved by single-particle cryo-EM at subnanometer resolution to date. The tertiary structure model, defined to an estimated accuracy of 5.9 Å, presents a topologically complex fold in which the 5′ end threads through a ring formed inside a three-stem pseudoknot. Our results suggest an updated model for SARS-CoV-2 frameshifting as well as binding sites that may be targeted by next generation ASOs and small molecules. To better understand the FSE tertiary structure, we sought to model RNA coordinates into the 6.9-Å map using Ribosolve, a hybrid pipeline recently developed for automatically modeling RNA 3D structures based on secondary structure information from mutate-and-map guided by nextgeneration sequencing (M2-seq), cryo-EM maps, and computer modeling with autoDRRAFTER 7 . M2-seq secondary structure analysis recovered the three-stem pseudoknot in Fig. 1a that has been validated by NMR and compensatory mutagenesis for the SARS-CoV-1 FSE 21 . The same threestem secondary structure was also observed in a separate SHAPE-directed modeling study as well as an independent DMS-MaPseq study 27 , with minor variations in edge base pairs (Extended Data Table 1 ). At 5.9-Å estimated accuracy, individual atomic positions and non-canonical base pairs cannot be confidently assigned. Nevertheless, the tertiary arrangement of the helical segments and non-helical linkers of the SARS-CoV-2 FSE can be traced (Fig. 2) , is consistent in different members of the model ensemble, and is further supported by alternative autoDRRAFTER runs making different assumptions about secondary structure and initial 3D helix placements (Extended Data Fig. 7 ). The architecture of the FSE involves several interlocking elements (Fig. 2e-f and Extended Data Movie 1). Starting from the 5′ end and proceeding to the 3′ end, the molecule begins with a 5′ region that includes the heptanucleotide slippery site. The Ribosolve modeling folds this region into a loose hairpin-like shape closed by G•U wobble pairs. This 5′ end is followed by the first strand of Stem 1, a long helix in all coronavirus FSE's. The loop of Stem 1 is also the first strand of the Stem 2 pseudoknot, which forms a 5-bp helix that hybridizes with its complement at the 3′ end of the FSE. The RNA strand continues from this region to complete the second strand of Stem 1 and doubles back to form a hairpin, Stem 3. After an unpaired segment J3/2, the RNA completes Stem 2 to close the Stem 1-Stem 2 pseudoknot. Lastly, unstructured terminal nucleotides form a 3′ tail. Despite the absence of base pairings or direct stacking between Stem 3 and the Stem 1-Stem 2 pseudoknot, Stem 3 exhibits a distinct tertiary conformation in relation to the pseudoknot which, along with the conformationally heterogeneous hairpin at the 5′ end, result in the legs of the "λ"-shaped map ( Fig. 2e-f, Extended Data Fig. 6 and Extended Data Movie 1). Overall, Stems 1, 2, and 3 form a circular ring with a visually apparent hole (Fig. 2e-f) . The 5′ end of the FSE is connected to the pseudoknot by a linker that passes through this ring. This complex topology was predicted as a possible FSE fold in two recent, independent 3D computer modeling studies that include an extended 5′ end 44, 45 . We emphasize that while the fold of the 5′ end appears poorly resolved and thus may have multiple conformations, the connection point of the 5′ end to the rest of the structure and its threading through the Stem 1-Stem 2-Stem 3 ring is a consistent feature in all models in the autoDRRAFTER ensemble (Extended Data Movie 1) as well as in alternative modeling runs based on alternative FSE secondary structures and different autoDRRAFTER modeling assumptions (Extended Data Fig. 7) . In terms of structural requirements, the 5′ end ring-threading requires formation of a ~10 base pair helical turn in Stem 1. With fewer base pairs, the 5′ strand of Stem 1 cannot turn fully inside and through the ring. Supporting the general relevance of 5′-end ring-threading, the length of Stem 1 is ~10 base pairs or larger in all proposed coronavirus FSE elements 22 , and de novo modeling of FSEs from bovine coronavirus, murine hepatitis virus, human coronaviruses OC43 and HKU1, SARS-CoV-1, and MERS all give 5′ ring-threaded models (Extended Data Figure 9 ). The SARS-CoV-2 FSE RNA represents an extreme case for cryo-EM. It is the smallest macromolecule (28 kDa) so far resolved by cryo-EM single particle analysis at subnanometer resolution. We therefore sought further independent validation of the map and the Ribosolve model, particularly to test our inference that Stem 3 comprises a ′leg′ of the λ shape, perpendicular to the Stem 1-Stem 2 pseudoknot 44,45 . We rationally designed a variant of the SARS-CoV-2 FSE termed FSE-ATP-TTR3, which contains an insertion of a clothespin-like nanostructure whose visualization would test the assignment and orientation of Stem 3 in the FSE map 46 (Fig. 3a) . For the tag, we chose the rationally designed ATP-TTR 3 RNA 47 based on its known amenability to cryo-EM imaging 7 and on modeling suggesting that its insertion into the FSE would not perturb either RNAs' secondary structure, a prediction verified through chemical mapping (Extended Data Fig. 10) . A dataset containing ~12,000 micrographs of FSE-ATP-TTR3 was collected and resulted in a 6.4-Å map of FSE-ATP-TTR3 exhibiting the ATP-TTR 3 clothespin-like shape. Additional density at the end of the clothespin is visible, with a λ-like shape (Fig. 3b-c, Extended Data Fig. 11 and Extended Data Movie 2). The FSE-only map fits into this additional density (Fig. 3d , Extended Data Movie 2). To evaluate the relative orientation of FSE and the ATP-TTR tag, we adopted an unbiased method to align the two maps. The orientation of the FSE map was determined by conducting a rigid body exhaustive search to maximize the correlation between the FSE density and difference mapping between FSE-ATP-TTR3 and ATP-TTR3 densities independently using The urgency of the COVID-19 pandemic, recent advances in targeting RNA 3D structures with ASOs and small molecules, and the identification of the FSE as a potentially well-defined RNA 3D structure in the SARS-CoV-2 genome has generated significant interest in understanding and targeting SARS-CoV-2 ribosomal frameshifting 3, 23, 27, 28, 44, 45, 50, 51 . Here, we confirmed the ability of ASOs to invade the SARS-CoV-2 FSE structure and to reduce frameshifting efficiencies in cellfree assays, and we explored the potential for these ASOs to reduce viral replication in cells at submicromolar concentrations. Incomplete inactivation of coronavirus frameshifting and replication in these and prior studies motivated structural analysis to better understand the FSE. Despite prior expectations of structural heterogeneity and propensity for multimerization 4, 45, 52 , and despite having a size slightly under that of the previous smallest macromolecule imaged by cryo-EM (28 vs. 30 kDa 35 ), the SARS-CoV-2 FSE gave a 6.9-Å resolution monomer map with recognizably helix-like elements arranged into a λ-like tertiary arrangement. Automated secondary structure determination and 3D coordinate building through the recently developed Ribosolve pipeline led to a model with an intricate fold: the Stem1-Stem2 pseudoknot interconnected by a J3/2 linker and Stem 3 to form a closed ring. Remarkably, the linker connecting the 5′ end of the FSE to Stem 1 threads through this ring. This overall 3D arrangement was tested through a second cryo-EM analysis of an FSE molecule extended by a rationally designed clothespin-shaped nanostructure at Stem 3. We note two caveats regarding our cryo-EM analysis. First, the 6.9 Å resolution of the cryo-EM map is not sufficient to directly resolve base interactions, much less atom positions. It is likely that hydrogen bonds and base pairs beyond our modeled secondary structure stabilize the FSE 3D structure, and higher resolution structures will be needed to resolve those details. Second, cryo-EM data processing can filter out alternative, less well-defined conformations while focusing on particle images that contribute to the most well-defined structures. De novo computer modeling, in addition to predicting the ring-threaded structure we observe, has suggested alternative conformations in which the 5′ end is not threaded through the ring (Extended Data Figure 9 ). Such alternative states of FSE may exist in our preparations but escape computational detection due to the extra conformational heterogeneity at the 5′ ends. Within our nanostructure-tagged map ( Fig. 3b) , alternative states may contribute to the lower resolution of the FSE segment compared to the nanostructure segment. Our cryo-EM-guided data and modeling, along with recent results from several groups, suggest explanations for why FSE targeting efforts by us and others have so far yielded only modest inhibition of frameshifting and viral replication [4] [5] [6] 34 . There is accumulating evidence that, in the full SARS-CoV-2 genome context, the FSE forms alternative structures involving genomic segments upstream of the element (Extended Data Fig. 13 and refs 27,44,45,50 ). As the human ribosome approaches the FSE, it must unfold these upstream structures through its helicase activity, and the FSE will refold. The position of the 5' end of the FSE will be influenced by structures that existed prior to ribosome-induced refolding and could end up either threaded through the final FSE Stem1-Stem2-Stem3 ring, as captured by our cryo-EM analysis , or remain unthreaded (Fig. 4) . The 5'-end threaded structure (Fig. 4a ) appears poised to lead to ribosomal pausing and -1 PRF through a torsional restraint mechanism 21 , perhaps working in concert with specific interactions of the pseudoknot with the ribosome 53 . Other structures (Fig. 4b shows one model) might be resolved by the ribosome without frameshifting. Further work, potentially involving design of topologically trapped FSE variants, single molecule FRET, and time-resolved cryo-EM of the ribosome 54-57 , will be necessary to test and refine this model. The structural complexity of FSE 3D folding has implications for therapeutic targeting efforts. First, for ASO targeting, the intricately threaded FSE tertiary structure (Fig. 4c ) may hinder strand invasion, even into regions typically considered unpaired, thus explaining the requirement we and others have observed for near-micromolar ASO concentrations to achieve inhibition of frameshifting. Second, for either ASO or small molecule inhibitors, if the molecule does not significantly alter the partitioning in how the FSE 5'-end leads into the pseudoknot (Fig. 4a vs. 4b), it would not alter frameshifting efficiencies to extremely low levels, as is expected to be needed to disrupt viral replication 51 . If this explanation is correct, design of anti-frameshifting therapeutics may need to focus on binding to alternative structures that are distinct from our cryo-EM-guided model. Alternatively, SARS-CoV-2 replication might be disrupted by maximizing rather than inhibiting frameshifting, which would still perturb the stoichiometry of viral proteins 17, 18 . In this largely unexplored strategy, pro-frameshifting drugs could be designed to stabilize binding pockets that consistently appear in all members of the cryo-EM-guided model ensemble (Extended Data Figure 14 ). Three such sites, which we term the 'ring site', the 'J3/2 site', and the 'slippery hairpin site', involve nucleotides that are highly conserved across a diverse range of coronaviruses (blue in Fig. 4c ) and hence might be targeted with a reduced chance of viral escape . We have provided the first 3D structural data of a functionally obligate segment of the SARS-CoV-2 genome and have described implications for mechanism and targeting of -1 programmed ribosomal frameshifting, an intricate and critical genomic process. Dozens of other segments of the 30 kb RNA genome are highly conserved and have been predicted to be structured 27,44,45,50,58,59 , many in the size range explored here. Applying antisense targeting and cryo-EM to these segments may yield additional information that sheds light on poorly understood SARS-CoV-2 RNA biology and, hopefully, accelerates design of genome-disrupting therapeutic agents. LNAs are oligonucleotides with a 2′ sugar modification that locks the ribose in a C3′-endo conformation by a 2′-O, 4′-C methylene bridge, which in turn locks the LNA into the A-form helical conformation 60 . Oligonucleotides containing LNAs were custom synthesized by Integrated The JL4 and RL4 ASOs, inspired by ref. 61 , contained short regions of hybridization to Stem 1 and Stem 2, but were not expected to be able to displace intramolecular pairings in those stems. To test binding of each LNA to the SARS-CoV-2 FSE, we initially planned to carry out SHAPE and DMS chemical mapping but discovered in preliminary experiments that the bound S2D LNA interfered with reverse transcription. We therefore used this interference itself to test for binding. Binding assays were performed in 20 μL reactions containing either 1. Tail2 primer for 0.2 pmol RNA, and 3.25 μl ddH2O, was added. Lastly, a premixed volume containing 1.0 μL of 100 mM DTT, 0.5 μL of SuperScript III reverse transcriptase (Thermo Fisher Scientific), 1.6 μL of 10 mM dNTPs and 1.9 μL of ddH2O was added to each reaction and reverse transcription was carried out at 48°C for 40 mins. The resulting products were then treated equivalently to the standard 1D SHAPE/DMS chemical mapping protocol (described below). Frameshifting levels for SARS-CoV-2 were determined using the p2luc bicistronic dual-luciferase Frameshifting levels were calculated by taking the ratio: The resulting mean fold-change frameshifting data were then fitted using the Matlab 2019b (MathWorks) "Levenberg-Marquardt" algorithm to a standard binding isotherm (Slp2 was fitted using a constrained IC50 to 130 nM): chemical mapping construct′s assembled dsDNA was 2% agarose gel purified by size using bluelight transillumination using SyBrSafe (Invitrogen™) and purified using a Qiagen MinElute Gel Extraction kit prior to IVT. The RNA samples were folded prior to plunge freezing as previously described 7 . Briefly, the RNA sample in a buffer containing 50 mM Na-HEPES (pH 8.0) was denatured at 90℃ for 3 mins and movie stacks for FSE and 12,380 movie stacks for FSE-ATP-TTR3 were collected. All micrographs were first imported into Relion for image processing. The motion-correction was performed using MotionCor2 68 and the contrast transfer function (CTF) was determined using CTFFIND4 69 . All particles were autopicked using the NeuralNet option in EMAN2. Then, particle coordinates were imported to Relion, where the poor 2D class averages were removed by several rounds of 2D classification. The initial models for both datasets were built in cryoSPARC using the ab-initio reconstruction option. For the FSE, 1,063,711 particles were picked and 445,707 were selected after 2D classification in Relion. After removing classes with poorly connected density by 3D classification, the final 3D refinement was performed using 109,137 particles in Relion, and a 6.9-Å map was obtained. For the FSE-ATP-TTR3, 1,103,091 particles were picked and 902,309 were selected after 2D classification in Relion. After removing classes with poorly connected density by 3D classification in Relion, two rounds of heterogeneous refinement performed in cryoSPARC to further remove contaminant particles. The final 3D homogenous refinement was performed using 257,558 particles, and a 6.4-Å map was obtained. Resolution for the final maps was estimated with the 0.143 criterion of the Fourier shell correlation curve without or with mask. A Gaussian low-pass filter was applied to the final 3D maps displayed in the UCSF Chimera software package 70 . Chemical mapping was conducted on FSE, plusFSE and FSE-ATP-TTR3 constructs (Extended Data Following one-dimensional chemical mapping, the same constructs were also subjected to twodimensional (2D) chemical mapping which uses DMS modification on folded RNA but leverages information on structural perturbations from sequence mutations to more directly infer helices 72 . The resulting signals are read out using Illumina short read sequencing, exploiting the mutational readthrough of DMS modified bases by the retrotranscriptase TGIRT-III (InGex). In brief, the same DNA encoding the 1D chemical mapping constructs were first subjected to errorprone PCR (epPCR) where 2 ng/μL of PCR assembled (see above) dsDNA (1.6μg total dsDNA used per construct, 8⨉ epPCRs) was amplified using the epPCR forward and reverse primers at 100 μM (Extended Data shown on log scale; same data as main text Fig. 1f in all top ten results for (g). j-k. autoDRRAFTER modelling forcing an initial placement of Stem 3 in an alternative position of the map (red vs. yellow in (j)) initially gives poorly converged models that are not well-placed in the density, but after iterative refinement, helices positions shift and lead to well-converged models (k), indistinguishable to results from unbiased autoDRRAFTER modeling (b,e,h). with all ten models from the autoDRRAFTER modelling (c). Structures are colored by phyloP scores so that blue vs. red highlight regions of higher vs. lower conservation than the average genomic phyloP score, respectively. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia Flooded by the torrent: the COVID-19 drug pipeline. The Lancet vol RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses: a first look Structural and functional conservation of the programmed -1 ribosomal frameshift signal of SARS coronavirus 2 (SARS-CoV-2) Interference of ribosomal frameshifting by antisense peptide nucleic acids suppresses SARS coronavirus replication Identification of RNA Pseudoknot-Binding Ligand That Inhibits the −1 Ribosomal Frameshifting of SARS-Coronavirus by Structure-Based Virtual Screening Accelerated cryo-EM-guided determination of three-dimensional RNAonly structures Structure-Based Drug Discovery Paradigm Antifungal drugs: Small molecules targeting a tertiary RNA structure fight fungi RNA-based therapeutics: current progress and future prospects Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structure of the RNA-dependent RNA polymerase from COVID-19 virus Structure of M from COVID-19 virus and discovery of its inhibitors Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir SARS and MERS: recent insights into emerging coronaviruses Achieving a golden mean: mechanisms by which coronaviruses ensure synthesis of the correct stoichiometric ratios of viral proteins Mechanisms and implications of programmed translational frameshifting A three-stemmed mRNA pseudoknot in the SARS coronavirus frameshift signal An in silico map of the SARS-CoV-2 RNA Structurome Locked nucleic acid: modality, diversity, and drug discovery Torsional restraint: a new twist on frameshifting pseudoknots A dualluciferase reporter system for studying recoding signals A mouse-adapted SARS-CoV-2 model for the evaluation of COVID-19 medical countermeasures A new coronavirus associated with human respiratory disease in China Superior 5' homogeneity of RNA from ATPinitiated transcription under the T7 phi 2.5 promoter Primerize: automated primer assembly for transcribing non-coding RNA domains Ultraviolet shadowing of RNA can cause significant chemical damage in seconds MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy CTFFIND4: Fast and accurate defocus estimation from electron micrographs UCSF Chimera--a visualization system for exploratory research and analysis HiTRACE: high-throughput robust analysis for capillary electrophoresis RNA structure inference through chemical mapping after accidental or intentional mutations Specific viral RNA drives the SARS CoV-2 nucleocapsid to phase separate RNA secondary structure prediction without physics-based models FARFAR2: Improved De Novo Rosetta Prediction of Complex Global RNA Folds The open science grid Detection of nonneutral substitution rates on mammalian phylogenies Standardization of RNA chemical mapping experiments 151 movie stacks (low centration samples with high noisy background Extended Data Fig. 10. The secondary structure for FSE-ATP-TTR3, the SARS-CoV-2 frameshift stimulation element tagged by the ATP-TTR nanostructure as determined by 1D chemical SHAPE mapping, as computed in RNAStructure allowing for pseudoknots Bootstrapping (100 iterations) support for each continuous helix is shown as an underlined percentage. Nucleotides are colored by (a) SHAPE reactivity or (b) dimethyl sulfate (DMS) reactivity. Bootstrapping probabilities are shown for the DMS case for helices that are also predicted with RNAstructure guided by DMS reactivity Reactivity data are a. SHAPE reactivities collected in this work, b. SHAPE reactivities from Iserman, et al. 73 , c. SHAPE reactivities from Manfredonia, et al. 28 , and d. DMS reactivities from Lan et al 27 Bootstrapping (100 iterations) support for each helix is shown as an underlined percentage. Base pairs are shown in blue if they are not present in all four models **** ****