key: cord-0699553-itqwhi6a authors: Haddad, Christina; Davila-Calderon, Jesse; Tolbert, Blanton S. title: Integrated Approaches to Reveal Mechanisms by which RNA Viruses Reprogram the Cellular Environment date: 2020-07-02 journal: Methods DOI: 10.1016/j.ymeth.2020.06.013 sha: f42123265a3bc026cbd80dcc70fd38a99eb9e27c doc_id: 699553 cord_uid: itqwhi6a RNA viruses are major threats to global society and mass outbreaks can cause long-lasting damage to international economies. RNA and related retro viruses represent a large and diverse family that contribute to the onset of human diseases such as AIDS; certain cancers like T cell lymphoma; severe acute respiratory illnesses as seen with COVID-19; and others. The hallmark of this viral family is the storage of genetic material in the form of RNA, and upon infecting host cells, their RNA genomes reprogram the cellular environment to favor productive viral replication. RNA is a multifunctional biomolecule that not only stores and transmits heritable information, but it also has the capacity to catalyze complex biochemical reactions. It is therefore no surprise that RNA viruses use this functional diversity to their advantage to sustain chronic or lifelong infections. Efforts to subvert RNA viruses therefore requires a deep understanding of the mechanisms by which these pathogens usurp cellular machinery. Here, we briefly summarize several experimental techniques that individually inform on key physicochemical features of viral RNA genomes and their interactions with proteins. Each of these techniques provide important vantage points to understand the complexities of virus-host interactions, but we attempt to make the case that by integrating these and similar methods, more vivid descriptions of how viruses reprogram the cellular environment emerges. These vivid descriptions should expedite the identification of novel therapeutic targets. Mammalian Ribonucleic acid (RNA) viruses persist to pose serious threats to human health and global economies. As this article is being prepared, the world is living through a viral pandemic RNA is a diverse, multifunctional biomolecule that is involved in both the transfer and storage of genetic information as well as the modulation of a myriad of biological processes by virtue of its capacity to fold into complex structures and to catalyze biochemical reactions [1] . It is therefore no surprise that RNA viruses take advantage of the unique physicochemical properties of their viral genomes to assemble functional complexes, which in turn drives almost every aspect of their replication cycles within host cells (Figure 1 ). Many of these complexes are formed through the recruitment of cognate RNA-binding proteins (RBPs) to specific genomic (or sub-genomic) loci, and the nature of these interactions manifest as signals that regulate each step of viral gene expression [2, 3] . Thus, studying viral RNA structures and their interactions with cognate RBPs are essential to understanding the pathogenesis of RNA viruses and to further assist the design of novel antivirals. In this article, we attempt to describe how integrating methods that probe RNA structures and its interactions can inform on mechanisms that regulate viral gene expression for two representative positive-sense RNA viral families, namely Enteroviruses and Coronaviruses. Members from both of these families have caused wide-spread outbreaks in recent history. Non-polio human enteroviruses (EV) are persistent pathogens that cause millions of infections in the United States and globally each year [4, 5] . Infections typically manifest with mild illness; however, protracted infections in the immunocompromised (mostly infants, children and teenagers) can lead to severe neurological disorders, morbidity, paralysis, respiratory failure and death [6, 7] . The National Institutes for Allergies and Infectious Diseases identified EV-A71 and EVD68 as emerging infectious pathogens [8] , and the World Health Organization discussed including both viruses in its Blueprint List of Priority Diseases [9] ; emphasizing the serious threat that these viruses represent to public health. In a 2018 EV-A71 outbreak in Vietnam, 53,000 children were hospitalized and six died [10] . Similar cases with significant mortality rates have been reported in Taiwan and other parts of Asia-Pacific, thus, reiterating the urgency to develop antivirals or vaccines, and the necessity to better understand the molecular mechanisms involved in host-virus interactions [11] . EV-A71 is a non-enveloped single-stranded RNA virus that contains a 7,500 nucleotide (nt) positive sense genome; a dual-purpose RNA element that must serve as template for both viral translation and genome replication [12, 13] . Cellular entry is initiated through interactions between the viral capsid and host membrane receptors such as the scavenger receptor B2 (SCARB2), P-selectin glycoprotein ligand-1 (PSGL-1), heparan sulfate and annexin II (Anx2) and sialic acid-linked glycan. [14] Upon cellular entry, the single strand positive-sense viral RNA genome is released into the host cytoplasm. Given its limited coding capacity, EV-A71 uses multiple strategies to usurp host factors and modulate viral protein synthesis and replication. Particularly, the virus takes advantage of its highly structured 5' untranslated region (5' UTR) to initiate translation in a capindependent pathway [13] . The 5'UTR is predicted to fold into six stem loops. Stem loop (SL) I adopts a 'cloverleaf' structure known to interact with the viral 3C protease to promote genome replication, whereas, stem loops II-VI form the active type I Internal ribosome entry site (IRES) involved in the recruitment of the ribosome [15] . The viral genome is translated in a cap-independent pathway, such that a single polyprotein is synthesized, which is further processed by the viral-encoded proteases 2A and 3C into structural and non-structural proteins [12] . Additionally, the viral-encoded proteases facilitate the shutdown of the host translation and transcription machinery, which produces the ideal environment for viral translation and replication, and ultimately apoptosis [16] . The EV-A71 genome cannot undergo translation and replication concurrently on the same genomic RNA, as the ribosome blocks the 3'-5' progression of the elongating viral RNA polymerase [17] . Thus, the virus coordinates complex processes to transition between these two particular stages of its replication cycle. Specifically, genomes undergoing replication are translocated to virus-induced vesicles, allowing for spatial separation from those undergoing translation in the cytoplasm [18] . The EV-A71 genome is subsequently replicated through a negative strand intermediate and packaged into the viral capsid [12] ( signaling changes to normal cellular functions [15, 22] . Competition between the positive and negative regulators suggests a mechanism by which the virus fine-tunes its protein synthesis while simultaneously coordinating the reduction of the host cells translation levels and overall physiological homeostasis [19, 24] . Interestingly, almost all of these ITAFs are known to interact with the 5'UTR of other picornaviruses to regulate IRES activity and replication, suggesting an evolutionary preference to conserve RNA structural features that drive specific RBP recognition [15, 19] . The current dogma supports a model in which ITAFs cycle the IRES through different conformational states to modulate ribosome assembly; however, the molecular mechanisms by which ITAFs interact with the IRES to regulate this process remain poorly understood. Therefore, knowledge of the IRES structures and how ITAFs remodel it to assemble functional complexes is essential to understand how EV-A71 promotes viral protein synthesis to produce progeny virions. December of 2019, the city of Wuhan in China witnessed an outbreak of the novel coronavirus disease (COVID-19), spreading to 185 countries and regions in months [25] . According to the Johns Hopkins Coronavirus Research Center, globally more than 7 million cases and more than 400,000 deaths were recorded as of June 12, 2020. The rapid spread of the infection is attributed to its ability to target the respiratory system [26] . This virus demonstrates similar symptoms to previously known coronaviruses, such as dry cough, dyspnea, and fever; however, it has the ability to infect lower respiratory airways leading to multiple organ failure in severe cases [27] . No treatment or vaccine are available to date. COVID-19 is an infection caused by the severe acute respiratory syndrome-related coronavirus, namely SARS-CoV-2 and formally known as 2019-nCoV [25, 28] . The SARS-CoV-2 virus belongs to the coronaviridae family, subfamily orthocoronavirinae, and genera betacoronaviruses, similar to MERS-CoV and SARS-CoV [25, 28, 29] . The SARS-CoV-2 genome displays 79.6% sequence identity to SARS-CoV, and 96% similarity to the bat-related coronavirus [30] . The enveloped coronavirus (CoV) genome is a positivesense, single-strand RNA, which varies in size from 27 to 32 Kb, specifically 29.9 Kb in SARS-CoV-2 [29, 31] . The enveloped virion carries a surface spike protein which binds to the host cell surface receptor, angiotensin converting enzyme 2 (ACE2). [32] This interaction promotes fusion of viral and cellular membranes, subsequently releasing the viral contents into the host cytoplasm. Upon cellular entry, the viral RNA genome is uncoated and released into the cytoplasm where it serves as a template for capdependent viral protein synthesis [33, 34] (Figure 1 ). The open reading frames, ORF1a and ORF1b, covers two-thirds of the viral genome to encode two large polyproteins (pp1a and pp1ab), which are post-translationally cleaved into 16 nonstructural proteins (nsps) [29, 34] . Polyprotein translation utilizes a ribosomal frameshifting mechanism that requires 5' cap formation and 3' polyadenylation of the viral genome [35, 36] . Nsp3 or papain-like protease and mainly nsp5 or 3C-like protease (3CL pro ) are responsible for processing the polyprotein into mature nsps [34, 37, 38] . Structural and accessory proteins are encoded by the remaining one-third of the viral genome [33] . In CoVs, 5' and 3'-untranslated regions (UTRs) consisting of phylogenetically conserved stem loops are required for replication and transcription [37, 39] . Multiple nsps and a number of host factors assemble to form the replicase-transcriptase complex (RTC) at the 3'-UTR in order to synthesize genomic and subgenomic RNA (sgRNA) through negative strand template intermediates [34, 39] (Figure 1 ). RNA-dependent RNA polymerase (RdRp) or nsp12 replicates genomic RNA and transcribes sgRNA. Transcriptional regulatory sequences (TRSs) guide nsp12 and are present at the 5'leader sequence (TRS-Leader or TRS-L) and at the genomes encoding for accessory and structural proteins (TRS-Body or TRS-B) [39] . RdRp continues transcription after encountering TRS-B sequences and switches to the TRS-L to transcribe the 5' leader sequence; however, this mechanism is poorly understood [40] . Having common 5'-ends, sgRNAs translate only the 5' segments of their ORFs into accessory and structural proteins and recognize the rest of the sequence as an untranslated region [33, 41] . Comparatively to other coronaviruses, nsp1 promotes degradation of host mRNA and inhibits host cell translation [42] ; a common strategy by which positive-sense RNA viruses reprogram the cellular environment [3] . In Murine Hepatitis Virus (MHV), a betacoronavirus, hnRNP A1 and polypyrimidine tract-binding protein (PTB) bind to the 5'-UTR, specifically at the TRS, and play a role in RNA synthesis [43, 44] . In addition, eukaryotic initiation factors, eIFs, 3i, 3f, and 3e along with other host proteins assemble in the microenvironment of the replicase-transcriptase complex (RTC) [45] . The siRNAmediated knockdown of these initiating factors showed a reduction in RNA replication, indicating their involvement in virus-dependent reprogramming of the cellular environment [45] . These aforementioned virus-host interactions and their mechanisms of action are yet to be understood. The novelty of SARS-CoV-2 raises many questions concerning the mechanisms by which the virus regulates its gene expression. Obtaining structural details of the UTRs and identifying functional binding sites of RBPs will be deeply insightful in elucidating how this virus replicates within host cells. Focusing on essential RNA-RNA and RNA-protein interactions, such as RdRp, 3CL pro [46, 47] , or cellular RBPs will inform on novel targets to therapeutically inhibit SARS-CoV-2, while simultaneously shedding light on the cellular pathways hijacked by the virus. Despite differences in the life cycles of entero-and coronaviruses, sufficient similarities allow for the parallel discussion of shared features of their biology (Figure 1 ). implemented to identify functional RNA structural motifs unlike traditional methods, such as align-and-fold or fold-and-align, which rely on previous sequence alignments [53] . ScanFold decouples these steps, which minimizes computational time and aids researchers studying systems with poor sequence alignments [54] . Potential functional regions on the RNA are identified by analyzing the thermodynamic parameter z-score from which a single base pair arrangement is assigned to each nucleotide in the input sequence and a structural model is built [54] . The ScanFold pipeline was previously benchmarked against experimentally supported models of the well-studied HIV-1 genome [53] , and it has been used recently to identify thermodynamically stable RNA structures throughout the SARS-CoV-2 genome (https://www.biorxiv.org/content/10.1101/2020.04.17.045161v1). A similar approach can be employed to identify probable functional regions along the genome of other RNA viruses, which are thought to coordinate multiple aspects of their replication cycles through co-opting RBPs (Figure 1) . Furthermore, ScanFold results can be complemented with sequence alignment data to identify functional regions that have been evolutionarily conserved in RNA viruses. Although ScanFold can identify RNA sequences with potential to form stable structures, the obvious "limitation" is that the structures are predicted and therefore need to be validated experimentally by other structural methods described in this article (Figure 2) . Particularly, these structures predicted in silico can be further confirmed experimentally [55] . DMS modifies non-base paired adenosine and cytosine nucleotides present in bulges, loops, and other regions where the Watson-Crick edges of these bases are exposed. Using thermostable group II intron reverse transcriptase (TGIR-II), the modified RNA is reverse transcribed by creating a mutation when it comes across a methylated nucleotide. High throughput sequencing of the products will generate DMS driven RNA secondary identify the accessibility profile of RNA regions that are exposed and this information can aid in the determination of the structural changes that accompany RNA-RNA and/or RNAprotein interactions [56] . In the aim of studying structural RNA accessibility, antisense RNA probing utilizes a Structural Sensing System (iRS 3 ), a previously designed in vivo sensor, that utilizes an antisense RNA probe that hybridizes to the RNA of interest. This probe constitutes of a 9 to 16 nucleotide RNA complementary to its target, a stem loop structure that blocks the ribosome binding site (RBS) by binding to the cis-acting region, and a GFP encoding region to generate a fluorescent output signal [56] . The CLIP-seq framework was used in a recent study to identify binding sites for the splicing regulators hnRNP A1 and hnRNP H1 along the HIV-1 genome [58] . That study revealed RBP binding sites proximal to splice acceptor and donor signals that control HIV splicing. Mutations of select binding sites resulted in changes in HIV splicing patterns that had impacts on viral replication. NMR spectroscopy experiments carried out on protein-RNA interaction identified from CLIP-seq offered additional mechanistic insights into sequence specific recognition of the hnRNP H protein for its HIV targets. Given the large number of RBPs known to interact with genomic and subgenomic viral RNAs to modulate translation, replication and the shift between these two stages, CLIP-seq can be employed to understand virology at the molecular level. Altogether, this methodology provides the experimental scaffold to study host-viruses interactions more comprehensively and design novel strategies to design antivirals. As demonstrated for the HIV CLIP-seq study, it is useful to couple CLIP-seq with other structural approaches to provide more mechanistic details on the RNA physicochemical features that contribute to form functional RNP complexes (Figure 2 complementary information on all nucleotide types exposed within unpaired regions and those likely involved in long-range tertiary interactions. Once secondary structural models are known, high-resolution 3D NMR structures can be obtained in association with computational methods (https://doi.org/10.1101/2020.04.14.041962), producing models based on integrated data sets. By adapting this integrated approach, viral RNA structural models, which are cross-validated, can be determined for the 3'UTR of SAR-CoV-2 as well as others. In addition to structure, the RNA-protein interactions and their effect on RNA folding are yet to be studied in the SARS-CoV-2 RTC. CLIP-seq can identify the RBP RNA-targets at the nucleotide level in vivo. With prior knowledge of the binding sites, antisense RNA probing can confirm these interactions and identify their effects on RNA folding. Coupling with NMR, binding sites for sub-domains of the RTC can be mapped onto the RNA structure. Since NMR assignments will be available, it should be straightforward to assess the extent of binding induced conformational changes. Although the RTC is used as one example, these integrated technologies and others like them should provide mechanistic insights into a wide-range of host-virus pathways (Figure 2) . For instance, RdRp plays a primary role in viral transcription and replication [40] . Uncovering its binding site on the 3'-UTR can aid in designing a site-specific drug to prevent this interaction. Also, monitoring the 3'-pseudo-knot's structural interactions under the effect of RdRp can give insight on the molecular switching mechanism for RNA synthesis. This can be tested under the influence of different viral or cellular factors at different environmental conditions. In addition, RdRp transcribes sgRNA through the guidance of TRSs and host proteins [40] . Identifying the binding sites of specific host proteins to the TRS will help investigate the mechanism of protein recruitment for transcriptional purposes. Several other RNA involved processes can be studied under the effect of viral or host factors. Similar to SARS-CoV-2, EV-A71 and other viruses can be effectively studied by integrating structural techniques. This blueprint can be adopted for any RNA virus. Studying RNA structural interactions and the effects of viral-host RBPs on RNA structure and function are essential for understanding translation, replication, and transcription processes in order to better understand how viruses reprogram the cellular environment. The effectiveness of integrating these approaches can help design and test more effective and site-specific small molecules. As we deal with the COVID-19 pandemic and prepare for the next viral outbreak, we hope that this article will encourage adopting integrative (collaborative) approaches that will enhance our understanding of viral RNA structures, their interactions with cognate proteins, and in turn the mechanisms by which viruses reprogram the cellular environment. With this comprehensive level of knowledge, we expect that the discovery of novel therapeutic agents will be accelerated. Insights into RNA structure and function from genome-wide studies Diverse roles of host RNA binding proteins in RNA virus replication Virus and Human Disease An Apparently New Enterovirus Isolated from Patients with Disease of the Central Nervous System Neurotropic enterovirus infections in the central nervous system Understanding enterovirus 71 neuropathogenesis and its impact on other neurotropic enteroviruses NIAID Emerging Infectious Diseases/ Pathogens List of Blueprint priority diseases Severe enterovirus A71 associated hand, foot and mouth disease Recent advances in the molecular epidemiology and control of human enterovirus 71 infection. Current Opinion in Virology Host and virus determinants of picornavirus pathogenesis and tropism Enterovirus 71 contains a type I IRES element that functions when eukaryotic initiation factor eIF4G is cleaved Receptors for enterovirus 71. Emerging microbes & infections Host Factors in Enterovirus 71 Replication EV71 3C protease induces apoptosis by cleavage of hnRNP A1 to promote apaf-1 translation Viral and host proteins involved in picornavirus life cycle Host Factors in Positive-Strand RNA Virus Genome Replication Far upstream element binding protein 2 interacts with enterovirus 71 internal ribosomal entry site and negatively regulates viral translation Heterogeneous nuclear ribonuclear protein K interacts with the enterovirus 71 5′ untranslated region and participates in virus replication HnRNP A1 Alters the Structure of a Conserved Enterovirus IRES Domain to Stimulate Viral Translation Regulation Mechanisms of Viral IRES-Driven Translation HuR and Ago2 Bind the Internal Ribosome Entry Site of Enterovirus 71 and Promote Virus Translation and Replication Far upstream element binding protein 1 binds the internal ribosomal entry site of enterovirus 71 and enhances viral translation and viral growth Research and Development on Therapeutic Agents and Vaccines for COVID-19 and Related Human Coronavirus Diseases COVID-19 and the cardiovascular system The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 Bat coronaviruses in China. Viruses A pneumonia outbreak associated with a new coronavirus of probable bat origin The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak-an update on the status Cell entry mechanisms of SARS-CoV-2 Coronaviruses: methods and protocols SARS and MERS: recent insights into emerging coronaviruses Direct RNA sequencing and early evolution of SARS-CoV-2. bioRxiv Inhibition of the main protease 3cl-pro of the coronavirus disease 19 via structure-based ligand design and molecular modeling Recombinant SARS-CoV nsp12 and the use of thereof and the method for producing it Potential therapeutic agents for COVID-19 based on the analysis of protease and RNA polymerase docking Coronaviruses: an overview of their replication and pathogenesis Continuous and discontinuous RNA synthesis in coronaviruses. Annual review of virology Viral and cellular mRNA translation in coronavirus-infected cells A two-pronged strategy to suppress host protein synthesis by SARS coronavirus Nsp1 protein. Nature structural & molecular biology Polypyrimidine tract-binding protein binds to the leader RNA of mouse hepatitis virus and serves as a regulator of viral transcription Heterogeneous nuclear ribonucleoprotein A1 binds to the transcription-regulatory region of mouse hepatitis virus RNA Determination of host proteins composing the microenvironment of coronavirus replicase complexes by proximity-labeling. Elife Structural elucidation of SARS-CoV-2 vital proteins: computational methods reveal potential drug candidates against Main protease, Nsp12 RNA-dependent RNA polymerase and Nsp13 helicase Coronavirus 3CL pro proteinase cleavage sites: Possible relevance to SARS virus pathology Ribosolve: rapid determination of three-dimensional RNA-only structures. bioRxiv Progress and challenges for chemical probing of RNA structure inside living cells Advances in CLIP Technologies for Studies of Protein-RNA Interactions High-throughput determination of RNA structures Integrated structural biology to unravel molecular mechanisms of protein-RNA recognition ScanFold: an approach for genomewide discovery of local RNA structural elements-applications to Zika virus and HIV Mapping the RNA structural landscape of viral genomes Viral RNA structure analysis using DMS-MaPseq Antisense probing of dynamic RNA structures IDENTIFYING THE RNA TARGETS OF RNA BINDING PROTEINS WITH CLIP Genome-Wide Analysis of Heterogeneous Nuclear Ribonucleoprotein (hnRNP) Binding to HIV-1 RNA Reveals a Key Role for hnRNP H1 in Alternative Viral mRNA Splicing Applications of NMR to structure determination of RNAs large and small This work was funded by National Institutes of Health grants U54AI50470 (the Center for HIV RNA Studies) and R01GM126833 (BST).