key: cord-349839-s32d3di2 authors: Westhof, Eric; Jaeger, Luc title: RNA pseudoknots date: 1992-06-30 journal: Current Opinion in Structural Biology DOI: 10.1016/0959-440x(92)90221-r sha: doc_id: 349839 cord_uid: s32d3di2 Abstract RNA pseudoknots result from Watson-Crick base pairing involving a stretch of bases located between paired strands and a distal single-stranded region. Recently, significant advances in our understanding of their structural and functional aspects have been accomplished. At the structural level, modelling and NMR studies have shown that a defined subset of pseudoknots may be considered as tertiary motifs in RNA foldings. At the functional level, there is evidence that the realm of functions encompassed by RNA pseudoknots extends from the control of translation in prokaryotes, retroviruses and coronaviruses to the control of catalytic activity in ribozymes and the control of replication in some plant viruses. The possibility of pseudoknot formation has been suggested ever since the early thinking on the folding of single-stranded RNA molecules [1]. It was not until the experimental and theoretical work of Pleij and his collaborators [2] , however, that the existence of pseudoknots in RNA molecules became recognized. Recently, a variety of biological functions have been attributed to pseudoknotting in RNA single strands. In the first part of this review, we will examine the problems related to the definition of pseudoknots and will try to convey the message that pseudoknots with at least some approximate co-axial stacking between two adjacent helices are of particular value for understanding and modelling RNA three-dimensional foldings. In the second part, we will discuss the various functional aspects of pseudoknotting and the advances made during the past year. Finally, we will comment on the difficulties in predicting and recognizing pseudoknots in RNA sequences. In the folding of a single-stranded RNA molecule, there are only three ways in which two base-paired segments can be related to each other: two consecutive hairpins; two helices separated by an internal bulge; and, pseudoknots [1]. The first two motifs can be represented as two-dimensional graphs without self-intersections whereas pseudoknots cannot, as they are fundamentally three-dimensional structures in which the four base-paired strands alternate along the sequence of the RNA molecule. This general definition of pseudoknotting Biology 1992, 2:327-333 is illustrated in Fig. 1 . In other words, pseudoknots can be most generally defined as standard Watson-Crick base pairing involving a stretch of bases located between paired strands and an outlying parmer (e.g. Watson-Crick interactions between two hairpin loops or between a loop and a bulge) [2,3",4"]. In an important piece of work, Pleij and his collaborators [2] demonstrated that because of the special geometry of RNA helices, it is possible for the two helical stems in the pseudoknot to be co-axially stacked. This arrangement constitutes a very important subset of pseudoknots for biological functions. The definition portrayed in Fig. 1 is general and does not set any constraint on either the lengths of the connecting segments or the helices. A full turn of RNA double helix requires 11 bp. When each of the helices forming a pseudoknot makes a full rum, a topologically knotted structure is obtained. In all proposed pseudoknot structures, however, the helical stems are shorter than a full helical turn. Further, the minimum length of the connecting segments bridging the two grooves depends clearly on the length of $1 and $2, i.e. on the number of base pairs that each contains. Thermodynamic studies have been carried out only on a pseudoknot of type (ii) which contained 3 bp in helix S1 and 5 bp in helix $2 [4o,8]. In the presence of mg 2 +, a stable pseudoknot structure is obtained The three types of 'classic' pseudoknots with co-axial stacking of the two base-paired helices and with single-stranded segments crossing the RNA grooves. RNA segments forming double-stranded helices are labelled $1 and $2; the connecting RNA segments are labeled L1, L2 and L3 (same colour code as in Fig. 1 ). Right-handed RNA helices (similar to A-form DNA helices) are characterized by a deep but narrow groove (equivalent to the major groove of B-DNA) and a shallow but wide groove (equivalent to the minor groove of B-DNA). In the central drawing of a hypothetical circular RNA, the connecting segment (i) crosses the deep groove whereas segment (iii) crosses the shallow groove. Segment (ii) bridges the whole length of the co-axial helices. when L1 is equal or larger than three nucleotides and L3 is equal or larger than four nucleotides [8] . With six base pairs in S2, however, one could expect that a single nucleotide in L1 is sutt]cient to obtain a stable pseudoknot structure. When the connecting segments contain a large number of nucleotides, they may be stmctured and partially basepaired. Several situations can be envisaged. If L1 or L3 is large, a type (ii) pseudoknot becomes equivalent to either a type (i) or type (iii), respectively. Types (i) and (iii) could still be described as motifs even if the segment crossing the whole helical stem is long and structured. Where there is no short loop over either groove, however, the notion of structural motif is obviously lost. Figure 3 illustrates schematically another type of RNA tertiary motif, which also is the outcome of the particular geometry of RNA helices and which exhibits stacking of two helices. It results from the formation of base triples between a single strand and a double-stranded helix (either in the deep groove or in the shallow groove of the helix) and can lead to a segment of small triple helix. Fig. 3. Two-dimensional and three-dimensional diagrams illustrating how a pseudoknot can be related to a triple helix. Instead of crossing the deep groove, the single-stranded 3' end of helix H1 (in black) interacts with the side of the bases exposed in the deep groove of helix H2 (in grey). Similarly, instead of crossing the shallow groove, the single-stranded 5' end of helix H2 (in grey) interacts with the shallow groove of helix H1 (in black). A three-dimensional example of such a double triple-helix motif has been proposed in the core of the catalytic introns of group I [25]. The formation of classic pseudoknots leads to a compact structure, the stability of which can be affected by the size of the loops and the number of base pairs. Thermodynamic data on these aspects are still scarce. Recent work has indicated that pseudoknots are only marginally more stable than their constituent hairpins (by less than 2 kcal mol-1 ) [4",9"" ], If upheld by further evidence, this observation suggests a role for pseudoknots as conformational switches or control elements in several biological functions [10]. In molecules that lack an overall three-dimensional fold, pseudoknots fold locally and their positions along the sequence reflect their function [ 11. ]. For example, pseudoknots that are folded at the 5' end of mRNAs tend to be involved in translational control whereas those at the 3' end maintain intact signals for replication. In molecules with catalytic and mechanistic activities, pseudoknots are located at the core of the tertiary fold and involve nucleotides that are far apart in the sequence. Pseudoknots appear to adopt two roles in the control of mRNA translation: either specific recognition of a pseudoknot by some protein is required for control, as described for the 5' end of mRNAs in some prokaryotic systems The presence of three pseudoknots in 16S rRNA has been suggested on the basis of comparative sequence analysis (reviewed in [2]). The first pseudoknot lies between the 5' end and the region around nucleotide 915. The second is located at a phytogeneticalty conserved tertiary interaction between positions 570 and 866, also in the 5' domain [2] . The third pseudoknot involves the 530 stem-loop structure [19] , which is known to be important for the binding of tRNA to the ribosomal A site [20] , and was recently shown to be essential for ribosomal function [21.o] . Interestingly, some mismatches such as A.C or C.A in the third pseudoknot were lethal, whereas the wobble base pairs (G.U, U.G) conferred resistance to streptomycin, an antibiotic that perturbs the control of translational accuracy [21..] . The same 530 region interacts with protein $12 [22] . Mutations in $12 lead to streptomycin resistance and give rise to hyperaccurate ribosomes [21.-] . These observations are particularly interesting in view of the suggested conformational switch that involves two pseudoknots, one at the 5' end and one in the region around nucleotide 915 [23] . The equilibrium between the two states would be controlled by $12 and $4 (a protein which has effects opposite to those of $12 with respect to translational accuracy) [23] . Clearly, further experiments are required to prove this model. There has, however, been suggestion of an alternative conformational switch between the pseudoknot at position 915 and a new one that involves the anti-Shine-Dalgamo sequence [24] . Like the pseudoknot at the 5' end of the 16S rRNA, the pseudoknots at the core of the catalytic RNAs of group I introns [25] and ribonucleaseP [26,27,28. ] are highly conserved. A pseudoknot in the satellite RNA of barley yellow dwarf virus, which belongs to the 'hammerhead' class, has been characterized [29"']. It has been suggested that pseudoknotting stabilizes the structure required for the self-cleavage of the hepatitis delta vires RNA [30] . Secondary structures for genomic and antigenomic sequences that share a common stem-loop axehead motif could be developed without a pseudoknot [31" ], indicating that pseudoknotting might not be needed for the catalytic action. In support of this, a tran~active ribozyme was engineered from the normally c~active molecule [31-.,32 ..]. Though not capable of catalytic activity in the absence of protein [33] , the mitochondrial-RNA-processing RNase could, as the RNA component of RNase P, possess a cage-shaped structure centered around a pseudoknot [26] . The existence of two pseudoknots in eubacterial ribonudease P RNA is attested by phylogenetic and mutational analyses [28. ]. The second pseudoknot (which base pairs nucleotides 82-85 with 276-279, using Escherichia coli nomenclature) is absent in Bacillus subtilis and it was suggested, on the basis of molecular models, that an additional helix present in Bacillus (and absent in E. coli) could hold an equivalent architectural function [28. ]. Although these two pseudoknots impose topological constraints on the three-dimensional fold, neither is considered absolutely essential for catalytic activity in vitro [26] . Similarly, in the self-splicing introns of group I, a new pseudoknot, Pll, was shown to be not essential for self-splicing activity in vitro, however, mutants with disrupted Pll required a higher concentration of MgCI 2 for self-splicing to occur [3"]. By contrast, mutants that strengthen base pairing in Pll self-splice more efficiently than the wild type at high temperature [3"]. This peripheral pseudoknot, which is present only in the subgroup IA, assists therefore the formation and the maintenance of an active conformation without itself being necessary for catalysis. Two similar pseudoknots, involving one terminal loop and a distal section of the molecule, have already been proposed for intron--exon interactions in group II introns [34, 35] . The ribozyme in the satellite RNA of barley yellow dwarf virus [29"'] undergoes self-cleavage at a low rate which was attributed to the presence of a pseudoknot. Base substitutions that prevented folding of the pseudoknot increased the self-cleavage rate up to 400fold, whereas compensatory mutations reduced the selfcleavage rate in proportion to the helical stability of the helix responsible for pseudoknot formation [29"']. Pseudotmotting underlies the tRNA-like motifs at the 3' end of several groups of plant viral RNA genomes (for a review, see [11 °]). This structural similarity is paralleled in biological function as the tRNA-like motifs are recognized by many tRNA-specilic enzymes such as aminoacyl-tRNA synthetases, nucleotidyl transferase, or RNAseP [11"]. The tRNA-like structure has been shown to be necessary for the initiation of replication of a positivesense RNA vires, brome mosaic vires (BMV) [11.]. A telomeric function of the tRNA-like structure of BMV was also demonstrated in vivo [36] , in agreement with the 'genomic tag model' associated with such 3'-terminal tRNA-like motifs [37] . The dependence of replication on aminoacylation appears to be variable. The turnip yellow mosaic virus (TYMV) depends on valylation for replication [38] whereas BMV, which is normally tyrosylated, does not [39"'] . On the basis of this observation, transcripts bearing mutations in the tRNA-like domain on one of the three genomic RNAs of BMV (which lead to decreased tyrosylation function in vitro) are able to interfere in tram with the replication of the other viral RNAs when inoculated in barley protoplasts [39"] . A mutational analysis of the pseudoknot in TYMV concluded that mutations in the loop bridging the shallow groove has stronger effects on valytation efficiency than those in the segment crossing the deep groove [9"*]. Recently, the stretch of three pseudoknots preceding the tRNA-like structure in tobacco mosaic virus was shown to act as the functional equivalent of a poly(A) tail, stabi-lizing a reporter mRNA and increasing gene expression up to 100-fold [40•'] . The formation of pseudoknots is a three-dimensional process and is central to the RNA folding pathway. Yet, at present, pseudoknot formation is incompatible with the dynamic programming method used in the present algorithms for predicting RNA secondary structures [41, 42] . Nevertheless, two programs for the prediction of pseudoknot formation have been developed recently [5, 43] . Both simulate the RNA fold by a sequential selection around the most stable stems [5, 43] and, further, by sequential addition of folding domains (200-400bp) during RNA .synthesis [43] . An implicit assumption in the prediction of secondary structures is that all structural elements co-exist in the final folded form; however, the possibility of alternative or transient pairings during the folding pathway or the function of the RNA should be kept in mind. Generally, it is difficult to prove the existence of pseudoknots merely on the basis of phylogenetic comparisons of only a small number of sequences. For example, a phylogenetically conserved sequence for pseudoknots does not necessarily mean that it is an important structural signal for the bicoid mRNA [44] , or the RNA template in telomerase [45, 46] , or the RNA in the signal recognition particle [47] . In these instances, systematically directed mutagenesis (with double mutants) coupled to chemical probing and functional tests would be required in order to reach consistent and biologically relevant structures. In contrast, with a large number of sequences, phylogenetic comparisons can be the most powerful approach for determining both the two-dimensional structure and the possible pseudoknots, which are then considered as those Watson--Crick pairings that are not contained in the secondary pairings [19, 25, 27, 35] . Achieving these goals requires rigorous investigations on the frequency of compensatory base changes and the frequency of the underlying mutational events in the helical stems. Additionally, it is necessary to develop a new method, such as that described in [25] , with which to distinguish those coordinated changes that result from authentic molecular constraints from those that arise from historical contingencies. In addition to demonstrating the requirements of both a pseudoknot and a heptameric shift site for frameshifting, the authors investigate the influence of the sequence at the shift site and especially of the seventh nucleotide of the generalized X XXX YYZ shift sequence (where X and Y are often A or U, and YYZ is usually AAC or UUA). Thus, the human immunodeficiency virus (H1V) shift site has the same frameshifting ef ficiencies as MMTV and IBV shift sites in the MMTV context (20%), whereas it is sixfold more efficient in the MMTV context than in its native HIV context. Also, changing the seventh nucleotide of the shift site from a C to U, A, or G provokes a 10-fold decrease in frameshifting efficiency. This last result points to a possible role of modified bases at position 34 in the antic'don loop of the tRNA binding to the YYZ loop. A compelling demonstration that a pseudoknot located eight nu cleotides 3' of the UAG stop codon is required ff)r translational read-through in MuLV. Although the nucleotide sequence in the helical stems of the pseudoknot is inconsequential for read-through, base substitutions in L3 diminish or stimulate read-through whereas insertion of three nucleotides in L1 or deletion of three nucleotides in L3 abolish read through. Interestingly, replacement of the MuLV pseudo knot by the MMTV pseudoknot eight nucleotides from the UAG stop c(xton resulted in detectable (10 %) read through. Evidence for Several Higher Order Structural Elements in Ribosomal RNA Transfer RNA Shields Specific Nucleotides in 16S Ribosomal RNA from Attack by Chemical Probes A Functional Pseudoknot in 16S Ri-• . bosomal RNA Interaction of Ribosomal Proteins $5, $6, Sll, S12, S18, and $21 with 16S rRNA GtNGRAS k A Conformational Switch Involving the 915 Region of Escherichia coli 16S Alternative Base Pairing Between 5'-and 3'-Terminal Sequences of Small Subunit RNA May Provide the Basis of a Conformational Switch of the Small Ribosomal Subunit Similar Cage-shaped Structures for the RNA Component of All Ribonuclease P and Ribonuclease MRP Enzymes Structure and Evolution of Ribonuclease P RNA NR: • Long-range Structure in Ribonuclease P RNA A first attempt at finding equivalent three-dimensional motifs in RN& 29 Alternative Tertiary Structure Atten-• . uates Self-cleavage of the Ribozyme in the Satellite RNA of Barley Yellow Dwarf Virus A Pseudoknot-like Structure Required for Efficient Serf-cleavage of Hepatitis Delta Virus RNA Efficient tran~Cleavage and a • * Common Structural Motif for the Ribozymes of the Human Hepatitis Delta Agent Analysis of RNA secondary structures, together with in vitro RNA transcripts, leads to the production of a trans-active ribozyme which could be used as a therapeutic agent. The results suggest also that some RNA structures facilitate the folding process to an active ribozyme without themselves being involved in the catalytic reactions Sequence • . and Structure of the Catalytic RNA of Hepatitis Delta Virus Genomic RNA Enzymatic and chemical probing indicate that at least two stem-loop structures are required for catalysis. The structure of the region involved in pseudoknotting by [30] is uncertain but its tertiary structure affects the efficiency of cleavage Secondary Structure of the RNA Component of a Nuclear Mitochondrial Ribonucleoprotein Multiple Exon-binding Sites in Class II Self-splicing Introns Comparative and Functional Anatomy of Group II Catalytic Introns--a Review Telomeric Function of the tRNA-like Structure of Brome Mosaic Virus RNA tRNA-like Structures Tag the 3' Ends of Genomic RNA Molecules for Replication: Implications for the Origin of Protein Synthesis Turnip Yellow Mosaic Virus RNAs with Antic.don Loop Substitutions that Result in Decreased Valylation Fail to Replicate Efficiently HALt TC: Interference in trans with Brome Mo Each of the three genomic RNAs of BMV carries a 3'-end tRNA like domain. RNA-2 mutants with a deficiency in tyrosylation activity of the 3'-end tRNA-like domain are able to interfere in trans with the synthesis and accumulation of the viral RNAs when inoculated in barley protoplasts Tobacco Mosaic Virus tRNA-like Structure in Cytoplasmic Gene Regulation On Finding All Suboptimal Folding of an RNA Molecule The Equilibrium Partition Function and Base Pair Binding Probabilities for RNA Secondary Structure The Computer Simulation of RNA Folding Involving Pseudoknot Formation Bicoid mRNA Localization Signal: Phylogenetic Conservation of Function and RNA Secondary Structure A Conserved Secondary Structure for Telomerase RNA A Conserved Pseudoknot in Telomerase RNA SRP-RNA Sequence Alignment and Secondary Structure 15 me R. Descartes, F-67084 We thank our colleagues for sharing with us reprints and we thank especially F Michel and Y-M Hou for their constructive and critical reading of the manuscript. References and recommended reading A state-of-the-art mutational analysis of a functional pseudoknot. The authors investigate the frameshift signal which is comprised of a heptameric shift site and a downstream RNA pseudoknot. They show that base pair formation at the junctions between the two helices forming the pseudoknot is not a pre-requisite for efficient frameshifting. The primary structure is not determinant as long as the overall pseudoknot structure is maintained. Although small insertions or deletions in the loops of the pseudoknot have no effect on frameshifting, insertion or deletion of three nucleotides in the six nucleotides separating the end of the shift site from the first helix of the pseudoknot abolishes frameshifting. It is further shown that the pseudoknot cannot be replaced by an equivalent stem-loop structure.