key: cord-0307290-pur7k5oc authors: Rouse, Warren B.; Gart, Jessica; Peysakhova, Lauren; Moss, Walter N. title: Analysis of RNA sequence and structure in key genes of Mycobacterium ulcerans reveals conserved structural motifs and regions with apparent pressure to remain unstructured date: 2021-11-23 journal: bioRxiv DOI: 10.1101/2021.11.23.469657 sha: 5f23e057ce3c7280fc9ffc210ba29e8900d97e80 doc_id: 307290 cord_uid: pur7k5oc Buruli Ulcer is a neglected tropical disease that results in disfiguring and potentially dangerous lesions in affected persons across a wide geographic area, which includes much of West Africa. The causative agent of Buruli Ulcer is Mycobacterium ulcerans, a relative of the bacterium that causes tuberculosis and leprosy. Few therapeutic options exist for the treatment of this disease beyond the main approach, surgical removal, which is frequently ineffective. In this study we analyze six genes in Mycobacterium ulcerans that have high potential of therapeutic targeting. We focus our analysis on a combined in silico and comparative sequence study of potential RNA secondary structure across these genes. The end result of this work was the comprehensive local RNA structural landscape across each of these significant genes. This revealed multiple sites of ordered and evolved RNA structure interspersed between sequences that either have no bias for structure or, indeed, appear to be ordered to be unstructured and (potentially) accessible. In addition to providing data that could be of interest to basic biology, our results provide guides for efforts aimed at targeting this pathogen at the RNA level. We explore this latter possibility through the in silico analysis of antisense oligonucleotides that could be used to target pathogen RNA. Author Summary Buruli Ulcer is a neglected tropical necrotizing skin disease endemic to West Africa and several other developing countries. The disease is known to be caused by Mycobacterium ulcerans, but the mode of transmission is not well understood. Here, we present findings on the RNA secondary structural landscape of key genes found in its genome and virulence plasmid. We also suggest potential therapeutic strategies to treat this disease that leverage a better understanding of RNA secondary structure. In our analysis we have predicted regions within these genes that are potentially ordered by evolution to have unusual structural stability and likely functionality, as well as regions that lack stable structure and may be unordered for accessibility. These structured regions can act as potential targets of both antisense oligonucleotide and small molecule therapeutics, while the unstructured regions may be most advantageous for only antisense oligonucleotides. Both strategies have been proven to be effective in other bacterial and viral pathogens; therefore, adaptation to this neglected disease may prove beneficial to developing more effective and efficient treatment options. Through our analysis of the RNA secondary structure landscape of key genes in M. ulcerans, we hope to provide other researchers with new avenues for development of novel therapeutic strategies to treat this devastating and neglected disease. 223 z-score ranged from -0.82 to −0.03, while the average ED ranged from 20.64 to 26.85. Here 224 we see that the gene with the lowest (most favorable) ΔG and highest GC% (Mul_RS01615) 225 has the highest (least favorable) average z-score and highest ED. The gene with the lowest 226 average z-score (Mul_RS04730) also showed the highest percent of windows with z-score <-1 227 (39.55%). This did not hold true for z-scores <-2 (two standard deviations more stable than 228 random), but the gene with the second lowest average z-score (Mul_RS01365) showed the 229 highest percent of windows with z-scores <-2 (14.76%). The gene with the smallest fraction 230 of its nucleotides spanned by low z-score windows was Mul_RS01615 (19.0% and 4.4% for 231 the -1 and -2 z-score cutoffs, respectively). 232 233 Table 1 . Summary of ScanFold data across each target gene.* 234 *The ScanFold data as found across all windows spanning each gene. From left to right the data corresponds to percent 235 GC content, average MFE, average ED, average z-score, number of windows generated, percent of windows with z-score <-236 1, percent of windows with z-score <-2, total number of possible base pairs, percent of base pairs with z-score <-1, percent 237 of base pairs with z-score <-2, and number of motifs with a z-score <-2. 238 239 In the prediction of the MFE ΔG for each analysis window, a model secondary structure is 240 also generated. Across each gene this resulted in many predicted base pairs, where a specific 241 nucleotide can be paired differently across several overlapping windows. As a result, many 242 potential base pairing partners may be predicted per nucleotide. ScanFold-Fold predicts 243 a single structural context (paired or unpaired) for each nucleotide based on its 244 contributions to low z-score windows (indicating ordered stability and, potentially, 245 function). The genes that had the greatest percentages of low z-score base pairs were 246 Mul_RS04730 and Mul_RS01365; Mul_RS04730 had 25.90% of its base pairs predicted with 247 average z-score <-1 and Mul_RS01365 had 10.12% of its base pairs predicted with average 248 z-score <-2 (Table 1) . Additionally, ScanFold-Fold extracts discrete structural motifs 249 (i.e., single hairpins or multi-branched stem loops) containing at least one base pair with an 250 average z-score <-2. This results in a list of motifs for each gene, where the longest gene 251 Mul_RS00210 had the greatest number of motifs (8 motifs) and the much shorter gene 252 Mul_RS01365 had the second greatest number of motifs (7 motifs). 253 254 In summary, all predictions indicate a particular importance for functional RNA secondary 255 structures encoded within the genes for virulence factor production and cell wall 256 biosynthesis. The ScanFold-Fold results also give us a means of generating motifs of 257 interest for further analysis. Fig. 1 , where blue, green, and yellow arcs indicate 266 base pairing with average z-scores ≤ -2, ≤ -1, and <0, respectively. Interesting trends are 267 seen in the ScanFold-Scan folding metrics partitioned per nucleotide of the transcript. 268 The overall thermodynamic stability remains flat across Mul_RS01365 until scans reach the 269 3′ end of this RNA, where stability increases (more negative MFE ΔG° in windows 270 overlapping these nucleotides). Despite fairly monotonous trends in ΔG°, evidence for 271 regions of ordered RNA stability (negative z-scores) are clustered into two distinct regions 272 at the 5′ and 3′ ends of the coding region, respectively. Interestingly, the core coding region 273 is spanned by a region with positive z-scores, which indicates that the ordered/evolved 274 sequence is less stable than predicted based on nucleotide content: i.e., the evolved sequence 275 may be ordered to be unstructured or accessible. This region of unusual instability correlates 276 with spikes in the ensemble diversity, which indicates a volatile ensemble of potential RNA 277 secondary structures or a lack of stable structure. Conversely, the low z-score clusters 278 overlap regions of low ensemble diversity, indicating one (or a few similar) conformation(s). 288 289 When the low z-score windows were evaluated by ScanFold-Fold, seven distinct motifs 290 (M1-7) were identified with exceptionally low (<-2) z-score-weighted base pairs (Fig. 1) . 291 These motifs comprise two upstream hairpins (M1 and M2) and five downstream hairpins 292 that span the start and stop sites of translation, respectively. M1 is notable for containing the 293 start codon embedded within a stable helix and for being the longest thermodynamically 294 stable hairpin predicted for this gene. When evaluated for their conservation across 295 mycobacterial species, none of the proposed base pairs were found to have statistically 296 significant covariation (S6 File). The proposed structures were, however, found to be 297 present/conserved across a wide array of species. 298 299 In M1, for example, the hairpin structure is preserved across pathogenic species of 300 mycobacteria (Fig. 2) . Conservation is highest in the terminal hairpin loop region that 301 contains the start codon, and trails off toward the basal stem, where deletions and 302 inconsistent mutations (that ablate canonical base pairing potential) would be predicted to 303 weaken the basal stem. The core hairpin is best preserved (100% preservation of base 304 pairing) in the medically significant species, M. gordonae, kansasii, tuberculosis and bovis, 305 while inconsistent mutations begin accumulating in M. fortuitum, where a U>C mutation 306 disrupts a single AU pair (Fig. 2) Fig. 3 following the same color scheme mentioned 327 previously. Trends are seen in the ScanFold-Scan folding metrics partitioned per 328 nucleotide of the transcript. The overall thermodynamic stability (ΔG°) remains flat across 329 the entirety of the transcript. Despite the flatline trend in ΔG°, evidence for regions of 330 ordered RNA stability (negative z-scores) are clustered into one distinct region just 331 upstream of and at the 5′ end of the coding region. The coding region is predominantly 332 spanned by structures with z-scores of 0 to -1 with a few structures of ≤ -2 and ≤ -1 333 interspersed. This indicates that the ordered/evolved sequence is overall only slightly more 334 stable than predicted based on nucleotide content: i.e., the evolved sequence may be ordered 335 to be structured. The regions of only slightly increased stability correlate with spikes in the 336 ensemble diversity when compared to those of the much more stable region just upstream 337 of and at the 5′ end of the coding region. This indicates a volatile ensemble of potential RNA 338 secondary structures or a lack of stable structure across the core coding sequence; whereas 339 the more 5' region shows a less volatile ensemble or more stable structures that may form 340 potentially functional structures. When the low z-score windows were evaluated by ScanFold-Fold, two distinct motifs 353 (M1-2) were identified with exceptionally low (<-2) z-score-weighted base pairs (Fig. 3) . 354 These motifs comprise one upstream multi-branch helix (M1) and one downstream hairpin 355 (M2) that spans the start site of translation and is a part of the coding sequence, respectively. 356 M1 is notable for containing the start codon in a single stranded region between two hairpins 357 of the multi-branch helix. When evaluated for their conservation across mycobacterial 358 species, two of the proposed base pairs were found to have statistically significant 359 covariation (S6 File and Fig. 3 ). In addition, part of the M1 structure and all of the M2 360 structure were found to be present/conserved across a wide array of species. 361 362 In M1, for example, the hairpin structure is preserved across pathogenic species of 363 mycobacteria (Fig. 4) . Conservation is highest in the central hairpin that shows evidence of 364 significant covariation and the downstream hairpin containing the start codon. However, 365 toward the 5' end of the structure, conservation drops off, as homologous sequences were 366 not found here. The small hairpin containing the last nucleotide of the start codon is best 367 conserved (100% preservation of base pairing) in the medically significant species, M. 368 gordonae, kansasii, fortuitum, phlei, abscessus, and avium; while inconsistent mutations begin 369 accumulating in M. tuberculosis, bovis, and leprae, where a U>A mutation disrupts a single AU 370 pair (Fig. 4) 439 polyketide has been shown to act as the virulence factor essential for infection and the 440 painless nature of the ulcers (49). Rather than falling within the bacterial genome, the 441 Mul_RS00210 gene occurs within the 174 kb virulence plasmid pMUM001 that encodes a 442 cluster of giant polyketide synthases (49). Until recently this was an uncharacterized 443 example of plasmid-mediated virulence in a Mycobacterium, and it is believed that the 444 pathogenicity of M. ulcerans is due the acquisition of pMUM001 by horizontal transfer (49). 445 446 A summary of ScanFold data is shown on (Fig. 5) . The overall thermodynamic stability 447 (ΔG°) of the Mul_RS00210 transcript is much more variable than that of all other genes 448 analyzed in this study. Evidence for regions of ordered RNA stability (negative z-scores) are 449 clustered into eight distinct regions throughout the coding region. Here, there are eight 450 structured regions with z-scores <-2 and five small regions of positive z-scores interspersed 451 (the first region is in the intergenic region). This indicates that the ordered/evolved 452 sequences making up these eight structures are much more stable than predicted based on 453 nucleotide content: i.e., the evolved sequence may be ordered to have structure. These 454 regions of high ordered stability (low z-score) correlate with dips in the ensemble diversity 455 when compared to those found in regions of much lower ordered stability indicating that 456 they are more stable and have a less volatile ensemble of potential RNA secondary structures 457 that can form. The few regions of positive z-score may be indicative of regions that are 458 evolved to be unstructured and therefore accessible to regulatory molecules. (Fig. 6) . This motif falls in a somewhat diffuse region of low z-515 scores that overlaps a cluster of moderate ensemble diversity; thus, while the base pairs and 516 nucleotides comprising M2 have significantly low z-scores (<-2), the predicted 517 conformational ensemble does not appear to be particularly tight (i.e., potential for 518 dynamics). It is worth noting that the transcript annotations for M. ulcerans are not sufficient 519 to determine if this motif falls within the 3′ UTR of Mul_RS01615. While both motifs in 520 Mul_RS01615 were found to have potentially homologous sequences and structures in other 521 mycobacteria (S6 File), significant covariation was not observed. To explore the potential value of ScanFold data in identifying binding sites for short 534 antisense oligonucleotides (ASO), we partitioned the ScanFold-Scan and -Fold results 535 by averaging predicted metrics across short (18 nt) windows that approximate the size of 536 potential ASO binding sites; this was further enhanced via predicting duplex binding 537 affinities via the program OligoWalk (48) while considering the effects of ScanFold 538 predicted local intramolecular structure on intermolecular duplex formation (all data 539 available in S3 File). To facilitate analysis, these results were also plotted vs. ScanFold-540 Fold predicted base pairs (S2 File). Numerous short stretches of sequence across each gene 541 of interest, including potentially accessible regions with positive z-scores, overlap strong 542 predicted duplex binding sites. While the results in toto are potentially valuable for aiding in 543 the identification and design of ASOs vs. M. ulcerans, we focus our attention on the two genes 544 that have predicted ordered structures that encompass the start sites of translation, which 545 are particularly attractive sites for ASO therapeutics. 546 547 The ASO accessibility results for the desA2 homolog, Mul_RS01365, are summarized in Fig. 7 . 548 The trend toward enhanced thermodynamic stability of local RNA secondary structure in 549 this gene is starkly illustrated, where significant dips in the partitioned averaged MFE and z- Additionally, using the sequence fasta file, final partners file, and log files from IGV-182 ScanFold an in-house python script All data 189 used to create these graphs and analyze the results, can be found in (S3 File). Following the 190 work of Matveeva et. al., oligonucleotides that had intra kcal/mol, inter-oligo values greater than -1.1 kcal/mol, duplex ΔG values of less than -15 These were subjected to 202 the full ScanFold pipeline as integrated in IGV-ScanFold. In the first step ScanFold-203 Scan was used to define the local RNA folding landscape of each gene (S4 File). Here, a 204 scanning analysis window of 120 nucleotides (nt) was moved one base at a time across each 205 gene sequence while predicting several folding metrics: the minimum free energy of folding 206 (MFE, ΔG°), its associated base pairing model (secondary structure), and a thermodynamic 207 z-score that compares the MFE of the natively ordered RNA to randomized version to 208 identify propensity for ordered stability as indicated by negative values; details on all 209 metrics in the Methods Section 40 kcal/mol to -33.23 kcal/mol. This 216 difference in predicted RNA folding stability is directly correlated with GC%, as expected, 217 which ranged from 68.14%to 61.32%. The average z-score and ensemble diversity (ED), 218 however, do not follow trends for ΔG or GC%. The z-score quantifies how much greater-thanof the transcript (S1 Fig.). The overall thermodynamic stability (ΔG°) remains flat 397 across the entirety of the Mul_RS04200 gene. Despite the flatline trend in ΔG°, there is 398 evidence for regions of ordered RNA stability (negative z-scores). Z-scores remained 399 relatively negative across the entire gene, but regions of lower z-scores were noted. One 400 region in the middle of the gene did show a significant decrease in z-score below -2, while 401 the majority of the gene's 3' end displayed lower than average z-scores The ScanFold-Fold motif found in the lowest z-score region (M1; S1 Fig.) had base pairs 406 with significantly low z-scores (< -2) which increased upstream and downstream of the 407 hairpin. Notably, as base pairing extended out from the basal stem, the z-scores steadily 408 increased until the final two bulges and terminal loop became only slightly negative The overall thermodynamic stability (ΔG°) remains uniform across the 419 entirety of the gene. Mul_RS09540 also had the fewest base pairs with < -2 z-score. Near the 420 3' end, there are two distinct clusters of z-score values that increase into the positive range, 421 thus indicating a region that may be evolved to have reduced structural propensity. This 422 same unstable region correlates with high ensemble diversity, further indicating the lack of 423 ordered and stable structure at the 3' end. Conversely, low z-score regions overlapped with 424 regions of low ensemble diversity The ScanFold-Fold motif found for this region gene 469 cartoon, base pair arc diagram, and 2D models of -2 ∆G z-score structures. The base pair arc diagram is annotated with 470 colored boxes (correlated to the colors in panel A) to show the location of M1-8 across the gene When the low z-score windows were evaluated by ScanFold-Fold, eight distinct motifs 476 (M1-8) were identified with exceptionally low (<-2) z-score-weighted base pairs Here, all the 478 hairpins were found to have base pairs with significantly lower than average z-scores (<-2), 479 whereas the multi-branch helix was found to only have significantly low z-scores in the basal 480 stem. When evaluated for conservation across mycobacterial species, two of the proposed 481 base pairs, found in structures (M1-2), were found to have statistically significant 482 covariation (Fig. 5 and S6 File). In addition, all structures were found to be of mycolactone (Fig. 5) A distinct cluster of low z-score nucleotides 503 occurs in the core coding region of the gene, which overlaps a cluster of distinctly low 504 ensemble diversity values. The ScanFold-Fold motif built for this region, M1, contains a 505 multibranch loop structure formed of two hairpin loops with low, but not significantly 506 negative, z-score nucleotides and base pairs. The multibranch loop structure sits atop a long 507 stem formed by significantly low (<-1) z-score base pairs and nucleotides, where the basal 508 stem (composed of six base pairs) had the lowest (<-2) z-scores. Though the ensemble 509 diversity was low across M1, the region spanning the two hairpins of the multibranch loop 510 were somewhat higher (Fig. 6), suggesting potential conformational dynamics for the two 511 hairpins. Notably, the region immediately downstream of M1 was spanned by positive z-512 scores, indicating a potential for ordered instability of secondary structure. A second 513 structural motif, M2, was predicted immediately downstream of the annotated open reading 550 score overlap the translational stop site and ScanFold-Fold modeled low z-score 551 structures MFEs) local structure across the gene. Perhaps counterintuitively, there are dips in the 556 average z-score and ensemble diversity that indicate ordered structure-indeed Put another way, while the thermodynamic stability of this region is predicted to be low, 559 it is still higher than expected given the nucleotide content Data generated using OligoWalk and in-house script to partition ScanFold data into 18-mer averages for Mul_RS01365 Average MFE per 18-mer (red), average z-score per 18-mer (blue), overall duplex ΔG (pink), base pair diagram Favorable overall ASO duplex ΔGs span the start codon, despite it being contained in ordered 569 structure taking into account disruption of target structure) ASO predicted to 573 occlude the start site was identified (Fig. 8). Here, an 18-mer ASO is predicted to bind to the 574 mRNA with an overall ΔG of -9.8 kcal/mol, which takes into account the significant energy 575 barrier (+15.8 kcal/mol) needed to disrupt 12 base pairs in the M1 hairpin structure. This 576 disruption is predicted to totally ablate the terminal hairpin structure. Notably, other 577 favorable ASO binding sites flank this optimal one (Fig. 8 and S3 File) and thus, the ASO 578 sequence can be Metrics in green indicate favorable binding, and the nucleotide outlined in green indicate the position MFE varies across the mRNA, however, it is predicted to be less stable in the 18-591 mers that span the start codon. Z-score is lowest toward the 5' end of the RNA and steadily 592 increases toward the 3' end, indicating 18-mers are less likely to be engaged in ordered RNA 593 structures as one moves along the sequence. Importantly, the overall predicted ASO duplex 594 stabilities were most favorable in the region spanning the start codon: indeed, the most 595 stable predicted duplex across the entire mRNA spans the start site (Fig. 10). The overall 596 predicted ASO duplex ΔG is -15.0 kcal/mol, which takes into account the relatively low 597 barrier (+7.4 kcal/mol) needed to invade flanking hairpin structures in the target RNA (2 598 base pairs in each structure) Data generated using OligoWalk and in-house script to partition ScanFold data into 18-mer averages for Mul_RS04730 Average MFE per 18-mer (red), average z-score per 18-mer (blue), overall duplex ΔG (pink), base pair diagram, and gene 607 cartoon The Mul_RS04730 Motif 1 2D model, ASO of interest with OligoWalk data, and the predicted structure after strand 613 invasion (left to right). Metrics in green text indicate favorable binding and the nucleotides outlined in green indicate the we see thermodynamic stability (favorable/low MFE predictions) and 634 ordered stability (low z-scores) at the 3′ end spanning the stop codon. This suggests 635 potential roles for RNA structure and its thermodynamic stability in the termination of 636 translation. Conversely, the coding region of this gene appears to be unstructured (as 637 evidenced by mediocre MFEs and positive z-scores) perhaps to facilitate interactions with 638 regulatory molecules or to promote rapid translation of this gene. Indeed, this latter idea is 639 gaining traction as a mode for affecting protein folding (50): i.e While none of the analyzed genes had global biases for ordered structure (average z-score < 643 -1 Table 1) clusters of ordered stability were present in each, where at least one defined 644 structural motif could be predicted with exceptional (z-score < -2) base pairs, yielding 19 645 motifs in total across the six target genes. These sequences have been (presumably) ordered significant covarying base pairs. The functional roles of conserved, ordered RNA 649 secondary structures in M. ulcerans and species with identified homologs can be diverse. For 650 example, as noted above Structures may also be playing roles in modulating accessibility to regulatory molecules 652 present in the bacterial cell or in mRNA turnover (e.g., modulating sensitivity to endogenous Significantly, two motifs span the translational start site: in Mul_RS01365 and 654 Mul_RS04730 the start codons are modeled to lie within a helix and loop For 659 Mul_RS01365 and Mul_RS04730, we predict oligos that can invade target intramolecular 660 structure and, for Mul_RS01365, has the potential to greatly disrupt the ordered folding that 661 may itself be functionally significant. That is to say, an ASO targeting the start site of 662 Mul_RS01365 may have two potential modes of action: obscuring the start site to impede Our results 679 are made public to advance a basic understanding of the RNA biology of this pathogen-by 680 providing conserved structural motifs of high likelihood of function (which may, themselves, 681 serve as potential therapeutic targets). As well This research was supported by National Institute of General Medical Sciences 687 R01GM133810 to WNM and F31CA257090 to WBR. We would also like to thank the Science 688 and Engineering Research Program (SERP) at Staten Island Technical High School led by Dr This file contains the M. ulcerans bacterial genome fasta, virulence plasmid fasta, and their ScanFold-Scan output data such as per nucleotide MFE, ED, z-score, input, and output fasta 708 files, and out file. 709 710 S5 File: All ScanFold-Fold data ScanFold-Fold output data such as the log file, base pair track, final partners data, all dot 713 bracket files, all CT files, extracted structures gff3 file, and the global VARNA 2D model. 714 715 S6 File: All cm-builder covariation data 716 This file contains all the data required to run cm-builder and all the output files generated 717 by INFERNAL All Mul_RS04200 ScanFold results 720 Figure showing ScanFold results for Mul_RS04200 including z-score, MFE, ED, base pair 721 diagram, gene cartoon, and 2D model of the structure with a z-score <-2. 722 723 S2 Fig: All Mul_RS09540 ScanFold results 724 Figure showing ScanFold results for Mul_RS09540 including z-score Buruli ulcer: Advances in understanding 730 Mycobacterium ulcerans infection and Analysis of 2014 WHO 733 Programmatic Targets Ecology and 735 transmission of Buruli ulcer disease: a systematic review In vivo and in vitro growth of Mycobacterium 738 marinum at homoeothermic temperatures Effect of Environmental Temperatures on Infection with 740 Balnei) of Mice and a Number of Poikilothermic Species Mapping the global 743 distribution of Buruli ulcer: a systematic review with evidence consensus Buruli 746 Ulcer: a Review of the Current Knowledge Mycolactone: 748 a polyketide toxin from Mycobacterium ulcerans required for virulence Buruli ulcer: reductive evolution enhances 751 pathogenicity of Mycobacterium ulcerans Pleiotropic molecular effects of the Mycobacterium ulcerans 753 virulence factor mycolactone underlying the cell death and immunosuppression seen in 754 Buruli ulcer Mycobacterial 756 toxin induces analgesia in buruli ulcer by targeting the angiotensin pathways Mycolactone is responsible for 759 the painlessness of Mycobacterium ulcerans infection (buruli ulcer) in a murine study All-oral 762 antibiotic treatment for buruli ulcer: a report of four patients Treating Mycobacterium 765 ulcerans disease (Buruli ulcer): from surgery to antibiotics, is the pill mightier than the 766 knife? Targeting RNA with small molecules: from 768 fundamental principles towards the clinic Principles for targeting RNA with drug-like small 770 molecules Design 772 of small molecules targeting RNA structure from sequence RNA to the rescue: RNA is one of the most promising targets for drug 775 development given its wide variety of uses Antisense antimicrobial therapeutics Efficiency of antisense oligonucleotide drug discovery The cost of getting personal Advances in oligonucleotide drug delivery Progress 784 in the Use of Antisense Oligonucleotides for Vaccine Improvement A map 787 of the SARS-CoV-2 RNA structurome A survey of RNA secondary structural propensity 789 encoded within human herpesvirus genomes: global comparisons and local motifs Translation of the 792 intrinsically disordered protein alpha-synuclein is inhibited by a small molecule 793 targeting its structured mRNA RNA 795 structural analysis of the MYC mRNA reveals conserved motifs that affect gene 796 expression The role of mRNA structure in bacterial translational regulation ScanFold: an approach for genome-wide discovery of 800 local RNA structural elements-applications to Zika virus and HIV Targeting the SARS-CoV-2 802 RNA Genome with Small Molecule Binders and Ribonuclease Targeting Chimera 803 (RIBOTAC) Degraders Secondary 805 structure determination of conserved SARS-CoV-2 RNA elements by NMR spectroscopy De novo 3D 808 models of SARS-CoV-2 RNA elements from consensus experimental secondary 809 structures Targeting the 811 Conserved Stem Loop 2 Motif in the SARS-CoV-2 Genome Conserved Genomic Terminals of SARS-CoV-2 as Coevolving 814 Functional Elements and Potential Therapeutic Targets Comparative genomics analysis of Mycobacterium 816 ulcerans for the identification of putative essential genes and therapeutic candidates Structural RNA has lower folding energy than 819 random RNA of the same dinucleotide frequency A comparison of RNA folding measures Using an RNA secondary structure partition function to determine 823 confidence in base pairs predicted by free energy minimization The ensemble diversity of non-coding RNA structure is lower than random 826 sequence Mapping the RNA structural landscape of viral genomes Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-831 relevant elements A statistical test for conserved RNA structure shows lack 833 of evidence for structure in lncRNAs Evolutionary conservation of RNA sequence and structure Infernal 1.1: 100-fold faster RNA homology searches BLAST: at the core of a powerful and diverse set of sequence 839 analysis tools The log likelihood ratio test (the G-test); methods and tables for tests of 841 heterogeneity in contingency tables VARNA: Interactive drawing and editing of the RNA 843 secondary structure OligoWalk: an online siRNA design tool utilizing hybridization 845 thermodynamics Giant plasmid-847 encoded polyketide synthases produce the macrolide toxin of Mycobacterium ulcerans Role of mRNA structure in the control of 850 protein folding Advances in therapeutic bacterial antisense biotechnology