Summary of your 'study carrel' ============================== This is a summary of your Distant Reader 'study carrel'. The Distant Reader harvested & cached your content into a collection/corpus. It then applied sets of natural language processing and text mining against the collection. The results of this process was reduced to a database file -- a 'study carrel'. The study carrel can then be queried, thus bringing light specific characteristics for your collection. These characteristics can help you summarize the collection as well as enumerate things you might want to investigate more closely. This report is a terse narrative report, and when processing is complete you will be linked to a more complete narrative report. Eric Lease Morgan Number of items in the collection; 'How big is my corpus?' ---------------------------------------------------------- 62 Average length of all items measured in words; "More or less, how big is each item?" ------------------------------------------------------------------------------------ 6475 Average readability score of all items (0 = difficult; 100 = easy) ------------------------------------------------------------------ 51 Top 50 statistically significant keywords; "What is my collection about?" ------------------------------------------------------------------------- 33 RNA 29 figure 7 dna 6 SARS 6 HIV-1 5 sequence 4 PCR 3 UGA 3 PRF 2 site 2 probe 2 gene 2 NTD 2 LTR 2 LNA 1 ÀPMO 1 toxin 1 target 1 structure 1 specific 1 small 1 siRNA 1 set 1 rig 1 ribosome 1 recombination 1 recode 1 pseudoknot 1 protein 1 peptide 1 mutation 1 motif 1 molecule 1 molecular 1 loop 1 lod 1 limd1 1 hash 1 hairpin 1 frameshift 1 exosome 1 domain 1 datum 1 database 1 cpv 1 copy 1 codon 1 chemical 1 cd200 1 beacon Top 50 lemmatized nouns; "What is discussed?" --------------------------------------------- 2043 sequence 1726 protein 1506 figure 1399 cell 1285 structure 1219 virus 1090 gene 1020 site 857 genome 844 % 741 activity 681 analysis 612 dna 596 datum 557 sample 552 domain 537 codon 527 type 522 expression 518 assay 517 loop 511 acid 510 frameshift 501 ribosome 497 motif 497 interaction 483 product 477 region 470 result 464 mutation 461 study 453 signal 453 rna 453 level 436 system 425 target 425 c 420 number 389 time 388 residue 386 efficiency 381 replication 377 effect 372 polymerase 363 stem 362 amplification 354 probe 353 pseudoknot 350 frame 349 reaction Top 50 proper nouns; "What are the names of persons or places?" -------------------------------------------------------------- 2568 RNA 523 Figure 433 C 381 À1 345 Supplementary 345 PCR 332 HIV-1 295 tRNA 294 SARS 287 G4 280 A 242 mRNA 240 DNA 231 G 224 PRF 216 RT 190 DnaG 189 siRNA 184 Table 174 U 157 CoV 155 −1 152 k 150 IFN 142 al 142 LNA 141 SOX 138 M 131 FS 129 pseudoknot 129 et 127 PMO 120 Bst 119 USA 119 T 118 PCAF 117 pH 116 NTD 115 D 112 RTSL 110 MADP1 109 rRNA 109 G4s 108 N 106 IBV 102 miRNA 100 WT 100 S. 100 ADP 98 NMR Top 50 personal pronouns nouns; "To whom are things referred?" ------------------------------------------------------------- 1584 we 735 it 187 they 126 i 90 them 56 us 29 itself 21 one 9 themselves 5 nsp10 4 nsp7 4 mrnas 2 you 1 pyrrole−probe 1 pip6a 1 parp10 1 ours 1 nthash 1 mine 1 me 1 imagej 1 ibv14931-f 1 i- 1 hsp60 1 his 1 he 1 genemarkhmm 1 az1c 1 -interferon 1 --they Top 50 lemmatized verbs; "What do things do?" --------------------------------------------- 12114 be 1926 use 1471 have 769 show 692 bind 684 contain 638 frameshifte 458 base 437 observe 431 include 406 find 378 identify 346 suggest 343 indicate 340 perform 335 do 329 induce 329 increase 322 describe 315 require 315 determine 314 follow 302 provide 276 generate 273 form 260 compare 256 involve 250 occur 249 predict 240 target 239 result 227 see 226 reveal 220 demonstrate 214 allow 210 encode 204 mediate 198 conserve 196 detect 194 express 193 produce 193 give 192 test 191 lead 187 correspond 181 reduce 178 label 173 know 170 make 169 associate Top 50 lemmatized adjectives and adverbs; "How are things described?" --------------------------------------------------------------------- 976 not 973 - 791 viral 735 also 726 specific 598 high 530 other 526 human 460 non 416 different 411 more 407 ribosomal 399 only 381 however 367 such 358 single 352 structural 343 well 336 first 335 low 324 small 315 large 289 thus 288 new 271 nucleic 268 similar 267 further 266 previously 257 then 255 nt 252 most 246 dependent 244 same 234 wild 233 several 228 functional 226 multiple 216 highly 210 important 210 as 205 cellular 204 efficient 203 genomic 198 nucleotide 196 molecular 191 positive 189 long 182 bacterial 178 translational 177 downstream Top 50 lemmatized superlative adjectives; "How are things described to the extreme?" ------------------------------------------------------------------------- 93 most 52 least 43 high 38 good 22 Most 21 low 17 near 17 large 8 small 5 strong 5 fast 5 close 3 simple 3 late 3 great 2 weak 2 short 2 long 2 few 2 early 2 bad 2 -proximal 2 --b 1 setcov 1 old 1 new 1 hmG 1 bright 1 ClustalW 1 Arg102 Top 50 lemmatized superlative adverbs; "How do things do to the extreme?" ------------------------------------------------------------------------ 159 most 49 least 12 well 4 -tag 1 -end Top 50 Internet domains; "What Webbed places are alluded to in this corpus?" ---------------------------------------------------------------------------- Top 50 URLs; "What is hyperlinked from this corpus?" ---------------------------------------------------- Top 50 email addresses; "Who are you gonna call?" ------------------------------------------------- 3 journals.permissions@oupjournals.org 1 zhouxi@whu.edu.cn 1 zhiqi.chen@utoronto.ca 1 zemla1@llnl.gov 1 ncbi-help@ncbi.nlm.nih.gov 1 mmakrigiorgos@partners.org 1 mhoward@genetics.utah.edu 1 joacim.elmen@cgb.ki.se 1 ib103@mole.bio.cam.ac.uk 1 dorothylang@gmail.com 1 chun@seegene.com 1 aravind@ncbi.nlm.nih.gov 1 khl@udel.edu Top 50 positive assertions; "What sentences are in the shape of noun-verb-noun?" ------------------------------------------------------------------------------- 7 pseudoknot is sensitive 3 cells were then 3 data are consistent 3 data were fit 3 domain is also 3 frameshift inducing ability 3 frameshift inducing stem 3 interaction is also 3 product was evident 2 activity is rna 2 assays described above 2 assays require further 2 assays were also 2 cells are not 2 cells is due 2 cells were co 2 cells were transiently 2 dna binding protein 2 domain is comparable 2 frameshift inducing capacity 2 frameshift inducing element 2 frameshift inducing pseudoknot 2 gene is highly 2 genome form high 2 interaction does not 2 interactions involving stem 2 motifs were not 2 product was also 2 products contained phe 2 products were fs 2 protein are essential 2 protein increases strand 2 protein was able 2 protein was further 2 proteins were then 2 pseudoknot containing constructs 2 ribosomes were not 2 rna binding activity 2 rna binding channel 2 rna containing exon 2 rna frameshifting pseudoknot 2 rna is also 2 rna is not 2 rna was then 2 samples were sta 2 sequence binding protein 2 sequence is also 2 sequence is not 2 sequences are black 2 sequences are blue Top 50 negative assertions; "What sentences are in the shape of noun-verb-no|not-noun?" --------------------------------------------------------------------------------------- 1 analysis has not yet 1 assay has no additional 1 cells are not only 1 cells are not readily 1 cells were not high 1 domain has no counterpart 1 genes does not always 1 genome is not available 1 loop is not absolutely 1 motif is not critical 1 motifs were not experimentally 1 products were not faithfully 1 protein do not necessarily 1 proteins has not only 1 proteins were not available 1 pseudoknot does not necessarily 1 pseudoknot is not very 1 rna is not entirely 1 rna was not able 1 sample was not completely 1 sequence is not directly 1 sequence is not independent 1 sequenced was not homogenous 1 structure was not present 1 structures are not sufficient 1 type was not significant A rudimentary bibliography -------------------------- id = cord-002366-t94aufs3 author = Aurrecoechea, Cristina title = EuPathDB: the eukaryotic pathogen genomics database resource date = 2017-01-04 keywords = analysis; datum; figure; gene summary = To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user''s data. The near-seamless integration of strategy results with tools for functional enrichment analyses and transcript interpretation as well as our new Galaxy workspace and the availability of publicly shared strategies augment the data mining experience in EuPathDB. doi = 10.1093/nar/gkw1105 id = cord-000159-8y8ho2x5 author = Bekaert, Michaël title = Recode-2: new design, new search tools, and many more genes date = 2009-09-25 keywords = RNA; gene; recode summary = ''Recoding'' is a term used to describe non-standard read-out of the genetic code, and encompasses such phenomena as programmed ribosomal frameshifting, stop codon readthrough, selenocysteine insertion and translational bypassing. It provides access to detailed information on genes known to utilize translational recoding and allows complex search queries, browsing of recoding data and enhanced visualization of annotated sequence elements. The term ''translational recoding'' describes the utilization of non-standard decoding during protein synthesis and encompasses such processes as ribosomal frameshifting, codon redefinition, translational bypassing and StopGo (1) (2) (3) (4) (5) (6) (7) . To facilitate further development of computational tools for the prediction of recoded genes in the ever faster growing body of sequence data, as well as to provide bench researchers with upto-date information on recoding, an efficient means of Recode database population and annotation are now required. RECODE: a database of frameshifting, bypassing and codon redefinition utilized for gene expression doi = 10.1093/nar/gkp788 id = cord-000363-cbzd8ybv author = Belew, Ashton T. title = Endogenous ribosomal frameshift signals operate as mRNA destabilizing elements through at least two molecular pathways in yeast date = 2010-11-24 keywords = EST2; NMD summary = The slippery heptamers for these À1 RF signals begin at nucleotides 858, 1653, 279 and 1521 of their respective ORFs. These were cloned into a yeast PGK1 reporter gene so that frameshifted ribosomes are directed to PTCs. All inserts were flanked by sequences derived from Renilla and firefly luciferase genes, providing unique exogenous sequences for specific detection of the reporter mRNAs. Two additional PGK1 reporters without À1 RF signals were used as controls: a readthrough reporter encoded a continuous ORF, while a PTC control contained an in-frame UAA termination codon ( Figure 1 ). Analysis of the Programmed Ribosomal Frameshift Database (http:// prfdb.umd.edu/) reveals that, along with the other four putative À1 RF signals in the EST2 mRNA, the mRNAs encoding Est1p, Stn1p, Cdc13p and Orc5p, all components or regulators of telomerase that are stabilized in NMD À cells, also contain high confidence À1 RF signals (Supplementary Figure S2 ). doi = 10.1093/nar/gkq1220 id = cord-334127-wjf8t8vp author = Brister, J. Rodney title = NCBI Viral Genomes Resource date = 2015-01-28 keywords = NCBI; Viral; sequence summary = This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets. Whereas primary databases are archival repositories of sequence data, reference databases provide curated datasets that enable a number of activities, among them are transfer annotation to related genomes (11) (12) (13) , sequence assembly and virus discovery (14) (15) (16) (17) , viral dynamics and evolution (18) (19) (20) and pathogen detection (14, (21) (22) (23) . The second model captures and standardizes host information for all viruses, and whenever a new RefSeq record is created, a manually curated ''viral host'' property is assigned to the relevant species within the NCBI Taxonomy database. The link to the Retrovirus Resource (http://www.ncbi.nlm.nih.gov/genome/viruses/retroviruses) provides access to the Retrovirus Genotyping Tool and HIV-1, Human Interaction Database (50, 51) . doi = 10.1093/nar/gku1207 id = cord-000271-812uc4w7 author = Chen, Zhiqi title = Alternative splicing of CD200 is regulated by an exonic splicing enhancer and SF2/ASF date = 2010-06-17 keywords = ASF; ESE; SF2; cd200 summary = Taken together, our data suggest for the first time that SF2/ASF regulates the function of CD200 by controlling CD200 alternative splicing, through direct binding to an ESE located in exon 2 of CD200. Interestingly, in a mouse model of viral infection, we detected for the first time that the normal splicing pattern of CD200 was reversed in the lung tissue of A/J mice infected with mouse hepatitis virus strain I (MHV-1), following an increase in expression of SF2/ASF in this MHV-1 susceptible mouse strain. Two-and-a-half micrograms of siRNA, together with 10 mg of the alternative splicing construct DNA, was transfected to Daudi or SK-N cells by electroporation to detect exogenous expression pattern of CD200 following silencing SF2/ASF. As shown in Figures 3C and D, and 4A and B, expression of the full-length transcript (exon 2 inclusion) was reduced in both Daudi and SK-N cells after mutation or deletion of the ESE in exon 2. doi = 10.1093/nar/gkq554 id = cord-314877-db7tze8j author = Chkuaseli, Tamari title = Activation of viral transcription by stepwise largescale folding of an RNA virus genome date = 2020-08-12 keywords = RNA; RTSL; SL59; figure summary = When viewed in the context of the RNA secondary structure model for the TBSV genome (48) , the DE/CE interaction corresponds to the closing stem of a sizable RNA domain, termed large domain 3 (LD3), which, along with formation of the adjacent LD2, acts to unite the AS2 and RS2 sequences ( Figure 1B) . Translational readthrough for the CIRV genome requires a long-distance RNA-RNA interaction (LDRI) between RTSL and the 3 UTR, involving the PRTE and DRTE partner sequences, respectively ( Figure 1A , B) (39) . The binding of RTSL-TL to SL59-5 was investigated functionally by introducing compensatory nucleotide substitutions into the candidate partner sequences and assessing the effects on sg mRNA1 accumulation following transfection of mutant viral RNA genomes into protoplasts pairing potential in mutants TC-6 and TC-7 diminished sg mRNA1 plus-and minus-strand levels below ∼10% of wt, while regenerating pairing capacity with alternate nucleotides in mutant TC-8 restored levels up to ∼50-62% of wt ( Figure 3B, C) . doi = 10.1093/nar/gkaa675 id = cord-275519-98qxf6xo author = Chun, Jong-Yoon title = Dual priming oligonucleotide system for the multiplex detection of respiratory viruses and SNP genotyping of CYP2C19 gene date = 2007-02-07 keywords = DPO; PCR summary = This structure results in two primer segments with distinct annealing properties: a longer 5′-segment that initiates stable priming, and a short 3′-segment that determines target-specific extension. This DPO-based system is a fundamental tool for blocking extension of non-specifically primed templates, and thereby generates consistently high PCR specificity even under less than optimal PCR conditions. Since the development of the polymerase chain reaction (PCR), a variety of modifications in primer design and reaction conditions have been proposed to enhance and optimize specificity (1-3), but a fundamental solution for eliminating non-specific priming still remains a challenge and limits the versatility of PCR in nucleic-acid-based tests (NATs). In this article, we describe and demonstrate how effectively DPO eliminates extension of non-specifically primed templates and generates high PCR specificity under a range of sub-optimal or stringent reaction conditions. We further evaluated the DPO-based multiplex PCR system for the detection of a single nucleotide polymorphism (SNP) in CYP2C19. doi = 10.1093/nar/gkm051 id = cord-302895-471zei5o author = Deng, Zengqin title = Structural basis for the regulatory function of a complex zinc-binding domain in a replicative arterivirus helicase resembling a nonsense-mediated mRNA decay helicase date = 2013-12-24 keywords = EAV; HEL1; RNA; ZBD; dna; figure summary = Biochemical studies using recombinant arterivirus and coronavirus helicases revealed similar enzymatic properties, including nucleic acid-stimulated ATPase and 5 0 -3 0 duplex unwinding activities on both RNA and DNA substrates containing 5 0 single-stranded regions (34, 35) . Amino acid substitutions in ZBD or the adjacent ''spacer'' that connects it to the downstream domain can profoundly affect EAV helicase activity and RNA synthesis, with most replacements of conserved Cys or His residues yielding replicationnegative virus phenotypes (36, 37) . Thus, our study not only highlights how nidovirus helicase activity depends on the extensive relay of interactions between the ZBD, accessory and HEL1 domains but also provides a framework to propose and explore a role for the enzyme in the posttranscriptional quality control of nidovirus RNAs. Nsp10 of the EAV-Bucyrus isolate (NCBI Reference Sequence NC_002532) is composed of amino acids 2371-2837 of replicase pp1ab, which will throughout this study be referred to as nsp10 residues 1-467. doi = 10.1093/nar/gkt1310 id = cord-350189-2su7oqbz author = Elmén, Joacim title = Locked nucleic acid (LNA) mediated improvements in siRNA stability and functionality date = 2005-01-14 keywords = LNA; RNA; SARS; siRNA summary = A priori, this suggests that LNA may be used to increase the functional half-life of siRNA in vivo by two different mechanisms, e.g. by enhancing the resistance of the constituent RNA strands against degradation by single-stranded RNases and by stabilizing the siRNA duplex structure that is critical for activity. Next, we examined the effect of making single RNA to LNA exchanges at base-paired positions in the antisense strand of the firefly luciferase siLNA1. Although we cannot exclude that these modifications somehow prevent loading of the antisense strand into RISC, we believe this to be unlikely given the functionality of many significantly more modified siLNAs. Rather, as these positions are all close to the site where RNA target cleavage occurs [between pos. The SARS siRNA (Table 1) has identical closing base-pairs at both ends (A:U) making it likely that enough of both the antisense and sense strand would be incorporated into RISC to observe activity on the respective targets. doi = 10.1093/nar/gki193 id = cord-011565-8ncgldaq author = Elworth, R A Leo title = To Petabytes and beyond: recent advances in probabilistic and signal processing algorithms and their application to metagenomics date = 2020-06-04 keywords = Bloom; CMS; hash; sequence; set summary = doi = 10.1093/nar/gkaa265 id = cord-000435-2u49b7xo author = Firth, Andrew E. title = Stimulation of stop codon readthrough: frequent presence of an extended 3′ RNA structural element date = 2011-04-27 keywords = RNA; SINV; UGA; figure summary = doi = 10.1093/nar/gkr224 id = cord-007041-rloey02j author = Harel, Noam title = Direct sequencing of RNA with MinION Nanopore: detecting mutations based on associations date = 2019-12-16 keywords = Illumina; MS2; RNA; mutation summary = We sequenced virus populations in parallel using both MinION and Illumina, allowing us to corroborate the inferences of AssociVar. This then allowed us to directly infer relationships between mutations and to deduce the entire genome sequences of viral strains in the population. We then determined the population frequency of each mutation at passage 1 and passage 15 through whole genome deep sequencing as described below, using Illumina and MinION. After applying AssociVar to the data, we were able to identify five out of the six mutations appearing at a frequency above 10% in the Illumina results in p15A, and all eight positions within the p15B sample (Figure 4 , Supplementary Table S2 ). We applied AssociVar to sequencing data from an evolved population of phages where Illumina sequencing was available, allowing us to corroborate whether mutations we found based on analysis of the MinION data alone were indeed real. doi = 10.1093/nar/gkz907 id = cord-348427-worgd0xu author = Hatcher, Eneida L. title = Virus Variation Resource – improved response to emergent viral outbreaks date = 2017-01-04 keywords = Resource; Variation; Virus; sequence summary = The resource now includes expanded data processing pipelines and analysis tools, and supports selection and retrieval of nucleotide and protein sequences from four new viral groups: Ebolaviruses, MERS coronavirus, rotavirus, and Zika virus ( Table 2 ). New processes have been added to parse source descriptor terms from Gen-Bank records and map these to controlled vocabulary, and the resource now supports retrieval of sequences based on standardized isolation source and host terms in addition to standardized gene and protein names. The resource includes data processing pipelines that retrieve sequences from GenBank, provide standardized gene and protein an-notation, and map sequence source descriptors (i.e. metadata) to uniform vocabularies. To resolve this issue, the Virus Variation database loading pipeline parses Gen-Bank records, identifies important metadata terms, such as sample isolation host, date, country and source, and maps these to a standardized vocabulary using a hierarchical approach. doi = 10.1093/nar/gkw1065 id = cord-048327-xgwbl8em author = Henderson, Clark M. title = Antisense-induced ribosomal frameshifting date = 2006-08-18 keywords = RNA; UGA; site summary = The ability of cis-acting RNA structures or trans-acting 2 0 -O-Methyl antisense oligonucleotides to induce ribosomal frameshifting was determined by in vitro transcription and translation of a dual luciferase reporter vector, p2Luc. p2Luc contains the Renilla and firefly luciferase genes on either side of a multiple cloning site, and can be transcribed using the T7 promoter located upstream of the Renilla luciferase gene (45) . As the AZ1A antisense oligonucleotide was designed to anneal directly adjacent to the UGA codon of the shift site, it was of interest to determine whether the wild-type antizyme pseudoknot could induce À1 frameshifting when located in the equivalent position. The ability of spermidine to stimulate antisense oligonucleotide induced ribosome frameshifting to the +1 reading frame at the UCC UGA shift site in the absence of the natural 3 0 stimulator demonstrates that this cis-acting element is not required for polyamine responsiveness. doi = 10.1093/nar/gkl531 id = cord-320627-7vi6skvh author = Horejsh, Douglas title = A molecular beacon, bead-based assay for the detection of nucleic acids by flow cytometry date = 2005-01-19 keywords = SARS; beacon; molecular summary = We have developed a fluid array system using microsphere-conjugated molecular beacons and the flow cytometer for the specific, multiplexed detection of unlabelled nucleic acids in solution. Using beads of different sizes and molecular beacons in two fluorophore colours, synthetic nucleic acid control sequences were specifically detected for three respiratory pathogens, including the SARS coronavirus in proof-of-concept experiments. In this report, we describe the construction of molecular beacon-conjugated beads that we have called ''BeadCons'', whose specific hybridization with complementary target sequences can be resolved by flow cytometry (see Figure 1 ). In the multiplex detection experiment, the test sample contained 0.5 ml of the positive oligo DNA (100 mM stock) diluted in 9.5 ml of a complex mixture of oligonucleotides (equimolar levels of 10 mM each, equalling a 100 mM total concentration; sequences listed in Supplementary Table 1 ). doi = 10.1093/nar/gni015 id = cord-001453-l1r416w7 author = Hou, Linlin title = Archaeal DnaG contains a conserved N-terminal RNA-binding domain and enables tailing of rRNA by the exosome date = 2014-11-10 keywords = Csl4; NTD; RNA; exosome; figure summary = title: Archaeal DnaG contains a conserved N-terminal RNA-binding domain and enables tailing of rRNA by the exosome Consistently, a fusion protein containing full-length Csl4 and NTD of DnaG led to enhanced degradation of A-rich RNA by the exosome. The RNase PH-domain containing subunits Rrp41 and Rrp42 are arranged in a catalytically active hexamer, on the top of which a trimeric cap composed of the RNA-binding proteins Rrp4 and Csl4 is bound ( Figure 1B ; 4, 5, [22] [23] [24] . The bacterial primase DnaG is composed of an NTD containing a Zn-finger motif involved in DNA binding, the central, catalytic TOPRIM domain and a CTD neces-sary for the interaction with the replicative helicase DnaB ( Figure 1A , refs. coli cell-free extract was easily detectable by pull-down assays with Strep-Tactin Sepharose beads (for an example see Figure 7A be-low), we conclude that the CTD of DnaG is important for the binding to the archaeal exosome. doi = 10.1093/nar/gku969 id = cord-330067-ujhgb3b0 author = Huang, Yi title = CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes date = 2007-10-02 keywords = SARS; sequence summary = To overcome the problems we encountered in the existing databases during comparative sequence analysis, we built a comprehensive database, CoVDB (http://covdb.microbiology.hku.hk), of annotated coronavirus genes and genomes. CoVDB provides a convenient platform for rapid and accurate batch sequence retrieval, the cornerstone and bottleneck for comparative gene or genome analysis. In CoVDB, with the aim of facilitating gene retrieval, we tried to unify the naming of these non-structural proteins from different groups of coronaviruses. When we compared their putative amino acid sequences to the corresponding ones in other group 1 coronavirus genomes using BLAST, as well as searching for conserved domains using motifscan, results showed that the putative proteins encoded by these ORFs belonged to a protein family in Pfam originally assigned as ''Corona_NS3b'' (accession number PF03053). database, CoVDB, of annotated coronavirus genes and genomes, which offers efficient batch sequence retrieval and analysis. doi = 10.1093/nar/gkm754 id = cord-048222-1pq6dkl5 author = Imbeaud, Sandrine title = Towards standardization of RNA quality assessment using user-independent classifiers of microcapillary electrophoresis traces date = 2005-03-30 keywords = RIN; RNA; figure summary = With that prospect in mind, and with the aim of anticipating future standards by pre-normative research, we identified and tested two software packages recently developed to gauge the integrity of RNA samples with a user-independent strategy: one open source, the degradometer software for calculation of the degradation factor and ''true'' 28S:18S ratio based on peak heights (24) and the freely available RIN algorithm of the Agilent 2100 expert software, based on computation of a ''RNA Integrity Number'' (RIN) (25) . A RIN number is computed for each RNA profile (see Supplementary Table 4 online) resulting in the classification of RNA samples in 10 numerically predefined categories of integrity. The values of the mean fold changes, calculated according to the 2 ÀDDCt quantification method (see Materials and Methods), were found lower than 1.0, corresponding to the expression level (1·) in the sample exhibiting the highest RNA quality (Table 2 and Figure 5 ). doi = 10.1093/nar/gni054 id = cord-335377-zrbn637z author = Ishimaru, Daniella title = RNA dimerization plays a role in ribosomal frameshifting of the SARS coronavirus date = 2012-12-26 keywords = RNA; S3L2; SARS; Stem; figure; loop summary = Furthermore, the inability to dimerize caused by the silent codon change in Stem 3 of SARS-CoV changed the viral growth kinetics and affected the levels of genomic and subgenomic RNA in infected cells. We further show that kissing dimer formation plays a role in frameshift-stimulation and modulates the relative abundance of full-length and subgenomic viral RNAs. Plasmids containing wild-type pseudoknot as well as the ÁS3 pk mutant were described in Plant et al (1) . Our previous NMR analysis of exchangeable imino protons of the SARS-CoV pseudoknot ( Figure 1A , wild-type pk) provided unequivocal evidence for the existence of Stem 3 (1). Surprisingly, in the context of the SARS-CoV Stem 3 sequence, 5 0 -cuug-3 0 tetraloop-capped mutants readily formed extended duplex structures as revealed by native gel and NMR analysis. doi = 10.1093/nar/gks1361 id = cord-325985-xfzhn1n1 author = Jabado, Omar J. title = Comprehensive viral oligonucleotide probe design using conserved protein regions date = 2007-12-13 keywords = Pfam; probe; sequence summary = The method uses the Protein Families database (Pfam) and motif finding algorithms to identify oligonucleotide probes in conserved amino acid regions and untranslated sequences. Our method for probe design employs protein alignment information, discovered protein motifs, nucleic acid motifs and finally, sliding windows to ensure near complete coverage of the database. The EMBL nucleotide sequence database [July 2007, Release 91; 461,353 nucleic acid sequences (31) ] was chosen as the reference for this study because it is tightly integrated with the Pfam protein family database (23, 32 Taxon growth was estimated using a standard least squares method, with the SPSS statistical package. We have described a method that capitalizes on the Pfam protein alignment database and a motif finding algorithm to automate the extraction of nucleic acid sequence for probes from conserved protein regions. doi = 10.1093/nar/gkm1106 id = cord-000125-uvf5qzfd author = Kenworthy, Rachael title = Short-hairpin RNAs delivered by lentiviral vector transduction trigger RIG-I-mediated IFN activation date = 2009-09-03 keywords = B971; IFN; RNA; figure; rig summary = The interaction between a PAMP and a PRR triggers activation of the interferon (IFN) pathway in mammalian cells, which significantly changes the gene-expression profile in the cells and contributes to the well-documented off-target effect of RNAi. IFN induction is especially problematic in antiviral studies employing RNAi, where the antiviral effect of IFN must be distinguished from that of RNAi. Typical IFN-inducing structure patterns include dsRNA of certain length, single-stranded RNA (ssRNA) containing 5 0 -triphosphates (5 0 -ppp), the dsRNA analogue polyinosinic-polycytidylic acid (poly I:C), and certain dsDNA molecules. Mammalian expression plasmids encoding each of these proteins, as well as the dominant negative (DN) mutants of RIG-I and MDA5, were transfected into 293FT cells with shRNAs and an IFN-b promoter reporter construct. doi = 10.1093/nar/gkp714 id = cord-000881-s90geszi author = Lang, Dorothy M. title = Highly similar structural frames link the template tunnel and NTP entry tunnel to the exterior surface in RNA-dependent RNA polymerases date = 2012-12-25 keywords = Motif; RNA; figure summary = In contrast to the relatively short lengths of previously described motifs, we found that most homomorphs are long, and each provides a structural connection between the template tunnel or NTP entry tunnel and the exterior of the protein. The structurally aligned sequences that comprised homomorph of Motif F (hmF) for RdRps and HIV are summarized in Figure 3A . Using T7 DNAP as a query (lowest segment of the figure) , only a small portion of the C-terminal edge of Motif D and a few species have similar structures. In the RdRps, the combined regions of structural homology represent $75% of the sequence from the start of homomorph of Motif G (hmG) through the end of hmE in each species ($375 residues). The tertiary position of each of the homomorphs includes at least one residue (and sometimes more) in contact with the exterior surface of the protein and one or more highly conserved functional residues located within or at the wall of the template tunnel. doi = 10.1093/nar/gks1251 id = cord-321352-174q2pjw author = Lew, Qiao Jing title = PCAF interacts with XBP-1S and mediates XBP-1S-dependent transcription date = 2010-09-04 keywords = PCAF; XBP-1S; figure summary = Further induction (more than 2-fold) of the XBP-1S-mediated activation of HTLV-1 and BiP promoters was detected in the PCAF-expressing cells ( Figure 3A and B) . Co-transfection of the PCAF shRNA in the XBP-1S-expressing cells led to 35, 74 and 52% inhibition of BiP, CHOP, and EDEM transcription, respectively ( Figure 6B ), demonstrating the involvement of PCAF in the XBP-1S-dependent transcription. In the XBP-1S/PCAF co-transfected cells, more XBP-1S proteins were found to bind to the promoter region of BiP and CHOP genes ( Figure 7A and B). This observation could explain why CREB1 and other CREB/ATF family proteins fail to up-regulate HTLV-1 transcription in the absence of Tax. In contrast, the requirement for PCAF in the XBP-1S-dependent HTLV-1 basal transcription was clearly demonstrated in the cell-based reporter assays ( Figures 3A and 4A) . doi = 10.1093/nar/gkq785 id = cord-048359-lz37rh82 author = Li, Jin title = s-RT-MELT for rapid mutation scanning using enzymatic selection and real time DNA-melting: new potential for multiplex genetic analysis date = 2007-06-01 keywords = PCR; dna summary = Subsequently, melting curve analysis, on conventional or nano-technology real-time PCR platforms, detects the samples that contain mutations in a high-throughput and closed-tube manner. Following denaturation and re-annealing of PCR products that leads to formation of cross-hybridized sequences at the positions of mutations ( Figure 1A ) the sample is exposed to Surveyor TM endonuclease that recognizes base pair mismatches or small loops with high specificity (28) and generates a break on both DNA strands 3 0 to the mismatch. Finally, because the amplified mutated sequences contain defined primers at their ends, direct sequencing of enzymatically selected PCR products is readily possible following the real-time melting step, enabling sequencing of low-level mutations identified by Surveyor TM . Here we enabled Surveyor TM , an endonuclease that recognizes selectively mismatches formed by mutations and small deletions following ''cross-hybridized sequence'' formation, to generate mutation-specific DNA fragments that are amplified and screened via differential melting curve analysis. doi = 10.1093/nar/gkm403 id = cord-341154-wwq0sd2r author = Liao, Pei-Yu title = The many paths to frameshifting: kinetic modelling and analysis of the effects of different elongation steps on programmed –1 ribosomal frameshifting date = 2010-09-07 keywords = PRF summary = The model reveals three kinetic pathways to −1 PRF that yield two possible frameshift products: those incorporating zero frame encoded A-site tRNAs in the recoding site, and products incorporating −1 frame encoded A-site tRNAs. Using known kinetic rate constants, the individual contributions of different steps of the translation elongation cycle to −1 PRF and the ratio between two types of frameshift products were evaluated. Protein sequencing was originally employed to generate the simultaneous slippage model, and to confirm that the À1 PRF site for HIV-1 is U UUU UUA located within the gag/ pol overlap (where the P-site of the ribosome during frameshifting is underlined) (1). In agreement with the model predictions, experimental perturbation of different translation steps resulted in different levels of À1 PRF efficiency as well as in the relative ratios of two types of frameshift proteins. Our model suggests that in both mechanisms, incomplete translocation and slippage of P-and A-site tRNAs participate in synthesizing frameshift proteins to varying extents for different À1 PRF signals. doi = 10.1093/nar/gkq761 id = cord-314572-1pou702r author = Lin, Ya-Hui title = Rational design of a synthetic mammalian riboswitch as a ligand-responsive -1 ribosomal frame-shifting stimulator date = 2016-10-14 keywords = PRF; RNA; SARS; Switch-1; figure summary = Conformational and functional analyses indicate that the engineered theophylline-responsive RNA functions as a mammalian riboswitch with robust theophylline-dependent −1 PRF stimulation activity in a stable human 293T cell-line. In a first step to constructing a ligand-responsive −1 PRF stimulator, we designed Switch-0 RNA with a theophylline aptamer replacing the stem 3 of SARS-PK ( Figure 1A and C). We rationalized that such an engineered switch hairpin of reasonable stability (predicted free energy of −12.7 kcal/mole (37)) would be the dominant conformation that could interfere with the formation of pseudoknot stem 2 in the absence of theophylline (Supplementary Figure S2A) . To improve the dynamic range of ligand response and to see if theophylline aptamers can be functional while existing in both positive and negative regulators of −1 PRF, we fused previously designed theophylline-dependent upstream attenuator, theoOFF2 (24) with Switch-1 ( Figure 5A ) and examined theophylline-dependent −1 PRF activity in vitro. doi = 10.1093/nar/gkw718 id = cord-319681-kjet3e50 author = Lin, Zhaoru title = Spacer-length dependence of programmed −1 or −2 ribosomal frameshifting on a U(6)A heptamer supports a role for messenger RNA (mRNA) tension in frameshifting date = 2012-06-28 keywords = AON; RNA summary = The mRNA signal for À1 FS is composed of two elements, a slippery sequence with consensus X_XXY_ YYZ (underlines denote zero frame; X can be any base, Y is A or U, Z is not G in eukaryotic systems) where the ribosome changes frame, and a downstream stimulatory RNA structure, a stem-loop or pseudoknot (reviewed in 3, 4) . A version of the À1 FS reporter plasmid with the IBV slippery sequence (p2lucIBV-AON) was also Spacer-length dependence of programmed À1 or À2 ribosomal frameshifting on a U 6 A heptamer supports a role for mRNA tension in frameshifting Based on the published literature, including our own studies of 80S ribosomes stalled at the IBV frameshift-stimulatory pseudoknot (22, 37) , we proposed a mechanical model of frameshifting in which a failure of intrinsic ribosomal helicase activity (15, 28) to unwind efficiently the stimulatory RNA during the translocation step leads to the build up of tension in the mRNA and subsequently, breakage of codon:anticodon contacts and realignment of the tRNAs in the À1 reading frame. doi = 10.1093/nar/gks629 id = cord-003305-ya0siivm author = Liu, Weichi title = A unique intra-molecular fidelity-modulating mechanism identified in a viral RNA-dependent RNA polymerase date = 2018-11-16 keywords = NS5B; NTD; RNA; figure summary = doi = 10.1093/nar/gky848 id = cord-262076-b5u5hp2r author = Liu, Ying Poi title = Inhibition of HIV-1 by multiple siRNAs expressed from a single microRNA polycistron date = 2008-03-16 keywords = HIV-1; RNA; figure summary = We show that the expression of individual miRNAs is greatly enhanced in multiplex hairpin transcripts that are properly processed into functional miRNAs. HIV-1 replication can be potently inhibited by simultaneous expression of four antiviral miRNAs. These combined results indicate that the multiplex miRNA strategy is a promising therapeutic approach against escape-prone viral pathogens. By repeating this procedure we obtained constructs expressing different combinations of 1, 2, 3, 4 and 6 pri-miRNAs. The RNA structures formed by the transcripts were predicted with the Mfold program (47) at http://frontend.bioinfo.rpi.edu/ applications/mfold/ and found to be similar to the predicted conformation of the wild-type pri-miRNAs. The firefly luciferase (FL) reporters containing HIV-1 target sequences pol47 (Luc-A pol47 ), pol1 (Luc-B pol1 ), gag5 (Luc-C gag5 ), r/t5 (Luc-D r/t5 ), ldr9 (Luc-E ldr9 ) and the anti-HIV shRNAs have been described previously (32) . doi = 10.1093/nar/gkn109 id = cord-302368-uhhtvdif author = Longhini, Andrew P. title = Chemo-enzymatic synthesis of site-specific isotopically labeled nucleotides for use in NMR resonance assignment, dynamics and structural characterizations date = 2016-04-07 keywords = CEST; NMR; NOESY; RNA summary = Finally, we showcase the improvement in spectral quality arising from reduced crowding and narrowed linewidths, and accurate analysis of NMR relaxation dispersion (CPMG) and TROSY-based CEST experiments to measure μs-ms time scale motions, and an improved NOESY strategy for resonance assignment. Additionally, we show that the measurements of relaxation parameters using CPMG, R 1 , and CEST are possible for both small and large RNAs. Furthermore, we demonstrate substantial improvements in signalto-noise and line width for relaxation optimized spectroscopy (TROSY) experiments compared to the traditional heteronuclear single quantum coherence (HSQC) exNucleic Acids Research, 2016, Vol. 44, No. 6 e52 periments for isolated two-spin systems approximated by our purine and pyrimidine labeling schemes (30) (31) (66) (67) . Thus, RNAs synthesized with our selective site-specifically labeled NTPs should benefit from TROSY based NMR experiments that reduce the problems of crowding, fast signal decay, low resolution, and decreased S/N ratios (12, 34, 31, (66) (67) (80) (81) . doi = 10.1093/nar/gkv1333 id = cord-320325-sjab8zsk author = Mendez, Aaron S title = Site specific target binding controls RNA cleavage efficiency by the Kaposi''s sarcoma-associated herpesvirus endonuclease SOX date = 2018-12-14 keywords = RNA; SOX; Supplementary; figure; limd1 summary = Using purified KSHV SOX protein, we reconstituted the cleavage reaction in vitro and reveal that SOX displays robust, sequence-specific RNA binding to residues proximal to the cleavage site, which must be presented in a particular structural context. Using an RNA substrate that is efficiently cleaved by SOX in cells, we revealed that specific RNA sequences within and outside of the cleavage site significantly contribute to SOX binding efficiency and target processing. Given that both substrates contain the requisite unpaired bulge at the predicted cleavage site (see Figure 2A and Supplementary Figure S2 ), these observations suggest that additional sequence or structural features impact SOX targeting efficiency on individual RNAs. Two SOX point mutants, P176S and F179A, located in an unstructured region of the protein that bridges domains I and II have been shown to be selectively required for its endonucleolytic processing of RNA substrates (Supplementary Figure S3A and S3B) (8, 21) . doi = 10.1093/nar/gky932 id = cord-263645-wupre5uj author = Morgan, Brittany S title = Insights into the development of chemical probes for RNA date = 2018-09-19 keywords = HIV-1; RNA; chemical; molecule; small; target summary = One important example is the development of chemical probes, which has greatly progressed the study of proteins and related diseases (11, 12) but has been challenging for non-ribosomal RNAs. This powerful chemical tool requires small molecules with well-defined biological activity, cell permeability, and selectivity to accurately and reliably probe specific mechanistic and phenotypic questions (11, 12) . While ligands that bind non-ribosomal RNA in vitro have been reported for decades, the development of chemical probes with evidence of specific small molecule:RNA engagement in cell or animal models has dramatically increased in the last four years. Recent studies report several drug-like small molecules that target a range of RNAs in animal models, including riboswitches (15) , miRNAs, (16, 17) splice sites (18) , and mature mRNAs (19) , at least one of which is currently in clinical trials (NCT02268552). doi = 10.1093/nar/gky718 id = cord-275859-ix8du1er author = Mouzakis, Kathryn D. title = HIV-1 frameshift efficiency is primarily determined by the stability of base pairs positioned at the mRNA entrance channel of the ribosome date = 2012-12-15 keywords = HIV-1; RNA; figure; frameshift summary = In contrast, there is a strong correlation between frameshift efficiency and the local thermodynamic stability of the first 3–4 bp in the stem–loop, which are predicted to reside at the opening of the mRNA entrance channel when the ribosome is paused at the slippery site. Here, we investigate the role of the HIV-1 RNA structure in frameshifting, focusing on elucidating the relationships between frameshift efficiency and (i) the downstream RNA stem-loop thermodynamic stability, (ii) spacer length and (iii) surrounding genomic secondary structure. Our data further indicate that the base pairs important for frameshifting are located at a distance of 8 nt from the slippery site, which corresponds to the length of the spacer and is consistent with a structural model of the ribosome paused at the frameshift site. Instead, we observe a strong correlation (R 2 = 0.88) between frameshift efficiency and local stability of the first 3 bp at the base of the stem-loop using a one-phase exponential decay function ( Figure 3C and Supplementary Table S3 ). doi = 10.1093/nar/gks1254 id = cord-048478-ftlb5b95 author = Mroczek, Seweryn title = Apoptotic signals induce specific degradation of ribosomal RNA in yeast date = 2008-04-01 keywords = 25S; RNA; ROS; figure summary = One striking characteristic, which accompanies apoptosis in both vertebrates and yeast, is a fragmentation of cellular DNA and mammalian apoptosis is often associated with degradation of different RNAs. We show that in yeast exposed to stimuli known to induce apoptosis, such as hydrogen peroxide, acetic acid, hyperosmotic stress and ageing, two large subunit ribosomal RNAs, 25S and 5.8S, became extensively degraded with accumulation of specific intermediates that differ slightly depending on cell death conditions. For example, in metazoans the programmed cell death (PCD) called apoptosis, in addition to irreversible DNA damage, which is considered an apoptotic hallmark (3) , also involves specific cleavage of several RNA species, including 28S rRNA, U1 snRNA or Ro RNP-associated Y RNAs (4) . Together, this strongly suggests that rRNA degradation observed in apoptotic and oxidative stress conditions is not simply a result of cell death but is produced in the process that requires enzymatic activity and functional cellular machinery. doi = 10.1093/nar/gkm1100 id = cord-003711-l3brhmzq author = Munnur, Deeksha title = Reversible ADP-ribosylation of RNA date = 2019-06-20 keywords = ADP; PARP10; RNA; TRPT1; figure summary = ADP-ribosylation is a reversible chemical modification catalysed by ADP-ribosyltransferases such as PARPs that utilize nicotinamide adenine dinucleotide (NAD(+)) as a cofactor to transfer monomer or polymers of ADP-ribose nucleotide onto macromolecular targets such as proteins and DNA. We further reveal that ADP-ribosylation of RNA mediated by PARP10 and TRPT1 can be efficiently reversed by several cellular ADP-ribosylhydrolases (PARG, TARG1, MACROD1, MACROD2 and ARH3), as well as by MACROD-like hydrolases from VEEV and SARS viruses. Importantly, PARP3 could only ADP-ribosylate DNA ends and did not have any activity on RNA oligos, while PARP10 specifically modified phosphorylated ssRNA oligo in the conditions tested ( Figure 1A and B). Since PARP10 can ADP-ribosylate both 5 and 3 phosphorylated ends of RNA, we tested both of these modified oligos as substrates for well characterized human ADP-ribosylhydrolases: PARG, TARG1, MACROD1, MACROD2 and ARH1-3. doi = 10.1093/nar/gkz305 id = cord-304794-z2kx314h author = Métifiot, Mathieu title = G-quadruplexes in viruses: function and potential therapeutic applications date = 2014-11-10 keywords = HIV-1; RNA; dna; figure; structure summary = Conversely, a G-quadruplex or G4 is formed by nucleic acid sequences (DNA or RNA) containing G-tracts or Gblocks (adjacent runs of guanines) and composed of various numbers of guanines. Short RNA templates from the central region of the HIV-1 genome contain G-rich sequences near the central polypurine tract (cPPT) at the 3 end of the pol gene (IN coding sequence); this is a region where one of the two primers used for synthesizing the (−) strand DNA is produced during reverse transcription. In addition, one could imagine alternative therapeutic strategies focused on targeting RNA structures within viral ORFs to interfere with the virus cycle as well as to promote antigen presentation and to stimulate the host immune response. Topology of a DNA G-quadruplex structure formed in the HIV-1 promoter: a potential target for anti-HIV drug development U3 Region in the HIV-1 genome adopts a G-quadruplex structure in its RNA and DNA sequence doi = 10.1093/nar/gku999 id = cord-297760-uzzuoy9v author = Naito, Yuki title = siVirus: web-based antiviral siRNA design software for highly divergent viral sequences date = 2006-07-01 keywords = RNA summary = title: siVirus: web-based antiviral siRNA design software for highly divergent viral sequences siVirus () is a web-based online software system that provides efficient short interfering RNA (siRNA) design for antiviral RNA interference (RNAi). siVirus searches for functional, off-target minimized siRNAs targeting highly conserved regions of divergent viral sequences. siVirus will be a useful tool for designing optimal siRNAs targeting highly divergent pathogens, including human immunodeficiency virus (HIV), hepatitis C virus (HCV), influenza virus and SARS coronavirus, all of which pose enormous threats to global human health. Consequently, only a limited fraction of 21mers is suitable for use as antiviral siRNAs. In this study, we developed a novel web-based online software system, siVirus, which provides functional, off-target minimized siRNAs targeting highly conserved regions of divergent viral sequences. Guidelines for the selection of highly effective siRNA sequences for mammalian and chick RNA interference siDirect: highly effective, target-specific siRNA design software for mammalian RNA interference doi = 10.1093/nar/gkl214 id = cord-000010-prsvv6l9 author = Qin, Jian title = Studying copy number variations using a nanofluidic platform date = 2008-08-18 keywords = PCR; copy; dna summary = Copy number variations (CNVs) in the human genome are conventionally detected using high-throughput scanning technologies, such as comparative genomic hybridization and high-density single nucleotide polymorphism (SNP) microarrays, or relatively low-throughput techniques, such as quantitative polymerase chain reaction (PCR). We have developed a new technology to study copy numbers using a platform known as the digital array, a nanofluidic biochip capable of accurately quantitating genes of interest in DNA samples. Other existing technologies, such as quantitative polymerase chain reaction (PCR), are limited because of their inability to reliably distinguish less than a twofold difference in copy number of a particular gene in DNA samples (11) (12) (13) . In this study we demonstrate the use of a unique integrated nanofluidic system, the digital array, in the study of CNVs. The digital array (14, 15) is able to accurately quantitate DNA samples based on the fact that single DNA molecules are randomly distributed in more than 9000 reaction chambers and then PCR amplified. doi = 10.1093/nar/gkn518 id = cord-273107-xc61osdx author = Qureshi, Abid title = AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses date = 2014-01-01 keywords = database; peptide summary = title: AVPdb: a database of experimentally validated antiviral peptides targeting medically important viruses Therefore, we have developed AVPdb, available online at http://crdd.osdd.net/servers/avpdb, to provide a dedicated resource of experimentally verified AVPs targeting over 60 medically important viruses including Influenza, HCV, HSV, RSV, HBV, DENV, SARS, etc. Whereas HIPdb is a specific database of experimentally validated HIV inhibiting peptides, which is freely available at http://crdd.osdd.net/servers/hipdb. Further 624 modified peptides were also extracted and have been provided separately in AVPdb. In our database, complete AVP data of almost all human viruses reported in the literature have been included. AVPdb also provides physicochemical properties and predicted structure of AVPs along with more informative tools for data analysis such as BLAST and MAP as well as links to major peptide resources. For modified AVPs also, a separate browse option is provided where the data can be sought by Virus, Modification, Peptide Source, Cell Line, Target and Assay. doi = 10.1093/nar/gkt1191 id = cord-048345-m56agj4z author = Reddy, Timothy E. title = Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABA(A) receptor subunit genes date = 2007-01-03 keywords = GABA; Gibbs; figure summary = title: Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABA(A) receptor subunit genes We evaluate the efficacy of our approach using known examples of binding and regulation in yeast and experimentally testing predicted TF-binding sites upstream of the subunit genes coding for the heteromeric mammalian neurotransmitter receptor system, the type A g-aminobutyric acid receptor (GABA A R). In the current study, we test the ability of positional clustering to detect known TF-binding sites in a series of increasingly noisy sets of yeast promoters, and found marked improvement in the percentage of correct predictions over Gibbs sampling alone. We also present de novo predictions of TF-binding sites in promoter regions of GABA A receptor subunit genes (GABRs) whose expression is altered (either up-regulated or down-regulated) in an animal model of temporal lobe epilepsy (35) . Using positional clustering, we predicted 13 TF-binding sites upstream of GABA A receptor subunit genes ( Table 1) . doi = 10.1093/nar/gkl1062 id = cord-000294-2g471tb4 author = Rhodin, Michael H. J. title = A flexible loop in yeast ribosomal protein L11 coordinates P-site tRNA binding date = 2010-08-12 keywords = L11; PRF; site summary = High-resolution structures reveal that yeast ribosomal protein L11 and its bacterial/archael homologs called L5 contain a highly conserved, basically charged internal loop that interacts with the peptidyl-transfer RNA (tRNA) T-loop. The two mutants also had opposing effects on binding of aa-tRNA to the ribosomal A-site, and downstream functional effects were observed on translational fidelity, drug resistance/hypersensitivity, virus maintenance and overall cell growth. The high degree of similarity across species, from the primary amino acid sequences to their tertiary structures, suggests conserved functional roles beyond serving as mere scaffolding for the rRNAs. Ribosomal protein L11 of Saccharomyces cerevisiae is an essential, highly conserved component of the 60S subunit (in bacteria and archaea, the homologous protein is named L5; the yeast nomenclature is used throughout this text to minimize confusion). Analysis of the recent cryo-EM yeast ribosome structure (4) revealed that these H84 loop bases are located within 3 Å of the stretches of amino acids changed to alanines in both the 51-4A and 54-7A mutants ( Figure 5B ). doi = 10.1093/nar/gkq711 id = cord-324367-il9mz5na author = Rodnina, Marina V title = Translational recoding: canonical translation mechanisms reinterpreted date = 2020-02-20 keywords = Sec; UGA; codon; ribosome summary = This review summarizes the recent advances in understanding the mechanisms of three types of recoding events: stop-codon readthrough, –1 ribosome frameshifting and translational bypassing. In this review, we will focus on three types of recod-ing: (i) stop-codon readthrough; (ii) ribosome frameshifting and (iii) translational bypassing ( Figure 1 ). The key element for recruitment of the SelB-GTP-Sec-tRNA Sec to the stop codon on bacterial ribosomes is a selenocysteine insertion sequence (SECIS) in the mRNA, a SL structure located immediately downstream of the in-frame UGA codon at which Sec is incorporated (67) . A recent crystal structure of a translocation intermediate formed in the absence of EF-G indeed shows that the interactions of the ribosome with the codon-anticodon complex are disrupted and the A-site tRNA in the complex is shifted by one nucleotide toward the -1-frame of the mRNA (78) (Figure 4) . doi = 10.1093/nar/gkz783 id = cord-009318-zt1o1bcz author = Rolando, Justin C title = Real-time kinetics and high-resolution melt curves in single-molecule digital LAMP to differentiate and study specific and non-specific amplification date = 2020-04-17 keywords = Bst; HRM; amplification; figure; lod; specific summary = We then develop a real-time digital LAMP (dLAMP) with high-resolution melting temperature (HRM) analysis and use this single-molecule approach to analyze approximately 1.2 million amplification events. By differentiating true and false positives, HRM enables determination of the optimal assay and analysis parameters that leads to the lowest limit of detection (LOD) in a digital isothermal amplification assay. To test this hypothesis, we used a dLAMP assay with CT DNA as the target (combined with sequencing to identify the products of bulk reactions) to analyze both specific and non-specific amplification under conditions that include clinically relevant concentrations of background human DNA. Single digital partition counts were observed at low-T m non-specific amplification in both the presence of template and the NTC and independent of hgDNA concentration ( Figure 9B and C). When using Bst 3.0 (Supplementary Figure S11F) and HRM to remove non-specific amplification, LOD tracks with the number of true-positive events. doi = 10.1093/nar/gkaa099 id = cord-308331-55ge7kmr author = Routh, Andrew title = Discovery of functional genomic motifs in viruses with ViReMa–a Virus Recombination Mapper–for analysis of next-generation sequencing data date = 2013-10-09 keywords = FHV; RNA; recombination summary = Using ViReMa, we demonstrate that by mapping the distribution and frequency of recombination events in the genome of flock house virus (FHV), we can discover de novo functional genomic motifs required for viral replication and encapsidation. Here, segments at the 5 0 and the 3 0 end of a complex recombination event have been mapped to nt 500-550 and nt 1040-1080 of FHV RNA 1, but there remain a small number of trimmed nucleotides in the middle. We generated 5 043 791 synthetic reads containing 99 033 unique recombination events and aligned these reads to the FHV genome with ViReMa using a seed length of 20 nt. Our analysis of FHV demonstrates that by isolating a small number of virus particles, deep sequencing the encapsidated RNA and mapping the positions of recombination events, functional RNA motifs can be discovered. doi = 10.1093/nar/gkt916 id = cord-319116-2ts6zpdb author = Ruggiero, Emanuela title = G-quadruplexes and G-quadruplex ligands: targets and tools in antiviral therapy date = 2018-04-20 keywords = HIV-1; LTR; RNA; dna summary = Since the number of reports describing the presence of G4s in virus genomes has boomed in the past 2 years and treatment with several G4 ligands has shown potentially interesting therapeutic activity, we here aim at presenting, organizing and discussing an up-to-date close-up of the literature on G4s in viruses and the classes of molecules that have shown antiviral activity by viral G4 targeting. The research of G4s in the HIV-1 genome has been quite productive, concerning not only the two RNA viral genome copies, but also the integrated proviral genome, specifically For each virus the following information is shown: virion structure and dimension, genome size and organization; schematic representation of the G4 (red dots) location in the viral genomes or in the mRNA and G4 binding proteins; number of G4s assessed through bioinformatics analysis, according to the corresponding references; G4 ligands reported to date to display antiviral effect and corresponding references. doi = 10.1093/nar/gky187 id = cord-001502-omq0becw author = Shabanpoor, Fazel title = Bi-specific splice-switching PMO oligonucleotides conjugated via a single peptide active in a mouse model of Duchenne muscular dystrophy date = 2015-01-09 keywords = Dmd; PMO; figure summary = Studies to date have focused on the delivery of a single SSO conjugated to a CPP, but here we describe the conjugation of two phosphorodiamidate morpholino oligonucleotide (PMO) SSOs to a single CPP for simultaneous delivery and pre-mRNA targeting of two separate genes, exon 23 of the Dmd gene and exon 5 of the Acvr2b gene, in a mouse model of Duchenne muscular dystrophy. Importantly, the cell viability using a bi-specific compound was significantly better than for a mixture of the two individual Pip6a-PMOs. We furthermore assessed the potential of this approach in an in vivo environment through intramuscular administration and demon-Nucleic Acids Research, 2015, Vol. 43, No. 1 31 strated that there were no significant differences in exon skipping activities for both Dmd and Acvr2b targets between bi-specific conjugates (D2 and D3) and a cocktail of the individual P-PMO equivalents. doi = 10.1093/nar/gku1256 id = cord-001337-a3y1rfas author = Sharma, Virag title = Analysis of tetra- and hepta-nucleotides motifs promoting -1 ribosomal frameshifting in Escherichia coli date = 2014-07-01 keywords = IS3; figure; motif summary = The present study combines experimental and bioinformatics approaches to carry out (i) a systematic analysis of the frameshift propensity of all possible motifs (16 Z_ZZN tetramers and 64 X_XXZ_ZZN heptamers) in Escherichia coli and (ii) the identification of genes potentially using this mode of expression amongst 36 Enterobacteriaceae genomes. Prokaryotic signals sometimes possess another type of stimulatory element upstream of the frameshift motif: a Shine-Dalgarno (SD)-like sequence normally involved in translation initiation through pairing with the CCUCC sequence at the 3 end of 16S ribosomal RNA (32) (33) (34) . A preliminary study of 271 IS3 family members (see Supplementary Figure S2 ) led us to choose the following empirical rules: the structure is (i) a simple or branched hairpin of a length ranging from 17 to 140 nt, (ii) that starts 4-10 nt after the last base of the motif , (iii) with a G-C(or C-G) base-pair followed by at least three consecutive Watson-Crick or G-U or U-G base-pairs and (iv) has a G unfold@37 • C ≥ 7.6 kcal.mol −1 ; the G @37 • C value was determined using the default parameters of the RNAfold program from version 1.8.5 of the Vienna RNA package (47) . doi = 10.1093/nar/gku386 id = cord-333502-3ulketgy author = Snyder, E. E. title = PATRIC: The VBI PathoSystems Resource Integration Center date = 2006-11-16 keywords = PATRIC summary = The PathoSystems Resource Integration Center (PATRIC) is one of eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infection Diseases (NIAID) to create a data and analysis resource for selected NIAID priority pathogens, specifically proteobacteria of the genera Brucella, Rickettsia and Coxiella, and corona-, caliciand lyssaviruses and viruses associated with hepatitis A and E. (i) collection and organization of existing genomic data for the eight pathosystems under a single, unified framework (ii) genome annotation and curation following standardized procedures (iii) visualization of raw data from analytical programs, as well as curated data (iv) creation of orthologous gene groups within each organism category allowing comparative analysis of gene content (v) prediction and visualization of bacterial metabolic pathways to complement functional analysis of proteins (vi) integration of online literature reviews from PathInfo (14) for selected organisms. doi = 10.1093/nar/gkl858 id = cord-337998-08tknscm author = Sztuba-Solinska, Joanna title = A small stem-loop structure of the Ebola virus trailer is essential for replication and interacts with heat-shock protein A8 date = 2016-11-16 keywords = EBOV; HSPA8; RNA; a30u; figure summary = Selective 2′-hydroxyl acylation analyzed by primer extension analysis of the secondary structure of the EBOV minigenomic RNA indicates formation of a small stem-loop composed of the HSPA8 motif, a 3′ stem-loop (nucleotides 1868–1890) that is similar to a previously identified structure in the replicative intermediate (RI) RNA and a panhandle domain involving a trailer-to-leader interaction. 3E-5E-GFP minigenome RNA secondary structure and host protein interactions were examined using selective 2 -hydroxyl acylation analyzed by primer extension (SHAPE) (38, 39) , antisense-interfered SHAPE (aiSHAPE) (40) , electrophoretic mobility shift assays (EMSA), siRNA, and mutational analysis, using both the 3E-5E-GFP minigenome system and EBOV reverse genetics. The secondary structure of the EBOV 3E-5E-GFP minigenome RNA predicted by RNAstructure software version 5.7 (48) and chemical probing data from SHAPE were used to generate 10 three-dimensional (3D) models for the trailer-to-leader panhandle interaction in the wt-EBOV genome and variants, A30U and A26U/A30U, using open-source RNAComposer, version 1.0 (http://rnacomposer.cs.put.poznan.pl/). doi = 10.1093/nar/gkw825 id = cord-269150-d1sgnxc0 author = Tan, Yong Wah title = Binding of the 5′-untranslated region of coronavirus RNA to zinc finger CCHC-type and RNA-binding motif 1 enhances viral replication and transcription date = 2012-02-22 keywords = -utr; IBV; MADP1; RNA summary = In a screen based on a yeast three-hybrid system using the 5′-untranslated region (5′-UTR) of SARS coronavirus (SARS-CoV) RNA as bait against a human cDNA library derived from HeLa cells, we found a positive candidate cellular protein, zinc finger CCHC-type and RNA-binding motif 1 (MADP1), to be able to interact with this region of the SARS-CoV genome. In this study, we describe the interaction of a cellular protein, MADP1 (zinc finger CCHC-type and RNA binding motif 1) with the 5 0 -UTR of IBV and SARS-CoV, using yeast-based three hybrid screen (34) and RNA-binding assays. Using indirect immunofluorescence, we confirmed that MADP1, despite being reported as a nuclear protein (35) , was detected in the cytoplasm of virus-infected cells and partially co-localized with the RTCs. Upon silencing of MADP1 using siRNA, viral RNA synthesis on general has been affected, resulting in a lower replication efficiency and infectivity. doi = 10.1093/nar/gks165 id = cord-000532-e18licyc author = Tholstrup, Jesper title = mRNA pseudoknot structures can act as ribosomal roadblocks date = 2011-09-08 keywords = RNA; figure; pseudoknot summary = The results shown in Figure 2 revealed that a 1D SDS-PAGE assay could not firmly identify polypeptides originating from a À1 frameshifted ribosome stalled in the pseudoknot from the non-frameshifted product in a ''Downstream Stop'' construct. In order to quantify the amount of À1 frameshifted ribosomes stalled inside the pseudoknot, we performed a 2D SDS-PAGE separation of the radioactively labelled proteins originating from the ''Upstream Stop'' construct (Supplementary Figures S4 and S5) , which is the type of construct most commonly used throughout literature. In the ''Upstream Stop'' construct the non-frameshifting ribosomes will translate gene10 and terminate at a UAA stop codon in the spacer sequence and produce a 28 kDa polypeptide. In the following subsections ''Identification of transcripts from the T7gene10-PK-lacZ gene fusions'', ''Messenger RNA stability'' and ''Coupling between translation and transcription is required for full-length transcripts'', we will show that the observed proteins did indeed originate from stalled ribosomes and that they were not caused by other effects. doi = 10.1093/nar/gkr686 id = cord-289274-3g67f8sw author = Tosoni, Elena title = Nucleolin stabilizes G-quadruplex structures folded by the LTR promoter and silences HIV-1 viral transcription date = 2015-10-15 keywords = HIV-1; LTR; NCL; figure summary = The implementation of electrophorethic mobility shift assay and pull-down experiments coupled with mass spectrometric analysis revealed that the cellular protein nucleolin is able to specifically recognize G-quadruplex structures present in the LTR promoter. In this direction, the significance of these structures as focal points of interactions with host and viral factors is supported also by the observation that G4-folded sequences are specifically recognized by various viral proteins, such as the Epstein Barr Virus Nuclear Antigen 1 (34, 35) and the SARS coronavirus unique domain (SUD), which occurs exclusively in highly pathogenic strains (36) . The LTR-II+III+IV oligonucleotide was incubated with extracts of HIV-1 producing and non-producing 293T cells to test whether the presence of viral proteins affected in any detectable way the observed EMSA profiles ( Figure 1C ). Positive identification was also confirmed by performing EMSA analysis of samples that included the G4-folded wt and mutant LTR-II+III+IV sequences with either nuclear extracts or purified human NCL. (B) EMSA analysis of the binding of nuclear extract (NE) proteins and purified NCL to the wt and mutant LTR sequences. doi = 10.1093/nar/gkv897 id = cord-001824-7c37elh6 author = Tükenmez, Hasan title = The role of wobble uridine modifications in +1 translational frameshifting in eukaryotes date = 2015-10-30 keywords = Lys summary = doi = 10.1093/nar/gkv832 id = cord-048370-noscodew author = Wu, Rebecca P. title = Cell-penetrating peptides as transporters for morpholino oligomers: effects of amino acid composition on intracellular delivery and cytotoxicity date = 2007-08-01 keywords = ÀPMO summary = doi = 10.1093/nar/gkm478 id = cord-319649-d6dqr03e author = Yang, Jie title = A cypovirus VP5 displays the RNA chaperone-like activity that destabilizes RNA helices and accelerates strand annealing date = 2013-12-05 keywords = MBP; RNA; VP5; cpv; figure summary = Here, we expressed VP5 from type 5 Helicoverpa armigera cypovirus (HaCPV-5) in a eukaryotic system and determined that this VP5 possesses RNA chaperone-like activity, which destabilizes RNA helices and accelerates strand annealing independent of ATP. In this study, we expressed HaCPV-5 VP5 in a eukaryotic expression system and determined that this CPV VP5 possesses an RNA chaperone-like activity to ATP-independently destabilize RNA helices and accelerate strand annealing. Moreover, we found that HaCPV-5 VP5 could facilitate the transcription initiation of an alternative polymerase (i.e. reverse transcriptase) through a CPV panhandle-structured RNA template, thereby strongly suggesting a direct role of the RNA chaperone activity of VP5 in the initiation of cypoviral dsRNA replication. In the family Reoviridae, CPV VP5 may not be the only RNA chaperone, as rotavirus nonstructural protein 2 (NSP2), which is a multifunctional enzyme involved in rotaviral dsRNA replication, was previously shown to contain ATP-independent nucleic acid helix-destabilizing activity (45) . doi = 10.1093/nar/gkt1256 id = cord-275232-0sg0hv9w author = Yeung, Siu-Wai title = A DNA biochip for on-the-spot multiplexed pathogen identification date = 2006-09-25 keywords = PCR; dna; probe summary = The DNA-based identification of the two model pathogens involved a number of steps including a thermal lysis step, magnetic particle-based isolation of the target genomes, asymmetric PCR, and electrochemical sequence-specific detection using silver-enhanced gold nanoparticles. The assay involves the following steps: (i) sample preparation using thermal cell lysis and magnetic particle-based target genome isolation; (ii) target DNA amplification by the PCR; (iii) hybridization of the amplicons to their complementary oligonucleotide capture probes immobilized onto individual detection electrode surfaces and (iv) electrochemical transduction of the recognition event via gold nanoparticles with signal amplification using electrocatalytic silver deposition (10) . The three main steps were (A) sample preparation: thermal cell lysis and magnetic particle-based isolation of specific genomic DNAs; (B) target DNA amplification: generation of single-stranded rich amplicons by asymmetric PCR; (C) product detection: gold nanoparticle labeling, electrocatalytic silver deposition, and electrochemical silver dissolution. doi = 10.1093/nar/gkl702 id = cord-000293-pc4x5e24 author = Yu, Chien-Hung title = Stimulation of ribosomal frameshifting by antisense LNA date = 2010-08-06 keywords = LNA; RNA; dna summary = The requirements for À1 ribosomal frameshifting are the presence of a slippery heptanucleotide sequence X XXY YYZ (where X can be A, U, G or C; Y can be A or U; and Z does not equal Y; the spaces indicate the original reading frame) (8) followed by a downstream structural element, such as a pseudoknot, a hairpin or an antisense oligonucleotide duplex [for reviews, see (9) ]. We and others have demonstrated that small RNA oligonucleotides are able to mimic the function of frameshifter pseudoknots or hairpins by redirecting ribosomes into new reading frames (27, 28) . These results demonstrate that LNA modifications indeed enhance the antisense-induced frameshifting efficiency probably due to higher thermodynamic stability and RNA-like structural properties. In addition to RNA oligonucleotides, we demonstrated that LNA/DNA mix-mers are also capable of stimulating efficient À1 ribosomal frameshifting in contrast to DNA oligonucleotides. Triplex structures in an RNA pseudoknot enhance mechanical stability and increase efficiency of À1 ribosomal frameshifting doi = 10.1093/nar/gkq650 id = cord-000482-wifs97yy author = Yu, Chien-Hung title = Stem–loop structures can effectively substitute for an RNA pseudoknot in −1 ribosomal frameshifting date = 2011-07-29 keywords = RNA; SRV-1; figure; hairpin summary = −1 Programmed ribosomal frameshifting (PRF) in synthesizing the gag-pro precursor polyprotein of Simian retrovirus type-1 (SRV-1) is stimulated by a classical H-type pseudoknot which forms an extended triple helix involving base–base and base–sugar interactions between loop and stem nucleotides. In short, pDUAL-HIV(0) was digested by KpnI and BamHI, followed by insertion of complementary oligonucleotides to clone the SRV-1 gag-pro pseudoknot, various hairpins as shown in Figures 2C and 5, and a negative control (NC) which formed no apparent secondary structure downstream of the slippery sequence. These data support the notion that downstream structures serve as barriers to stall translating ribosomes to stimulate frameshifting, and demonstrate that there is a correlation between the thermodynamic stability of a hairpin and its frameshift inducing capacity. In the present study, a 12 bp hairpin derivative of the SRV-1 gag-pro pseudoknot with a calculated stability of À26.9 kcal/mol was capable of inducing 22% of frameshifting, which is only 1.4-fold . doi = 10.1093/nar/gkr579 id = cord-284990-klsl1nzn author = Zhang, Dapeng title = A novel immunity system for bacterial nucleic acid degrading toxins and its recruitment in various eukaryotic and DNA viral systems date = 2011-02-08 keywords = CDI; SUKH; domain; figure; protein; toxin summary = By analyzing the toxin proteins encoded in the neighborhood of the SUKH superfamily we predict that they possess domains belonging to diverse nuclease and nucleic acid deaminase families. Our above observations indicate that outside of CDI systems, the SUKH superfamily genes are linked to genes encoding the HNH and NucA nucleases; hence, it is likely that even these nucleases function as distinct but analogous toxins that cleave nucleic acids in target cells. Together, the above observations raised the possibility that the SUKH superfamily protein might serve as immunity proteins, not just in certain proteobacterial CDI systems, but also more generally function, across all major bacterial lineages, to protect against linked genes, which are predicted to act as toxins. In bacteria the SUKH superfamily domains are one of the most widespread immunity proteins that appear to function in conjunction with a repertoire of nuclease toxins that are extremely diverse in sequence and structure (Figures 3 and 4) . doi = 10.1093/nar/gkr036 id = cord-291070-y0wf456f author = Zhang, Guang Lan title = PRED(BALB/c): a system for the prediction of peptide binding to H2(d) molecules, a haplotype of the BALB/c mouse date = 2005-07-01 keywords = BALB; MHC summary = PRED(BALB/c) is a computational system that predicts peptides binding to the major histocompatibility complex-2 (H2(d)) of the BALB/c mouse, an important laboratory model organism. To our knowledge, this is the first online server for the prediction of peptides binding to a complete set of major histocompatibility complex molecules in a model organism (H2(d) haplotype). PRED BALB/c is a computational system for the prediction of peptides binding to all five MHC molecules in BALB/c mice (H2 d ) class I (H2-K d , H2-L d and H2-D d ) and class II (I-A d and I-E d ) that allows analysis of proteins for the presence of binding motifs to all five H2 d molecules in parallel. We derived the initial quantitative matrices for PRED BALB/c using logarithmic equations based on the frequency of amino acids at specific positions within the training set of 9mer peptides as described previously (16) . To our knowledge, PRED BALB/c is the first online server for the prediction of peptides binding to a complete set of MHC molecules in a model organism (H2 d haplotype). doi = 10.1093/nar/gki479 id = cord-271701-tx0lqgff author = te Velthuis, Aartjan J.W. title = The SARS-coronavirus nsp7+nsp8 complex is a unique multimeric RNA polymerase capable of both de novo initiation and primer extension date = 2011-10-29 keywords = RNA; SARS; figure summary = Commonly, its core subunit is a single RNA-dependent RNA polymerase (RdRp) that drives the production of template strands for replication, new genome molecules, and-in many RNA virus groupsalso subgenomic (sg) mRNAs. This canonical RdRp is structurally conserved among RNA viruses and widely accepted to drive catalysis of phosphodiester bond formation via a well-established reaction mechanism involving two metal ions that are coordinated by aspartate residues in its motifs A and C (3) (4) (5) . Interestingly, both nsp8 and nsp(7+8) are able to extend the RNA primers beyond template length in the presence of heparin ( Figure 4D and Supplementary Figure S2B ), suggesting that these extensions result from terminal transferase activity and not from template switching, as was previously observed for poliovirus 3D pol (20) . Subsequent alanine substitution of the N-terminal D/ ExD/E motif, composed of D50 and D52 in SARS-CoV, greatly affected primer extension activity on the CU 10 template as shown in Figure 5C . doi = 10.1093/nar/gkr893