key: cord-0296360-vz9awlqv authors: Yeung, Jason; Routh, Andrew L title: ViReMaShiny: An Interactive Application for Analysis of Viral Recombination Data date: 2022-04-06 journal: bioRxiv DOI: 10.1101/2022.04.06.487215 sha: 05af6fbe14688dc67eb37830cf1093719e775b4d doc_id: 296360 cord_uid: vz9awlqv Recombination is an essential driver of virus evolution and adaption, giving rise to new chimeric viruses, structural variants, sub-genomic RNAs, and Defective-RNAs. Next-Generation Sequencing of virus samples, either from experimental or clinical settings, has revealed a complex distribution of recombination events that contributes to the intrahost diversity. We and others have previously developed alignment tools to discover and map these diverse recombination events in NGS data. However, there is no standard for data visualization to contextualize events of interest and downstream analysis often requires bespoke coding. To address this, we present ViReMaShiny, a web-based application built using the R Shiny framework to allow interactive exploration and point-and-click visualization of viral recombination data provided in BED format generated by computational pipelines such as ViReMa (Viral-Recombination-Mapper). Viruses exist as dynamic populations of diverse genomes (often referred to as intra-host 30 diversity) which is maintained by the error-prone nature of viral replication (Lauring, et al., 2013) . 31 In addition to single-nucleotide variations (SNVs), viral recombination also contributes to the 32 intrahost diversity through the production of structural variants (SVs), sub-genomic RNAs 33 (sgmRNAs), Defective-RNAs (D-RNAs) and chimeric viruses that seed the emergence of novel 34 virus strains (Simon-Loriere and Holmes, 2011). Viral recombination has contributed to the 35 generation of notable variants in SARS-CoV-2, such as conserved insertions and deletions in 36 the Spike region of the Alpha, Delta and Omicron variants of concern (VOCs). Consequently, 37 recombination has an important influence on viral evolution both within single hosts and on 38 ecological scales. 39 intra-and extra-cellular compartments of alphaviruses (Langsjoen, et al., 2020) , and compared 50 recombination events from experimental and clinical isolates of SARS-CoV-2 (Jaworski, et al., 51 2021) . 52 Due to the combination of skills required, these studies are commonly collaborations between 53 wet-lab experimentalists and bioinformaticians. The collaborative process can require multiple 54 iterations of analyses to home in on data that are both valid and biologically interesting. Data 55 exploration using easily accessible, GUI-based applications can improve turn-around times 56 between iterations and allows experimentalists with limited coding experience to actively 57 engage in analysis. 58 We present a R Shiny application, ViReMaShiny, that enables rapid visualization of viral 59 recombination data from ViReMa or other applications that output recombination events using 60 BED files. This application seeks to standardize representation of key features in viral 61 recombination events such as their frequency and position relative to important genomic 62 elements. 63 The ViReMaShiny application was created using the R Shiny framework and relies on the 65 ggplot2 and circlize (Gu, et al., 2014) packages for plotting. The Shiny framework provides 66 interactivity, extensibility, and flexibility with local and web-hosted options available. Initial input 67 of user files requires BED files, an output of the ViReMa (Sotcheff, et al., 2022) or other 68 recombination mappers and are hosted locally. Either a single BED file or multiple BED files 69 from multiple biological or experimental replicates can be uploaded. The BED files follow the 70 standardized format as depicted in Figure 1A . Briefly, the genome reference, strand, and represented by the numerous points close to the x=y axis. This approach has been extensively 91 used to visualize recombination events in a range of viruses including Nodaviruses (Routh, et 92 al., 2012) , alphaviruses (Langsjoen, et al., 2020) and coronaviruses (Gribble, et al., 2021) . 93 Scrubbing the scatterplots generates a filterable table, allowing users to identify events with 94 specific features. A text box above this table alternatively allows for filter-expressions in R 6 syntax to be applied to data based on each of the parameters in the BED file(s). This allows 96 users to sample specific events with desired features, such as (for example) only small InDels 97 or only highly abundant events. Events in the table can be highlighted on the scatterplot using a 98 toggle-able button or by clicking on a row in the sequence table. to use the application for these purposes. Associated documentation also includes tutorials for 114 analysis of ViReMa output data in R. Code for the shiny application is available for local use and 115 extension at https://github.com/routhlab/ViReMaShiny. 116 ViReMaShiny standardizes outputs and improves the approachability of exploratory viral 118 recombination analysis. This application is built on the outputs of the ViReMa python script, 119 allowing for intuitive investigation of data with no coding requirement. We plan on expanding 120 support for analysis in the R environment as well as providing options to visualize recombination 121 between multiple genes of multi-partite viruses such as influenza virus (Alnaji, et al., 2021) . The 122 application is hosted at https://routhlab.shinyapps.io/ViReMaShiny/ with associated 123 documentation at https://jayeung12.github.io/. Code is available at 124 https://github.com/routhlab/ViReMaShiny. 125 Influenza A Virus Defective Viral Genomes Are Inefficiently Packaged into 156 Virions Relative to Wild-Type Genomic RNAs The coronavirus proofreading exoribonuclease mediates extensive viral 158 recombination circlize Implements and enhances circular visualization in R Tiled-ClickSeq for targeted sequencing of complete coronavirus genomes 162 with simultaneous capture of RNA recombination and minority variants Parallel ClickSeq and Nanopore Sequencing elucidates the rapid 164 evolution of Defective-Interfering RNAs in Flock House virus Circos: an information aesthetic for comparative genomics Differential Alphavirus Defective RNA Diversity between Intracellular and 168 Extracellular Compartments Is Driven by Subgenomic Recombination Events The role of mutational robustness in RNA virus 170 evolution Discovery of functional genomic motifs in viruses with ViReMa-a 172 Virus Recombination Mapper-for analysis of next-generation sequencing data Nucleotide-resolution profiling of RNA 175 recombination in the encapsidated genome of a eukaryotic RNA virus by next-generation 176 sequencing Why do RNA viruses recombine? A Virus Recombination Mapper of Next-Generation Sequencing 180 data characterizes diverse recombinant viral nucleic acids Covariation of viral recombination with single nucleotide variants during virus 183 evolution revealed by CoVaMa 126 We would like to thank Dr. Fadi Alnaji The authors declare no conflicts of interest. These files are also available at https://routhlab.shinyapps.io/ViReMaShiny and 153 https://jayeung12.github.io 154