key: cord-0261923-bx9v9ex8 authors: Pasquesi, Giulia Irene Maria; Kelly, Conor J.; Ordonez, Andrea D.; Chuong, Edward B. title: Transcriptional dynamics of transposable elements in the type I IFN response in Myotis lucifugus cells date: 2022-04-18 journal: bioRxiv DOI: 10.1101/2022.04.18.488675 sha: 9849f790ad279d2a83c1be5364d3a282f4b1f736 doc_id: 261923 cord_uid: bx9v9ex8 Background Bats are a major reservoir of zoonotic viruses, and there has been growing interest in characterizing bat-specific features of innate immunity and inflammation. Recent studies have revealed bat-specific adaptations affecting interferon (IFN) signaling and IFN- stimulated genes (ISGs), but we still have a limited understanding of the genetic mechanisms that have shaped the evolution of bat immunity. Here we investigated the transcriptional and epigenetic dynamics of transposable elements (TEs) during the type I IFN response in little brown bat (Myotis lucifugus) primary embryonic fibroblast cells, using RNA-seq and CUT&RUN. Results We found multiple bat-specific TEs that undergo both locus-specific and family-level transcriptional induction in response to IFN. Our transcriptome reassembly identified multiple ISGs that have acquired novel exons from bat-specific TEs, including NLRC5, SLNF5 and a previously unannotated isoform of the IFITM2 gene. We also identified examples of TE-derived regulatory elements, but did not find strong evidence supporting genome-wide epigenetic activation of TEs in response to IFN. Conclusion Collectively, our study uncovers numerous TE-derived transcripts, proteins, and alternative isoforms that are induced by IFN in Myotis lucifugus cells, highlighting candidate loci that may contribute to bat-specific immune function. To characterize the contribution of TEs to the IFN response in bats, we conducted 97 transcriptomic and epigenomic profiling of the type I IFN response in M. lucifugus primary 98 embryonic fibroblast cells (Fig. 1) . We stimulated cells using recombinant universal IFN alpha 99 (IFNa), and profiled the transcriptome at 0, 4 and 24h time points using RNA-seq. We confirmed 100 cellular response to universal IFN treatment using qPCR on canonical ISGs (Fig. S1 ), as shown 101 previously for M. lucifugus dermal fibroblasts [8] . We also profiled 0 and 4h time points using 102 CUT&RUN to map genome-wide localization of H3K27ac, POLR2A, and STAT1. We aligned 103 these data to a chromosome-scale HiC assembly of the little brown bat genome 104 (myoLuc2.0_HiC) [36] , which was the most contiguous assembly available (Scaffold N50 of 105 ~95.5Mb). Prior to analyzing our functional genomic data, we performed de-novo repeat identification on 108 the myoLuc2.0_HiC assembly using RepeatModeler2 [37] [38] [39] ) and HelitronScanner ([37-39], 109 followed by repeat annotation using RepeatMasker [40] . We annotated 42.7% of the genome as To analyze transcriptional activity at the TE family level, we mapped RNA-seq reads to both 120 genes and TE families using TETranscripts (Fig. 2B) [42]. On average, 6.38% of RNA-seq reads 6 mapped to TEs in unstimulated cells, while 7.26% of reads mapped to TEs after 4h IFN 122 treatment, and 7.15% after 24h IFN treatment. The most abundant TE-derived transcripts we 123 identified included L1 LINEs, DNA/hAT elements, ERVs, SINEs and L2 LINEs. We identified 45 TEs that showed significant family-level transcriptional induction at 4h (adj. p-val < 0.05; log2FC 125 > 1.5), and 8 families induced at 24h according to the same cutoff thresholds (Fig. 2C) . These 126 included multiple ERV families (21), L1 LINEs (6) and DNA transposons (6) We identified a total of 11 transcripts with a TSS deriving from a TE that are IFN-inducible at 4h, 217 9 of which are shared with the 24h subset (Table S7 ). Most of these belong to genes known to 218 be involved in immune function and regulation, like NLRC5 (Fig. 4) Of these regions, we found that 466 out of 1113 fully overlapped at least one TE (Table S10) . Additionally, we identified 766 inducible, STAT1-bound TEs that fall within 100kb of an ISG (Table S11 ). This includes an LTR14_ML element that may be functioning as the promoter for 248 the NLRC5 locus in addition to an intronic Ves2_ML SINE element (Fig. 4) . However, in contrast 249 to previous studies in other species [22-26], we did not observe any overrepresented TE 250 families within this set ( Fig. S3B ; Table S12 ). The only subfamilies that overlapped more than We also found that only a small subset of genes that were overexpressed at 4h maintain high We also identified instances of TE-derived constitutively expressed genes. We verified through 304 multiple BLAST and sequence alignments that the ~100 amino acids of the EEF1A1 protein of MultiQC v1.7 [65] . Filtered FASTQ files were then mapped to the myoLuc2_HiC genome using 459 a 2-pass approach in STAR v2.7.3a [66] . STAR was run following default parameters and DESeq2 v1.32 [68] (Table S3) we were unable to identify peaks that were significantly upregulated in response to IFN with an 544 FDR < 0.10. We therefore took a more relaxed approach, retaining all peaks with an unadjusted 545 p-val < 0.10 and log2FC > 0. log2FC values were shrunken using the apeglm function v1.8.0 546 [83] for visualization. Motif analysis was performed using XSTREME v5.4.1 [84] with options '-- The World Goes Bats: Living Longer and Tolerating 566 Viruses A comparison 568 of bats and rodents as reservoirs of zoonotic viruses: are bats special? Egyptian rousette bats 571 maintain long-term protective immunity against Marburg virus infection despite diminished 572 antibody levels Pteropid bats are 574 confirmed as the reservoir hosts of henipaviruses: a comprehensive experimental study of virus 575 transmission Experimental inoculation of plants and animals with Ebola virus Replication and shedding of MERS-CoV in Jamaican fruit bats (Artibeus jamaicensis). Sci Rep Contraction of the type I IFN 583 locus and unusual constitutive expression of IFN-α in bats Fundamental properties 586 of the mammalian innate immune system revealed by multispecies comparison of type I 587 interferon responses Response in Bats Displays Distinctive IFN-Stimulated Gene Expression Kinetics with Atypical 590 RNASEL Induction The evolution of bat nucleic acid-sensing Toll-like receptors Unique Loss of the PYHIN Gene Family in Bats 595 Dampened NLRP3-597 mediated inflammation in bats and implications for a special viral reservoir host Dampened STING-Dependent Interferon 600 Activation in Bats A 602 prenylated dsRNA sensor protects against severe COVID-19 Co-option of endogenous viral sequences for host cell function Late viral 606 interference induced by transdominant Gag of an endogenous retrovirus Evolutionary journey of the 609 retroviral restriction gene Fv1 Co-option of an endogenous retrovirus envelope 611 for host defense in hominid ancestors Methylation Causes an Interferon Response in Cancer via dsRNA Including Endogenous 615 Demethylating Agents Target Colorectal Cancer Cells by Inducing Viral Mimicry by Endogenous 618 Transcripts An 620 influenza virus-triggered SUMO switch orchestrates co-opted endogenous retroviruses to 621 stimulate host antiviral immunity Regulatory evolution of innate immunity through co-626 option of endogenous retroviruses HIV-1 infection activates endogenous retroviral promoters regulating antiviral gene 630 expression Transposable elements in 632 mammals promote regulatory variation and diversification of genes with specialized functions Transposable elements have contributed human 635 regulatory regions that are activated upon bacterial infection Rules of engagement: molecular insights from host-virus arms 638 races Evolution: Endogenous Viruses Provide Shortcuts in Antiviral 640 Multiple 642 waves of recent DNA transposon activity in the bat, Myotis lucifugus A Helitron 645 transposon reconstructed from bats reveals a novel mechanism of genome shuffling in 646 eukaryotes Functional 648 characterization of piggyBat from the bat Myotis lucifugus unveils an active mammalian DNA 649 transposon Massive amplification of rolling-circle transposons in the lineage of 651 the bat Myotis lucifugus Recurrent evolution of 653 vertebrate transcription factors by transposase capture The Potential Role 656 of Endogenous Viral Elements in the Evolution of Bats as Reservoirs for Zoonotic Viruses Annual Reviews The evolution of endogenous retroviral envelope genes in bats 660 and their potential contribution to host biology De novo 662 assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds RepeatModeler2 for 665 automated genomic discovery of transposable element families HelitronScanner uncovers a large overlooked 668 cache of Helitron transposons in many plant genomes Using and understanding RepeatMasker Bats with hATs: evidence for recent DNA 674 transposon activity in genus Myotis TEtranscripts: a package for including 676 transposable elements in differential expression analysis of RNA-seq datasets Transcriptome 679 assembly from long-read RNA-seq alignments with StringTie2 Novel Superfamily of DNA Transposable Elements Recently Active in Fish Repbase Update, a database of repetitive elements in 684 eukaryotic genomes Syncytin is a captive retroviral 686 envelope protein involved in human placental morphogenesis RNA polymerase gene in bat genomes derived from an ancient negative-strand RNA virus Comparative analysis of 691 bat genomes provides insight into the evolution of flight and immunity Regulatory evolution of innate immunity through co-697 option of endogenous retroviruses NLRC5: a key regulator of MHC class I-dependent 699 immune responses NLRC5 Functions beyond MHC I Regulation-What 701 Do We Know So Far? Front Immunol Latent enhancers 703 activated by stimulation in differentiated cells Synergistic 705 activation of inflammatory cytokine genes by interferon-γ-induced chromatin remodeling and toll-706 like receptor signaling Chiropteran types I and 710 II interferon genes inferred from genome sequencing traces by a statistical gene-family 711 assembler A high-resolution 713 map of human evolutionary constraint using 29 mammals Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with 716 chromosome-length scaffolds for under $1000 Hi-C: a comprehensive 719 technique to capture the conformation of genomes Squamate 721 reptiles challenge paradigms of genomic repeat element evolution set by birds and mammals Targeted in situ genome-wide profiling with high 724 efficiency for low cell numbers v3 Improved CUT&RUN chromatin profiling 727 tools Babraham bioinformatics -FastQC A quality control tool for high throughput sequence data 731 MultiQC: summarize analysis results for 734 multiple tools and samples in a single report STAR: ultrafast 736 universal RNA-seq aligner Moderated estimation of fold change and dispersion for RNA-740 seq data with DESeq2 WebGestalt 2019: gene set analysis toolkit 742 with revamped UIs and APIs Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement 748 TrEMBL in 2000 architecture and applications BEDTools: a flexible suite of utilities for comparing genomic features RSEM: accurate transcript quantification from RNA-Seq data with or 754 without a reference genome gEVE: a genome-based endogenous viral element database 756 provides comprehensive viral protein-coding sequences in mammalian genomes Database 759 resources of the national center for biotechnology information Search and clustering orders of magnitude faster than BLAST Fast gapped-read alignment with Bowtie 2 deepTools: a flexible platform for 766 exploring deep-sequencing data deepTools2: a 768 next generation web server for deep-sequencing data analysis Use Model-Based Analysis of ChIP-Seq (MACS) to Analyze Short Reads Generated 771 by Sequencing Protein-DNA Interactions in Embryonic Stem Cells Cell Transcriptional Networks: Methods and Protocols Heavy-tailed prior distributions for sequence count data: 775 removing the noise and preserving large differences XSTREME: Comprehensive motif analysis of biological sequence 777 datasets 780 JASPAR 2020: update of the open-access database of transcription factor binding profiles GIGGLE: a search 783 engine for large-scale integrated genome analysis Newly generated RNAseq and CUT&RUN raw files have been deposited under the GEO 794 SuperSeries accession GSE200833. Processed data, including TE annotation, can be 795 visualized as a UCSC genome browser custom track here The authors declare that they have no competing interests.