key: cord-0065286-it20um7r authors: Tian, Yun; Khwatenge, Collins N.; Li, Jiuyi; De Jesus Andino, Francisco; Robert, Jacques; Sang, Yongming title: Targeted Transcriptomics of Frog Virus 3 in Infected Frog Tissues Reveal Non-Coding Regulatory Elements and microRNAs in the Ranaviral Genome and Their Potential Interaction with Host Immune Response date: 2021-06-17 journal: Front Immunol DOI: 10.3389/fimmu.2021.705253 sha: 6e72ae1e32c3080641b034c765ae7d0d0ad51517 doc_id: 65286 cord_uid: it20um7r BACKGROUND: Frog Virus 3 (FV3) is a large dsDNA virus belonging to Ranaviruses of family Iridoviridae. Ranaviruses infect cold-blood vertebrates including amphibians, fish and reptiles, and contribute to catastrophic amphibian declines. FV3 has a genome at ~105 kb that contains nearly 100 coding genes and 50 intergenic regions as annotated in its reference genome. Previous studies have mainly focused on coding genes and rarely addressed potential non-coding regulatory role of intergenic regions. RESULTS: Using a whole transcriptomic analysis of total RNA samples containing both the viral and cellular transcripts from FV3-infected frog tissues, we detected virus-specific reads mapping in non-coding intergenic regions, in addition to reads from coding genes. Further analyses identified multiple cis-regulatory elements (CREs) in intergenic regions neighboring highly transcribed coding genes. These CREs include not only a virus TATA-Box present in FV3 core promoters as in eukaryotic genes, but also viral mimics of CREs interacting with several transcription factors including CEBPs, CREBs, IRFs, NF-κB, and STATs, which are critical for regulation of cellular immunity and cytokine responses. Our study suggests that intergenic regions immediately upstream of highly expressed FV3 genes have evolved to bind IRFs, NF-κB, and STATs more efficiently. Moreover, we found an enrichment of putative microRNA (miRNA) sequences in more than five intergenic regions of the FV3 genome. Our sequence analysis indicates that a fraction of these viral miRNAs is targeting the 3’-UTR regions of Xenopus genes involved in interferon (IFN)-dependent responses, including particularly those encoding IFN receptor subunits and IFN-regulatory factors (IRFs). CONCLUSIONS: Using the FV3 model, this study provides a first genome-wide analysis of non-coding regulatory mechanisms adopted by ranaviruses to epigenetically regulate both viral and host gene expressions, which have co-evolved to interact especially with the host IFN response. Frog virus 3 (FV3) is a large (~105 kb), double-stranded DNA (dsDNA) virus belonging to Ranaviruses of the family Iridoviridae, which consists of a group of emerging viruses infecting fish, amphibians, and reptiles (1, 2) . FV3 infects amphibians at various life stages; whereas the infection is usually lethal in tadpoles, adult animals are more resistant and even become asymptomatic carrier following the infection. Hence, FV3 has been isolated from both sick and apparently healthy frogs in the wild and laboratory conditions (1) (2) (3) . The association of FV3 with apparently healthy frogs indicates hostadaptive evolution for effective viral transmission and infection manifested at susceptible stages during the amphibian life cycle (1) (2) (3) . This resembles the balance between deadliness and contagiousness exhibited by most successful viruses, which have effectively caused epidemics even pandemics in affected animals and humans (4) . Increasing evidence suggests that Ranaviruses are important contributors of the catastrophic global amphibian declines, which pose emerging pressure on bio-ecological health and biodiversity (5) (6) (7) . So far, FV3 acts as the most frequently reported iridovirus in infected anuran cases worldwide; it is widespread in wild amphibians and the only ranavirus detected in turtles in North America (5) (6) (7) (8) . Vilaca et al. (8) detected several FV3 lineages in wild amphibians in Canada, and these new FV3 isolates seem to have undergone genetic recombination with common midwife toad virus (CMTV) (8, 9) . In this context, CMTV represents another ranavirus endangering amphibians and reptiles throughout Europe and Asia (8, 9) . Owing to their prevalence and negative impact on many aquatic vertebrate species, more extensive studies of ranavirus biology at the genomic and molecular level are needed (1) (2) (3) (4) (5) (6) (7) (8) (9) . FV3 is the one of the best characterized models for ranaviral research, and previous studies using this virus have discovered features applicable to all iridoviruses, including the characterization of two-stage viral genome replication, phagelike hyper-methylated genomic DNA, temporal transcription of coding genes, and virus-mediated arrest of host immune response (10) (11) (12) (13) (14) . Focused on coding genes, early studies had classically examined the expression of 47 viral RNAs and 35 viral proteins in FV3-infected fish cell lines, and designated them into immediate early, delayed early, and late genes expressed in a sequential fashion during the viral infection (10) (11) (12) . Majji et al. (15) reported a first FV3 transcriptomic analysis of all putative annotated 98 coding genes (or open reading frames, ORFs) using microarray (15) . They identified 33 immediate early (IE) genes, 22 delayed early (DE) genes, 36 late (L) viral genes, while seven genes remained undetermined (15) . These previous transcriptomic studies were performed in vitro mostly using a model of fathead minnow (FHM) fish cells (10) (11) (12) 15) . Thus, FV3's transcriptomic information in vivo in infected amphibians under pressure from host various microenvironment and immune responses may provide important and more realistic information about ranaviral transcriptome. Furthermore, besides the 98 coding genes that occupy about 80% of FV3's genome, there are about 50 intergenic regions from 20 to 900 nt long spanning the remaining~20% of FV3's genome. The potential regulatory property and transcription of these non-coding genomic regions is largely unknown. Given the relatively small size of viral genomes (even for large DNA viruses), it is reasonable to hypothesize that these intergenic regions in the FV3 genome exert a regulatory role underlying viral gene expression and virus-host interaction, especially at the epigenetic level (16, 17) . The best-characterized core promoter in eukaryotic genes contains a TATA-Box, which is located at the positions −25 and −30 from the transcription start site (TSS). The TATA-Box is recognized by the TATA-binding protein (TBP) in a complex of several other transcription factors (TF), which recruits the RNA polymerase II (pol II) to initiate transcription process (18) . Viruses rely on cellular metabolism for completing their infection cycle. Viral genes, thus, adopt similar cis-regulatory elements (CREs) for interacting with host transcription machinery and orchestrating viral and host gene expression (16) . For example, in human herpes simplex viruses (HSV), a recent study detected the binding sites for TBP, pol II, and a viral ICP4 protein on the promoter regions of representative immediate early (IE), early (E), and late (L) genes, and relevant CRE-TF interaction to mediate associated HSV gene expression in a function of time post-infection (19) . Various promoter elements have also been examined in other large dsDNA viruses of Poxviridae, Asfarviridae; Phycodnaviridae and Iridoviridae (20) . Studies of viral gene promoters in iridoviruses have mainly used FV3 and only focused on a few genes. A cis-element with 23 bp core region at 78-bp upstream of a major FV3 IE gene encoding ICP-18 (a.k.a, ICR-169, encoded by FV3gorf82R), was shown to interact with a FV3 protein (and potentially other cellular transcription factors) critical for transcription of ICP-18 gene (21) . Additional analysis of the promoter region for another IE gene encoding ICP-46 (a.k.a ICR489, encoded by FV3gorg91R) detected no similar CRE (22) . This lack of similarity between the two IE gene promoters indicated that the temporal regulation of IE genes is diverse. Furthermore, other CRE elements including those containing 'TATA', 'CAAT', and 'GC' motifs were identified in the ICP46 gene promoter, like to those of typical eukaryotic gene promoters (21) (22) (23) . Other studies of three Bohle iridovirus genes -two early (ICP-18 and ICP-46) and one late (major capsid protein [MCP]) identified conservative CRE motifs located 127 to 281 bp upstream of the transcription start site (TSS), and other ones located within 30 bp proximity to the TSS (21) (22) (23) (24) . While these studies provide a good first step, a more extensive analyses of promoter and relevant cis-trans interaction are imperative for understanding the temporal expression and transcriptomic profile of ranaviral genes, and for progressing in comparative studies of large dsDNA viruses (16, 20) . Viruses have evolved various strategies to evade host immune responses. In addition to the commonly studied antagonistic role exerted by viral proteins, multiple families of viruses, particularly DNA viruses, also encode regulatory microRNA (miRNA) species (25) . miRNAs are small non-coding RNAs acting as RNA silencing and post-transcriptional regulators of gene expression by targeting primarily 3'-UTR regions of cellular transcripts. Virus-derived miRNAs (v-miR) potently act on either host or virus transcripts, and have been shown to be critical in shaping host-pathogen interaction (26) . A variety of v-miRs has been identified in different DNA viruses, and their role in viral pathogenesis is emerging. v-miRs can subvert host defense responses and mediate other cellular processes such as cell death and proliferation. Whether v-miRs are present in ranavirus and play a role in regulation of virus-host interaction is largely unknown (25, 26) . Along with recent virome studies and the identification of novel ranavirus isolates (8, 9) , we performed a whole transcriptomic analysis (RNA-Seq) using total RNA samples containing both the viral and cell transcripts from FV3infected frog tissues (27) . The virus-specific transcriptome mapped authentic reads, which spanned the full FV3's genome at~10× depth (both positive and negative strands) in several infected tissue including intestine, liver, spleen, lung and particularly kidney. Focusing on viral coding genes, we previously profiled their differential expression in a virus-, tissue-, and temporal class-dependent manners. Further functional analysis based on transcriptomic detection unraveled some viral genes encoding hypothetical proteins that contain domains mimicking conserved motifs found in host interferon (IFN) regulatory factors (IRFs) or IFN receptors (27) . The IFN system is a critical antiviral mechanism that has diversified during vertebrate evolution. The IFN system in most tetrapod species include three types of IFNs (type I, II, and III), which are classified mainly based on type-specific molecular signatures and recognizing receptors (28) (29) (30) . The binding of an IFN ligand with its cognate receptor, thus, elicits a signaling cascade involving IFN receptors and various transcription factors such as IRFs and STATs (28) (29) (30) . Here, we report that in addition to reads mapping in the coding region, we also detected RNA-Seq reads that distributed in non-coding intergenic regions of both positive and negative strands the FV3 genome. Further analyses identified various non-coding regulatory CREs in these intergenic regions corresponding to transcriptomic profiles of the coding genes. These CREs include those similar to TATA-Box marking the core promoters of typical eukaryotic genes (18) , and viral mimics of CREs interacting with various transcription factors including CEBPs, CREBs, IRFs, NF-kB, and STATs, which are critical for regulation of cellular immunity and cytokine responses in antimicrobial immunity (29, 31) . Moreover, we discovered for the first time, an enrichment of putative viral miRNA sequences in more than five intergenic regions of FV3 genome. A variety of these viral miRNAs have the potential to target the 3'-UTR of Xenopus genes involved in antiviral IFN response, including those encoding IFN receptor subunits and IRFs (26) . Collectively, using FV3 model, this study provides a first comprehensive genome-wide analysis of non-coding regulatory mechanisms acquired by ranavirus pathogens to epigenetically regulate both viral and host gene expressions. Two Frog virus 3 (FV3) strains, a wild type (FV3-WT) and an ORF64R-deprived strain (FV3-D64R), were used. The virus preparation and animal infection were conducted as previously described (13, 27, 32) . In brief, fathead minnow (FHM) cells (ATCC ® CCL-42) or baby hamster kidney (BHK) cells (ATCC ® CCL-10) or a kidney A6 cell line (ATCC ® CCL-102) were maintained and used for propagation and titration of FV3 virus stocks. Virus stocks were purified and the virus load was assessed by plaque assays in the BHK or A6 cells. Outbred specific-pathogen-free adult (1-2 years old) frogs were obtained from the X. laevis research resource for immunology at the University of Rochester (http://www.urmc.rochester.edu/mbi/ resources/xenopus-laevis/). Animal handling procedures were approved and performed under strict laboratory and University Committee on Animal Resources (UCAR) regulations (approval number 100577/2003-151). Adult frogs with the comparable Age/body-weight were randomly allotted into mock controls and infected groups (n = 5/ group). Animal infections were conducted by intraperitoneal (i.p.) injection with FV3-WT (at 1 × 10 6 PFU/each) or FV3-D64R (at 1 × 10 6 PFU/each) virus in 100-ml amphibian phosphatebuffered saline solution (APBS) or only APBS for mock controls. At 0, 1, 3, and 6 days postinfection (dpi), animals were euthanized and indicated tissues were sampled and pairwise allotted for classical viral titration and gene expression analyses, and the samples of 3 dpi were cryopreserved for further transcriptomic analysis as described (13, 27, 32, 33) . Total RNA and DNA were isolated from frog cells or tissues using a TRIzol reagent (Invitrogen) for PCR-based assays or a column-based RNA/DNA/protein purification kit (Norgen Biotek, Ontario, Canada) for transcriptomic analysis. RNA concentration and integrity were examined with a NanoDrop 8000 spectrometer (NanoDrop, Wilmington, DE) and an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) to ensure RNA samples with A260/A280>1.8 and RNA integrity number (RIN) >7.0 qualified for construction of sequencing libraries (27, 32, 33) . Quantitative PCR (qPCR) or qRT-PCR assays were conducted as described (29, 33) . In brief, 150 ng/reaction of DNA templates were used to measure FV3 gene copies based on detection of FV3gorf60R, which encodes a viral DNA polymerase II (Pol II), in an ABI 7300 real-time PCR system and PerfeCta SYBR green FastMix, ROX (Quanta) (29, 33) . For qRT-PCR analyses, assays were performed in a 96-well microplate format using a QuantStudio ™ 3 Real-Time PCR System (Thermofisher) with the validated primers. Reactions were formed with a SYBR Green RT-PCR kit (Qiagen, Valencia, CA) with 500 ng of total RNA in a 20-ml reaction mixture. Specific optic detection was set at 78°C for 15 s after each amplification cycle of 95°C for 15 s, 56-59°C for 30 s and 72°C for 40 s. Cycle threshold (Ct) values and melt curves were monitored and collected with an enclosed software. Relative gene expression was first normalized against Ct values of the housekeeping gene (GAPDH) for relative expression levels, and compared with the expression levels of control samples for stimulated regulation if needed (29, 32, 33) . RNA sample and RNA-Seq sequencing library preparation were performed using the Illumina Pipeline (Novogene, Sacramento, CA) as previously described (27) . For RNA-Seq, approximately 40 M clean reads per sample were generated for sufficient genome-wide coverage. The clean reads were assembled and mapped to the Reference genome/transcripts of X. laevis or FV3 virus through Xenbase (http://ftp.xenbase.org/) or NCBI genome ports (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF), respectively. Data of virus-targeted transcriptome was reported here. The workflow of RNA-Seq analysis, bioinformatics software used, and some exemplary data to show general quality and comparability of the transcriptome data was schematically shown and previously reported (27) . Differentially expressed genes (DEGs) between two treatments were called using DeSeq and edgeR packages and visualized using bar charts (FPKM) or heatmaps (Log2 fold ratio) as previously described (27) . The transcriptomic dataset was deposited in the NIH Short Read Archive (SRA) linked to a BioProject with an accession number of PRJNA705195. The sequences of 51 intergenic regions between coding ORFs (including the 5'-and 3'-UTR regions of the viral genome) were extracted from FV3's reference genome (GenBank accession number: NC_005946.1). The sequences were aligned using the multiple sequence alignment tools of ClustalW or Muscle through an EMBL-EBI port (https://www.ebi.ac.uk/). Other sequence management was conducted using programs at the Sequence Manipulation Suite (http://www.bioinformatics.org). Sequence alignments were visualized using Jalview (http://www. jalview.org) and MEGAx (https://www.megasoftware.net). Two programs/databases were used to confirm each other for the major CRE detection. The CREs (and corresponding binding TFs) in intergenic regions were examined against both human/animal TFD Database using a program Nsite (Version 5.2013, at http://www.softberry.com). The mean position weight matrix (PWM) of key cis-elements in intergenic regions were examined and calculated using PWM tools through https://ccg. epfl.ch/cgi-bin/pwmtools, and the binding motif matrices of examined TFs were extracted from MEME-derived JASPAR CORE 2020 vertebrates or JASPAR CORE 2018 vertebrates clustering affiliated with the PWM tools (34). Comparative CRE-Analysis of Intergenic Regions Immediately Upstream of Top-Ranked Highly Expressed FV3 Genes FV3's coding genes were categorized based on their temporal classes into immediate early (IE), delayed early (DE), and late (L) viral transcripts as previously designated. The expression levels of individual FV3 ORF coding genes were determined as averages across all samples to demonstrate the differential expression using the transcriptomic data. The relative expression order across and within each temporal gene classes was sorted. The intergenic regions immediately upstream of topten highly expressed FV3's coding genes in each temporal class were extracted to perform PWM analyses as described above, and were compared to overall scores of all intergenic regions. The comparative analyses were broadly performed against various CRE types/clusters, but focused on those potently interacting with vertebrate transcription factors critically in antiviral immune regulation including CEBPs, CREBs, IRFs, NF-kB2-like, and STAT1-like transcription factors (31, 34) . The miRNA prediction and RNA structure prediction were analyzed using a findMiRNA and FoldRNA programs, respectively, through an online bioinformatic suite at http:// www.softberry.com. The miRNA target prediction on the 3'-UTR of various Xenopus genes were performed using three RNA analysis programs through an online BiBiServ Service (https://bibiserv.cebitec.uni-bielefeld.de/). The sequences of 3'-UTR regions and information about alternative transcripts of X. laevis genes/transcripts were extracted from the gene annotations at Reference genome/transcripts of X. laevis or FV3 virus through Xenbase (http://ftp.xenbase.org/) and NCBI genome ports (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF). The locations and sequences of all predicted v-miR are listed in Supplemental Excel Sheet, and the GenBank accession numbers of analyzed genes/transcripts are listed in indicated tables. Due to the enrichment of predicted v-miR target sites on the 3'-UTR of Xenopus IRF and IFN receptor genes, transcriptomic analyses of X. laevis mRNA encoding various IRF and IFN receptor gene families to show the differential expression of these genes was compared between FV3-D64R-and FV3-WTinfected tissues. Wherein, some intergenic regions containing putatively responsible v-miR were demonstrated to transcribe differentially between these two FV3 strains. Particularly, several representative v-miR were synthesized and transformed into X. laevis A6 kidney cells to evaluate RNA interference effect against Xenopus IRF and IFN receptor genes. The small interfering RNA (siRNA) identical to representative v-miR sequences were synthesized and transformed as previously described (35) . In brief, the sense and antisense sequences of the siRNA were synthesized at IDT (Coralville, Iowa) together with an AlexaFluor-488 (AF488) labeled scramble siRNA, which was designed to serve as control siRNA and allow transfection optimization. A6 cells were cultured as described in a 24-well plate and transfected with Oligofectamine (Invitrogen to attain >90% transfected ratio as estimated by the AF488-scramble siRNA (35) . Forty-eight hours after siRNA transfection, cells in different wells were collected for RNA extraction and gene specific RT-PCR was used to quantify the expression of target genes as described above (27, 33) . RNA samples used for RT-PCR assays were treated with RNase-free DNase I (NEB) to remove potential DNA contamination (29, 33) . Statistical analysis was completed using the SAS package (Company information)?. One-way analysis of variance (ANOVA) and Tukey's post hoc test, as well as a two-sample F test was applied for significant evaluation between samples/ treatments. A probability level of p<0.05 was considered significant (27, 32, 33) . The FV3 genome regions are functionally classified into exons, or intergenic regions based on annotation of the reference genome (NC_005946.1). All FV3's coding ORFs span about 80% of the genome sequence, and lack introns, i.e., intronless (27) . In contrast, we extracted 51 intergenic regions that are intermediate between sequential ORFs, including the terminal 5'-and 3'-untranslational regions (UTRs) that are known to play important regulatory role in viral replication and gene expression. These ranaviral intergenic regions take about 20% of the FV3 genome with a length varying from 20 to 900 bp and an average length of 340 bp long. As expected, the majority of RNA-Seq reads (>90%), representing a significant coverage of the whole viral genome, mapped to coding regions in most infected tissues including the intestine, kidney, liver, spleen, thymus and lung ( Figure 1 ). However, a careful examination of virus-specific reads in most infected tissues also detected~5-10% authentic reads being specifically mapped on intergenic regions. This indicates that these intergenic regions in the FV3 genome are transcribed and probably function as regulatory RNA species. In addition, consistent with data previously reported for coding genes, the FV3-D64R mutant virus had also a general higher transcription of reads mapped to intergenic regions in most infected tissues ( Figure 1 ) (27) . This implies that the disruption of the FV3orf64R gene, which encodes a putative interleukin-1 beta convertase containing caspase recruitment domain (vCARD), may change the overall viral transcription dynamics, or result in accumulation of viral transcripts due to inefficient virus assembly process (36) . To reveal cis-regulatory role of intergenic regions on expression of coding genes, we first searched for putative viral TATA-box equivalent. In eukaryotic genes, the TATA-box is a cis-regulatory element (CRE) marking the core promoters. To identify a putative viral TATA-like box we used a software based on an evaluating score system of position weight matrix (PWM) used for vertebrate CREs (18, 19) . The bar chart in Figure 2A shows that a significant score (pseudo-weight value <0.0001 as defaulted in the system) for putative FV3 TATA-box-like was detected in all intergenic UTR sequences including two terminal 5'-and 3'-UTR regions. The location of these putative TATA-Box-like CREs are at 11-470 nt (overall average at 190 nt) ahead of the TSS of downstream associated coding genes (Supplemental Excel Sheet). These results from a bulk study are consistent with previous single promoter characterization of a few genes in FV3 and Bohle iridovirus, where CRE motifs were found located 127 to 281 bp upstream of the TSS (24) . The average PWM score of TATA-Box CRE across all intergenic regions was 8.0 (Log 2 Unit) with most scores higher than 5.0, which is close to the median value across PWM scores of multiple CREs executed in this study. The line chart in Figure 2A illustrates the transcriptomic average of all 98 coding genes annotated on the FV3 reference genome (27) . Careful comparison did not show obvious positive correlation between higher PWM scores of TATA-Box-like in intergenic regions and increased expression of associated coding genes. A similar PWM score at 8.1 was obtained by executing the PWM evaluation for FV3 genes exhibiting top-ten ranked transcribing levels in different temporal classes ( Figure 2B ) (27) . This suggests that although the putative TATA-Box CRE in intergenic regions may function to recruit vPol II through binding of the transcription factor TBP and signifies the corepromoter regions, it is not the only determinant ( Figure 2C ) (18) . Rather these putative intergenic TATA-Box CRE are likely to cooperates with other intergenic CREs to induce relative expression levels of associated genes in the virus-host interaction (18, 19) . Further analysis detected the presence of multiple types of viral CRE mimics (v-CREs) in FV3 genome intergenic regions. We focused our interest on CRE families that are critical in regulation of amphibian antiviral immunity. These v-CREs include those predicted to interact with transcription factors (TFs), such as the IRF and STAT families that critically mediate cytokine-and IFNdependent signaling. Among these factors, NF-kB-like and PU.1 (a.k.a. SPI1) regulate inflammation, whereas other like the CEBP and CREB families control immune cell proliferation and activation (37) (38) (39) (40) (41) . Figure 3 shows the distribution v-CREs that have likely evolved to interact with representative TFs critical for regulating antimicrobial immunity as aforementioned. Besides v-CRE showing a significant binding score for IRF1, most intergenic regions also exhibit conserved v-CREs with comparable PWM scores that can bind IRF2 IRF5 and IRF6 ( Figure 3 and Supplemental Excel Sheet). In contrast, only a portion (a third to a half) of intergenic regions contain v-CREs with a high PWM binding score (>2 Log 2 Unit) for other IRFs. Similarly, v-CREs with significant prediction for binding members of the STAT family were detected in almost all intergenic regions and for all vertebrate STAT members with average PWM scores between 2.0-6.0 log 2 Unit in an increasing order of STAT1(2.0)