key: cord-0845761-94bm64bi authors: Gowda, S.; Satyanarayana, T.; Naidu, R. A.; Mushegian, A.; Dawson, W. O.; Reddy, D. V. R. title: Characterization of the large (L) RNA of peanut bud necrosis tospovirus date: 2014-05-16 journal: Arch Virol DOI: 10.1007/s007050050468 sha: e871a8d291b1eacb6dbd46e962d005964dc5d46e doc_id: 845761 cord_uid: 94bm64bi The nucleocapsids purified from peanut plants systemically infected with peanut bud necrosis virus (PBNV), a member of the genus Tospovirus, contained both viral(v) and viral complementary(vc) sense L RNAs. Defective forms of L RNA containing ‘core polymerase region’ were observed. The full length L RNA of PBNV was sequenced using overlapping cDNA clones. The 8911 nucleotide L RNA contains a single open reading frame (ORF) in the vc strand, and encodes a protein of 330 kDa. At the 5′ and 3′ termini of the v sense RNA there were 247 and 32 nt untranslated regions, respectively, containing an 18 nt complementary sequence with one mismatch. Comparisons of the predicted amino acid sequence of the L protein of PBNV with other members of Bunyaviridae suggest that the L protein of PBNV is a viral polymerase. The L protein had highest identity in the ‘core-polymerase domain’ with the corresponding regions of other tospoviruses, tomato spotted wilt virus and impatiens necrotic spot virus. Peanut bud necrosis virus (PBNV), a member of the genus Tospovirus of the family Bunyaviridae, is the most economically important virus of peanuts in the Indian sub-continent. It is transmitted by melon Thrips, Thrips palmi, in a propagative manner [19] . Membrane bound spherical virus particles (80-100 nm) encapsidate three species of RNA, small (S), medium (M) and large (L). Based on serological cross-reactivity and sequence homology of the nucleocapsid(N) protein, five serogroups have been recognized in the tospovirus genus. PBNV and the closely related watermelon silver mottle virus (WSMV) [33] represent serogroup IV and are antigenically distinct from other Tospoviruses: serogroup I=tomato spotted wilt virus (TSWV) [2] ; serogroup II=groundnut ring spot virus and tomato chlorotic spot virus [1] ; serogroup III=impatiens necrotic spot virus (INSV) [13] ; and, serogroup V=peanut yellow spot virus (PYSV) [26] . The complete nucleotide sequences of the S (3,057 nt) and M (4,801 nt) RNAs of PBNV have been published [23, 24] and they have ambisense coding strategies, similar to that previously described for TSWV [2, 11] and INSV [4, 13] . The S and M RNAs of PBNV encode the non-structural proteins NSs (49 kDa) and NSm (34 kDa) by the viral sense strand (v), respectively, while the viral complementary sense (vc) the M RNA encodes the envelope glycoprotein precursor (GP) with a predicted size of 127 kDa and the S RNA encodes the N protein (30 kDa) [23, 24] . The proteins encoded by the S and M RNAs are expressed through subgenomic RNAs that terminate in the intergenic regions of the corresponding RNA. In contrast, the L RNAs of TSWV and INSV, have negative polarity ORFs that encode a putative viral polymerase [3, 31] . Both v and vc sense S and M RNAs are encapsidated in TSWV particles. Although the vc strand of the L RNA was not observed in purified viral particles [8] , the purified nucleocapsids contained both v and vc sense strands of the L RNA [11] . In this communication, we show evidence that nucleocapsids of PBNV contain both v and vc forms of L RNA. Further, we observe the presence of several defective L RNA (dL) species of both v and vc polarities in the nucleocapsids. The sequence of the L RNA of PBNV was determined from a set of cDNA clones prepared from the purified L RNA. This completes the sequence of all the three genomic RNAs and allowed the sequence comparison of the L RNA of PBNV, a serogroup IV member, with those of TSWV (serogroup I), INSV (serogroup III) and selected members of the animal Bunyaviridae. The preparation of the PBNV L RNA from purified nucleocapsids, from systemically infected peanut plants, and the synthesis of cDNA to the gel purified L RNA were essentially as described earlier [23] . The L RNA specific cDNA clones were generated by random primed first and second strand syntheses [9] using the SuperScript choice system (Gibco-BRL), followed by addition of EcoRI adaptors and cloned into pGEMf7Z at EcoRI site. Regions that were not covered by the initial cDNA clones were made using reverse transcription coupled with polymerase chain reaction (RT-PCR) of purified RNA using specific primers designed based on the sequence information. Oligonucleotides M-258 and M-210 (Table 1 ) were used to amplify the gap between pb40 and pb21 to generate the clone pbM3 (Fig. 1) . Primers based on the terminal conserved sequences in the S and M RNAs of PBNV [23, 24] were used to amplify the 3 and the 5 end clones by RT-PCR. The oligonucleotide, M-250 complementary to nucleotides 3 049-3 057 of the S RNA of PBNV [23] and an oligonucleotide M-255 (Table 1) towards the 3 end of pb21, were used in the RT-PCR to generate the clone pb6 (Fig. 1) . Similarly, oligonucleotide, M249 corresponding to the 5 conserved terminus of PBNV S RNA [23] and an oligonucleotide, M-256 (Table 1) , located in the 5 end of pb27, were used to generate the 5 end clones pb2 (Fig. 1 ). The products of the RT-PCR were made blunt by T4 DNA polymerase and cloned into the SmaI site of pUC119 prior to sequencing. The sequencing of the double-stranded DNA was carried out by Taq cycle sequencing using fluorescence based chain termination in an automated Applied Biosystems Model 373A DNA sequencer (Perkin Elmer/Applied Biosystems) at the ICBR DNA sequencing core facility of the University of Florida, Gainesville, Florida. The nucleotide sequence and the deduced amino acid sequence from the overlapping clones were assembled using the GCG program package [5] . Northern analysis of the RNA isolated from the purified nucleocapsid was as described previously [23] . The clone pb40 in pGEMf7Z was linearized with either SmaI or XhoI prior to transcription with T7 or SP6 RNA polymerase to generate v and vc sense transcripts, respectively. The strand-specific probes and oligonucleotide probes were prepared by the incorporation of digoxigenin-labelled UTP according to manufacturer's specifications (Boeheringer Mannheim). Pre-hybridization, hybridization and chemiluminiscent detection were carried out as specified in the Genius 5 kit (Boehringer Mannheim). The nucleocapsids purified from peanut systemically infected with PBNV separated by sucrose density gradient centrifugation into three light scattering zones (top, middle and bottom) which respectively contained S, M and L RNAs in approximately equal amounts [25] . Previously it was shown that total RNA extracted from Nicotiana rustica plants infected with TSWV contained both v and vc forms of RNA for each of the three genomic RNAs, with v sense RNAs present at approximately ten times the levels of corresponding vc RNAs [11] . However, the RNAs extracted from intact enveloped TSWV particles contained both polarities of S and M RNAs and only the v form of the L RNA [11] . Analysis of PBNV RNAs from purified nucleocapsids in northern blots using strand specific L RNA riboprobes showed the presence of genomic length v and vc forms of L RNA (Fig. 2a, 2b , the genomic length L RNA indicated by arrows). In initial studies, using RNA from purified PBNV nucleocapsids, we amplified parts of the genomic L RNA by RT-PCR with either v sense or vc sense primers. Specific PCR products were obtained with the sense primer, To demonstrate the v and vc sense forms of L RNA by northern analysis, strand specific digoxigenin labeled riboprobes were prepared from clone pb40 (Fig. 1) linearized with SmaI or XhoI prior to transcription with T7 or SP6 RNA polymerase to generate v and vc sense transcripts, respectively. Blots shown in Fig. 2a showed both v (left panel) and vc (right panel) sense genomic length RNA (indicated by arrows). To confirm that the hybridizations were strand specific, an independent set of synthetic oligonucleotide probes were made to both v and vc sense strands. Two oligonucleotide primers, M-251and M-306 (Table 1) , were 3 end labeled and used as probes. The v and vc riboprobes and oligoprobes were equalized based on hybridization to respective PBNV cDNA clones. A similar pattern of hybridization observed with strand specific riboprobes was also observed with oligonucleotide probes (Fig. 2b , left panel, hybridization using v sense oligoprobe, 2b, right panel, hybridization using vc sense oligoprobe). In addition to the genomic length RNA, several smaller L RNAs differing in size and abundance were observed that specifically hybridized to both the probes. These, smaller than genomic length RNAs indicated by arrow heads in Fig. 2a , 2b, probably represent defective species of the L RNA (dL) since they were not observed when hybridized with PBNV M RNA or S RNA specific probes (data not shown). Both v and vc forms of these RNAs were found in nucleocapsids. Defective RNAs have been reported for the M RNA of PBNV [24] and the L RNA of TSWV [20] , whereas defective RNAs specific for the S RNA have not been observed in tospoviruses. The dL RNAs of TSWV nucleocapsids, contain the 5 and 3 genomic termini with an internal deletion of 60-80% of the L RNA segment [21] . The dL RNAs observed in PBNV contained at least some of the middle portion of the L RNA since the strand specific riboprobes used in the northern analysis contained the core polymerase region. The presence of such dL RNAs suggests multiple deletions of internal sequences while maintaining portions or all of the 'core' region to create the observed defective RNAs. However, the precise nature of the dLRNAs observed in PBNV have not been confirmed since this can only be done by sequencing to map the junction sites. Several groups of RNA plant viruses possess defective RNAs which often interfere with the replication of the wild type virus and/or modulate the symptom expression [29] . The presence of the 5 and 3 RNA sequences and the maintenance of the ORF despite large internal deletions in TSWV [21] suggest that dL RNAs in tospoviruses interfere with replication by competing for the replication machinery and/or packaging of the wild type virus. The occurrence of the defective RNAs and the appearance of virus particles without the envelope resulted in the loss of transmission of PBNV by T. palmi (D. V. R. Reddy, pers. comm.). We sequenced the PBNV L RNA from cDNA clones generated by the random priming of the gel purified L RNA. Clones pb27, pb20, pb40 and pb21 (Fig. 1) were selected for further analysis because they were large and specific for L RNA in northern hybridization analyses (data not shown). Together, they represented more than 70% of the L RNA sequence. We did not obtain 5 or 3 terminal clones by this method. Since all genomic RNAs of tospoviruses contain conserved sequences at their termini, primers corresponding to the conserved sequences in the S and the M RNA of PBNV (Table 1) were combined with primers, M-256 and M-255 (Table 1) , to amplify the terminal sequences of the L RNA. In addition, clone pbM3 was obtained by RT-PCR using primers from the 5 sequence of pb21 (M-258) and the 3 sequence of pb40 (M-210) to fill the sequence gap between clones pb40 and pb21. The clones obtained by RT-PCR using specific primers ( Fig. 1) were selected based on their hybridization specificity to L RNA and sequences overlapping the adjacent cDNA clones. The L RNA of PBNV was 8 911 nts with 37% A, 29% T, 18% G and 16% C. The size was similar to that estimated from its migration in denaturing agarose gels [25] and was similar to that of the L RNA of TSWV (8 897 nt) [3] and INSV (8 776 nt) [31] . However, the tospovirus L RNAs are substantially larger than the L RNAs of animal Bunyaviridae: Rift valley fever phlebovirus = 6 404 nt [15] and Uukuniemi phlebovirus = 6 423 nt [7] ; Bunyamwera bunyavirus = 6 875 nt [6] ; Hantaan hantavirus = 6 530 nt [28] and puumala hantavirus = 6 550 nt [30] ; and, lymphocytic choriomeningitis arenavirus = 6 680 nt [22] . The 5 and 3 termini of PBNV L RNA (v sense), contained two untranslated regions of 247 and 32 nucleotides, with an 18 nt complementary sequences (with one mismatch) which could form a stable panhandle structure characteristic of Bunyaviridae. Sequence analysis showed that the L RNA of PBNV contained a single large open reading frame (ORF) initiating at the AUG codon at position 8 879 and terminating at TAA codon at position 32, encoding a protein of 2 877 amino acids, with a predicted molecular weight of 330 kDa. No other reading frame in v or vc sense contained ORFs of significant size. Thus the L RNA of PBNV appears to function as a negative sense RNA. The predicted size of the L protein was similar to those of TSWV (2 875 amino acids, 331 kDa) [3] and INSV (2 865 amino acids, 329 kDa) [31] . The predicted translation product of the PBNV L RNA appears to be the viral polymerase based on the similarities of sequence and size compared to the other tospovirus L proteins and the presence of conserved signature sequences in the 'core polymerase region' (GDX 1-3 K, GXXNXXS, SDD, FX 10-17 KK, EFXSXR) which are characteristic of the viral RNA polymerases of the negative stranded Orthmyxoviridae (influenza A virus) and Bunyaviridae (see below). The putative viral polymerase of PBNV was compared with those of TSWV and INSV. The degree of conservation varied among the three tospoviruses. The highest level of identity was observed in the core polymerase region (60%) and the least identity occurred towards the ends of the protein (data not shown). Pairwise comparison of the L proteins of PBNV, INSV and TSWV using the GAP program of the Wisconsin GCG sequence analysis software package, indicated an overall identity of 45% (63.8% similarity) between PBNV and TSWV and 46% (65% similarity) between PBNV and INSV, while an identity of 69% (84% similarity) was observed between INSV and TSWV. These comparisons demonstrate that the L proteins of INSV and TSWV are more related to each other than to PBNV. The putative proteins encoded by S RNA and M RNA had similar patterns of relatedness between serogroups. For example, the N proteins were 55% identical between serogroups I and III [2, 13] and only 30-33% identical comparing TSWV and PBNV or INSV and PBNV [23] . In contrast, within serogroup IV (PBNV and WSMV) the relatedness of the N protein was much higher (> 80% identity) [33, 23] . The Clustering Phenogram of the L proteins of INSV, PBNV and TSWV constructed by progressive pair wise alignment with the PILEUP program of the GCG software package revealed that INSV clustered together with TSWV while PBNV represented a separate branch (data not shown). It is likely that PBNV and the closely related WSMV, both of which have been found only in the Asian subcontinent, have recently diverged. Partitioning of the L protein sequence into globular and non-globular regions, using the local sequence complexity measures, performed with the SEG program with the parameters optimized to efficiently recognize such regions [32] identified two non-globular and three globular regions. The two non-globular regions occupied amino acid residues 361-452 and 1 073-1 160. Non-globular elongated regions are common in large eukaryotic proteins where they are thought to serve as hinges connecting globular domains [32] . The three predicted globular regions in the L protein correspond to the three most conserved regions (regions 1, 2, and 3, [15] ). Region 1, located towards the amino terminus of the L protein (Fig. 3A) , contained two motifs conserved in tospo-, and selected bunya-and arenaviruses. The second globular region is nearly 700 amino acids downstream from region 1. Globular region 3 contains motifs A, B, C, and D that are also conserved in the L proteins of tospoviruses and bunyaviruses (Fig. 3B) . This globular region of the L protein contains a putative RNA-dependent RNA polymerase domain followed by another short conserved domain, which by a coiled-coil region might be involved in protein-protein interaction in vivo. Further, downstream in region 3, a long region moderately similar to the coiled-coil repeats in animal myosin was found (data not shown). Based on the superimposition on to the known threedimensional structure of HIV-1 reverse transcriptase, motifs A, C, and D have been suggested to be directly involved in enzyme activity [15] . The general organization of PBNV L RNA resembles the L RNAs of TSWV and INSV. However, the latter are more similar to each other than to PBNV [15] , except for motif "pre-B"detected in this work (45-46% identity at the amino acid level between PBNV and TSWV or PBNV and INSV while 69% identity between TSWV and INSV). Similarly, the above pattern of relatedness were also seen with N and NSs (S RNA) and glycoprotein precursor Nsm proteins (M RNA) of these viruses [23, 24] . These features together with the absence of serological cross-reactivity of PBNV with other serogroups [19] indicate it is a distinct species in the genus tospovirus. Distinct levels of relationships between tospovirus isolates The S RNA segment of a tomato spotted wilt virus has an ambisense character Tomato spotted wilt L RNA encodes a putative RNA polymerase The nucleotide sequence of the S RNA of Impatiens necrotic spot virus, a novel tospovirus A comprehensive set of sequence analysis programs for the VAX Nucleotide sequence analysis of the large genomic RNA segment of Bunyamvera virus, the prototype of the family Bunyaviridae Nucleotide sequence and coding strategy of Uukuniemi virus L RNA segment Molecular and biological aspects of tospoviruses A simple and very efficient method for generating cDNA libraries The nucleotide sequence of the M RNA segment of tomato spotted wilt virus, a bunyavirus with two ambisense RNA segments Viral RNA synthesis in tomato spotted wilt virus-infected Nicotiana rustica plants Transcription and replication of influenza virion RNA in the nucleus of infected cells The M RNA of impatiens necrotic spot tospovirus (Bunyaviridae) has an ambisense genomic organization Primary structure and translation of defective interfering RNA of murine coronavirus Rift valley fever virus L segment: correction of the sequence and possible functional role of newly identified regions conserved in RNA-dependent polymerases Sequence analysis of eukaryotic developmental proteins: ancient and novel domains Structure of defective interfering (DI) RNAs of influenza viruses and their role in interference Characterization of Bunyamvera virus defective interfering particles Serological relationships and purification of bud necrosis virus, a tospovirus occurring in peanut (Arachis hypogea L.) in India Generation of envelope and defective interfering RNA mutants of tomato spotted wilt virus by mechanical passage Defective interfering L RNA segments of tomato spotted wilt virus retain both virus genomic termini and have extensive internal deletions Primary structure of the lymphocytic choriomeningitis virus L gene encodes putative viral polymerase Peanut bud necrosis tospovirus S RNA: complete nucleotide sequence, genome organization and homology to other tospoviruses The complete nucleotide sequence and genome organization of the M RNA segment of peanut bud necrosis tospovirus and comparison with other tospoviruses Peanut bud necrosis tospovirus: purification of nucleocapsids, and sequence homology of nucleocapsid protein and glycoprotein precursor with other tospoviruses Peanut yellow spot virus: a distinct tospovirus species based on serology and nucleic acid hybridization Defective interfering influenza RNAs of polymerase 3 gene contain single as well as multiple internal deletions Nucleotide sequence of the L genome segment of Hantaan virus RNA-RNA recombination and evolution in plants Primary structure of the large (L) RNA segment of nephropathia epidemica virus strain halinas B1 coding for viral RNA polymerase Completion of the impatiens necrotic spot virus genome sequence and genetic comparison of the L proteins with the family Bunyaviridae Analysis of compositionally biased regions in sequence databases Nucleotide sequence of the N gene of watermelon silver mottle virus, a proposed new member of the genus tospovirus Florida Agricultural Experiment Station Journal Series Number R-06269, ICRISAT Journal Series number 2 021. The sequence presented in this report has been submitted to the GenBank with the assigned accession number AF025538. Authors' address: Dr. S. Gowda, The University of Florida, Citrus Research and Education Center, 700 Experiment Station Road, Lake Alfred, FL 33850, U.S.A.Received October 17, 1997