key: cord-0874585-jh6d1o0i authors: Allison, Andrew B.; Mead, Daniel G.; Palacios, Gustavo F.; Tesh, Robert B.; Holmes, Edward C. title: Gene duplication and phylogeography of North American members of the Hart Park serogroup of avian rhabdoviruses date: 2014-01-01 journal: Virology DOI: 10.1016/j.virol.2013.10.024 sha: 443dcdf7fa35b481fde73608479940044ec91d49 doc_id: 874585 cord_uid: jh6d1o0i Flanders virus (FLAV) and Hart Park virus (HPV) are rhabdoviruses that circulate in mosquito-bird cycles in the eastern and western United States, respectively, and constitute the only two North American representatives of the Hart Park serogroup. Previously, it was suggested that FLAV is unique among the rhabdoviruses in that it contains two pseudogenes located between the P and M genes, while the cognate sequence for HPV has been lacking. Herein, we demonstrate that FLAV and HPV do not contain pseudogenes in this region, but encode three small functional proteins designated as U1, U2, and U3 that apparently arose by gene duplication. To further investigate the U1-U2-U3 region, we conducted the first large-scale evolutionary analysis of a member of the Hart Park serogroup by analyzing over 100 spatially and temporally distinct FLAV isolates. Our phylogeographic analysis demonstrates that although FLAV appears to be slowly evolving, phylogenetically divergent lineages co-circulate sympatrically. Flanders virus (FLAV) and Hart Park virus (HPV) are two closelyrelated members of the Hart Park serogroup of the family Rhabdoviridae that are maintained in mosquito-passerine bird transmission cycles in the eastern and western United States, respectively (Whitney, 1964; Johnson, 1965; Kokernot et al., 1969; Crane et al., 1970; Main et al., 1979; Main, 1981) . Viruses in the Hart Park serogroup were initially classified together based on antigenic cross-reactivity in complement fixation, neutralization, immunodiffusion and/or immunofluorescence assays (Boyd, 1972; Frazier and Shope, 1979) . In addition to FLAV and HPV, other members in earlier classifications of the serogroup included Mosqueiro virus (MQOV), a virus first isolated in Brazil, and two African viruses, Mossuril virus (MOSV) and Kamese virus (KAMV) (Tesh et al., 1983; Calisher et al., 1989) . Besides their antigenic relatedness, these five geographically disparate viruses appear to share a similar mechanism of transmission, as virus isolation data indicated that they were predominately associated with birds and/or culicine (e.g., Culex, Culiseta) mosquitoes (Karabatsos, 1985) . More recently, Wongabel virus (WONV), Parry Creek virus (PCRV), and Ngaingan virus (NGAV) have also been provisionally included into the serogroup based on genetic and phylogenetic (rather than antigenic) relationships (Bourhy et al., 2005; Gubala et al., 2008 Gubala et al., , 2010 . These three viruses were originally isolated in Australia, and besides the serological observation that the natural host range of NGAV may include macropods, they also appear to be predominately associated with birds and culicine mosquitoes or other hematophagous insects such as Culicoides biting midges (Humphery-Smith et al., 1991; Bourhy et al., 2008; Gubala et al., 2010) . Additionally, two recently described (but historically isolated) Australian viruses recovered from Culex annulirostris -Holmes Jungle virus (HOJV) and Ord River virus (ORRV)appear to be new members of the serogroup (Gubala, 2012) , as do Bangoran virus (BGNV) and Porton's virus (PORV) (Dacheux et al., 2010) . Whether these twelve potential Hart Park serogroup members will eventually be designated as a new genus within the Rhabdoviridae will likely entail a more comprehensive phylogenetic analysis (such as full genome studies) of these and other unclassified rhabdoviruses of the Dimarhabdovirus supergroup. FLAV is unique among the rhabdoviruses in that it purportedly contains a 19 kDa protein gene flanked on either side by putative pseudogenes (GenBank accession AH012179). No comparative sequence for HPV has previously been available. These three consecutive genes, originally termed pseudogene 1, 19 kDa protein gene, and pseudogene 2, are located between the phosphoprotein (P) and matrix (M) genes, such that the FLAV genome is currently represented as 3′-nucleoprotein(N)-P-pseudogene1-19K-pseudo-gene2-M-glycoprotein(G)-polymerase(L)-5′ . However, given the constraints on genome size that seem to characterize RNA viruses as a whole (Holmes, 2009) , it is surprising that FLAV would apparently carry two sequences that have no functional role. As Australian Hart Park serogroup viruses (i.e., WONV and NGAV) contain three complete intact ORFs between their P and M genes (Gubala et al., , 2010 , we sought to analyze this region in the two North American members of the serogroup, FLAV and HPV, and clarify this apparent genomic complexity. Additionally, we investigated the potential encoding of a viroporin-like small hydrophobic (SH) protein located between the G and L proteins and undertook the first comprehensive evolutionary study of a Hart Park serogroup virus by analyzing more than 100 pseudogene region sequences of FLAV isolates collected over a 50-year period. Gene, mRNA, and protein analysis of the pseudogene region and SH ORF Our genetic analysis of multiple FLAV isolates indicated that the two putative pseudogenes located between the P and M genes contained complete uninterrupted ORFs flanked by conserved transcriptional start (UCGUCMKUAG) and stop/polyadenylation (CU 7 ) sequences, suggesting that they in fact encode functional proteins (GenBank accessions KF028661-KF028670). The predicted proteins associated with pseudogene 1, the 19 kDa protein gene, and pseudogene 2 ORFs in FLAV were very similar in size, with lengths of 161, 165, and 160 amino acids, respectively. Similar results were found with HPV (GenBank accession KF028764), indicating both viruses had three complete ORFs between the P and M genes. Cloning of RT-PCR products generated from RNA extracted from FLAV-infected Vero cells demonstrated that polyadenylated transcripts of the two putative pseudogene sequences (as well as the 19 kDa protein gene) were being produced, again indicating that they are functional ORFs. Functionality was further supported as an analysis of the pseudogene 1, 19 kDa protein gene, and pseudogene 2 sequences of 10 FLAV isolates produced d N /d S ratios of 0.07, 0.02 and 0.09, respectively, indicative of strong selective (i.e., functional) constraints rather than the selective neutrality expected of pseudogenes (in which d N /d S ratios would tend to be a value of $1.0). Similarly, a d N /d S of 0.07 was observed in 103 pseudogene 1 (U1) sequences (see below), again revealing strong selective constraints. In addition to the predicted N, P, M, G, and L proteins, we detected three small viral protein bands when we probed FLAVinfected Vero cell lysates in a Western blot using FLAV-specific antisera (Fig. 1 ). Based on their respective molecular weights, the L (238.54 kDa), G (71.05 kDa), N (50.40 kDa), and M (25.83 kDa) proteins were identified by their approximate size in the immunoblot (Fig. 1) . Although the predicted P protein (25.78 kDa) is very similar in size to the M protein, the former is known to migrate in SDS-PAGE gels at between 40 and 50 kDa , suggesting P is the band around 40 kDa (size known from additional blots) beneath N. As the predicted molecular weights of the products of pseudogene 1, the 19 kDa protein gene, and pseudogene 2 are essentially identical to one another (18.58, 18.98, and 18.93 kDa, respectively) , this suggests that the band just beneath the 20 kDa marker (which is as immunoreactive as the N or M bands) might be the co-migration of the three protein products, provided that their migration is not affected by any post-translational modifications or physiochemical differences. Similarly, the slightly larger band of $ 23 kDa might represent a modified form (e.g., phosphorylated) of one of the pseudogene region proteins or an in vivo cleavage product as suggested by Boyd and Whitaker-Dowling (1988) . Finally, the lowest band could represent an additional cleavage product, a faster migrating form of one of the pseudogene region proteins (e.g., the acidic pseudogene 1), or the putative SH protein, a predicted 10.37 kDa viroporin-like protein lying between the G and L genes (see below). To determine if the lower viral protein bands detected in the immunoblot were the pseudogene region products (and/or SH protein) or proteolytic truncated forms of the five major structural proteins, FLAV was purified by sucrose density gradient ultracentrifugation and select SDS-PAGE protein bands were further analyzed by nano-scale high performance liquid chromatography coupled to tandem mass spectrometry (nano HPLC-MS/MS). Although the same or similarly-sized viral bands seen in the infected cell lysates (Fig. 1) were also present (but at a lower intensity) in purified virions by immunoblotting, they were not clearly observed in the SYPRO Ruby-stained gels, suggesting that these proteins/peptides may be incorporated into virions at low concentrations, either selectively or randomly. However, a bright band(s) approximately 10-20 kDa was demonstrated to be abundantly present in purified viruses and was the only distinct band(s) present beneath the putative M protein in the fluorescent gel (not shown). In-gel tryptic digestion of this band followed by nano HPLC-MS/MS analysis identified peptides corresponding to both pseudogene 1 and pseudogene 2 products (Table 1) , conclusively demonstrating that proteins of these reported pseudogenes are being expressed; whether they are normal structural components of the virus or are incorporated into particles by chance during morphogenesis is uncertain. Additionally, peptides corresponding to N, and to a lesser extent P, M, and G, were also detected ( Table 1 ), suggesting that cleavage products of the major structural proteins may also contribute to the observed immunoreactivity in Western blots. However, the vast majority (96 M percent) of the peptides (and hence, the major component of the 10-20 kDa band intensity) identified in the MS/MS spectra were derived from two cellular proteins, histone H4 ( $11.4 kDa) and cyclophilin A ( $ 17.9 kDa) ( Table 1) , both of which have been previously identified as being incorporated into rhabdovirus virions. In vesicular stomatitis New Jersey virus (VSNJV), cyclophilin A (a chaperone protein involved in protein folding) has been shown to bind to N and is required for VSNJV replication (Bose et al., 2003) . Histone H4 has been observed in vesicular stomatitis Indiana virus particles (Moerdyk-Schauwecker et al., 2009) , as well as other viruses such as retroviruses (Chertova et al., 2006; Segura et al., 2008) and coronaviruses (Neuman et al., 2008) . Although contamination of chromatin on the viral surface could be the source of histone H4, the complete absence of other similarly sized core histone proteins (i.e., H2A, H2B, H3), despite the very high abundance of histone H4, suggests its incorporation into virions may be selective and that FLAV infection may entail a tentative nuclear phase as observed in other rhabdoviruses (Glodowski et al., 2002) , including the related WONV (see below). Additional cellular proteins of interest found in FLAV particles (but at a much lower concentration than either histone H4 or cyclophilin A) were CD59 and heat shock protein 70 (Hsp70) ( Table 1) . CD59 is a complement regulatory protein which inhibits the membrane attack complex and has previously been found embedded in the outer membrane of a number of different viruses, thus providing a unique mechanism to avoid complement-mediated lysis (Vanderplasschen et al., 1998; Hu et al., 2010; Amet et al., 2012) . Like cyclophilin A, Hsp70 is a protein chaperone that has been demonstrated to associate with N in rhabdovirus particles (Lahaye et al., 2012) , and is often subverted from the host by viruses for a variety of functions (Gurer et al., 2002; Mayer, 2005; Nagy et al., 2011) . The biological significance of these cellular proteins within FLAV particles and their potential role in the viral life cycle, if any, remains to be determined (e.g., Colpitts et al., 2011) . Based on these results, we suggest that the pseudogene 1 and 2 sequences be renamed as U1 and U3, respectively, to conform to the standard nomenclature first set forth by Gubala et al. (2008) with WONV. We also suggest that the 19 kDa protein gene be renamed as U2 for clarity among related viruses. Additionally, a putative ORF between the G and L genes in FLAV (GenBank accession KF028661), denoted as the SH ORF by Walker et al. (2011) , encodes a viroporinlike protein which contains a hydrophobic transmembrane domain (WIGTGILGLLGFIVIK), similar to the transmembrane domain of the G protein (WISIGILIVISILIC), and a highly basic C-terminus. Although the putative SH protein (120 aa) could be translated by mechanisms such as leaky ribosomal scanning, similar to that observed with the C proteins of the vesiculoviruses (Spiropoulou and Nichol, 1993) , or by ribosomal frameshifting (À 1) to produce a G-SH polyprotein (Liston and Briedis, 1995) , conserved motifs found in FLAV strongly suggest that the SH protein is expressed by coupled translation. In addition to the pentanucleotide UAAUG junction between the G and SH proteins (where UAA is the termination codon for G and AUG is the start codon for SH), which has previously been demonstrated to be a common sequence for translational termination-reinitiation in a number of viruses (Horvath et al., 1990; Powell et al., 2008; Guo et al., 2009) , FLAV contains sequences (motifs 1, 2, and 2 n ) that constitute the termination upstream ribosome-binding site (TURBS) essential for coupled translation that are very similar to those seen in members of the Norovirus genus within the family Caliciviridae, as well as Influenza B virus (Fig. 2 ) (Meyers, 2007; Powell, 2010) . This represents the first, albeit tentative, recognition of a rhabdovirus utilizing coupled translation for protein expression and demonstrates that convergent evolution of this expression strategy has occurred in a diverse range of viral families. Although direct evidence that the U2 and SH proteins are expressed is still lacking and will likely require more specific immunological analysis with protein-specific antibodies and/or Histone H4 a 16 À ISGLIYEETRGVLK À 30 , 26 À GVLKVFLENVIR À 37 Cyclophilin A a 2 À VNPTVFFDIAVDGEPLGR À 19 , 20 À VSFELFADKVPK À 31 , 56 À IIPGFMCQGGDFTR À 69 , 77 À SIYGEKFEDENFILK À 91 , 92 À HTGPGILSMANAGPNTNGSQFFICTAK À 118 , 132 À VKEGMNIVEAMER À 144 , 155 À KITIADCGQLE À 165 CD59 a 56 À AGLQVYNQCWK À 66 , 67 À FANCNFNDISTLLK À 80 , 81 À ESELQYFCCK À 90 Heat shock protein 70 a 26 À VEIIANDQGNR À 36 , 37 À TTPSYVAFTDTER À 49 a Numbering of the amino acid residues in the Vero cell-derived proteins are based on Chlorocebus aethiops GenBank accessions AAT78443, P62938, Q28222, and Q28216, respectively. further mass spectrometry analysis, the fact that U1 and U3 proteins were detected and that other accessory proteins in related rhabdoviruses have been shown to be expressed , suggests that the U2 and SH proteins are also likely being produced in FLAV. Based of these results, we propose that the genomic organization of FLAV be demonstrated as 3′-N-P-U1-U2-U3-M-G-SH-L-5′ (Fig. 3) . Whether other small putative ORFs within the FLAV genome, such as overlapping ORFs found within the N gene , may also be functional and express proteins remains to be determined. The degree of amino acid sequence similarity between the U1-U3 proteins in FLAV (Fig. 4) and HPV was particularly evident and strongly suggests that they arose through gene duplication in an ancestral rhabdovirus, similar to that observed with WONV and suggested for FLAV Simon-Loriere and Holmes, 2013) . Additionally, the presence of identical motifs present in U1 and U3, but not U2 (e.g., YDFVWP in WONV), is of interest, and means that the order in which duplication of the genes occurred is uncertain. Previously, gene duplication has been described in a number of other rhabdoviruses, including Bovine ephemeral fever virus (BEFV) and Adelaide River virus (ARV) (Walker et al., 1992; Wang and Walker, 1993) , illustrating that this particular mechanism of virus evolution appears to have occurred multiple independent times among different members of the Rhabdoviridae, thereby facilitating the noted complexity of their genomes. As rhabdovirus genomes contain similar initiation and termination sequences within each gene, this repetitive genetic feature may facilitate the occurrence of homologous gene duplication and novel gene evolution within the family. In the case of BEFV and ARV, a nonstructural G protein (G NS ), which lies directly downstream of the G protein, is believed to have been generated by homologous gene duplication of the G protein in an ancestral rhabdovirus (Walker et al., 1992; Wang and Walker, 1993) . As the G and G NS proteins exhibit low levels of amino acid identity, and the G NS protein does not share characteristics of the G protein such as being incorporated into virions or inducing neutralizing antibodies in the host (Hertig et al., 1996; Johal et al., 2008) , it is likely that G NS has undergone adaptive evolution and functional divergence after duplication, although its role in viral infection is unknown. While recent functional analysis of WONV has demonstrated that U3 is required for efficient viral replication, is translocated to the nucleus and modulates the host response to infection through targeting the SWI/SNF chromatin remodeling complex (Peter Walker, personal communication), it is unclear whether U1 and U2 have similar roles to U3, and whether the three proteins may act synergistically. Similarly, whether the functions of U1-U3 are conserved throughout the viruses of the Hart Park serogroup remains to be determined. To explore the evolution of U1 in more detail, we analyzed 103 FLAV isolates from mosquitoes and birds collected annually over a 9-year period (2002) (2003) (2004) (2005) (2006) (2007) (2008) (2009) (2010) in Georgia and over a 6-year period in Texas (2005) (2006) (2007) (2008) (2009) (2010) , as well as additional isolates from other states and older archived viruses dating back to the prototype FLAV isolate from Flanders, New York in 1961 (Table 2) . Although our phylogenetic analysis revealed a low level of evolutionary change, with the vast majority of viruses falling into a single large clade (denoted as lineage A) (Fig. 5) , the most notable result was the identification of a unique FLAV variant (termed lineage B), which demonstrated $ 15% nucleotide divergence in U1 to lineage A. This variant lineage, which was first identified in 2005, appears to be localized to the lower coastal plain region of Georgia (Lowndes Co., Chatham Co.), and despite longitudinal in-state surveillance has never been found outside of this two-county area. Interestingly, both the prototypical FLAV (lineage A) and the variant (lineage B) appear to circulate sympatrically (Fig. 5) , as they have been repeatedly isolated together from the same county (i.e., Lowndes) over a 6-year period (2005) (2006) (2007) (2008) (2009) (2010) . Despite such co-circulation in Georgia, it is also notable that all viruses sampled outside of Georgia fell into lineage A, as did all viruses sampled from 1961 to 1999. Although available data suggests that lineage B is transmitted primarily by the same mosquito species as other FLAV isolates (Table 2) , the evolutionary factors that have driven this phylogenetic divergence in a sympatric set of viruses, such as switching to non-avian (more sedentary) hosts or to a mosquitoonly cycle, are unknown and would require additional virus surveillance and serological surveying in the region. In this context, it is important to note that some Culex species (e.g., Cx. quinquefasciatus) may feed upon mammals, including dogs and humans (Niebylski and Meek, 1992; Molaei et al., 2007) . Phylogeographic analysis also revealed a significant clustering (i.e., more than expected by chance alone) by both state and county of sampling (p o0.001 in both the AI and PS tests), indicative of some spatial barriers to viral gene flow. Our phylogenetic analysis of the U1 sequences was also notable for the marked absence of temporal structure, which precluded a detailed analysis of rate of evolutionary change. This is apparent both from a visual inspection of the phylogeny where, for example, the oldest viruses in our sample set (from 1961 to 1974) are generally no less divergent than viruses collected more than 40 years later, and by the very weak correlation coefficient (0.11) in the regression analysis of sampling year against root-to-tip genetic distance. Importantly, multiple independent stocks of older isolates were sequenced to confirm this observation. Such a lack of temporal structure is compatible with a relatively low rate of evolutionary change in FLAV, which is in contrast both to other rhabdoviruses studied to date, in which rates of nucleotide substitution are high (in the range of 10 À 3 to 10 À 4 nucleotide substitutions site/year), as well as to a broad array of other RNA viruses (Duffy et al., 2008; Jenkins et al., 2002) . The reasons underlying this very low rate of FLAV evolution and whether it is true of other Hart Park serogroup viruses clearly merit further investigation. Mosquitoes in Georgia, USA, were collected as part of a statewide arbovirus surveillance program using a variety of methods (CDC light traps, gravid traps), identified to the species level (when possible), and stored at À 80 1C until further processing. Mosquito pools were mechanically homogenized in BA-1 media (Lanciotti et al., 2000) , clarified by centrifugation (6700 Â g for 10 min), and an aliquot (100 μl) was inoculated into confluent 2-day-old 4.0 cm 2 cultures of Vero E6 cells. Wells exhibiting cytopathology were harvested and RNA was extracted using a QIAamp Viral RNA Mini kit (Qiagen, Valencia, CA) and virus isolates were identified as FLAV by RT-PCR targeting the N gene (Nasci et al., 2001) using an AMV reverse transcriptase/GoTaq s Flexi DNA polymerase system (Promega, Madison, WI). Arbovirus surveillance in Texas, USA, was performed as described previously (Lillibridge et al., 2004) . A small number of avian isolates of FLAV were included in the analysis (Table 2) , and these were recovered from homogenized brain tissue of dead bird submissions using the methods described above. Archived FLAV isolates from 1961 to 1999 (Table 2 ) and the prototype strain of HPV from 1955 (Ar70, Culex tarsalis, Hart Park, Kern County, California, USA) were obtained from the World Reference Center for Emerging Viruses and Arboviruses (WRCEVA) at the University of Texas Medical Branch (UTMB). The pseudogene 1, 19 kDa gene, and pseudogene 2 sequences in a representative set of spatially and temporally discrete FLAV isolates, including the original prototype strain 61-7484 (GenBank accessions KF028661-KF028670), were amplified by RT-PCR using primers designed from the original FLAV sequences (GenBank accession AH012179). The analogous region in HPV (GenBank accession KF028764) was amplified by designing primers based on highly conserved regions in FLAV. All pseudogene 1 (U1) sequences used in phylogenetic analysis (see below) have been submitted to GenBank under the accession numbers KF028671-KF028763. cDNA products of the transcripts of the pseudogene region were generated using an oligo(dT) primer and gene-specific primers based on the 5′-terminal mRNA sequence and cloned using a PCR Cloning kit (Qiagen, Valencia, CA). Primer sequences are available from the authors upon request. FLAV mouse hyperimmune ascites fluid (MHIAF) was generated as described previously (Tesh et al., 1983) and carried out under an animal use protocol approved by the UTMB. Immunoblots were performed according to standard methods (Harlow and Lane, 1999) . Vero cells were infected with FLAV at a multiplicity of infection (M.O.I.) of $ 1, trypsinized at day 3 post-infection, and pelleted by light centrifugation (4300 Â g for 15 min). The cell pellet was washed 2X in PBS and then lysed in RIPA buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% sodium deoxycholate, 1 mM EDTA). Insoluble protein was removed by centrifugation (6700 Â g for 10 min) and the lysate was mixed with 5X Laemmli sample buffer (250 mM Tris-HCl, pH 6.8, 25% βmercaptoethanol, 10% SDS, 50% glycerol, 0.05% bromophenol blue) and boiled for 5 min. Proteins were electrophoresed by SDS-PAGE in a 10% or 12% polyacrylamide gel and transferred to 0.45 μm nitrocellulose. The membrane was blocked with 5% dry milk in TBS-0.05% Tween and probed using a 1:100 dilution of FLAV MHIAF and a 1:2000 dilution of a goat anti-mouse IgG (H þL) HRP conjugate (Jackson ImmunoResearch Laboratories, West Grove, PA). Viral protein sizes were estimated against a Super-Signal ™ Molecular Weight Protein Ladder (ThermoScientific, Waltham, MA) and protein-antibody complexes were detected using a SuperSignal ™ West Pico Chemiluminscent Substrate Kit (ThermoScientific). Blots were analyzed using a ChemiDoc ™ MP imaging system (BioRad, Hercules, CA). To obtain viral proteins for mass spectrometry, large-scale purification of FLAV was performed. Briefly, confluent Vero MARU cell cultures were grown in 850 cm 2 roller bottles (Corning Inc., Corning, NY) and infected with FLAV at an M.O.I. of $ 1. Supernatant was harvested at day 4 post-infection, clarified by lowspeed centrifugation at 4400 Â g for 30 min, and virus was precipitated overnight at 4 1C with 7% polyethylene glycol (PEG) and 2.3% NaCl. Virus was pelleted by centrifugation at 13,000 Â g for 1 h and the pellet was resuspended in TES buffer (10 mM Tris-Cl, pH 7.4, 2 mM EDTA, 150 mM NaCl) and centrifuged (13,000 Â g, 15 min) to remove the PEG. Virus was then purified on a 20% sucrose cushion followed by a 20-60% sucrose gradient in a Beckman SW 32 Ti rotor at 134,000 Â g for 2 h at 4 1C using an Optima ™ L-100K Ultracentrifuge (Beckman Coulter, Brea, CA). The virus band was recovered, loaded on an Amicon s Ultra-15 100K centrifugal filter unit for concentration and to remove low molecular weight proteins (Millipore, Billerica, MA), and subjected to SDS-PAGE as previously noted except that the gel was stained with a SYPRO Ruby Protein Gel Stain (Molecular Probes, Invitrogen, Carlsbad, CA). Proteins in the gel were visualized using an UV transilluminator and a band corresponding to the approximate size of the accessory proteins of interest (U1-U3, SH; $ 10-20 kDa) was cut from the gel. Nano-scale high performance liquid chromatography coupled to tandem mass spectrometry (nano HPLC-MS/MS) was performed as described previously (Hochrainer et al., 2012) . Briefly, SYPRO Rubystained proteins were destained, reduced using dithiothreitol (10 mM), alkylated with iodoacetamide (55 mM), and digested overnight with trypsin (0.5 μg). Tryptic peptides were collected by centrifugation (4000 Â g, 2 min) and the remaining peptides in the gel were sonicated in 50% acetonitrile-5% formic acid and collected. Tryptic peptides were pooled, evaporated in a Speedvac SC110 (Thermo Savant, Milford, MA, USA), reconstituted in 2% acetonitrile-0.5% formic acid, and analyzed with nano HPLC-MS/MS using an LTQ-Orbitrap Elite mass spectrometer (Thermo-Fisher Scientific, San Jose, CA). Proteins were identified by searching MS/MS spectra using the Mascot Daemon search engine (version 2.3.02, Matrix Science, Boston, MA) against a combination database of Chlorocebus aethiops from NCBI and FLAV-specific proteins. Mascot search settings included tryptic peptide specificity of one missed cleavage site, carbamidomethyl cysteine as a fixed modification, and Asn and Gln deamidation and methionine oxidation as variable modifications. Search results of Mascot were comparable to those found using the database search algorithm SEQUEST in Proteome Discoverer 1.4 (ThermoScientific). Proteins identified by MS/MS were filtered with the false discovery rate of detected tryptic peptides at $ 1% using a decoy database search in Mascot. A phylogenetic tree was inferred for 103 U1 gene sequences (511 nt) of FLAV isolates sampled across the eastern United States (Table 2 ). Phylogenetic analysis was performed using the maximum likelihood (ML) method implemented in PAUP n (Swofford, 2003) , employing TBR branch swapping with the best-fit model of nucleotide substitution (GTRþΓ 4 ) determined using MODELTEST (Posada and Crandall, 1998) . To assess the reliability of the groupings obtained, a bootstrap resampling analysis was undertaken, employing 1000 pseudo-replicate neighbor-joining trees estimated under the ML substitution model. To assess whether there was sufficient temporal structure in these sequence data to estimate rates of evolutionary change, we plotted the root-to-tip genetic distances determined from the ML tree against year of sampling using the Path-O-Gen program (http://tree.bio.ed.ac.uk/software/pathogen/). A broad-scale analysis of selection pressures was undertaken by estimating the numbers of synonymous (d S ) and nonsynonymous (d N ) nucleotide substitutions per site (ratio d N /d S ) using the Single Likelihood Ancestor Counting (SLAC) method available at the Datamonkey webserver (Delport et al., 2010) . To determine if the FLAV phylogeny is more structured by place of sampling than expected by chance alone, we computed the Association Index (AI) and Parsimony Score (PS) metrics of phylogeny-trait association using the BaTS (Bayesian tipassociation significance testing) program (Parker et al., 2008) . This analysis utilized a posterior distribution of phylogenetic trees inferred using the Bayesian Markov Chain Monte Carlo method available in the MrBayes package (version 3.1.2, Ronquist and Huelsenbeck, 2003) and again utilizing the GTR þΓ 4 model of nucleotide substitution. For this analysis the sequences were categorized according to (a) their state of origin, and (b) their state and county of origin within the state of Georgia. CD59 incorporation protects hepatitis C virus against complement-mediated destruction Requirement for cyclophilin A for the replication of vesicular stomatitis virus New Jersey serotype Phylogenetic relationships among rhabdoviruses inferred using the L polymerase gene Animal Rhabdoviruses Serological comparisons among Hart Park virus and strains of Flanders virus Flanders virus replication and protein synthesis Antigenic relationships among rhabdoviruses from vertebrates and hematophagous arthropods Proteomic and biochemical analysis of purified human immunodeficiency virus type 1 produced from infected monocyte-derived macrophages Dengue virus capsid protein binds core histones and inhibits nucleosome formation in human liver cells Arbovirus isolations from mosquitoes collected in central Utah in 1967 Application of broad-spectrum resequencing microarray for genotyping rhabdoviruses Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology Family Rhabdoviridae Rates of evolutionary change in viruses: patterns and determinants Serological relationships of animal rhabdoviruses Complex nuclear localization signals in the matrix protein of vesicular stomatitis virus Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae Ngaingan virus, a macropod-associated rhabdovirus, contains a second glycoprotein gene and seven novel open reading frames Rhabdoviruses: Molecular Taxonomy, Evolution, Genomics, Ecology, Host-Vector Interactions Coupled termination/ reinitiation for translation of the downstream open reading frame B of the prototypic hypovirus CHV1-EP713 Specific incorporation of heat shock protein 70 family members into primate lentiviral virions Using Antibodies: A Laboratory Manual Vaccinia virus-expressed bovine ephemeral fever virus G but not G(NS) glycoprotein induces neutralizing antibodies and protects against experimental infection Monoubiquitination of nuclear RelA negatively regulates NF-KB activity independent of proteasomal degradation The evolutionary genetics of emerging viruses Eukaryotic coupled translation of tandem cistrons: identification of the influenza B virus BM2 polypeptide A high-affinity inhibitor of human CD59 enhances complementmediated virolysis of HIV-1: implications for treatment of HIV-1/AIDS Seroepidemiology of arboviruses among seabirds and island residents of the Great Barrier Reef and Coral Sea Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis Antigenic characterization of bovine ephemeral fever rhabdovirus G and GNS glycoproteins expressed from recombinant baculoviruses Diseases derived from wildlife International Catalogue of Arboviruses, Including Certain Other Viruses of Vertebrates Arbovirus studies in the Ohio-Mississippi Basin, 1964-1967. III. Flanders virus Hsp70 protein positively regulates rabies virus infection Rapid detection of West Nile virus from human clinical specimens, field-collected mosquitoes, and avian samples by a TaqMan reverse transcriptase-PCR assay The 2002 introduction of West Nile virus into Harris County, Texas, an area historically endemic for St. Louis encephalitis Ribosomal frameshifting during translation of measles virus P protein mRNA is capable of directing synthesis of a unique protein Arbovirus surveillance in Connecticut. III. Flanders virus Field evidence against transovarial transmission of Flanders virus in Connecticut Recruitment of Hsp70 chaperones: a crucial part of viral survival strategies Characterization of the sequence element directing translation reinitiation in RNA of the calicivirus rabbit hemorrhagic disease virus Analysis of virion associated host proteins in vesicular stomatitis virus using a proteomics approach Host feeding pattern of Culex quinquefasciatus (Diptera: Culicidae) and its role in transmission of West Nile Emerging picture of host chaperone and cyclophilin roles in RNA virus replication West Nile virus isolates from mosquitoes in New York and New Jersey Proteomics analysis unravels the functional repertoire of coronavirus nonstructural protein 3 Blood-feeding of Culex mosquitoes in an urban environment Correlating viral phenotypes with phylogeny: accounting for phylogenetic uncertainty MODELTEST: testing the model of DNA substitution Characterization of the termination-reinitiation strategy employed in the expression of influenza B virus BM2 protein Translational termination-reinitiation in RNA viruses MRBAYES 3: Bayesian phylogenetic inference under mixed models Identification of host proteins associated with retroviral vector particles by proteomic analysis of highly purified vector preparations Gene duplication is infrequent in the recent evolutionary history of RNA viruses A small highly basic protein is encoded in overlapping frame within the P gene of vesicular stomatitis virus PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates Antigenic relationship among rhabdoviruses infecting terrestrial vertebrates Extracellular enveloped vaccinia virus is resistant to complement because of incorporation of host complement control proteins into its envelope Rhabdovirus accessory genes The genome of bovine ephemeral fever rhabdovirus contains two related glycoprotein genes Adelaide River rhabdovirus expresses consecutive glycoprotein genes as polycistronic mRNA: new evidence of gene duplication as an evolutionary process Flanders strain, an arbovirus newly isolated from mosquitoes and birds of New York state We thank Laura Fiorentino, Julia Sprang, and Jennifer Abi Younes for technical support and Peter Walker for helpful discussions of the manuscript. We also thank Wei Chen, James McCardle, and Sheng Zhang of the Proteomics and Mass Spectrometry Core Facility, Cornell University Institute of Biotechnology, for their expertise and support on gel-based protein identifications. Funding for arbovirus surveillance in Georgia was provided by the Centers for Disease Control Epidemiology and Laboratory Capacity Cooperative Agreement. Support for work at the University of Texas Medical Branch was provided by the National Institutes of Health (NIH) contract HHSN27220-100004OI/HHSN27200004/ D04. Additional funding was provided by the wildlife management agencies of the Southeastern Cooperative Wildlife Disease Study member states through the Federal Aid to Wildlife Restoration Act (50 Stat.917) and other sources, and by the U.S. Department of the Interior Cooperative Agreement G11AC20003. E.C.H. was supported by a NHMRC Australia Fellowship. Additional support was provided through a NRSA Fellowship (F32AI100545) to A.B.A. from the National Institute of Allergy and Infectious Diseases, NIH.