key: cord-0874987-woy1plcv authors: Le Loc’h, Guillaume; Bertagnoli, Stéphane; Ducatez, Mariette F. title: Time scale evolution of avipoxviruses date: 2015-07-29 journal: Infect Genet Evol DOI: 10.1016/j.meegid.2015.07.031 sha: b571790a3eb2e8e7b5448ff8a972d5b9c7fee259 doc_id: 874987 cord_uid: woy1plcv Avipoxviruses are divided into three clades: canarypox-like viruses, fowlpox-like viruses, and psittacinepox-like viruses. Several molecular clock and demographic models available in the BEAST package were compared on three avipoxvirus genes (P4b, cnpv186 and DNA polymerase genes), which enabled to determine that avipoxviruses evolved at a rate of 2–8 × 10(−5) substitution/site/year, in the range of poxviruses previously reported evolution rates. In addition, the date of mean time of divergence of avipoxviruses from a common ancestor was extrapolated to be about 10,000–30,000 years ago, at the same period as modern poxvirus species. Our findings will facilitate epidemiological investigations on avipoxviruses’ spread, origin and circulation. The Poxviridae virus family is divided into two subfamilies: the Chordopoxvirinae (containing vertebrate viruses) and the Entomopoxvirinae (containing insect viruses). The Chordopoxvirinae is further divided into ten genera: nine mammalian genera and the Avipoxvirus genus, subject of the present study (Haller et al., 2014; http://ictvonline.org/). Based on basic genomic analyses (restriction fragment length polymorphism), avipoxviruses (APV) are currently divided into ten virus species (http://ictvonline.org/) but more recent phylogenetic studies identified three major clades: canarypox-like viruses, fowlpox-like viruses, and psittacinepox-like viruses (Jarmin et al., 2006) . Classically, APV were considered to be host species-or orderspecific. Genus taxonomy had been based on this concept until recent studies showed that Otididae, Columbidae and Accipitridae can be infected by a large diversity of strains (Abdallah and Hassanin, 2013; Gyuranecz et al., 2013; Jarmin et al., 2006; Le Loc'h et al., 2014) . The age of divergence between the three main APV clades has not been estimated. So far, only three genes have been widely studied: fpv094/cnpv121 (locus P4b, hereafter ''P4b''), fpv140/cnpv186 (hereafter ''cnpv186'') and fpv167/cnpv240 (hereafter ''DNA polymerase'') (Gyuranecz et al., 2013; Jarmin et al., 2006; Le Loc'h et al., 2014) . Besides, only four APV have been fully sequenced: one Fowlpox virus (Afonso et al., 2000) , one Pigeonpox virus, one Penguinpox virus (Offerman et al., 2014) , and one Canarypox virus (Tulman et al., 2004) . Their analyses have shown large genomic rearrangements and suggest significant genomic diversity among APV. Codivergence between DNA (and in some cases RNA) viruses and their hosts is considered as a main evolution path and should thus be associated with low mutation rates (Holmes, 2004; Holmes and Drummond, 2007; Villarreal and DeFilippis, 2000) . Although this codivergence seems more likely in the case of papillomaviruses (millions of years of association, (Bernard et al., 2006) ), it is less probable or at least more recent in the case of newly emerging viruses (e.g. the Severe Acute Respiratory Syndrome Coronavirus evolution (Peiris et al., 2004) ). The literature suggests that a short sampling time interval (as compared to the real evolution timescale) biases the calculations (Ho et al., 2007) . The aim of the present preliminary study was to assess substitution rate and time to most recent common ancestor (TMRCA) of APV using the currently available data. Our results were compared with data available for other poxviruses and a hypothesis on the pathogen's spread and circulation is given. All available DNA sequences of APV fpv094/cnpv121 (locus P4b, hereafter ''P4b''), fpv140/cnpv186 (hereafter ''cnpv186'') and fpv167/cnpv240 (hereafter ''DNA polymerase'') genes were retrieved from GenBank/EMBL databases. least year, month and day when available) was collected for each sequence. In case no information was available, the 1st of July of the year of sequence publication was then considered as sample collection date. Only the oldest sequence was kept when identical sequences were available. Table 1 summarizes the details of included APV sequences. For each gene, DNA sequences were translated to amino acid (aa) sequences using BioEdit 7.2.5 (Hall, 1999) , the aa sequences were aligned with ClustalX 2.1 (Larkin et al., 2007) , and the original BioEdit format aa sequences were aligned with the ClustalX aa aligned sequences before they were reverse-translated in an aligned nucleotide file using BioEdit. The sequence of the APV pigeon/CAPM V-101/CZE/1965 P4b gene (accession number: GU982736) was removed due to many deletions resulting in non-functional short proteins; the P4b sequences of APV isolates from Hawaii EL1.1H1900, APAP16.2H03, and HAAMB.07H02 (accession numbers EF568402 to EF568404) were also removed because they were too short (116 bp). In order to exclude recombinant sequences (as done in Firth et al., 2010) which can bias the calculation, possible recombinant sequences were searched with the Recombinant Identification Program (RIP) (www.hiv.lanl.gov). None were identified. Overall, we analyzed 81 P4b gene sequences (partial open reading frame, ORF: 426 nucleotides) from specimens collected between 1865 and 2014; 31 cnpv186 genes (complete ORF: 933 to 1020 nucleotides) from specimens collected between 1965 and 2014; and 47 DNA polymerase genes (partial ORF: 555 nucleotides) from specimens collected between 1980 and 2014 (Table 1) . Phylogenies taking sampling time into account were estimated using the Bayesian Markov chain Monte Carlo (MCMC) inference methods available in the BEAST package (v.1.4.8, (Drummond and Rambaut, 2007) ). The analyses presented were run using the Tamura and Nei 1993 model (TN93) with gamma-distributed rate: the closest substitution model available in BEAST to the best fitting nucleotide substitution model estimated by means of hierarchical likelihood ratio approach using Mega v 6.06 (Tamura et al., 2013) ; and the second best nucleotide substitution model as per MrModeltest (Nylander, 2004) . We used empirical base frequencies and three partitions into codon positions. Different combinations of demographic models and clock models were compared as suggested (Drummond and Rambaut, 2007) : constant size or extended Bayesian skyline plot models, strict or relaxed (lognormal or exponential) clocks. At least 200 million MCMC iterations were run for each gene and each demographic model/clock model combination. Subsampling was performed every 10,000 generations to decrease autocorrelation between model parameter samples. The estimation of parameters and divergence time was carried out using Tracer (http://beast.bio.ed.ac.uk/Tracer). A correct mixing of MCMC was verified by effective sampling size (ESS) calculations available in Tracer. The best fitting model was also assessed using ''Model Comparison'' available in Tracer and choosing the lowest Akaike's information criterion (AICM) value. The dN/dS ratios (x) were calculated using the single-likelihood ancestor (SLAC) method available in the HyPhy package available on the http://www.datamonkey.org/ server (Kosakovsky Kosakovsky Pond and Frost, 2005) . Phylogenetic trees based on three APV genes (P4b, cnpv186, and DNA polymerase) DNA sequences were constructed and all confirmed the common classification in the genus: canarypox-like viruses, fowlpox-like viruses, and psittacinepox-like viruses ( Fig. 1) . To assess both substitution rate per site per year and TMRCA, different combinations of demographic models and clock models were compared: constant size or extended Bayesian skyline plot models, strict or relaxed (lognormal or exponential) clocks. The lowest AICM values were obtained with a relaxed exponential clock and the extended Bayesian skyline plot model for both cnpv186 and the DNA polymerase genes (AICM values of 11,673.74 ± 0.11 standard error (SE) and 7,316.26 ± 0.12 SE for cnpv186 and DNA polymerase genes, respectively, Table 2 ). For the P4b gene, the AICM value was slightly lower with a strict clock and the Extended Bayesian Skyline Plot model than with a relaxed exponential clock and the extended Bayesian skyline plot model (AICM values of 8, 492.42 ± 0.21 SE and 8, 493.49 ± 0.44 SE, respectively) . We however decided to use the same combination for our three genes for consistency: a relaxed exponential clock and the extended Bayesian skyline plot model (Table 2 ). Analyses were also performed using the General Time Reversible substitution model with gamma-distributed rate and invariant sites (GTR + G + I, best fitted model as per MrModeltest), which gave similar results (data not shown). Our molecular clock analyses showed that APV diverged from a common ancestor approximately 10-30 thousand years ago: 8,682 (95% Highest Posterior Density (HPD): 950-60,718), 10,891 (95% HPD: 963-93,707), and 29,259 (95% HPD: 1871-197,714) years ago for DNA polymerase, cnpv186, and P4b, respectively ( Fig. 1) . Interestingly, fowlpox-like and psittacinepox-like viruses shared a more recent common ancestor than canarypox-like viruses: they diverged 5-16 thousand years ago (5,034 (95% HPD: 625-34,636) years ago for DNA polymerase, 16,012 (95% HPD: 1208-117,230) years ago for P4b, no psittacinepox-like viruses sequence are available for cnpv186, Fig. 1 ). When considering APV collected the same year in the same location such as The present study aimed at assessing the substitution rate of APV and at estimating their TMRCA. We compared several clock and demographic models available in the BEAST package and found that APV evolved at a rate of 2-8 Â 10 À5 substitution/site/year. The date of mean time of divergence of APV from a common ancestor was extrapolated to be about 10,000-30,000 years ago. We observed a threefold difference between substitution rates and TMRCA of P4b on the one hand and cnpv186 and DNA polymerase on the other hand (Table 2) . DNA polymerase encodes a polymerase protein, cnpv186 an immunodominant protein, and P4b a core protein (Tulman et al., 2004) . The slower evolution rate of P4b is therefore somewhat surprising: while viable DNA polymerase and cnpv186 mutants may be more difficult to generate, one could expect more flexibility on a core protein. The selection pressure may however be lower on P4b than on its counterparts. Esposito et al. showed that the genes coding for host range and virulence proteins vary the most for Smallpox Virus compared to genes involved in virus replication, suggesting that these genes are most targeted by selection pressure (Esposito, 2006) . But the large confidence intervals (95% HPD) associated with the values we found (within the range of what had been calculated in previous studies) does not show a statistically significant difference (Table 2) . Therefore, it is not possible to draw any conclusion. In addition, our analysis has a bias as it was carried out on three genes only: a very small fragment of avipoxvirus genomes (260-306 kbp). Cnpv186, P4b and DNA polymerase genes were selected on the basis of available sequence data but might have a substitution rate different from the rest of the avipoxvirus genome. The molecular clock analyses of avipoxviruses will gain in precision when more sequence data (and more full genomes) become available. To run the analyses with as much data as possible, we extrapolated sampling years for some sequences. This also represents a bias. When excluding the sequence data for which no collection year was available, the substitution rates and TMRCA results for P4b and DNA polymerase genes were almost unchanged, but cnpv186 gene had a tenfold lower calculated substitution rate and a TMRCA of 2,274 years only. McLysaght et al. (2003) tested poxvirus genes for selection acting on the evolution of genes by looking at synonymous and nonsynonymous mutations. They observed positive selection for genes involved in virulence or for gene candidates for host-pathogen interaction (McLysaght et al., 2003) . Indeed, positive selection of viruses could also contribute to an increase of substitution rate, which can be observed either for viruses inducing a strong host immune response or for those which have been the object of vaccination or intervention campaigns (Firth et al., 2010) . While the houbara bustards we focus on in our projects in Morocco, the United Arab Emirates, Kazakhstan and Uzbekistan are vaccinated annually (Le Loc'h et al., 2015) , APV vaccination is not systematic worldwide, thus making selection pressure assessment difficult. Our selection pressure analyses demonstrated a purifying selection of the three APV genes we studied with x values ranging between 0.065 and 0.200, suggesting that a non-synonymous mutation has only 7-20% as much chance as a synonymous mutation of being fixed in the population. This finding might correlate with relatively weak vaccination/intervention pressure and may be linked with the low host specificity of avipoxviruses. Vaccination is likely not the sole factor driving potential selection pressure and ecological and epidemiological factors should be further investigated for a proper understanding of the observed selection phenomenon. Li et al. (2007) worked on Smallpox Virus evolution and linked sequence data and the first description of smallpox in humans (4th century AD in China; suspicions of smallpox as early as 1122 BC). Their dating was based on time-structured sequence data (207-231 years before present (ybp) using strict and relaxed clocks, respectively; these values were discarded for obvious discrepancy with historical and epidemiological data), or with calibration with historical records of smallpox infection (1400-6300 ybp) (Li et al., 2007) . However, the latter method proved erroneous as epidemiological records may not correlate with real origin of a pathogen (Hughes et al., 2010; Shchelkunov, 2009 ). The absence of ancient epidemiological data on APV prevented us from including any historical calibration in our analyses. Several research teams have in the past warned evolutionary biologists studying DNA viruses about the biases associated with the use of heterochronous phylogenetic modeling designed for faster evolving RNA pathogens (Firth et al., 2010) . In addition, it was shown that molecular evolution is artificially accelerated on short timescales (Ho et al., 2007) . Indeed our datasets contain very recent sequences only (from 1865 to now) compared to the TMCRA of APV. Hence our results should be considered with caution. But Babkin and Shchelkunov estimated that Poxviridae evolve at a rate of 0.9-1.2 Â 10 -6 substitutions/site/year. They calculated the ancestors of poxviruses, orthopoxviruses, and modern poxvirus species dated approximately 500,000, 300,000, and 14,000 years, respectively (Babkin and Shchelkunov, 2006) . The evolution rate of APV estimated in the present study is therefore within the range of what was observed for poxviruses. The divergence between APV clades (canarypox-like viruses, fowlpox-like viruses, and psittacinepox-like viruses) is as old as the divergence among modern poxvirus species. Our results therefore suggest that APV may keep their ''Genus'' classification and not become a sub-family as proposed in the past (Jarmin et al., 2006) , even if taxonomic classifications could not be based solely on the time of divergence. In addition, when Firth et al. tried to account for the specificity of double stranded DNA genomes in evolution calculations, their most robust virus model (with the different clock and demographic models tested) was Variola virus, suggesting that heterochronous phylogenetic modeling may be used for poxviruses evolution calculations (Firth et al., 2010) . They however emphasized that the TMRCA calculated for Variola virus based on the high substitution rates observed were likely erroneous. Our TMRCA values should really be considered as first estimates and not overinterpretated. In conclusion, the present study based on three genes showed that APV evolve at a rate of 2-8 Â 10 À5 substitution/site/year and that the date of mean time of divergence of APV from a common Table 2 Mean and 95% HPD of the Bayesian posterior estimates of substitution rate and TMRCA for avipoxviruses. ancestor is about 10,000-30,000 years ago. This new finding should facilitate future epidemiological investigations on virus spread, origin and circulation. G.L.L., M.F.D. and S.B. designed the study and drafted the manuscript. G.L.L. and M.F.D. performed and interpreted the analyses. The authors declare no conflict of interest. Detection and molecular characterization of avipoxviruses isolated from different avian species in Egypt The genome of fowlpox virus Time scale of poxvirus evolution Genome variation of human papillomavirus types: phylogenetic and medical implications BEAST: Bayesian evolutionary analysis by sampling trees Genome sequence diversity and clues to the evolution of variola (smallpox) virus Using time-structured data to estimate evolutionary rates of double-stranded DNA viruses Worldwide phylogenetic relationship of avian poxviruses BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT Poxviruses and the evolution of host range and virulence Evidence for time dependency of molecular rate estimates The phylogeography of human viruses The evolutionary genetics of viral emergence The evolutionary biology of poxviruses Avipoxvirus phylogenetics: identification of a PCR length polymorphism that discriminates between the two major clades Not so different after all: a comparison of methods for detecting amino acid sites under selection Datamonkey: rapid detection of selective pressure on individual sites of codon alignments Clustal W and clustal X version 2.0 Diversity of avipoxviruses in captive-bred Houbara bustard Outbreaks of pox disease due to canarypox-like and fowlpox-like viruses in large-scale Houbara bustard captive-breeding programmes, in Morocco and the United Arab Emirates On the origin of smallpox: correlating variola phylogenics with historical smallpox records Extensive gene gain associated with adaptive evolution of poxviruses MrModeltest v2. Program Distributed by the Author The complete genome sequences of poxviruses isolated from a penguin and a pigeon in South Africa and comparison to other sequenced avipoxviruses Severe acute respiratory syndrome How long ago did smallpox virus emerge? MEGA6: molecular evolutionary genetics analysis version 6.0 The genome of canarypox virus A hypothesis for DNA viruses as the origin of eukaryotic replication proteins Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.meegid.2015.07. 031.