key: cord-0686888-yb1vaggq authors: Valastro, Viviana; Holmes, Edward C.; Britton, Paul; Fusaro, Alice; Jackwood, Mark W.; Cattoli, Giovanni; Monne, Isabella title: S1 gene-based phylogeny of infectious bronchitis virus: An attempt to harmonize virus classification date: 2016-04-30 journal: Infection, Genetics and Evolution DOI: 10.1016/j.meegid.2016.02.015 sha: 6ce753b839a8001030d5f70cb9d6e5e2099df01c doc_id: 686888 cord_uid: yb1vaggq Abstract Infectious bronchitis virus (IBV) is the causative agent of a highly contagious disease that results in severe economic losses to the global poultry industry. The virus exists in a wide variety of genetically distinct viral types, and both phylogenetic analysis and measures of pairwise similarity among nucleotide or amino acid sequences have been used to classify IBV strains. However, there is currently no consensus on the method by which IBV sequences should be compared, and heterogeneous genetic group designations that are inconsistent with phylogenetic history have been adopted, leading to the confusing coexistence of multiple genotyping schemes. Herein, we propose a simple and repeatable phylogeny-based classification system combined with an unambiguous and rationale lineage nomenclature for the assignment of IBV strains. By using complete nucleotide sequences of the S1 gene we determined the phylogenetic structure of IBV, which in turn allowed us to define 6 genotypes that together comprise 32 distinct viral lineages and a number of inter-lineage recombinants. Because of extensive rate variation among IBVs, we suggest that the inference of phylogenetic relationships alone represents a more appropriate criterion for sequence classification than pairwise sequence comparisons. The adoption of an internationally accepted viral nomenclature is crucial for future studies of IBV epidemiology and evolution, and the classification scheme presented here can be updated and revised novel S1 sequences should become available. Infectious bronchitis virus (IBV) is the etiological agent of an acute and highly contagious disease that affects chickens of all ages and poses a major economic burden on the poultry industry. The virus exists in a wide range of antigenically and genetically distinct viral types, making the prevention and the control of this important pathogen both complex and challenging. Although the natural host of IBV is the chicken, the presence of IBV-like and other avian coronaviruses in both domestic and wild animals, including domestic fowl, partridge, geese, pigeon, guinea fowl, teal, duck and peafowl has been reported (Cavanagh, 2007 (Cavanagh, , 2005 . IBV is a single-stranded, positive-sense RNA virus of the family Coronaviridae, genus Gammacoronavirus (Cavanagh and Naqi, 2003; International Committee on Taxonomy of viruses, http://www.ictvonline. org/virustaxonomy.asp). The viral genome comprises two untranslated regions (UTRs) at the 5′ and 3′ ends (Boursnell et al., 1987; Ziebuhr et al., 2000) , two overlapping open reading frames (ORFs) encoding the polyproteins 1a and 1ab, and regions encoding the main structural proteinsspike (S), envelope (E), membrane (M) and nucleocapsid (N) (Spaan et al., 1988; Sutou et al., 1988) . In addition, two accessory genes, ORF3 and ORF5, expressing proteins 3a and 3b and 5a and 5b, respectively, have been described (Casais et al., 2005; Hodgson et al., 2006; Lai and Cavanagh, 1997) . The S protein (~3462 nt), located in the surface of the viral membrane, is the major inducer of neutralizing antibodies (Cavanagh and Naqi, 1997; Winter et al., 2008) and is responsible for virus binding and entry to host cells (Cavanagh et al., 1986a; Koch et al., 1990; Niesters et al., 1987) . It is post-translationally cleaved into the amino-terminal S1 (~535 amino acids) and the carboxyl-terminal S2 (~627 amino acids) subunits at a multi-basic cleavage site (Cavanagh et al., 1986b) . The observation that IB serotypes may differ by 20% to 25% at the genomic scale, and up to 50% of amino acids in the S1 protein , has warranted considerable attention (Cavanagh and Gelb, 2008) . Such variability may lead to important biological differences between strains and novel serotypic variants can emerge as the result of a limited number of amino acid changes in the spike protein. Nucleotide heterogeneity is most prevalent in the S1 portion of the S gene and largely contained within three different hypervariable regions (HVRs) (aa 38-67, 91-141 and 274-387) Moore et al., 1997) . Accordingly, the analysis of the complete or partial S1 gene nucleotide sequence has been conventionally used to determine viral genetic types. Currently, more than 50 different antigenic and genetic types of IBV have been recognized, some with substantial economic impact on the livestock industry, and some others restricted to specific geographical areas (de Wit et al., 2011a; Jackwood, 2012) . Effective surveillance is primarily based on the identification of the virus type causing disease (Jackwood and de Wit, 2013) . A variety of methods have been developed to differentiate IBV strains. Systems that examine the antigenic or genetic features of an isolate result in the description of serotypes and genotypes, respectively, whereas methods that are focused on the immune response of chickens against challenge with an IBV strain lead to the designation of protectotypes (Lohr, 1988) . Importantly, however, the genotype-, serotype-or protectotype-based approaches do not always group IBVs in the same way. In the absence of fast and appropriate biological assays for IBV classification, analyses of S1 sequence data are the most widely used means to assign IBV strains to groups, arbitrarily and confusingly defined as genetic types, genotypes, clades or clusters. Both phylogenetic analysis and measures of pairwise similarity between nucleotide and amino acid sequences have been used for this purpose. However, there is no agreement on the exact method by which sequences should be compared nor the criteria used to distinguish viral genetic types. This is in part due to the rapid appearance of novel variants and a lack of consistency and uniformity in the nomenclature of the IBV genetic groups. For example, several genotyping studies have been performed on IBV within a specific geographic area without considering a more global context (see below). As a consequence, different clade designations, such 'Korean New Cluster II' (Mase et al., 2010) , 'JP-IV' (Lim et al., 2012) and 'Chinese New Type' , have been assigned to describe closely related viruses. Further confusion arises because different regions of the S1 subunit have been used to infer phylogenetic trees, and which region is most informative is debated (Kingham et al., 2000; Li et al., 2012; Mo et al., 2013; Schikora et al., 2003; Wang and Huang, 2000) . Although it is generally true that longer sequences are more informative, several laboratories use a part of S1 that can include one or more HVRs. The study described here was performed with the aim of constructing a comprehensive, reliable and robust phylogenetic inference on a global scale as the basis for classifying IB viruses for epidemiological purposes. Due to its variability and biological function, the S1 gene is the region commonly sequenced as an ideal target in molecular assays to type IBV strains. Accordingly, we focused on the complete S1 gene. Using all publicly available S1 gene sequence data, our goal was to determine the genetic structure of IBV and to propose a rational and standardized nomenclature of the IBV genetic groups identified here, referred to as lineages. In addition, we evaluated the ability of S1 fragments of different sizes to recapitulate the phylogeny and classification obtained from full-length S1 sequences. All available nucleotide sequences corresponding to the complete coding sequence of the S1 gene (~1620 bp) of IBV (n = 1652) were downloaded from GenBank (http://www.ncbi.nlm.nih.gov). Details on these sequences, including their genotype and serotype, were extracted from the GenBank annotations. Sequences shorter than 1440 bp and those of low quality, for example resulting in a nonsense and/or truncated S1 protein, or identical in both sequence and strain name were removed, resulting in a final data set of 1518 sequences. An alignment of the complete S1 gene was performed with a slow and iterative refinement method (FFT-NS-i) implemented in Mafft v.7.0 (http://mafft.cbrc.jp/alignment/software/; Katoh and Standley, 2013 ) and a maximum likelihood (ML) phylogenetic tree was estimated (see below). The initial ML tree revealed that some previously recognized IBV groups did not form monophyletic groups (Supplementary material Fig. S1 ; see Results) indicative of inter-lineage recombination events that are relatively frequent in IBV (Cavanagh et al., 1992b; Kottier et al., 1995; Lee and Jackwood, 2000) . To confirm the occurrence of recombination smaller sequence data sets comprising the suspected recombinant and the putative parental strains were analyzed using the RDP, Geneconv, Maxchi, BootScan, 3Seq and Chimaera methods available in the RDP package v.4 (Martin et al., 2010) , applying default settings. The Simplot program v.3.5 was also used to define the locations of recombination break-points (Lole et al., 1999) . We considered "true recombinants" to be those sequences identified by at least two methods (P b 1 × 10 −10 ) and confirmed by significant phylogenetic incongruence among trees estimated on either side of the putative recombination break-points. All sequences with a history of recombination determined in this manner were removed from the original 1518 sequence data set used to identify 'pure' IBV lineages, but described as recombinant IBV forms (see Results). In addition, a number of sequences were considered to be unreliable due to a lack of congruence between the strain description in the associated publication and the corresponding nucleotide sequences. This quality control step resulted in a final data set of 1286 full-length S1 sequences, which was used to determine the phylogenetic relationships among IBV strains and to classify them into wellestablished lineages. Evolutionary distances between lineages and genotypes were inferred using the complete S1 data set, with pairwise (p-distance) comparisons of nucleotide and amino acid sequences performed using the MEGA6 program (Tamura et al., 2013) . To facilitate tree visualization we performed an additional phylogenetic analysis using a smaller subset of full-length S1 sequences (n = 199) . This subset comprised, where available, 6 representative sequences of each IBV lineage identified in the final 'cleansed' data set described above. In addition, 26 strains recognized as unique variants because they did not group with any of the identified lineages were included. Detailed information on the selected isolates along with their corresponding nucleotide sequences are provided as Supplementary materials (Table S1 ). The same data set was also used to assess whether the lineages established using phylogenetic analysis of the complete data set were maintained when only a portion of the S1 gene was analyzed. To that end, two different phylogenetic trees were inferred using the two most common sequenced regions corresponding to the coding sequences of HVRs1 and 2, located between nucleotide positions 112 and 423, and HVR3 between positions 820 and 1161 of the S1 gene, respectively (according to the sequence M21883). Finally, two additional data sets were created to determine whether there was sufficient temporal structure in the data to undertake a molecular clock dating analysis. The first data set consisted of 372 sequences sampled between 1956 and 2013 and randomly selected from the complete data collection, while the second data set represented a single large lineage (here named as lineage GI-19 but originally designated as QX) of relatively close related viruses sampled between 1993 and 2010. Specifically, all the GI-19 S1 gene full-length sequences collected before the administration of the homologous vaccine in the field (n = 354) were selected. To assess the extent of temporal structure in these data, a regression of root-to-tip genetic distances against date of sampling was performed using the Path-O-Gen program v.1.4 (http://tree.bio.ed.ac.uk/software/pathogen/) based on an input ML phylogenetic tree (see below). In all cases phylogenetic trees were inferred using the ML method available in PhyML (Guindon et al., 2010) and implemented in Geneious v.7.1.8 (Kearse et al., 2012) , employing a combination of NNI and SPR branch swapping. Prior to phylogenetic analysis, all hypervariable and potentially poorly aligned regions were removed using Gblocks (Castresana, 2000) . For this analysis, a less stringent procedure, allowing for gap positions within final blocks, was employed. In addition, the bestfit model of nucleotide substitution was inferred using JModeltest v.2.1.4 (Darriba et al., 2012) . Accordingly, the General Time Reversible (GTR) model with a discrete gamma distribution (Γ) and allowing for invariant sites (I) was selected in all data analyses based on AICc. Nodal supports in the PhyML analyses were assessed using Shimodaira-Hasegawa (SH)-like branch supports (Anisimova and Gascuel, 2006; Guindon et al., 2010) . To further assess the robustness of the phylogenetic tree, additional analyses of the small S1 gene sequence data set were performed using the Bayesian approach within MrBayes v.3.2 (Huelsenbeck and Ronquist, 2001) , and the Neighbor-Joining method available in MEGA6 (Tamura et al., 2013) . In both these cases we employed the GTR + I + Γ substitution model, with nodal support values obtained by posterior probabilities and 1000 bootstrap replicates, respectively. Topological congruence between trees was compared through visual inspection for (i) ML trees obtained for the complete (n = 1286) and the small data sets (n = 199), (ii) the ML trees estimated for the full-length S1 sequences (∼1620 nt) and those corresponding to the HVRs1 and 2 (312 nt) and HVR3 (342 nt) regions, and (iii) the ML, NJ and Bayesian trees all run on the small data set. To assess the phylogenetic relationships among the IBV variants and develop a harmonized system to define and name viral lineages, we analyzed all full-length S1 gene IBV sequences available on GenBank. These data comprised 1652 nucleotide sequences obtained from field samples and IBV vaccine strains collected worldwide between 1937 and 2013. After quality control, we inferred a ML phylogenetic tree on a total of 1518 sequences ( Fig. S1 ) with the aim of obtaining a picture of the global genetic variability of this pathogen. The topology of the preliminary ML tree showed evidence for recombination among IBV lineages. In particular, although defined previously, the so-called QX, 793B and Italy02 genetic groups, here referred to as the GI-19, -13 and -21 lineages (see below), no longer appeared as monophyletic groups (Fig. S1 ). We therefore performed additional analyses to determine whether recombination has occurred within the S1 gene and how this may have impacted the tree topology. This revealed a total of 213 recombinant viruses, which were removed from the data set to enable a more robust phylogenetic inference and identification of major viral lineages. For the purposes of classification, we propose that such recombinant viruses are simply referred to as combinations of the 32 IBV lineages defined below. Recombination has clearly been of importance in shaping the evolution of some IBV variants. In particular, 143 viruses sampled in China (n = 107) and Korea (n = 36) since the 1990s were found to descend from parental strains belonging to the QX and HN08 (here referred to as lineage 22) genetic groups. This recombination involves, among others, viruses originally described as clustering into Chinese genotype III (Liu et al., 2006b ) also known as the ck/CH/LSC/95I-type or tl/CH/ LDT3/03I-type Mo et al., 2013; Sun et al., 2011) , and those previously assigned to the ck/CH/LHLJ/95I-type and BJ-type cluster . In addition, the Korean nephropathogenic strains already known to be recombinants and originally designated as New Cluster I (Lim et al., 2012 (Lim et al., , 2011 , also fell into the group derived from the recombination between QX and HN08. Multiple recombinant break-points were detected within this group, with most located between nucleotides 550 and 652 and 934 and 1125 (according to the sequence AY561711). In 44 viruses we found evidence of inter-lineage recombination between the 793B and the QX-or HN08 clades, thereby supporting previous observations . Notably, all sequences possessed break-points located between nucleotide positions 665 and 709 (according to the sequence AY561711). These strains were collected in China from 2004 to 2012 and some were originally grouped by phylogenetic analysis with the 793B or QX genetic groups (Ji et al., 2011) . With the exception of few viruses, the remaining recombinant sequences do not share any common break-points or parental strains and were a mosaic of diverse parental lineages. However, taken together, these results reveal that the majority of recombination break-points are located in the intermediate region between the HVRs1 and 2 and the HVR3. Our phylogenetic analysis of 1286 IB strains ( Fig. 1 ) was used to derive a new and coherent classification scheme for IBV based on the S1 gene. Not only this is the most variable region within the IBV genome, containing abundant phylogenetic information, but it is also the major immunogenic component and the most commonly sequenced region of the IBV genome. Accordingly, 32 IBV lineages, each of which was defined by strongly supported nodes (N 0.98 SH-like test support values), were identified using our expansive S1 gene phylogeny. The designation of "lineage" was arbitrarily assigned to monophyletic groups of at least three viruses sampled from at least two different outbreaks. Strains that do not cluster into any lineages according to these subjective criteria are labeled as unique variant (UV) in the phylogenetic tree (n = 26). The lineages further fall into 6 well-supported (i.e. SH-like test support values of 1.0) and more genetically divergent groups, herein termed "genotypes"; 27 lineages cluster into genotype I (GI), which includes the majority of the IBV strains, whereas the remaining 5 genotypes contain one lineage each. The IBV lineages defined in this manner exhibit uncorrected pairwise distances of 13% and 14% for nucleotide and amino acid sequences, respectively. Similarly, viral genotypes differed at 30% of nucleotides and 31% of amino acids. Importantly, however, because natural virus evolution is unlikely to always produce discrete boundaries, these distance values should only be considered as "rules of thumb" rather than universally valid parameters. Thus, IBV classification should not be undertaken on pairwise distance comparisons alone, but requires input from phylogenetic data. To avoid confusion, IBV lineages were labeled using the abbreviation of the genotype in which they fall, followed by a consecutive number assigned according to the temporal order of the collection date of the first virus detected per lineage, here referred to as prototype strain. More details on the prototype viruses are provided in Table 1 . The same temporal scheme was used to assign consecutive roman numbers to the different genotypes. For those viruses collected in the same year and belonging to different lineages within GI we have followed the temporal order of their GenBank sequence submissions. Accordingly, they are labeled GI-1 to GI-27; the oldest IBV in the current study falls into lineage 1, whereas lineage 27 represents the most recently identified cluster within GI. Moreover, to simplify the possible future designation of additional genetic variants, we assigned the number '1' to the all lineages out from genotype I, even if a second IBV lineage is not yet detected in any of these five genotypes. Accordingly, they are labeled GII-1, GIII-1 GIV-1, GV-1 and GVI-1. Of note is that the lineage GI-24 consists of Indian IB viruses that so far have not been included in a scientific publication or for which a phylogenetic analysis has not been still performed. The sequence details, where available, were added to the strain name in the format: GenBank accession number, strain name (as reported in the public database), country of origin and collection date. To further assess the reliability of our classification scheme and to better display the IBV phylogeny, we performed an additional ML phylogenetic analysis on a smaller, sub-sampled, data set (n = 199), representative of IBV variability in the field (Fig. 2 ). These two trees had very consistent topologies; all lineage-defining branches are distinct from each other and strongly supported (N 0.97 SH-like support values). To confirm these findings, we analyzed the smaller data set using different phylogenetic methods. Importantly, equivalent branching patterns were obtained using both NJ and Bayesian methods (Supplementary materials Figs. S2 and S3). Accordingly, we suggest that this smaller data set is used as a reference tool for future epidemiological and evolutionary studies of IBV. The nucleotide sequences of the reference data set are provided as Supplementary materials (Table S1 ). We used a geography-based system (see below) to describe the 32 IBV lineages reported here. Because of their wide geographic distribution, some lineages are clearly of importance. Among these, lineages GI-1 and -13 (previously named as the Mass and 793B types, respectively) are commonly found, partly reflecting the use of vaccines derived from them in the countries where they have been reported. In contrast, other lineages are confined only to specific countries, many of which are limited to Asia and North America. Africa and South America possess unique lineages as well as some of the European-origin Table 1 Prototype strains and period of circulation of each lineage (data based on the complete S1 nucleotide sequences of the viruses included in the analysis). Period of circulation types. Notably, geographically distinct wild-type lineages were identified in Australia and New Zealand, likely reflecting their spatial isolation. The GI-1 lineage comprises the first IBV serotype identified and even today is one of the best known and most widely distributed genetic groups, likely due to the extensive use of a homologous vaccine derived from one of its strains. In our data set this group contains 189 viruses collected worldwide (with the exception of Oceania), which were previously assigned to the Massachusetts (also known as Mass or M41), the H120 and the Connecticut (Jungherr et al., 1956) types. The Mass serotype, of which the M41 is the representative strain, is mainly associated with respiratory disease (Cavanagh and Naqi, 1997) . The GI-13 lineage is present in many parts of the world and in our study comprises 70 viruses, both vaccine and virulent field strains, previously assigned to the 793B type (also known as 4/91 and CR88) (Gough et al., 1992; Parsons et al., 1992; Picault et al., 1995) . Notably, the so-called Israeli variant 1 viruses are members of this lineage (Callison et al., 2001; Gelb et al., 2005) . The first known strain of CR88 serotype was isolated in France in 1985 (Picault et al., 1995) , whereas the 793B strain emerged in the United Kingdom in 1991 and was originally described as a unique serotype responsible for severe respiratory syndromes (Callison et al., 2001; Cook et al., 1996) . A retrospective study revealed a 96% sequence similarity between a strain isolated in Morocco in 1983, which is here referred to as the GI-13 lineage prototype strain, and the 793B variant, suggesting that this North African virus is the progenitor of the lineage . Recently, this genetic type was identified for the first time in Canada in outbreaks with predominantly respiratory disease and/or egg production problems (Martin et al., 2014) . The largest number of IBV strains included in the present investigation comes from the GI-19 lineage that contains 546 viruses collected between 1993 and 2012. The GI-19 variant, the so-called QXIBV strain, was first detected in China in 1996 where it was associated predominantly with severe nephritis, 'false layer' syndrome and potentially proventriculitis (Wang et al., 1998) . Since then, several QX-type strains have been identified in China, although most cases have been associated with renal pathology (Liu et al., 2006b) . In Europe, numerous reports described QX-like strains following the Chinese index case (Abro et al., 2011; Beato et al., 2005; de Wit et al., 2011b; Gough et al., 2008; Monne et al., 2008; Valastro et al., 2010; Worthington et al., 2008) . At the same time, the first QX-like strains were identified in Japan (Ariyoshi et al., 2010; Mase et al., 2004) and Korea (Lee et al., 2008) . Thereafter, it was soon reported in such diverse localities as Russia, Africa and the Middle East (Amin et al., 2012; Bochkov et al., 2006; Toffan et al., 2011) . Thus, all strains falling in the GI-19 lineage have been previously assigned to the QX clade, also called LX4 Li et al., 2013; Liu et al., 2009 ) and A2 (Ji et al., 2011; Li et al., 2010) . Confusingly, the same genetic group has also been referred to as Korean-II (K-II) (Lee et al., 2008) and Japanese-III (JP-III) clusters (Ariyoshi et al., 2010; Mase et al., 2004) . Of note, a recently submitted sequence (KC577395) shows that the lineage had arisen in China by 1993. The GI-16 lineage contains 19 viruses collected between 1986 and 2011 in China, Taiwan and Italy, previously classified as the Q1 (even known as T3 and J2) or ck/CH/LDL/97I type (Liu et al., 2006b; Yu et al., 2001) . The designation of Korean III genotype (K-III) was also used to describe Korean strains clustering with Chinese viruses of LDL-like type. Notably, the classification into K-III was performed using phylogenetic analyses of partial S1 gene sequences (620-642 nt) (Lee et al., 2008) , and to our knowledge no complete S1 nucleotide sequences of this genetic group are currently available. The GI-16 lineage has been associated with respiratory syndrome (Ababneh et al., 2012; Yu et al., 2001) , severe drops in egg production (de Wit et al., 2012) and nephropathogenic disease (Huang et al., 2004; Toffan et al., 2013) . Although it is known to have a more widespread geographic distribution, we only include sequences from three countries. After the first isolation of the Q1 strain in China between 1996 and 1998 (Yu et al., 2001) , the lineage was reported in Taiwan in 2002 (Huang et al., 2004) , in South America since 2009 (Marandino et al., 2015; Sesti et al., 2014) , in some Middle Eastern countries (Ababneh et al., 2012) and in Italy (Toffan et al., 2013) in 2011, and in Colombia in 2012 (Jackwood, 2012) . Of note, our phylogenetic analysis reveals that the GI-16 prototype strain is an Italian virus -IZO28/86isolated in 1986, approximately 10 years before the first identification of the Q1 strain in China. In addition to those of European and American origin, we found 6 different lineages to be geographically strictly confined to Asia, of which one constitutes a different genotype (GVI-1). Thus, two distinct genotypes have been present and are probably still circulating in this continent. Most GI-7 lineage viruses were associated with nephropathogenic diseases in infected chickens (Huang et al., 2004) . The lineage was detected in Taiwan and China and comprises a total of 43 isolates; the majority were isolated after 1988, with the exception of the TP/64 strain which was isolated in Taiwan in 1964 from layers showing respiratory problems and drop in egg production (Huang et al., 2004) . Due to high nucleotide sequence similarity (90%), we propose the existence of a single group comprising strains previously assigned to two different genetic groups referred to as Taiwan-I (TW-I) and Taiwan-II (TW-II) (Liu et al., 2003; Wang and Tsai, 1996) . GI-15 consists of 11 respiratory strains collected exclusively in Korea between 1986 and 2008 and previously placed into the genotype named as Korean I (K-I) (Hong et al., 2012; Lee et al., 2010 Lee et al., , 2008 Song et al., 1998) . The GI-18 lineage comprises of 3 Japanese and 2 Chinese viruses collected between 1993 and 1999, and contains both respiratory and nephropathogenic strains (Mase et al., 2004; Shieh et al., 2004) . The lineage was designated as Japan I (JP-I) (Ariyoshi et al., 2010; Mase et al., 2004; Shieh et al., 2004) , as it originally contained only Japanese wild type field strains. However, it is clear that the lineage is no longer confined to Japan. The GI-22 lineage is the only Chinese indigenous genetic type identified here. Since its first detection, it has been of direct relevance to the poultry industry, reflecting its occurrence and widespread distribution in China, as well as its virulence. GI-22 includes 82 field viruses mainly of nephropathogenic nature collected from outbreaks in both broilers and layers flocks during 1997-2011. These local strains were initially assigned to the ck/CH/LSC/99I-type cluster following the inclusion of the Chinese IBV reference strain ck/CH/LSC/99I isolated in 1999 Liu et al., 2009 Liu et al., , 2006b Mo et al., 2013; Sun et al., 2011) , although it is also known as HN08 (Ji et al., 2011; Li et al., 2013) . Based on numerous epidemiological surveys conducted in China, the GI-22 lineage along with GI-19, appears to be the dominant viruses in the country Ji et al., 2011; Li et al., 2013 Li et al., , 2010 Liu et al., 2009 Liu et al., , 2006b Ma et al., 2012; Mo et al., 2013; Sun et al., 2011) . The GI-24 lineage contains IB viruses indigenous to India and, to date, no publications describe these strains (i.e. they are only recently reported as accession numbers) such that little epidemiological and clinical information is available. The lineage comprises of 24 viruses collected during the period 1998-2013. Of these, 11 have been assigned to a genotype named NPR by the submitting authors, while 12 others seem to be of nephropathogenic nature (according to the data reported in GenBank), with no data reported for one strain. The only published data on the circulation of local Indian variant was that of Bayry et al. (2005) who described the emergence in India of a unique nephropathogenic IBV classified as a novel genotype (isolate PDRC/Pune/Ind/ 1/99, AY091551). A BLAST search revealed that the PDRC/Pune/Ind/1/ 99 is 99% similar to the GI-24 prototype strain. However, there is currently insufficient data as to whether this strain can be included in GI-24. As noted above, GVI-1 represents a genetically distinct lineage present in Asia. It comprises 13 isolates collected in China and Korea between 2007 and 2012, which were originally grouped in the 'Korean New Cluster II' (Lim et al., 2012) , also designated as Chinese New-Type . The available data on the pathogenicity of these strains revealed them to be of respiratory nature (Li et al., 2010; Lim et al., 2012) . Notably, viruses closely related to those included here were also sampled in Japan in 2009 and assigned to a group named JP-IV (Mase et al., 2010) by sequencing of the partial S1 gene (621 nt). To date, no JP-IV-like S1 complete sequences are available so we cannot determine whether the so-called "JP-IV strains" are included in GVI-1 or if they cluster into a separate lineage. A large number of lineages, falling into two well distinct genotypes (GI and GIV), have been reported as indigenous to North America . However, only some of these -GI-9, GI-27 and GIV-1have been implicated in widespread disease disseminations and persistent virus infections (Jackwood, 2012) . The GI-8 lineage includes one of the first IBV serotypes (SE-17) recognized to be different from the pre-existing IBV antigenic types. However, as this lineage was only detected for a brief period it is likely of limited importance. The variant was isolated in 1967 in Georgia from a chicken flock with acute respiratory distress and was designated as SE-17 (Hopkins, 1969) . A retrospective study identified respiratory SE-17 IBVs to be present in USA since 1965 (Mondal et al., 2013) . The GI-9 lineage contains vaccine and virulent field strains collected from 1973 to 2011, the majority from the USA (44/49). Herein, we report IBVs previously known to be of Arkansas (Ark) and Ark DPI-like type and strains referred to as California 99 type first detected in North Carolina in 1999 (Martin et al., 2001; Mondal and Cardona, 2004) . These viruses are the causative agent of respiratory syndromes, observed in the field as well as under experimental conditions (Fields, 1973; Johnson et al., 1973; Martin et al., 2001; Mondal and Cardona, 2007) . A total of 12 viruses sampled in Pennsylvania, California and Alabama from 1988 to 1999 fall within the GI-17 lineage. This includes strains associated with respiratory distress and renal pathologies, with one also implicated in reproductive pathology Moore et al., 1998; Ziegler et al., 2002) . These strains were previously designated as California variants (CAV) (Hein et al., 1989; Moore et al., 1998) . Among these, two viruses isolated in the late 1990s in Pennsylvania -PA/Wolgemuth/98 and PA/171/99were classified as being two unique genotypes, genetically similar but antigenically distinct from the CA/Machado/88 reference prototype strain (Ziegler et al., 2002) . Although we only identified two strains as members of the lineage GI-20, it has been included in our classification because of its epidemiological relevance in Canada. The lineage has been never described outside of Eastern Canada, yet appeared to be the most common lineage circulating in the country between 2000 and 2013 (Martin et al., 2014) . The Qu_mv prototype variant (AF349621) was isolated in Quebéc in 1996 from commercial broiler flocks displaying respiratory signs of disease (Ojkic and Binnington, 2002; Smati et al., 2002) . Since then, its prevalence in the region has risen, also spreading to Nova Scotia and Ontario. Finally, we group 26 American indigenous viruses collected between 2004 and 2013 and associated with respiratory infection into the GI-25 (n = 9) and GI-27 (n = 17) lineages, which were previously designated as GA07 and GA08, respectively (Jackwood et al., 2007; Kulkarni and Resurreccion, 2010) . GI-24 includes, among others, the prototype CA/ 1737/04 strain (Jackwood et al., 2007) along with the DMV/5642/06 (Wood et al., 2009 ) and GA/60,173/07 (Jackwood et al., 2007) variants. The GI-27 lineage contains the most recent IBV defining a lineage, being first identified in 2007. The variant, which became the predominant virus type at that time, was reported to be a novel genotype and designated as GA08 (Jackwood et al., 2010; Kulkarni and Resurreccion, 2010) . Another cluster restricted to the USA is lineage 1 of GIV, which is also the only North American lineage belonging to a different genotype. This group contains both vaccine and field strains (n = 24) isolated between 1992 and 2003. Among these is the variant referred to as Delaware variant (DE or DE072), isolated in 1992 from commercial broiler chicks during severe respiratory disease and designated to be of a novel genotype and serotype compared to the others (Gelb et al., 1997; Mondal et al., 2001) . In the same lineage are IBV strains previously designated as GA98 and described to be closely related to the DE variant, although of a different serotype . It has been suggested that the GA98 variant arose from immune selection caused by DE072 attenuated live vaccine introduced in the country in 1993 ). In addition, viruses recovered in 2000 from layer flocks experiencing reduction in egg production also fell into this lineage (Mondal et al., 2001) . The GI-2, GI-3 and GI-4 lineages were first described in the USA between the 1950s and the 1960s and later detected in Asia many years later. Notably, however, GI-2 and GI-4 were reported in USA only during the 1950s-1960s and never again, while GI-3 was also reported in North America in the late 1990s Gelb et al., 2001) before being identified in Taiwan in 2006. Hence, old lineages may be sporadically re-detected. A total of 5 viruses cluster within the GI-3 lineage, which, among others, includes the serotypes known as Holte and Iowa 97 (Albassam et al., 1986; Hofstad, 1958) and two viruses sampled in China between 2004 and 2006 (Bing et al., 2007) . The GI-4 lineage consists of 1 nephropathogenic strain, isolated in USA in 1962 (Winterfield and Hitchner, 1962) , whose S1 gene was entirely sequenced in 1994 (accession number L18988; Wang et al., 1994) and two additional viruses collected in China for which no published data are currently available. Of note, the same IB strain nomenclature has been used to identify two viruses genetically distant between each other and belonging to two different lineages. Hence, both the GI-2 and GI-4 lineages include a virus called 'Holte' as prototype strain. The GI-3 lineage contains 7 viruses, comprising both respiratory and nephropathogenic strains. It was originally designated as JMK or the Gray serotype because of the appropriate reference strains (accession numbers L14070 and L14069, respectively). Although these two viruses are antigenically very similar (Cowen and Hitchner, 1975) , their pathogenicity is different because the Gray variant can be nephropathogenic while the JMK virus is strictly respirotropic (Kwon and Jackwood, 1995; Thor et al., 2011; Winterfield et al., 1964; Winterfield and Hitchner, 1962) . The GI-11 lineage is unique to South America and comprises a total of 13 Brazilian viruses collected between 1975 and 2009. However, novel IBV sequences, which were obtained from field samples from Argentina and Uruguay, have been recently submitted to GenBank (Marandino et al., 2015) . By phylogenetic analysis of the complete S1 coding region, the authors included these strains in a genotype referred to as South America I (SAI), which also contains the GI-11 Brazilian Fig. 2 . Phylogenetic tree of complete S1 nucleotide sequences. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. Each lineage is color-coded and its corresponding designation is reported. Bars reporting the genotypes in which the lineages fall are shown. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation "UV" indicates unique variants, here marked in black. A complete list of the 199 sequences used is provided in Table S1 . SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. viruses. A previous nomenclature based on partial S1 nucleotide sequences of local Brazilian field variants has been also adopted (Balestrin et al., 2014; Chacón et al., 2011; Fraga et al., 2013; Villarreal et al., 2010) , and referred to as the Brazil (Villarreal et al., 2010) or BR-I (Chacón et al., 2011) genotypes. The partial Brazilian sequences show a high degree of nucleotide similarity with those of , such that it is unclear whether they represent the same genetic type. The GI-11 lineage has been associated with a variety of clinical conditions, ranging from respiratory disease, infertility, drop in egg production and egg quality (Chacón et al., 2011 (Chacón et al., , 2008 Montassier, 2010; Villarreal et al., 2007a) to enteric disorders (Villarreal et al., 2010; Villarreal et al., 2007b) . It was recently demonstrated that the Brazilian variant causes predominantly respiratory and kidney diseases under experimental conditions (Chacón et al., 2014 : de Wit et al., 2015 . Interestingly, our phylogenetic analysis demonstrates that the indigenous GI-11 lineage has been circulating in the country since 1975, supporting the hypothesis of Montassier (2010) that this variant had already been present in the field since at least as early as 1988. Two distinct lineages that fall in two different genotypes -GI-21 and GII-1were identified as unique to Europe. Notably, one of these has also been reported in Russia (Bochkov et al., 2006) and recently in Morocco (Fellahi et al., 2015) . Within the GI-21 lineage we group 14 viruses sampled between 1997 and 2005 in Italy, the United Kingdom and Spain. The IB viral type of the lineage was originally isolated in Italy in 1999 and designated Italy02 (Bochkov et al., 2007) . Thereafter, it was reported to be one of the most predominant genotypes in Spain (Dolz et al., 2009) and the third most frequent in Western Europe over . This variant has mainly been detected in broiler flocks that experienced respiratory signs, as well as adult birds, broiler breeders and layers, associated with drop in egg production (Worthington et al., 2004) . It also appeared to induce renal disease in young chickens (Dolz et al., 2012) . Although strains in this lineage are related to one of the major and widespread European wild types, a limited number of complete S1 nucleotide sequences are available for analysis. GII-1 lineage is the only group of European viruses that falls in a different genotype to all the other viruses which are classified here as GI. The lineage is comprised of only the Dutch isolates D1466 and V1397, showing a large evolutionary distance compared to the remaining IBV genotypes. The D1466 variant (also called D212) was detected for the first time in The Netherlands in the late 1970s, when it was recognized to have antigenic and molecular properties significantly different from known IBV strains (Adzhar et al., 1995; Davelaar et al., 1984; Kusters et al., 1989 Kusters et al., , 1987 . Historically, D1466 has never been responsible for major disease in flocks and hence may be of relatively low pathogenicity. However, an increase in virulence of this variant was recently observed. In particular, poor egg production in both layers and broiler breeders was reported between 2005 and 2006 in some countries of Western Europe and more recently in Poland (Domanska-Blicharz et al., 2012) . The GI-26 lineage represents a unique African cluster of viruses that were identified relatively recently. It contains 32 viruses isolated in Nigeria and Niger between 2006 and 2007, for which no obvious clinical signs were recorded. These local strains were previously grouped into a novel IBV genotype designated as IBADAN, referring to the name of the city (in Nigeria) where the variant was first detected, and were described to be genetically and antigenically clearly distinct from all other known IBV strains (Ducatez et al., 2009 ). Two IBV lineages -12 and 14were found in some European countries as well as in Nigeria. Both fall into GI and were also reported in Russia (Bochkov et al., 2006) . Strains previously classified as D207like, D274-like or UK/6/82-like types fall into the GI-12 lineage. Here, we report 3 Dutch and 3 British strains isolated during 1978-1986 from broilers experiencing respiratory infection and from breeding flocks showing aberrant egg production (Cavanagh et al., 1992a; Cook and Huggins, 1986; Cook, 1984 Cook, , 1983 Davelaar et al., 1984) . In addition, 1 field strain from Russia and 2 from Nigeria (Ducatez et al., 2009 (Ducatez et al., ), collected in 2002 (Ducatez et al., and 2006 in this lineage. Although the circulation of this variant is well documented (Bochkov et al., 2006; Cavanagh et al., 1999 Cavanagh et al., , 1992a Cook, 1984; Davelaar et al., 1984; Meulemans et al., 2001; Monne et al., 2009; Valastro et al., 2014; Worthington et al., 2008) , only a relatively small number of D274-like sequences are available for analysis. A GI-12 like strain was also identified in Egypt in 1989 (Abdel-Moneim et al., 2006 , although the status of this virus is ambiguous as only partial S1 sequence (722 nt) is currently available. The GI-14 lineage comprises only two viruses collected in Belgium (B1648) (Meulemans et al., 1987) and Nigeria (NGA/324/2006) (Ducatez et al., 2009) , although it merits classification due to its epidemiological relevance and pathogenicity. After its first identification in Belgium in 1984 (Meulemans et al., 1987) , the variant was again reported in the country in 1993 (Meulemans et al., 2001) and later in Italy (Capua et al., 1999) , Russia (Bochkov et al., 2006) and Slovenia (Krapez et al., 2011) . No other complete S1 gene sequences are available. The viruses related to this variant were previously referred to as the B1848-like type and reported to be mostly nephropathogenic (Meulemans et al., 1987; Capua et al., 1999) and also associated with egg production problems (Capua et al., 1999) . The variant was rarely detected in France and Germany between 2002 and 2006 , and did not appear to be causing relevant illness in poultry flocks. The GI-23 lineage represents the unique wild-type cluster geographically confined to the Middle East. Strains belonging to this lineage have been detected since 1998 in Israel and are still circulating in the area (Ganapathy et al., 2015; Najafi et al., 2015) . Some have become dominant in the majority of farms and are involved in respiratory and renal pathologies (El-Mahdy et al., 2012; Meir et al., 2004) . However, the complete S1 sequence is only available for a limited number of viruses (n = 9). Some authors have previously assigned these strains as Israeli Variant 2 to distinguish them from those clustering within Israeli Variant 1 (Abdel- Moneim et al., 2002; Callison et al., 2001; Mahmood et al., 2011; Meir et al., 2004) . Alternatively, studies performed on the Egyptian isolates divided them into different genotypes on the basis of their HVR3 sequences; they were defined as Egyptian Variant 1, having as reference the strain Egypt/Beni-Suef/01 (Abdel-Moneim et al., 2002) and Egyptian Variant 2, which includes the viruses ck/Eg/BSU-2/2011 and ck/Eg/BSU-3/2011 (Abdel-Moneim et al., 2012) . To date, no complete nucleotide sequences are available for the three Egyptian strains. Fig. 3 . Phylogenetic tree of partial S1 nucleotide sequences including HVRs1 and 2. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. All strains belonging to the same lineage, assessed on the basis of the complete full-length sequences, are labeled with a unique color code as in Figs. 1 and 2 . The color-coded boxes reporting the lineage designations are only shown for those lineages correctly identified. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation "UV" indicates unique variants, here marked in black. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. Likely due to their geographical isolation, Australia and New Zealand possess only unique indigenous variants. We found 5 distinct IBV lineages in these localities, 3 falling into GI (GI-5, -6 and -10) and 2 possessing large evolutionary distances between each other and compared to those found elsewhere. Hence, our classification into distinct genotypes designated as GIII-1 and GV-1. The GI-5 and GI-6 lineages contain both vaccine and field strains (13 and 17 viruses, respectively), mostly sampled in Australia. The only Chinese sequences included here are 4 field viruses (1 in GI-5 and 3 in GI-6) that presumably represent re-isolations of the vaccine strains JAAS and J9, which were from Australia and used in China to control IBV (Liu et al., 2006a) . Hence, both these lineages may be geographically confined to Oceania. In addition, one strain sampled in New Zealand falls in GI-6. Among the strains clustering in GI-5 are the Armidale vaccine strain and the nephropathogenic N1/62, also known as T strain. Within the GI-6 lineage is the VicS/62 strain that was introduced as a vaccine into Australia in 1966 (Cumming, 1969) . The strains within GI-5 and GI-6 were originally grouped as Australian subgroup I (Ignjatovic et al., 2006) , which includes both respiratory and nephropathogenic strains (Sapats et al., 1996) . The GI-10 lineage contains 6 New Zealand indigenous viruses; 3 were collected in the 1970s and the remainder in the 2000s (McFarlane and Verma, 2008) . This IBV variant was first reported in the country in 1967 (Pohl, 1967) , and ten years later 4 different strains designated as A, B, C and D were identified using virus neutralization tests (Lohr, 1977 (Lohr, , 1976 . Finally, both the lineages falling into GIII and GV contain respiratory and indigenous Australian pathogens (4 and 7 strains, respectively). The GIII-1 lineage was first identified in 1988 (Ignjatovic and McWaters, 1991) and designated as Australian subgroup II (Sapats et al., 1996) , whereas the GV-1 lineage was described approximately 14 years later and referred to as Australian subgroup III (Ignjatovic et al., 2006) . Both appear to be genetically and antigenically different from the classical strains, here grouped into GI-5 and GI-6 (Ignjatovic et al., 1997; Mardani et al., 2010; Sapats et al., 1996) . Since partial S1 gene sequences are often used to classify the IBV strains, we inferred two additional ML phylogenetic trees based on HVRs1 and 2 (312 nt) and HVR3 (342 nt) of the reference subsampled data set (n = 199). Strikingly, important topological inconsistencies were observed between the HVRs1 and 2 phylogeny and that inferred using the complete S1 gene. Specifically, although GIII, V and VI exhibit large evolutionary distances compared to the remaining lineages, they cluster within GI in marked contrast to what is seen in the complete S1 tree, while 4 lineages -GI-7, -14, -23, and -27do not form monophyletic groups (Fig. 3 ). In addition, while most groups were strongly supported in the SH test (N 0.96 SH-like), others were more weakly supported, such as GI-15 which only received 0.60 support, and GI-6 and GI-9, both of which received 0.80 SH-like support. Genetic typing based on HVR3 was similarly inconsistent with that obtained from the whole S1 gene (Fig. 4) . In particular, 8 lineages are no longer monophyletic. Overall, these results indicate that both the genotypes and lineages identified using the HVRs are not representative of those obtained from the phylogenetic analysis of the whole S1 gene, so that only the latter should be used in IBV genetic classification. Finally, to determine whether there was sufficient temporal structure for molecular clock dating, we fitted a linear regression of root-to-tip genetic distance from the ML tree against the date (year) of collection for 372 randomly selected sequences from the entire data set. This revealed a weakly negative relationship between genetic distance and time (R-squared = −0.003; correlation coefficient = −0.181 0.181 under the best-fitting root). Such a clear lack of temporal structure means that molecular clock dating schemes based on 'tip dating' alone cannot proceed. An equivalent root-to-tip regression using the GI-19 lineage alone, which includes samples collected from 1993 to 2010 (n = 354) was conducted to determine whether this was also true of more closely related sequences. Similarly, the analysis revealed only weak temporal structure (R-squared = 0.159; correlation coefficient = 0.399). Advances in molecular biology and bioinformatics analyses have impacted virus classification at all taxonomic levels. The International Committee on Taxonomy of Viruses (ICTV) has no guidelines for the classification of viruses below the species level. However, classification systems have been developed and widely used for a variety of avian pathogens, including Avian influenza (AI) (WHO/OIE/FAO H5N1 Evolution working group) and Newcastle disease (ND) viruses (Aldous et al., 2003; de Almeida et al., 2013) , within which distinct "lineages" have been established through phylogenetic analysis and sequence similarities. Herein, we propose a similar framework for IBV. To date, no genetic characterization of IBV has included sequences from all the existing viral variants or adopted a unified system for naming the groups, such that no consensus on IBV classification has been reached. Indeed, the diversity of IBV genetic clustering and naming available at present is highly confusing. Hence, we have attempted to construct a comprehensive phylogenetic history of this virus and from this to derive a rational and harmonious scheme for the classification of IBV that we suggest should be used for future epidemiological and evolutionary studies. We have focused on the complete nucleotide sequence of the S1 gene as the basis for IBV lineage assignment. Not only it is the most variable region within the IBV genome, containing abundant phylogenetic information, but it encodes the major immunological determinants (Jackwood and de Wit, 2013) and it is used by many laboratories studying IBV. Hence, phylogenetic analysis of the S1 gene and an S1-based viral classification might provide data of direct epidemiological relevance for controlling IBV spread, particularly as field and vaccine strains share a high degree of S1 sequence identity . Importantly, our classification was exclusively based on the topology of the phylogenetic tree, with strong statistical (SH-like) support values at each node defining monophyletic groups. Hence, IBV strain clustering was evaluated by a robust (maximum likelihood) phylogenetic method that is able to efficiently handle a large number of sequences, and combined with an efficient statisticthe SH-like testthat can rapidly estimate the support for individual groupings on the tree. That very similar tree topologies were estimated using different phylogenetic techniques not only suggests that they are robust, but that faster phylogenetic methods can be used if necessary. A more challenging issue is recombination, which undoubtedly has major implications for virus classification (Simmonds, 2015) . Fig. 4 . Phylogenetic tree of partial S1 nucleotide sequences including HVR3. The phylogeny contains a total of 199 IBV strains, including 6 representative sequences of each lineage detected and 26 strains recognized as unique variants. All strains belonging to the same lineage, assessed on the basis of the complete full-length sequences, are labeled with a unique color code as in Figs. 1 and 2 . The color-coded boxes reporting the lineage designations are shown only for those lineages correctly identified. GenBank accession number, isolate number or name, country of origin and collection date is given for each strain. The designation "UV" indicates unique variants, here marked in black. SH-like branch supports are shown for key nodes. The scale bar represents the number of nucleotide substitutions per site, and the tree is mid-point rooted for clarity only. Importantly, however, viral phylogenies based on a single gene (as here) have been previously used to establish viable classification schemes. Notable examples include members of genus Enterovirus (Mirand et al., 2006; Oberste et al., 1999) , pestiviruses such as BVDV-1 (Deng et al., 2012; Vilcek et al., 2001) and BVDV-2 (Flores et al., 2002; Jenckel et al., 2014; Weber et al., 2015) , circoviruses such as PCV2 (Franzo et al., 2015; Grau-Roma et al., 2008; Segalés et al., 2008) , and lentiviruses such as FIV (Marçola et al., 2013; Sodora et al., 1994) . In the case of IBV we propose that an effective classification scheme, particularly the designation of lineages and genotypes, should be based on clearly identifiable genetic groups (i.e. with recombinants removed) as these represent a robust phylogenetic backbone. A similar approach has been undertaken for PCV2 (Franzo et al., 2015) . Hence, we contend that this is the most coherent and practical way for virus classification in the face of recombination, particularly as it is impractical to integrate multiple incongruent phylogenies and simplistic to think that such complex evolutionary histories will produce more rational classifications. Rather than being defined as unique variants in their own right, recombinants can then be referred to as combinations of these distinct lineages and genotypes, analogous to the definition of 'circulating recombinant forms' among HIV subtypes. However, it is evident that more experimental studies are needed to assess how recombination might impact viral fitness. In this respect, the relatively high number of recombinant viruses in our data (n = 213) is in part due to the presence of strains showing an identical recombinant structure and possessing a strong epidemiological link between each other. Hence, these should not be regarded as result of independent recombination events. As well as providing the first complete picture of IBV biodiversity, by determining the phylogenetic relationships between all described genetic groups we have provided a well-defined evolutionary history of IBV, which in turn results in a clear definition of viral genotypes and lineages. Accordingly, a total of 6 genotypes (GI-GVI) and 32 lineages were identified, with other potential groups present as unique variants (UVs) and which may become established should future viruses be sequenced. Some well-established lineages such as GI-1 and -13 have a broad geographic distribution, which is presumably associated with the use of vaccines derived from them. Therefore, the majority of the IBV strains included in these lineages might be vaccine and vaccinelike strains. The first vaccine to control the disease was developed in the USA in the 1950s using the van Roeckel M-41 strain (van Roeckel et al., 1942) that represents the parent strain of most of the Mass type vaccines used there. By the early 1960s IB had been diagnosed in The Netherlands, leading to the development of a Mass-based vaccine known as the H strain (Bijlenga et al., 2004) . The resulting vaccines, H120 and H52, soon became widely used. Today, the Mass and H120 strains of the lineage GI-1 continue to be the most commonly administrated attenuated-live vaccines. In contrast to the GI-1 vaccine strains, the 793B-like vaccines (GI-13), which were developed in Europe in the 1990s and used in many countries, have been never administered in North America. To date, the GI-13 lineage has not been detected in the USA, Oceania and many African and Latin American countries. The S1 gene phylogeny was also characterized by strong geographic structure, such that IBV strains are often clustered by place of sampling. In particular, with the exception of strains of the GI-1 lineage, IBVs in Europe differ from those found in the USA or Australia, and each geographic group can be distinguished at the phylogenetic scale. That most of the strains in the GI-9 lineage come from the USA might suggest that the pathogenic Ark variant is geographically confined to that country. However, there are unpublished reports recording the circulation of Ark-like strains in South America (Jackwood, 2012; Marandino et al., 2015) . The Ark virus is one of the most commonly reported types able to cause widespread disease in the USA, against which an attenuated vaccine was developed. When it first emerged in Arkansas in 1973, it was described as genetically distinct from all the known IBV serotypes recognized at that time and was referred to as Ark99 (Fields, 1973; Johnson et al., 1973) . During the 1980s, an attenuated vaccine derived from an Ark-type virus isolated in the Delmarva Peninsula (Ark DPI strain) (Gelb et al., 1983 (Gelb et al., , 1981 was extensively used in the USA and remains one of the most common vaccines administered to flocks in this country and also in the United Kingdom. In this respect, a previous epidemiological survey reported the identification of GI-9-like strains in Western Europe only in flocks that had received the commercial bivalent IBMM + Ark vaccine . However, no European IBV sequences similar to the GI-9 lineage are available in the public database. A similar situation arises with the Chinese IBVs in GI-9 (n = 5). Among these, the Jilin strain (AY839144), which was previously reported to be 100% identical to the Ark DPI strain (Ammayappan et al., 2008) , is currently used as vaccine in China (Liu et al., 2006a) . This suggests that the Chinese IBVs present in this lineage most likely represent re-isolations of the vaccine strain and not of the Ark field type. Although the widespread circulation of some specific lineages is probably attributed to the use of vaccination programs based on strains derived from these, this is likely not always the case. In particular, the spread of the nephropathogenic QX-like variant of the GI-19 lineage occurred long before its homologous vaccine was administrated in the field. This Chinese lineage has generated considerable attention due to its ability to become endemic, causing major economic losses in the poultry industry worldwide, with the exception of the Americas and Oceania where it has been never detected. The origin of this lineage and the factors responsible of its distinctive distribution remain unclear (Bochkov et al., 2006; Gough et al., 2008) . A role of wild birds has been hypothesized based on evidence that IBV may replicate in Anseriformes (Bochkov et al., 2006; Cavanagh, 2005) . Importantly, the present study seems to counter the common assumption that the GI-16 lineage arose in China in 1996. In particular, our analysis provides evidence that the Italian IZO28/86 strain, isolated in 1986, belongs to GI-16 such that it constitutes the lineage prototype strain. This nephropathogenic virus was originally sampled in Italy about 10 years before the first identification of the Q1 strain in China, and its sequence has been only recently submitted to the public database. Additionally, the IZO28/86 sequence is closely related to strain 624/I (JQ901492), suggesting that they belong to the same lineage, which also includes the Q1-like strains. However, they have previously been classified as distinct genotypes. The 624/I virus was first reported as novel variant in Italy in 1993 (Capua et al., 1994) during an outbreak of severe respiratory disease, although only a 350 nt region of S1 was sequenced (Capua et al., 1999) . Thereafter, it has been sporadically detected in Italy, in Russia (Bochkov et al., 2006) and Slovenia (Krapez et al., 2011) . More recently a longer nucleotide 624/I sequence (1043 nt in length) has been released (JQ901492), which clusters into the same monophyletic group. Based on these observations, it is plausible that both variants belong to the GI-16, unless recombination has occurred in the C-terminal portion of the 624/I S1 sequence. In the last two decades, a combination of phylogenetic clustering and patterns of sequence similarity in the S1 gene have been conventionally used to group IBV isolates into genetic clades, although a confusing variety of such clustering schemes currently exist. For instance, IBVs have been referred to as novel variants when their S1 nucleotide sequences are at least ≤75% dissimilar from that of any other IBV type Kingham et al., 2000) . However, because of rate variation between sequences, reflected here in the lack of temporal structure in the data, distance-based classification methods are susceptible to error. In particular, elevated evolutionary rates leading to individual clusters may result in high genetic distances between sister taxa even though they are closely related. Thus, we suggest that phylogenetic relationships are a more appropriate measure of evolutionary history and hence the basis of a rationale classification than pairwise comparison of sequences. In addition, it is unrealistic to think that nature will create discrete groups of sequences that can consistently be recovered using genetic distances. Most phylogenetic analyses of IBV have been based on the three more variable regions (HVRs) of the S1 gene. Some investigators have reported that the genetic typing based on HVR1 of the S1 gene is inconsistent with the groupings based on the whole S1 gene (Li et al., 2012; Mo et al., 2013; Schikora et al., 2003) , although others disagree Wang and Huang, 2000) . We clearly show here that the hypervariable fragments (HVRs1 and 2 and HVR3) do not consistently produce clusters that are equivalent to those found through phylogenetic analyses of the S1 phylogeny. Therefore, the risk of misclassification decreases by using a larger portion of the S1 gene, and the sequencing of only one of these regions might result in insufficient phylogenetic resolution. Hence, we strongly recommend a phylogeny that considers the complete S1 gene sequence be employed for future designations of novel IBV lineages or genotypes. Given the rapid evolution of IBV and the use of mass vaccination strategies to control the disease worldwide, additional genetic variants will likely be discovered in the future. As heterogeneous genetic group designations, which are inconsistent with phylogenetic classification, have largely been used to the present day, it is essential to employ a standard nomenclature of practical use and a well-supported system to identify these novel variants. Herein, we propose a simple and repeatable S1 phylogeny-based classification system combined with an unambiguous lineage nomenclature for the future assignment of IBV strains. Following the suggestions here proposed, at least three complete S1 sequences of viral samples collected at least from two different outbreaks should be available for the identification of a new viral lineage, and genotypes and lineages should be referred to according to the current numerical system. In addition, and in similar manner to the convention in AIV, we encourage the use of a uniform and informative system for naming IBV isolates, which at least should include the name of the strain, country of origin and date of collection (Cavanagh, 2001) . Clearly, the adoption of an internationally accepted nomenclature and a common system to coherently designate viruses is central for efficient communication on the evolution and emergence of epidemiologically important IBV variants. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.meegid.2016.02.015. Presence of infectious bronchitis virus strain CK/CH/LDL/97I in the Middle East. ISRN Vet Isolation and identification of Egypt/Beni-Suef/01 a novel genotype of infectious bronchitis virus S1 gene sequence analysis of a nephropathogenic strain of avian infectious bronchitis virus in Egypt Emergence of a novel genotype of avian infectious bronchitis virus in Egypt Emergence of novel strains of avian infectious bronchitis virus in Sweden Avian infectious bronchitis virus: differences between 793/B and other strains Comparison of the nephropathogenicity of four strains of infectious bronchitis virus A molecular epidemiological study of avian paramyxovirus type 1 (Newcastle disease virus) isolates by phylogenetic analysis of a partial nucleotide sequence of the fusion protein gene Circulation of QX-like infectious bronchitis virus in the Middle East Complete genomic sequence analysis of infectious bronchitis virus Ark DPI strain and its evolution by recombination Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative Classification of IBV S1 genotypes by direct reverse transcriptase-polymerase chain reaction (RT-PCR) and relationship between serotypes and genotypes of strains isolated between 1998 and 2008 in Japan Infectious bronchitis virus in different avian physiological systems-a field study in Brazilian poultry flocks Emergence of a nephropathogenic avian infectious bronchitis virus with a novel genotype in India Evidence of circulation of a Chinese strain of infectious bronchitis virus (QXIBV) in Italy Development and use of the H strain of avian infectious bronchitis virus from The Netherlands as a vaccine: a review Different genotypes of nephropathogenic infectious bronchitis viruses co-circulating in chicken population in China Molecular epizootiology of avian infectious bronchitis in Russia Phylogenetic analysis of partial S1 and N gene sequences of infectious bronchitis virus isolates from Italy revealed genetic diversity and recombination Completion of the sequence of the genome of the coronavirus avian infectious bronchitis virus Molecular characterization of infectious bronchitis virus isolates foreign to the United States and comparison with United States isolates A 'novel' infectious bronchitis strain infecting broiler chickens in Italy Co-circulation of four types of infectious bronchitis virus (793/B, 624/I, B1648 and Massachusetts) Gene 5 of the avian coronavirus infectious bronchitis virus is not essential for replication Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis A nomenclature for avian coronavirus isolates and the question of species status Coronaviruses in poultry and other birds Coronavirus avian infectious bronchitis virus Infectious bronchitis Infectious bronchitis Infectious bronchitis Coronavirus IBV: virus retaining spike glycopolypeptide S2 but not S1 is unable to induce virusneutralizing or haemagglutination-inhibiting antibody, or induce chicken tracheal protection Coronavirus IBV: partial amino terminal sequencing of spike polypeptide S2 identifies the sequence Arg-Arg-Phe-Arg-Arg at the cleavage site of the spike precursor propolypeptide of IBV strains Beaudette and M41 Amino acids within hypervariable region 1 of avian coronavirus IBV (Massachusetts serotype) spike glycoprotein are associated with neutralization epitopes Location of the amino acid differences in the S1 spike glycoprotein subunit of closely related serotypes of infectious bronchitis virus Infectious bronchitis virus: evidence for recombination within the Massachusetts serotype Longitudinal field studies of infectious bronchitis virus and avian pneumovirus in broilers using type-specific polymerase chain reactions Variation in the spike protein of the 793/B type of infectious bronchitis virus, in the field and during alternate passage in chickens and embryonated eggs S1 gene fragment amplification of Infectious bronchitis virus variant by RT-PCR from Brazil Epidemiological survey and molecular characterization of avian infectious bronchitis virus in Brazil between Pathogenicity and molecular characteristics of infectious bronchitis virus (IBV) strains isolated from broilers showing diarrhea and respiratory disease Isolation of a new serotype of infectious bronchitis-like virus from chickens in England The classification of new serotypes of infectious bronchitis virus isolated from poultry flocks in Britain between 1981 and 1983 Newly isolated serotypes of infectious bronchitis virus: their role in disease A survey of the presence of a new infectious bronchitis virus designated 4/91 (793B) Serotyping of avian infectious bronchitis viruses by the virus-neutralization test The control of avian infectious bronchitis/nephrosis in Australia jModelTest 2: more models, new heuristics and parallel computing Occurrence and significance of infectious bronchitis virus variant strains in egg and broiler production in The Netherlands New avian paramyxoviruses type I strains identified in Africa provide new outcomes for phylogeny reconstruction and genotype classification Infectious bronchitis virus variants: a review of the history, current situation and control measures Induction of cystic oviducts and protection against early challenge with infectious bronchitis virus serotype D388 (genotype QX) by maternally derived antibodies and by early vaccination Report of the Genotyping, Pathotyping and Protectotyping of Recent Strains from Chile Increased level of protection of respiratory tract and kidney by combining different infectious bronchitis virus vaccines against challenge with nephropathogenic Brazilian genotype subcluster 4 strains High prevalence of bovine viral diarrhea virus 1 in Chinese swine herds First report of IBV QX-Like Strains in Spain. Proceedings of the VI International Symposium on Avian Corona-and Pneumoviruses and Complicating Pathogens New insights on infectious bronchitis virus pathogenesis: characterization of Italy 02 serotype in chicks and adult hens D1466-like genotype of infectious bronchitis virus responsible for a new epidemic in chickens in Poland Characterization of a new genotype and serotype of infectious bronchitis virus in Western Africa Efficacy of some living classical and variant infectious bronchitis vaccines against local variant isolated from Egypt Prevalence and molecular characterization of avian infectious bronchitis virus in poultry flocks in Morocco from 2010 to 2014 and first detection of Italy 02 in Africa Arkansas 99, a new infectious bronchitis serotype Phylogenetic analysis of Brazilian bovine viral diarrhea virus type 2 (BVDV-2) isolates: evidence for a subgenotype within BVDV-2 Emergence of a new genotype of avian infectious bronchitis virus in Brazil Revisiting the taxonomical classification of Porcine circovirus type 2 (PCV2): still a real challenge Genotypes of infectious bronchitis viruses circulating in the Middle East between Serologic and cross-protection studies with several infectious bronchitis virus isolates from Delmarva-reared broiler chickens Prevalence of Arkansas-type infectious bronchitis virus in Delmarva Peninsula chickens Antigenic and S1 genomic characterization of the Delaware variant serotype of infectious bronchitis virus Novel infectious bronchitis virus S1 genotypes in Mexico 1998-1999 S1 gene characteristics and efficacy of vaccination against infectious bronchitis virus field isolates from the United States and Israel A 'new' strain of infectious bronchitis virus infecting domestic fowl in Great Britain Chinese QX strain of infectious bronchitis virus isolated in the UK A proposal on Porcine circovirus type 2 (PCV2) genotype definition and their relation with postweaning multisystemic wasting syndrome (PMWS) occurrence New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 A 15-year analysis of molecular epidemiology of avian infectious bronchitis coronavirus in China Cooperative investigation for the isolation of respiratory agents on problem cage layer operations Neither the RNA nor the proteins of open reading frames 3a and 3b of the coronavirus infectious bronchitis virus are essential for replication Antigenic differences among isolates of avian infectious bronchitis virus Comparative genomics of Korean infectious bronchitis viruses (IBVs) and an animal model to evaluate pathogenicity of IBVs to the reproductive organs Serologic and immunologic properties of a recent isolate of infectious bronchitis virus S1 and N gene analysis of avian infectious bronchitis viruses in Taiwan MRBAYES: Bayesian inference of phylogenetic trees Monoclonal antibodies to three structural proteins of avian infectious bronchitis virus: characterization of epitopes and antigenic differentiation of Australian strains A long-term study of Australian infectious bronchitis viruses indicates a major antigenic change in recently isolated strains Isolation of a variant infectious bronchitis virus in Australia that further illustrates diversity among emerging strains Virus Taxonomy: Classification and Nomenclature of Viruses: Ninth Report of the International Committee on Taxonomy of Viruses Review of infectious bronchitis virus around the world Infectious bronchitis Molecular and serologic characterization, pathogenicity, and protection studies with infectious bronchitis virus field isolates from California Rapid heattreatment attenuation of infectious bronchitis virus Mixed triple: allied viruses in unique recent isolates of highly virulent type 2 bovine viral diarrhea virus detected by deep sequencing Phylogenetic distribution and predominant genotype of the avian infectious bronchitis virus in China during A new serotype of infectious bronchitis virus responsible for respiratory disease in Arkansas broiler flocks A possible North African progenitor of the major European infectious bronchitis virus variant (793B, 4/91, CR88, etc Immunological differences in strains of infectious bronchitis virus MAFFT multiple sequence alignment software version 7: improvements in performance and usability Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data Identification of avian infectious bronchitis virus by direct automated cycle sequencing of the S-1 gene Antigenic domains on the peplomer protein of avian infectious bronchitis virus: correlation with biological functions Experimental evidence of recombination in coronavirus infectious bronchitis virus Circulation of infectious bronchitis virus strains from Italy 02 and QX genotypes in Slovenia between Genotyping of newly isolated infectious bronchitis virus isolates from northeastern Georgia Molecular epidemiology of infectious bronchitis virus in The Netherlands Phylogeny of antigenic variants of avian coronavirus IBV Molecular cloning and sequence comparison of the S1 glycoprotein of the Gray and JMK strains of avian infectious bronchitis virus The molecular biology of coronaviruses Evidence of genetic diversity generated by recombination among avian coronavirus IBV Origin and evolution of Georgia 98 (GA98), a new serotype of avian infectious bronchitis virus Identification and analysis of the Georgia 98 serotype, a new serotype of infectious bronchitis virus Typing of field isolates of infectious bronchitis virus based on the sequence of the hypervariable region in the S1 gene Genetic diversity of avian infectious bronchitis virus isolates in Korea between Characterization of a novel live attenuated infectious bronchitis virus vaccine candidate derived from a Korean nephropathogenic strain Isolation and genetic analysis revealed no predominant new strains of avian infectious bronchitis virus circulating in South China during Serotype and genotype diversity of infectious bronchitis viruses isolated during 1985 Continuous evolution of avian infectious bronchitis virus resulting in different variants cocirculating in southern China An emerging recombinant cluster of nephropathogenic strains of avian infectious bronchitis virus in Korea Live attenuated nephropathogenic infectious bronchitis virus vaccine provides broad cross protection against new variant strains Detection of infectious bronchitis virus by multiplex polymerase chain reaction and sequence analysis Infectious bronchitis virus: S1 gene characteristics of vaccines used in China and efficacy of vaccination against heterologous strains from China Genetic diversity of avian infectious bronchitis coronavirus strains isolated in China between Molecular characterization and pathogenicity of infectious bronchitis coronaviruses: complicated evolution and epidemiology in China caused by cocirculation of multiple types of infectious bronchitis coronaviruses Serological differences between strains of infectious bronchitis virus from New Zealand, Australia and the United States Studies on avian infectious bronchitis virus in New Zealand Differentiation of IBV strains Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination Genetic diversity of avian infectious bronchitis coronavirus in recent years in China Isolation and molecular characterization of Sul/01/09 avian infectious bronchitis virus, indicates the emergence of a new genotype in the Middle East Phylodynamic analysis of avian infectious bronchitis virus in South America Identification of a novel subtype of feline immunodeficiency virus in a population of naturally infected felines in the Brazilian Federal District Naturally occurring recombination between distant strains of infectious bronchitis virus Evaluation of commercially produced infectious bronchitis virus vaccines against an IBV field isolate obtained from broilers in California RDP3: a flexible and fast computer program for analyzing recombination Genotyping of infectious bronchitis viruses identified in Canada between Phylogenetic analysis of avian infectious bronchitis virus strains isolated in Japan A novel genotype of avian infectious bronchitis virus isolated in Japan in 2009 Sequence analysis of the gene coding for the S1 glycoprotein of infectious bronchitis virus (IBV) strains from New Zealand Identification of a novel nephropathogenic infectious bronchitis virus in Israel Incidence, characterisation and prophylaxis of nephropathogenic avian infectious bronchitis viruses Epidemiology of infectious bronchitis virus in Belgian broilers: a retrospective study Prospective identification of HEV-B enteroviruses during the 2005 outbreak Molecular characterization of major structural protein genes of avian coronavirus infectious bronchitis virus isolates in southern China Comparison of four regions in the replicase gene of heterologous infectious bronchitis virus strains Genotypic and phenotypic characterization of the California 99 (Cal99) variant of infectious bronchitis virus Isolation and characterization of a novel antigenic subtype of infectious bronchitis virus serotype DE072 Sequence analysis of infectious bronchitis virus isolates from the 1960s in the United States QX genotypes of infectious bronchitis virus circulating in Europe Molecular survey of infectious bronchitis virus in Europe in 2008 Molecular epidemiology and evolution of avian infectious bronchitis virus Identification of amino acids involved in a serotype and neutralization specific epitope within the s1 subunit of avian infectious bronchitis virus Sequence comparison of avian infectious bronchitis virus S1 glycoproteins of the Florida serotype and five variant isolates from Georgia and California Molecular characterization of infectious bronchitis viruses isolated from broiler chicken farms in Iran Epitopes on the peplomer protein of infectious bronchitis virus strain M41 as defined by monoclonal antibodies Molecular evolution of the human enteroviruses: correlation of serotype with VP1 sequence and application to picornavirus classification Phylogenetic analysis of Ontario infectious bronchitis virus isolates Characterisation of an avian infectious bronchitis virus isolated from IB-vaccinated broiler breeder flocks L'epizootie recente de bronchite infectieuse aviaire en France: importance, evolution et etiologie Infectious bronchitis in chickens Sequence analysis of the S1 glycoprotein of infectious bronchitis viruses: identification of a novel genotypic group in Australia Genetic diversity of avian infectious bronchitis virus California variants isolated between 1988 and 2001 based on the S1 subunit of the spike glycoprotein PCV2 genotype definition and nomenclature Diagnostic, epidemiology and control of the Q1 infectious bronchitis virus (IBV) variant strain in Peru, Colombia, Argentina and Chile Complete nucleotide sequences of S1 and N genes of infectious bronchitis virus isolated in Japan and Taiwan Methods for virus classification and the challenge of incorporating metagenomic sequence data Molecular characterization of three new avian infectious bronchitis virus (IBV) strains isolated in Québec Identification of three feline immunodeficiency virus (FIV) env gene subtypes and comparison of the FIV and human immunodeficiency virus type 1 evolutionary patterns Epidemiology classification of infectious bronchitis virus isolated in Korean between Coronaviruses: structure and genome expression Phylogenetic analysis of infectious bronchitis coronaviruses newly isolated in China, and pathogenicity and evaluation of protection induced by Massachusetts serotype H120 vaccine against QX-like strains Cloning and sequencing of genes encoding structural proteins of avian infectious bronchitis virus MEGA6: molecular evolutionary genetics analysis version 6.0 Recombination in avian gamma-coronavirus infectious bronchitis virus QX-like infectious bronchitis virus in Africa Diagnostic and clinical observation on the infectious bronchitis virus strain Q1 in Italy QX-type infectious bronchitis virus in commercial flocks in the UK An update of infectious bronchitis virus strains circulating in Europe between Bovine viral diarrhoea virus genotype 1 can be separated into at least eleven genetic groups Orchitis in roosters with reduced fertility associated with avian infectious bronchitis virus and avian metapneumovirus infections Molecular characterization of infectious bronchitis virus strains isolated from the enteric contents of Brazilian laying hens and broilers Molecular epidemiology of avian infectious bronchitis in Brazil from 2007 to 2008 in breeders, broilers, and layers The relationship between serotype and genotype based on hypervariable region of S1 gene of infectious bronchitis virus Genotypic grouping for the isolates of avian infectious bronchitis virus in Taiwan Evolutionary implications of genetic variations in the S1 gene of infectious bronchitis virus Isolation and identification of glandular stomach type IBV (QX IBV) in chickens Homologous recombination in pestiviruses: identification of three putative novel events between different subtypes/genogroups Continued evolution of highly pathogenic avian influenza A (H5N1): updated nomenclature The spike protein of infectious bronchitis virus is retained intracellularly by a tyrosine motif Etiology of an infectious nephritis-nephrosis syndrome of chickens Immunological characteristics of a variant of infectious bronchitis virus isolated from chickens Massachusetts live vaccination protects against a novel infectious bronchitis virus S1 genotype DMV/5642/06 An RT-PCR survey of infectious bronchitis virus genotypes in the UK and selected European countries between 2002 and 2004 and the results from a vaccine trial A reverse transcriptase-polymerase chain reaction survey of infectious bronchitis virus genotypes in Western Europe from Characterization of three infectious bronchitis virus isolates from China associated with proventriculus in vaccinated chickens Virus-encoded proteinases and proteolytic processing in the Nidovirales Nephropathogenic infectious bronchitis in Pennsylvania chickens The present work has been performed within the framework of the COST Action FA1207. E.C.H. is financially supported by the NHMRC Australia Fellowship (AF30).The authors thank Dr. J.J. de Wit from GD Animal Health -Deventer, The Netherlandsfor helpful advice and constructive discussions on IBVs.