key: cord-0699061-pj8p2x9l authors: Pratelli, Annamaria; Martella, Vito; Decaro, Nicola; Tinelli, Antonella; Camero, Michele; Cirone, Francesco; Elia, Gabriella; Cavalli, Alessandra; Corrente, Marialaura; Greco, Grazia; Buonavoglia, Domenico; Gentile, Mattia; Tempesta, Maria; Buonavoglia, Canio title: Genetic diversity of a canine coronavirus detected in pups with diarrhoea in Italy date: 2003-06-09 journal: J Virol Methods DOI: 10.1016/s0166-0934(03)00081-8 sha: 05338fa6a0ce3d65e93af268dee65b20998ae0f3 doc_id: 699061 cord_uid: pj8p2x9l The sequence of the S gene of a field canine coronavirus (CCoV), strain Elmo/02, revealed low nucleotide (61%) and amino acid (54%) identity to reference CCoV strains. The highest correlation (77% nt and 81.7% aa) was found with feline coronavirus type I. A PCR assay for the S gene of strain Elmo/02 detected analogous CCoVs of different geographic origin, all which exhibited at least 92–96% nucleotide identity to each other and to strain Elmo/02. The evident genetic divergence between the reference CCoV strains and the newly identified Elmo/02-like CCoVs strongly suggests that a novel genotype of CCoV is widespread in the dog population. Canine coronavirus (CCoV) is an enveloped, positive stranded RNA virus of dogs associated with moderate to severe enteritis in young pups. The genome contains two large open reading frames (ORFs), 1a and 1b , encoding two polyproteins leading to the viral replicase formation. Downstream to the ORF1b, there are 8 Á/10 smaller ORFs encoding for the structural proteins S (ORF2), E (ORF4), M (ORF5) and the nucleocapsid (N) protein (Enjuanes et al., 2000) . The small membrane protein (E) has been found recently to be important for viral envelope assembling (Raamsman et al., 2000) . The M protein is a type III glycoprotein consisting of a short amino-terminal ectodomain, a triple-spanning transmembrane domain, and a long carboxyl-terminal inner domain (Rottier, 1995) . The ORF2 encodes for a glycosilated protein (S) ranging from 1160 to 1452 amino acids (aa) in length (Enjuanes et al., 2000) , constituting the large, petal-shaped spikes on the surface of the virion. This large protein can be divided into three structural domains. The large external domain at the N-terminus is divided further into two subdomains S1 and S2. The S1 sub-domain includes the N-terminal half of the molecule and forms the globular portion of the spikes. It contains sequences that are responsible for binding to specific receptors on the membrane of susceptible cells. S1 sequences are variable, containing various degrees of deletions and substitutions in different coronavirus strains or isolates. Mutations in the S1 region have been associated with altered antigenicity and pathogenicity. In contrast, S2 sequences are more conserved and contain two heptad repeat motifs that suggest a coiled-coil structure (Lai and Holmes, 2001) . On the basis of phylogenetic analysis and antigenic cross reactivity, three groups can be distinguished in the Coronaviridae family. Group I includes CCoV, the transmissible gastroenteritis virus of swine (TGEV), the porcine epidemic diarrhoea virus (PEDV), the porcine respiratory coronavirus (PRCoV), the feline coronaviruses (FCoVs) and the human coronavirus 229E (HCoV 229E). FCoVs can be distinguished into two serotypes, I and II, on the basis of a virus neutralization assay in vitro using both type-specific feline sera and monoclonal antibodies directed against the S protein (Herrewegh et al., 1998) . In the field, FCoVs type I are predominant and FCoVs type II are detected only sporadically. Differences in the S gene of FCoVs type I and that of FCoVs type II may also account for the different properties observed in vitro, as indeed FCoVs type I grow poorly in tissue culture cells (Pedersen et al., 1984) while type II strains grow well. In a previous study, sequence analysis of CCoVs detected in faecal samples collected from dogs with diarrhoea revealed multiple nucleotide substitutions accumulating over a fragment of the M gene (Pratelli et al., 2001) . A genetic drift to FCoV type II was also observed in the sequence of CCoVs detected in the faeces of two pups infected naturally during the late stages of long-term viral shedding. It was thus hypothesized that (i) the dogs might have been infected by a mixed population of genetically different CCoVs, or (ii) the viruses detected in both the pups were the result of mutation/recombination events (Pratelli et al., 2002b) . Subsequently, extensive sequence analysis on multiple regions of the viral genome, including ORF1a, ORF1b and ORF5, of several CCoV positive faecal samples provided strong evidence for the existence of two separate genetic clusters of CCoV. The first cluster includes CCoVs intermingled with reference CCoV strains, such as Insavc-1 and K378, while the second cluster segregates separately from CCoVs and, presumably, represents a genetic outlier referred to as FCoVlike CCoV (Pratelli et al., 2003) . The aim of the present study was to evaluate the genetic differences between the FCoV-like and the 'typical' CCoVs in the sequence of the gene encoding for the S protein. Twenty faecal samples, collected in four kennels in Southern Italy from 2 Á/6 month-old pups affected with diarrhoea, were tested. Three of the kennels were sited in different areas of Puglia, about 50 km from each other, while the fourth shelter was located in Abruzzo, more than 400 km far from the other three. The faecal samples were stored at (/20 8C until tested. All the samples were negative by a haemoagglutination test for canine parvovirus and positive for CCoV when submitted to a PCR assay targeting a fragment of the M gene (primers CCoV1Á/CCoV2) (Pratelli et al., 1999) . The presence of FCoV-like CCoVs in the same samples was detected by means of a differential PCR assay, using primers (CCoV1a Á/CCoV2) able to recognise nucleotide substitutions conserved in the M gene across all the FCoV-like CCoVs (Pratelli et al., 2002a) . The sequence of the primers and their positions in the M gene are shown in Table 1 . Comparative sequence analysis of the S gene have revealed a higher degree of variation at the N-terminus rather than at the C-terminus of the S protein (Jacobs et al., 1987; Motokawa et al., 1996; Horsburgh and Brown, 1995; Wesley, 1999) . Taking into account the sequence drift to FCoVs observed in the M gene, we designed a pair of primers, UCD1F Á/UCD1R, amplifying a 502 bp fragment at the very 3? end of the sequence of the S gene of FCoVs type I (strains UCD1, KU-2 and Black), which encodes for the highly conserved C-terminus of the spike protein. All the 20 canine samples were tested with this primer pair and yielded an amplicon of the expected size. The sequence of the amplicons obtained was determined by direct sequencing of the PCR products and displayed 81 Á/82% nucleotide identity to FCoV type I strains. To verify the extent of genetic variation between the two clusters of CCoV in the S gene, we determined the nearly-full length sequence of the ORF2 of one (Elmo/ 02) of the samples that had tested positive to FCoV-like CCoV. Degenerate primers were designed with the CODE-HOP strategy (Rose et al., 1998) , using a wide selection of coronaviruses belonging to group I of the Coronaviridae. This strategy is based on the identification of blocks of homology in the amino acid sequences of distantly related organisms. Hybrid oligonucleotides, with a short 3? degenerate core region and a longer 5? consensus clamp region, are selected by retro-translation on the blocks individuated. CODEHOP sense primers with low degeneracy index were selected to amplify overlapping fragments of ORF2. One-step reverse transcription and PCR amplification were carried out using SuperScript TM One-Step RT-PCR for Long Tem-plates (Life Technologies, Invitrogen. Milan, Italy) . To select against any CCoV-like virus during PCR amplification, reverse primers specific for the S gene of FCoVlike CCoV Elmo/02 were used in combination with the forward degenerate primers. Reference CCoV strains, S378, K378, 45/93 (Buonavoglia et al., 1994) , USDA, 1/ 71 and SE, were used as controls in the PCR reactions, to verify that no CCoV was amplified by the primers. The amplicons were cloned into pCR † 2.1-TOPO † vectors (TOPO TA Cloning † , Invitrogen, Milan, Italy) and the recombinant clones individuated by blue/white screening. Plasmid DNA was extracted and subjected to sequence analysis (Genome Express: Labo Grenoble, France) . Following this strategy, the fragments of the FCoV-like virus Elmo/02 were inserted into the clones. A consensus of the sequences obtained was determined and the overlapping fragments were manually edited. Alignments and sequence analysis were performed using the BIOEDIT software package (Hall, 1999) . The guidelines of the strategy used to determine the sequence of the S gene of the FCoV-like CCoV are schematised in Fig. 1 . The position and sequence of the degenerate primers are reported in Table 1 . The nucleotide sequence of Elmo/02 will appear in the DDBJ/EMBL/GenBank databases under accession no. AY170345. The amino acid sequence of the FCoV-like CCoV Elmo/02 was inferred and aligned with a selection of coronaviruses of the group I. Phylogenetic and molecular evolutionary analyses were conducted using MEGA version 2.1 (Kumar et al. 2001 ) and PAUP version 4.0b (Swofford, 1998) . A parsimony tree was elaborated using a heuristic algorithm and supplying statistical support by bootstrapping over 100 replicates. The primer pair, V3F Á/V3R, designed on the sequence of virus Elmo/02, was chosen to selectively amplify FCoV-like CCoVs. The sequences and positions of the primers are shown in Table 1 . All the samples previously characterised as FCoV-like by two separate primer pairs targeted to the M gene (CCoV1a Á/CCoV2) (Pratelli et al., 2002a) and to the S gene (UCD1F Á/UCD1R) were screened with the new primers specific for virus Elmo/ 02. The RNA was reverse transcribed with random hexamers using MuLV Reverse Transcriptase (Applied Biosystems, Roma, Italy) and then amplified with AmpliTaq DNA polymerase (Applied Biosystems, Roma, Italy ), by 40 cycles at 94 8C for 1 min, 55 8C for 1 min and 72 8C for 1 min. To assess the intra-genotypic variability in the S gene of FCoV-like CCoVs, four strains, each representative of a different geographical area, were selected and subjected to sequence analysis. All the 20 faecal samples characterised previously as FCoV-like CCoVs were recognised by the primer pair UCD1F Á/UCD1R, yielding an amplicon of the expected size of 502 bp. About 80% (3347 nucleotides) of the ORF encoding for the S protein of strain Elmo/02 was determined. Using the ORF2 of strain UCD1 as a reference sequence, the fragment sequences between nt 868 and 4205 and between aa 300 and 1401 may be approximately localised. The highest nucleotide identity was to FCoV type I strains KU-2, UCD1 and Black (Â/77%), whereas identity to FCoVs type II and CCoVs was about 61%. Comparison of the inferred amino acid sequences revealed 80.81 Á/81.76% identity to FCoVs type I, 53.88 Á/54.31% to FCoVs type II and 54.31% to reference CCoV strains (Table 2 ). In accordance with previous observations, the sequence of the S protein was much more conserved at the C-terminus rather than at the Nterminus. For instance, amino acid identity to the bestmatching sequence (strain KU-2) ranged from 73.39 to 88.4% and to strain Insavc-1 from 41.4 to 65.51% in the N-and C-terminus, respectively. The inferred amino acid sequence of the S protein of strain Elmo/02 is shown in Fig. 2 . Similar to other coronaviruses, there were several potential N-glycosilation sites, Asn Á/X Á/Ser (NXS) or Asn Á/X Á/Thr (NXT). Most of the glycosilation sites were conserved between strain Elmo/02 and feline/CCoVs, in particular with respect to the most closely related FCoVs type I. Interestingly, a potential cleavage site, the stretch of basic amino acids, Arg Á/Arg Á/Ala Á/Arg Á/Arg (RRARR), was found. The basic stretch is about at the same position as in the S protein of group II and group III coronaviruses, but it is absent in the S protein of all the other group I coronaviruses. Parsimony analysis on the S protein of group I coronaviruses revealed that strain Elmo/02 is much more related to FCoVs type I rather than to typical CCoVs. Conversely, typical CCoVs tightly segregate with FCoVs type II and the porcine coronaviruses TGEV and PRCoV (Fig. 3) . The new pair of primers, V3F Á/V3R, specific for strain Elmo/02, successfully amplified all the 20 samples tested, yielding an expected PCR product of 744 bp. The sequence of the V3F Á/V3R amplicon of four strains representative of different geographical areas was determined, revealing a nucleotide variability of 4Á/8%. Genetic divergence within the coronavirus group I is accounted for by linear evolution as well as by a sudden, dramatic shift generated by RNA deletions or recombination. For instance, the S protein of PEDV occupies an intermediate position between HCoV 229E and TGEV (Kocherhans et al., 2001) , while the S protein of PRCoV is closely related to TGEV but has a large deletion in the N-terminus (more than 200 aa) that may explain the change in the pathobiology of the virus (Vaughn et al., 1994) . Comparative sequence analysis of the genome of FCoVs type I and type II and CCoV has demonstrated that FCoV type II has arisen from a template switch between FCoV type I and CCoV, which took place between the S and M genes. An additional template switch has been mapped in the ORF1b region for strain FCoV 79-1146 and in the ORF1a b region for strain FCoV 79-1683. The double recombination event deter-mined the introduction of a large genome fragment, encompassing the CCoV-like S-encoding gene, into the background of a FCoV genome (Herrewegh et al., 1998) . The S gene of CCoV is closely related to FCoVs type II, TGEVs and PRCoVs, and is more divergent from FCoVs type I, PEDVs and HCoV 229E (Wesseling et al., 1994) . So far, little evidence has been provided for genetic drifts or shifts affecting CCoV. Wesley (1999) has described a canine strain displaying a higher sequence identity to TGEV in the N-terminus of the S protein, explained as a possible recombination between CCoV and TGEV, and related to improved growth in swine testicular cells. The findings in the present study clearly indicate that a novel CCoV type, highly divergent from the reference CCoV strains, and more closely related to FCoVs type I, circulates among dogs. Indeed, by means of RT-PCR, Elmo/02-like strains were successfully detected in all the samples tested. All the samples had been characterised as 'atypical' CCoVs when screened with a RT-PCR targeted to the M gene and able to distinguish between the two genetic lineages previously identified (Pratelli et al., 2002a) . Extensive sequence analysis of multiple regions in the ORF1a and 1b , as well in the M-encoding gene, has confirmed the existence of a distinct genetic lineage of CCoV, evolutionarily localised between CCoV and FCoV. Many amino acid residues observed in the M protein of FCoVlike CCoVs are the same as in FCoVs and presumably represent a retention of the sequence of an ancestral virus (Pratelli et al., 2003) . Re-considering these data in the light of the findings of the present study and considering the analogies with closely related viruses, we have concluded that the extent of genetic variation observed within the CCoVs is limited in the ORF1a and slightly greater in the ORF1b and ORF5, though it still accounts for a clear pattern of segregation into a distinct genetic lineage. The two genotypes of CCoV diverge dramatically in the ORF2, where there is more than 38.4% nucleotide and about 45.5% amino acid variation from reference CCoVs. Analysis of the S gene of the Elmo/02-like CCoVs revealed a little degree of variation (4 Á/8%), which may be explained by their different geographical origin. The majority of the sequence changes observed are conservative, demonstrating that there is some heterogeneity in the ORF2 of Elmo/02-like CCoVs. In conclusion, the findings suggest that the two canine genotypes underwent a linear evolution rather than a sudden shift originating from a recombinant event analogous to those leading to the appearance of FCoVs type II. Finally, recombination with an ancestral coronavirus from which FCoVs type I and Elmo/02-like CCoVs directly evolved may not be excluded. Whether the Elmo/02-like CCoVs have phenotypic properties different from those of typical CCoVs, similarly to FCoVs type I and II, will be interesting to evaluate. The high divergence in the amino acid composition and the loss and gain of potential glycosilation sites, compared to the most closely related coronaviruses (FCoV type I, FCoV type II and typical CCoV), strongly suggest that the Elmo/02 strain is poorly correlated antigenically with the other coronaviruses of dogs and cats. Moreover, the presence of the stretch of basic residues RRARR is indicative of a potential cleavage of the protein (Wesseling et al., 1994) . A similar basic motive is present, approximately in the same position, in all the coronaviruses identified to date of both group II and III, but it is absent in all the coronaviruses of group I. Cleavage of the S protein of coronaviruses has been correlated to cell-fusion activity in vitro (Hingley et al., 1998) but the potential implications in viral pathobiology have not been determined. On the basis of the significant genetic differences between the reference and the Elmo/02-like CCoVs our tentative proposal is to designate the new genotype identified as CCoV type I, and to designate the reference strains, such as Insavc-1 and K378, as CCoVs type II. This new designation does not take into account the order of discovery of the viruses, but it is based on the genetic similarity between CCoVs type II and FCoVs type II and between CCoV type I and FCoV type I. L'infezione da coronavirus del cane: indagine sulla presenza del virus in Italia Virus Taxonomy, Classification and Nomenclature of Viruses BIOEDIT: a user-friendly biological sequence alignment and analysis program for Windows 95/98/NT Feline coronavirus type II strains 79-1683 and 79-1146 originate from a double recombination between feline coronavirus type I and canine coronavirus The spike protein of murine coronavirus mouse hepatitis virus strain A59 is not cleaved in primary glial cells and primary hepatocytes Cloning, sequencing and expression of the S protein gene from two geographically distinct strains of canine coronavirus The nucleotide sequence of the peplomer gene of porcine transmissible gastroenteritis virus (TGEV): comparison with the sequence of the peplomer protein of feline infectious peritonitis virus (FIPV) Completion of the porcine epidemic diarrhoea coronavirus (PEDV) genome sequence MEGA2: molecular evolutionary genetics analysis software Coronaviridae: the viruses and their replication Comparison of the amino acid sequence and phylogenetic analysis of the peplomer, integral membrane and nucleocapsid proteins of feline canine and porcine coronaviruses Pathogenic differences between various feline coronavirus isolates Development of a nested PCR for the detection of canine coronavirus Variation of the sequence in the gene encoding for transmembrane protein M of canine coronavirus (CCV) PCR assay for the detection and the identification of atypical canine coronavirus in dogs M gene evolution of canine coronavirus in naturally infected dogs Identification of coronaviruses in dogs that segregate separately from the canine coronavirus genotype Characterization of the coronavirus mouse hepatitis virus strain A59 small membrane protein E Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences The coronavirus membrane protein PAUP*, phylogenetic analysis using parsimony (*and other methods) Three new isolates of porcine respiratory coronavirus with various pathogeni cities and spike (S) gene deletions The S gene of canine coronavirus, strain UCD-1, is more closely related to S gene of transmissible gastroenteritis virus than to that of feline infectious peritonitis virus Nucleotide sequence and expression of the spike (S) gene of canine coronavirus and comparison with the S proteins of feline and porcine coronaviruses This study was supported by grants from CEGBA (Centro di Eccellenza di Genomica in Campo Biomedico e Agrario) and from Ministry of University, Italy (project: Enteriti virali del cane).