key: cord-1046081-fu964fvd authors: Jacobs, Liesbeth; de Groot, Raoul; van der Zeijst, Bernard A.M.; Horzinek, Marian C.; Spaan, Willy title: The nucleotide sequence of the peplomer gene of porcine transmissible gastroenteritis virus (TGEV): comparison with the sequence of the peplomer protein of feline infectious peritonitis virus (FIPV) date: 1987-11-30 journal: Virus Research DOI: 10.1016/0168-1702(87)90008-6 sha: 054d29f3f80c6fdb8f78be514632d9fe405bfaaa doc_id: 1046081 cord_uid: fu964fvd Abstract The amino acid sequence of the peplomer protein of transmissible gastroenteritis virus (TGEV) has been derived from the cloned cDNA sequence. The gene encodes a protein of 1447 amino acids with a molecular weight of 159 574. Comparison with the primary structure of the peplomer protein of feline infectious peritonitis virus (FIPV) (de Groot et al., 1987b) revealed one domain, from amino acids 1 to 274, in which the nucleotide homology was 39%, whereas in the second domain (from residues 275 to 1447) the homology was 93%. . In TGEV-infected cells five subgenomic mRNA species are synthesized. By in vitro translation, it has been shown that mRNA3 encodes the E2 protein (Hu et al., 1984; Jacobs et al., 1986) . The peplomer protein of coronaviruses is responsible for virus attachment to the cell and membrane fusion (for a review see Sturman and Holmes, 1983) . It has been shown recently that neutralizing monoclonal antibodies are directed against the E2 protein of TGEV and that the main neutralizing epitopes are conserved among TGEV strains (Laude et al., 1986; Delmas et al., 1986; Jimenez et al., 1986) . TGEV and feline infectious peritonitis virus (FIPV) are serologically closely related (Pederson et al., 1978; Horzinek et al., 1982) , to the degree that monoclonal antibodies raised against TGEV react with FIPV. In this report we present the nucleotide sequence of the TGEV peplomer gene and the deduced amino acid sequence. The results show that the peplomer proteins of TGEV and FIPV (de Groot et al., 1987a) are closely related also at the nucleotide level. The Purdue strain of TGEV was plaque-purified and grown on PD5 cells as described previously (Jacobs et al., 1986) . Poly (A + ) selected RNA was isolated from infected cells and fractionated by isokinetic sucrose gradient centrifugation as described previously (Jacobs et al., 1986) . cDNA was prepared using a sucrose gradient fraction enriched for mRNA3 as template and calf thymus pentanucleotides as primers. First and second strand synthesis was carried out essentially as described by Gubler and Hoffman (1983) ; for details see Niesters et al. (1986) . Double-stranded cDNA was dC-tailed using 4U terminal transferase (Amersham, England) for 30 s (Maniatis et al., 1982) , annealed to G-tailed &I-cleaved pUC9 vector (Pharmacia, Uppsala, Sweden) and used for transformation (Hanahan, 1983 ) of E. coli strain JM 109 (Yanish-Perron et al., 1985) . Colonies containing viral inserts were identified by hybridization using nick-translated FIPV-E2 probes (de Groot et al., 1987a) . Plasmid DNA was isolated from positive recombinants and analyzed by restriction enzyme mapping. After digestion with several restriction enzymes the DNA fragments were separated by agarose gel electrophoresis, isolated by binding to NA-45 paper (Schleicher and Schuell, Dassel, W. Germany) and recloned in bacteriophage M13, mp18 and mp19 vectors (Yanish-Perron et al., 1985) . Single-stranded DNA was isolated and sequenced using the dideoxynucleotide chain termination procedure of Sanger et al. (1977) . Data were analyzed using the computer programs of Staden (1982) . TGEV and FIPV are serologically closely related and we assumed a nucleotide homology between both viruses within the E2 gene. Two FIPV E2 specific probes representing the 5' and 3' ends of the FIPV E2 gene (de Groot et al., 1987b) hybridized with RNA3, the mRNA for E2 {Jacobs et al., 1986) but not with smaller subgenomic RNAs of TGEV. They were subsequentIy used to select TGEV E2 specific recombinants (Fig. 1 ). Twenty-nine cDNA clones hybridized to one probe, the two clones pA6 and pB9 to both FIPV-probes. Finally three clones were selected for recloning into bacteriophage MI3 (pB1, pA6 and pB9 respectively) and used for sequence analysis (Fig. 1) . The sequencing strategy is depicted in Fig. 1 . With the exception of the extreme 5'-end the sequence was determined using at least two different cDNA clones. To avoid extensive subcloning into M13, synthetic oligonucleotides were used to prime the sequence reactions. The nucleotide sequence is given in Fig. 2 . Fig. 1 . The restriction sites and sequence strategy of the TGEV peplomer gene. The position of recombinant cDNA clones and the two FIPV probes used for screening the cDNA library are indicated. LGGGA"AIPFAVA"QARLNYVALQTD"LNKNQQILASAF~lO67 ACTI'GGTGGAGGCGCCGTG~ATAC=~~~AGTAGCAA 3200 An open reading frame (ORF) with the potential to encode a protein of 1447 amino acids was found. A putative signal sequence is present at the amino terminal end (residues l-14; Fig. 2 ). To identify and characterize hydrophobic segments which may penetrate the lipid bilayer we have used the algorithm of Eisenberg et al. (1984) . In addition to the signal sequence a segment with a mean hydrophobicity of 0.91 was detected from position 1389-1411 (Fig. 2) . This part of the protein may be responsible for the anchoring of the protein in the lipid bilayer of the virion. Thirty-two potential glycosylation sites have been found (Asn-X-Ser or Asn-X-Thy, X = Pro) clustered in the amino terminal end of the protein and the carboxyl terminal end immediately upstream of the transmembrane anchor (Fig. 2) . Many cysteine residues are located downstream of the transmembrane anchor. Both the nucleotide and predicted amino acid sequences of the E2 genes of TGEV and FIPV were compared. Data on the homology of the E2 genes of TGEV and FIPV are given in Table 1 ; the differences on the amino acid level are indicated in Fig. 2 . Only 74 amino acid substitutions are found downstream of position 274 (homology of 94%); some of the mutations are clustered (Fig. 2B ). There is a striking difference between E2 protein of FIPV and TGEV at the N-terminal part (residues l-274) where a homology of only 30% is found. An ORF in the E2 gene of TGEV of 1447 amino acids was identified. The calculated mol. wt. of 159,574 is in good agreement with the size of the protein found in tunicamycin-treated cells or after in vitro translation of mRNA3 (Hu et al., 1984; Jacobs et al., 1986) . The M, of the peplomer protein is approximately 200000, suggesting that most of the 30 potential glycosylation sites (Fig. 2) carry carbohydrate side chains, each of which adds about 2000 to the mol. wt. of the protein (Neuberger et al., 1972) . Recently Rasschaert et al., 1987 have also obtained the TGEV E2 sequence and only 7 amino acid differences have been found (Fig. 2) . The ORF is flanked by a repeat of the 9 nucleotides 5'-ACTAAACTI-3' (Fig. 2) . The same repeat has been found at the boundaries of the E2 gene of FIPV (de Groot et al., 1987b) as well as upstream of the TGEV nucleocapsid gene (Kapke et al., 1986) . Similar sequence homologies have been found adjacent to the ORFs in the genomes of mouse hepatitis virus (MHV) and infectious bronchitis virus (IBV). Presumably these sequences are recognition signals used in the discontinuous transcription mechanism of coronaviruses (Spaan et al., 1983; Brown and Boursnell, 1984) . The amino acid sequence derived from the E2 gene has the characteristic features of coronavirus peplomer proteins with regard to the signal sequence, membrane anchor and the distribution of the glycosylations sites (Binns et al., 1985; Schmidt et al., 1987) ; however the TGEV and FIPV peplomer proteins lack proteolytic cleavage sites, which are present in IBV and MHV. Cavanagh, (1983) has proposed a model for the coronaviral peplomer in which the C-terminal half of the protein forms its stalk and the N-terminal half of the protein its bulbous part. Recently, de Groot et al., (1987a) have postulated a model in which a coiled-coil structure forms the connection between the globular part of the peplomer protein and the viral membrane. This model is based on the occurrence of heptad repeats, i.e., a periodicity (a-b-c-d-e-f-g) in which amino acids a and d are hydrophobic. Heptad repeats are indicative of a coiled-coil structure in which the hydrophobic residues form the interface between interlocking a-helices. Two such repeats were detected in E2 of FIPV (amino acid residues 1067-1149 and 1334-1380). There are three amino acid differences between TGEV and FIPV in the first region and none in the second. All amino acid substitutions leave the heptad repeat intact. In TGEV four main antigenic epitopes (A, B, C and D) have been found (Delmas et al., 1986) . Neutralization relevant epitopes A and B are highly conserved in TGEV (Jimenez et al., 1986 , Delmas et al., 1986 and monoclonal antibodies recognizing both epitopes also neutralized FIPV infectivity (data not shown). Hence these conserved epitopes should be located downstream of position 274, where FIPV and TGEV possess a homology of 94%. Neutralization epitopes in antigenic sites C and D of TGEV are less well conserved (Delmas et al., 1986) . Three neutralizing monoclonal antibodies recognizing sites C and D were not able to neutralize FIPV infectivity. Probably these epitopes are situated where amino acid substitutions between FIPV and TGEV are clustered. In the absence of selection, mutations will occur randomly. Downstream of position 274 there is a preference for mutations in the third position of a codon (Table l) , where only 28% of the nucleotide substitutions result in a amino acid replacement (Masotoshi and Gojobozi, 1986) . In this part of the protein there seems to be a selection for conservation of the amino acid sequence, rather than for antigenic variation. The high sequence divergence at the N-terminus was unexpected and is not caused by selection of neutralizing antibodies. A possible explanation would be that FIPV arose by recombination of TGEV with a related virus -related, because there is still a homology of 30% in residues l-274. In vitro, a high frequency of recombination has been described for murine corona viruses (Makino et al., 1986; Lai et al., 1985) . We have compared the 274 amino acids using the FASTP program (Lipman and Pearson, 1985) with the NBRF sequence library and with the MHV and IBV structural proteins. Excepting the 30% homology with FIPV E2 protein, no homologies have been found. However, sequences of other related coronaviruses such as canine coronavirus are not available yet. Cloning and sequencing of the gene encoding the spike protein of the coronavirus IBV Genome of porcine transmissible gastroenteritis virus Avian infectious bronchitis virus genomic RNA contains sequence homologies at the intergenic boundaries This study was supported by a research grant from Duphar B.V., Weesp, The Netherlands.We thank Rente Ter Haar for the FIPV neutralisation experiments, Hans Lenstra for the comparison of the E2 amino acid sequence with the NBRF library, and for many helpful discussions, Diane Ruyzendael for synthesis of the oligonucleotides, and Willem Luytjes for the advice on computer analysis. We also thank Dr. H. Laude for sending us the TGEV E2 sequence data prior to publication. Virol. 58, 45-53. Rasschaert, D. and Laude, H (1987) The predicted primary structure of the peplomer E2 of porcine coronavirus TGEV, J. Gen. Virol. 68, in press. Sanger, F., Nicklen, S. and Coulson, A.R. (1977)