key: cord-010045-eqzs01au authors: Britton, P.; Cármenes, R. S.; Page, K. W.; Garwes, D. J.; Parral, F. title: Sequence of the nucleoprotein gene from a virulent British field isolate of transmissible gastroenteritis virus and its expression in Saccharomyces cerevisiae date: 2006-10-27 journal: Mol Microbiol DOI: 10.1111/j.1365-2958.1988.tb00010.x sha: doc_id: 10045 cord_uid: eqzs01au Subgenomic mRNA from a virulent isolate of porcine transmissible gastroenteritis virus (TGEV) was used to produce cDNA which was sequenced. Two non‐overlapping open reading frames (ORFs) were identified. The largest, encoding a polypeptide of 382 amino acids (relative molecular mass (M(r)) 43 483), was shown to be the viral nucleoprotein gene. The second ORF, found 3’to the larger ORF, encodes a polypeptide of 78 amino acids (M(r) 9068) which has yet to be assigned to a viral product. The nucleoprotein gene was expressed in yeast cells under the control of two types of yeast promoters: the constitutive PGK promoter, and the inducible GAL1 promoter. Yeast cells containing recombinant plasmids, with the nucleoprotein gene in the correct orientation, produced a polypeptide of M, 47000, identical to the viral product, that reacted with a specific monoclonal antibody. The coronaviruses comprise a large group of enveloped positive-stranded RNA viruses and cause a variety of infections in a wide range of animal hosts (Garwes, 1982; Siddell et at., 1983) . Transmissible gastroenteritis virus (TGEV) Is a coronavirus that causes gastroenteritis in pigs, resulting in a high mortality among neonates. TGEV is composed of a capped polyadenylated single-stranded RNA genome of M, 6.8x10^ or 20 kb in length (Garwes et al.. 1975; Dennis and Brian, 1981) and three major structural polypeptides: a surface glycoprotein (spike or peplomer protein) with a monomeric M. 200 000, a giycosylated integral membrane protein observed as a series of potypeptides of M, 28000-31 000, and a basic phos-phorylated protein {the nucleoprotein) of M, 47000 associated with the viral RNA (Garwes and Pocock, 1975) . Infection of susceptible cells with TGEV results in the synthesis of a 'nested" set of subgenomic mRNA species with common 3' termini but different extensions in the 5' direction, a phenomenon also observed with other coronaviruses such as infectious bronchitis virus (IBV, Stern and Kennedy. 1980) , though the size and number of the mRNA species vary with the virus. TGEV appears to have, in addition to the genomic sized RNA species, four major (11.2.3.9,2.6 and 1.7 kb) and two low abundant (3.0 and 0.7 kb) mRNA species (Britton etal., 1986; Jacobs etal., 1986) . Each coronavirus subgenomic mRNA species is believed to direct the synthesis of a protein from the 5' proximal region (Sturman and Holmes, 1983: Stern and Sefton, 1984) . In vitro translation studies have shown that the mRNA species of 1.7 kb directs the synthesis of the viral nucleoprotein (Britton et al., 1986; Jacobs ef al., 1986) . Kapke and Brian (1986) have sequenced the 3' end of genomic RNA from the avirulent Purdue strain of TGEV and identified two complete open reading frames within the first 2022 bp of the genome. One of the open reading frames encodes a polypeptide of M, 43426 and has properties similar to those found in the IBV and mouse hepatitis virus (fyiHV) nucleoproteins. Here we report the production of cDNA from subgenomic mRNA species isolated from TGEV-infected cells using a virulent British isolate of TGEV. £ coti cells are frequently unable to process eukaryotic proteins correctly, often producing the protein in the wrong conformational structure as a result of incorrect folding, leading to the degradation of foreign gene products. Several heterologous proteins have been successfully expressed in Saccharomyces cerevisiae, Including eukaryotic proteins (calf chymosin, Mellor ef al.. 1983 ) and viral antigens (hepatitis B surface antigen (HBsAg), Valenzuela etal., 1982; fviiyanohara etai, 1983) . S. cerevisiae cells are easy to grow and unlike E. coli cells have the advantage that they lack any pathogenicity, making them particularly useful as a source of large quantities of hormones and antigens of interest in human and veterinary medicine. We have expressed the nucleoprotein gene in yeast celts and the product was shown to be identical to the TGEV nucleoprotein by immunoblotting using a specific monoclonal antibody. Cloning from TGEV mRNA species TGEV poly{A)-containing mRNA species were fractionated on non-denaturing sucrose gradients and TGEV cDNA, derived from the 1.7 kb mRNA species, was synthesized by the vector-primer system, and isolated as described in Experimental procedures. Out of 417 colonies transformed with recombinant DNA, 36 contained DNA that hybridized to ^^P-labelled TGEV poly{A)-containing RNA. Three recombinant plasmids, shown to contain cDNA inserts of 1.5 kbp and designated pTS13-2, pTS15-1 and pTSi 5-2, were labelled with p^S]-dATP by nick translation and hybridized to glyoxylated TGEV mRNA species separated on 1 % agarose gels as described in Experimentat procedures. All three recombinant plasmids were found to hybridize to all TGEV RNA species. In contrast, other recombinant plasmids derived from different mRNA species only reacted to one or more TGEV RNA species. This observation indicated that pTSi3-2, pTS15-1 and pTS15-2 contained regions of cDNA complementary to regions present on all the TGEV nested' mRNA species. This phenomenon can only occur if the cDNA was derived from the 3' end of the RNA and since the sizes of the cDNA inserts were about 1.5 kbp, then most of the 1.7 kb mRNA species was copied. Plasmid pTS15-1 was purified and a restriction map constructed {Fig. 1A). The end containing the poly(A) tail was confirmed by probing Southern blots with ^^P-labelled oiigo(dT)i2-i8 oligonucleotides. TGEV poly(A)-containing mRNA species isolated from infected cells were primed using a 90 bp H/ndl!l-Xtial fragment purified from the 5' end of the TGEV cDNA cloned in plasmid pTS15-1. The cDNA generated was endrepaired and cloned into plasmid pUC9 following the addition of fcoRI linkers. A recombinant plasmid, pF4F-36 ( Fig. IB) , was identified and found to contain a cDNA insert of 1.15 kbp with restriction sites at one end identical to those at the 5' end of plasmid pTS15-1 (Fig. 1A) . indicating that the two cDNA fragments were contiguous. The cDNA from plasmids pTSI 5-1 and pF4F-36 was subcloned into M13mp vectors using either the shotgun method (Bankier and Barrell, 1983) or by using specific restriction fragments and sequencing entirely in both directions. The corresponding DNA sequence 37 bp from the 5' end of the first open reading frame to the poly(A) tail is illustrated in Fig. 2 . The cDNA was translated in all six reading frames, of which only the translation in the virussense strand revealed any open reading frames. The longest open reading frame, initiating from the ATG at position 38, of 1149 bp overlapped both cDNA inserts and encoded a potypeptide of 382 amino acids, with an M^ of 43483. The basic nature and size of the polypeptide indicated that it corresponded to the viral nucleoprotein which is observed to have M, 42 000 and 47 000 in TGEV-infected cells and M, 47 000 in virions (Garwes et al., 1984) . The difference between the theoretical and predicted molecular weights of the protein may result from the degree of phosphorylation. Garwes ef al. (1984) for TGEV and Stohlman and Lai (1979) for MHV have shown that the nucleoproteins are phosphorylated on serine residues, which represent 41 (11 %) of the residues in strain FS772/70 and 39 (10%) in the Purdue strain of TGEV (Kapke and Kapke and Bnan (1986) . the nucleoprotein (bases 39-1185) and a hypothetical hydrophobic prote,r^bases,t921428)ThACTAAACdTGGGr!S^^^^^^1 .,,_ ,","" ^f^f'^^^';'""°"^*°'^'"J^'"'^^ePurduestrain (KapkeandBrian, i986) .Theboxedaminoacidsareidenticalresiduesfound between IBV Boursnelle(a/..1985)andf;1HV(SkinnerandSiddellJ983).ThebrokenNnesrefertothetwomaiorand weaker areasof homology betwe^t hethreenudeoprate,n^s^d™dbydot-malrixanalysis (Fig.3) .Thesesequencedatahavebô pen reading frame, initiating from the ATG at position 1192, is composed of 237 bp and encodes a polypeptide of 78 amino acids of M, 9068-Initial analysis using a universal codon usage table (Staden and McLachlan, 1982) , or a codon usage table constructed from the TGEV nucleoprotein gene, suggested that the polypeptide might not be produced because of its poor codon usage. The smaller open reading frame should be present on an mRNA species of 0.7 kb; the smallest mRNA species detected In TGEV-infected cells is the 0.7 kb low-abundance species (Britton ef at., 1986) , showing that a polypeptide of M, 9068 could be produced by this mRNA species. 1986). The polypeptide sequence has three areas of homology with the nucleoproteins from IBV (Boursnell et al., 1985) and MHV (Skinner and Siddell, 1983) as analysed by dot matrix (Fig. 3 ). Previous work (Britton et at., 1987) showed that antibodies raised against a purified chimaeric protein, produced from the pTS15-1 H/ndlll fragment (1-38 kbp) fused to the 3' end of the tacZ gene, immunoprecipitated TGEV nucleoprotein. This confirmed that the cDNA carried on pTS15-1 contained most of the nucleoprotein gene. A second open reading frame, present only in pTSi 5-1, is found 6 bp 3' to the nucleoprotein gene. Tbis second A. IBV Beaudette nucleoprotein (Boursnell ef at-, 1985) and TGEV FS772/70 nucleoprotein. B. MHV JHM nucleopfotein (Skinner and Siddell, 1983) and TGEV FS772/70 nucleoprotein. C. TGEV Purdue nucleoprotein (Kapke and Brian, 1986) and TGEV FS772/70 nucleoprotein. The comparisCMis used a window of 30 residues with a stringency of 8. Plasmids pTS15-1 and pF4F-36 were digested with restriction endonucleases to produce fragments that did not overlap with pUC9 fragments and the nucleoprotein gene vi/as constructed by cloning the fragments into the purified 2.5 kbp Nde\-Sma\ fragment from plasmid pUC9 (Fig. 4) . A recombinant plasmid, pPBNIO, was found to consist of the truncated pUC9 vector with an insert consisting of the fragments from plasmids pTSI 5-1 and pF4F-36 in the correct order. Plasmid pPBNIO contained the TGEV nucleoprotein gene initiating 19 bp 3' from the Wdel site. The plasmid also contained the smaller 3' ORF which initiates, in a different reading frame, 6 bp 3' from the end of the nucleoprotein gene and terminates 166 bp 5' to the Drall Sma\ combined site between the TGEV cDNA insert and the pUC9 DNA. The 1.76 kbp Nde\-Pvu\ fragment from pPBNIO, containing the complete nucteoprotein gene, was purified, repaired using the Klenow fragment of E. coii DNA polymerase I, and SamHI linkers were added (Fig. 4) . The resulting 1.58 kbp DNA fragment was gel-purified and cloned into the SamHI site of plasmid pBR322. A recombinant plasmid, pBNP5 {Fig. 4), consisting of the 1.58 kbp SamHI fragment in pBR322, was used as a source of the nucieoprotein gene in a SamHI cassette. Under a constitutive promoter The 1.58 kbp SamHI fragment from plasmid pBNP5 was ligated into the 6g/ll site of the yeast expression plasmid, pMA91, to position it downstream of the constitutive yeast PGK promoter. Two recombinant plasmids were identified as having the 1.58 kbp SamHI fragment in opposite orientations in pMA91. Plasmid pYNPI consisted of the TGEV nucleoprotein in the correct orientation for expression from the yeast PGK, and pYNP2 had the SamHI fragment in the opposite orientation so that expression from the PGK promoter was prohibited. Under an inducible promoter. The 1.58 kbp SamHI fragment from plasmid pBNP5 was ligated into the SamHI site of the yeast expression plasmid pB620 to position It downstream of the galactose inducible yeast GAL1 promoter. A recombinant plasmid, pYNGI, was found to contain the TGEV nucleoprotein gene in the correct orientation for expression under the control of the yeast GAL1 promoter. The expression plasmid pB620 does not contain a yeast terminator sequence, unlike plasmid pMA91, which is believed to improve the efficiency of expression. The 0.72 kbp 6g/ll-SamHI fragment, containing the PGK terminator sequence, was purified from plasmid pMA91 and ligated onto the 1.58 kbp SamHI fragment from pBNP5. The resulting mixture was electrophoresed into an agarose gel and a DNA fragment corresponding to 2.3 kbp was purified and ligated into the SamHI site of plasmid pBR322. A recombinant plasmid, pBRNP2, was isolated whioh contained the 0.72 kbp fragment in the correct orientation and fused onto the correct end of the TGEV nucleoprotein gene-containing fragment. The 2.3 kbp SamHI fragment was cloned into the SamHI site of plasmid pB620 as described for the 1.58 kbp SamHI fragment. Two recombinant plasmids were isolated which had the 2.3 kbp fragment in opposite orientations. Plasmid pYNG2 (Fig. 5) consisted of the fragment in the correct orientation for expression from the yeast GAL1 promoter but differed from plasmid pYNGI in that it now had the PGK termination sequence spliced to the other end. Plasmid pYNG3 was similar to pYNG2 except that the fragment was in the opposite orientation prohibiting expression of the TGEV nucleoprotein gene from the GAL1 promoter The recombinant plasmids pYNPl, pYNP2. pYNGI. pYNG2 and pYNG3 and expression vectors pMA91 and pB620 were transformed into the S. cerevisiae strain BWG1-7A by the spheroplast method. Yeast transformants were selected at 30°C for their ability to grow in the absence of leucine (pMA91, pYNPI and pYNP2) or uracil (pB620, pYNGI, pYNG2 and pYNG3}. The generation time of the yeast strain before or after transformation with any of the plasmids was the same (3.5 h in synthetic media). The stability of the plasmids in the yeast strain was determined by growing the cells in complex non-selective media (YPD) and plating the cells on either synthetic selective media (SD) or on YPD. After seven generations in YPD, 90% of cells conserved pMA91 and its derivatives pYNP1 and pYNP2 and 70% of cells conserved pB620 and its derivatives pYNGI, pYNG2 and pYNG3. In order to check that the yeast cells had not modified the recombinant plasmids in any way, plasmid DNA was rescued from the yeast cells, amplified through DHl E coii cells and analysed by restriction endonuclease digestion. None of the plasmids rescued from the yeast cells appeared to be modified. Yeast ceils transformed with the above piasmids were grown in the presence of 2% glucose or 2% galactose to show the production of any polypeptides under the control of tfie yeast promoters. To determine whether any TGEVspecific polypeptides could be detected in transformed yeast the cells were disrupted with glass beads and the cell lysates were dot blotted onto nitrocellulose membranes and probed with a specific mouse monoclonal antibody as described in Experimentai procedures. Yeast cells transformed with pYNP1 grown in the presence of 2% glucose or 2% galactose and cells transformed with pYNGI and pYNG3 grown in the presence of 2% galactose but not 2% glucose reacted with the monoclonal antibody. None of the other cell lysates reacted with the antibody. The nucleoprotein was also detected in yeast spheroplasts using this monoclonal antibody by immunofluorescence, and in cell extracts by enzyme-linked immunoabsorbance assay (ELISA) (data not shown). Cell lysates were prepared from cells transformed with pYNPi and pYNP2 grown in the presence of 2% glucose, and pYNGi. pYNG2 and pYNG3 grown in the presence of 2% glucose or 2% galactose and analysed on 12% SDS-polyacrylamide gels. The fractionated polypeptides were electrophoretically transferred onto nitrocellulose membranes which were probed as described. From Fig, 6A it can be seen that the TGEV nucleoprotein gene cloned in the correct orientation in plasmid pYNPI and in plasmids pYNG1 and pYNG2, in the presence of 2% galactose, produced a polypeptide of Mr 47000 detected by the nucleoprotein monoclonal antibody DA3. The nucleoprotein gene cassette cloned in the opposite orientation did not produce the polypeptide of M, 47000 or any other polypeptide detectable with DA3 under any conditions. Expression from the GAL1 promoter appeared to be more efficient than from the PGK promoter. The presence of the PGK termination sequence following the nucleoprotein gene under the control of the GAL 1 promoter did not seem to increase the efficiency of expression. There was no distinguishable difference in the sizes between fhe TGEV nucleoprotein produced in the yeast cells and that present in purified viral particles (Fig. 6B) . Nucleoprotein degradation products, of which the main one had an M, of 37000 ( Fig. 6A ), also reacted with the nucleoprotein monoclonal antibody. The complete nucleoprotein gene, from a British field isolate of TGEV, was cloned over two cDNA fragments. Both cDNA clones were sequenced and two potential open reading frames were identified in the viral sense strand. The initiation codons of both open reading frames were preceded by the heptameric sequence ACTAAAC, which is similar to the sequences preceding fhe nucleoprotein, matrix, spike genes and other open reading frames (whose products have not yet been identified) in IBV and MHV genomes. The heptameric sequence contains the hexameric sequence, CTAAAC, found by Kapke and Brian (1986) for the Purdue strain of TGEV. It has been suggested that this sequence is involved in the initiation of mRNA synthesis for other coronaviruses {Shteh ef ai., 1987). The sequence context of the initiation codon in the nucleoprotein gene, TAAATGG, often occurs among functional eukaryotic initiator sequences, and the sequence at the initiation codon for the second open reading frame, GAGATGG, is also favoured for eukaryotic initiation Samples were probed with a specific monoclonal antibody as described in Ihe lext. A. The cell-free extracts were as follows: lanes 1 and 2 are from the yeast recipient cell, lanes 3 and 4cells transformed with pYNPI and pYNP2: lanes 5 and 6 cells transformed with pYNG 1. fanes 7 and 8 cells transformed wilh p YNG2; and lanes 9 and 10 cells transformed with pYNG3. The celts used for lanes 1.3,4,5,7 and 9 were grown in the presence of glucose and cells used for lanes 2. 6. 8 and 10 were grown in the presence of galactose B. Lanes 1 and 2 are extracts from cells translormed with pVNG3 and pYNG2 grown in rhe presertce of galactose. Lane 3 contained fractionated proteins from purified TGEV virions. sequences (Kozak, 1983) . Both TGEV genes terminate with the same single stop codon, TAA. The 10-base sequence, TGGAAGAGCT, observed by Kapke and Brian (1986) , found 69 bp from the poly(A) tail, is similar to that found in both IBV (81 bp from the po!y(A) tail) and MHV (82 bp from the poly(A) tail) and may be a recognitioti site or binding site involved in the synthesis of the negative RNA strand (Boursnell ef al.. 1985; Kapke and Brian, 1986) . Knowledge of the amino acid sequences of gene products from different viral strains can help to locate domains in the proteins conserved for their structure and function. Comparison of all the genes between virulent and aviruient strains may be useful for the identification of sequences involved in the pathogenicity of the virus. The amino acid sequence of the nucieoprotein from FS772/70 and Purdue strains of TGEV differ by 2.1% and most of the changes are relatively conservative. However, the change at base 636 results in an amino acid substitution of cysteine for strain FS772/70 to arginine for Purdue, There are three areas of homology between the nucleoproteins of TGEV (strains FS772/70 or Purdue), IBV (Beaudette) and MHV (JHM) (Figs. 2 and 3) , indicating that all three coronavirus subgroups share common ancestry. The areas of homology. shown in Fig. 2 , between TGEV amino acid residues 65-123 and 257-291 appear to be present to a similar extent in the three viruses. However, the homology in the region between amino acid residues 155-190 is much less between IBV than MHV, indicating that TGEV more closely resembles MHV. Dot matrix analysis shows that this region between TGEV strains FS772/70 and Purdue has multiple repetitions (Fig. 3) , probably resulting from a series of alignments of serine-arginine repeats, which could be a site for interaction with genomic RNA. There is no detectable homology between the nucleotide sequences of the nucleoprotein genes of the three viruses. This suggests that the three regions of protein homology may be involved in either RNA interaction or the formation of the correct conformation for interaction. There is no antigenic homoiogy between the nucleoproteins of TGEV, IBV and MHV (Pedersen et al., 1978) , suggesting that the antigenic sites are present in the non-homologous regions. Motz ef al. (1986) have shown that major antigenic sites on the Epstein-Barr virus 138 kD early protein occurred at hydrophilic p-turns. Determination of the secondary structure of TGEV nucleoprotein using the PREDICT program and the hydrophilicity of the primary structure showed that the areas of non-homology with IBV and MHV contained potential hydrophilic p-turns, indicating that the areas responsible for antigenicity may fall within these regions. None of the amino acid substitutions between TGEV FS772/70 and Purdue strains fall within the two main homology regions, except for a conservative serine/ threonine substitution at amino acid position 262, which indicates that the differences between the nucieoproteins of the two strains will probably not be responsible for the difference in the pathogenicity of the two strains, though there may be antigenic variations. The 276-bp open reading frames of the two TGEV strains differed in amino acid sequence by 10.3%. Most of the changes observed are in the first base of the triplet, although they also resulted in conservative ammo acid substitutions, with the exception of the alanine (FS772/70) to arginine (Purdue) change at position 39 that results from a double substitution at positions 1 and 2 of the triplet. Substitution of asparagine for aspartate at position 30 of the gene results in the introduction of a potential A/-giycosylation site in the FS772/70 strain as compared to the Purdue strain. To date, no product has been identified in either infected cells or virions that corresponds to this polypeptide. However, Garwes (unpublished results) has detected a polypeptide of M, 12000-14000, produced in infected cells, which does not appear to be incorporated into virions. The polypeptide is probably similar to the polypeptide of Mr 17000 identified by Wesley and Woods (1986) and later sized as M, 14000 (Wesley and Woods, 1987) produced by the Purdue strain of TGEV. The presence of a potential glycosylation site at amino acid 30 of FS772/70, which may add up to 2000 to the molecular weight of the polypeptide, and the hydrophobicity of the molecule may cause the difference in the molecular weight between the theoretical and observed values. Three oligopeptides have been synthesized for the production of antibodies in order to probe infected cells and virions for the presence of the polypeptide of M, 9068. The TGEV nucleoprotein gene was constructed from the two cDNA clones and sandwiched between SamHI linkers to produce a gene cassette for transfer to suitable expression vectors. The gene cassette, inserted in the correct orientation in the yeast expression plasmids. produced a polypeptide under the control of the yeast promoters, identical in size to the nucleoprotein found in TGEV virions and which reacted with a TGEV nucleoprotein monoclonal antibody. This observation demonstrated that the nucleoprotein gene construction was correct. To our knowledge, this is the first coronavirus protein expressed in yeast cells. The TGEV nucleoprotein gene or its product did not appear to be toxic to the yeast cells. The nucleoprotein gene inserted in the wrong orientation in the yeast expression vectors did not produce the TGEV nucleoprotein, indicating that the viral cDNA did not contain sequences capable of acting as a promoter in the yeast cells. The PGK (phosphoglycerate kinase) promoter has been shown to be very efficient (Mellor ef a/., 1985) , which makes it useful for the expression of foreign proteins. Expression of the TGEV nucleoprotein gene under the control of the PGK promoter was constitutive, as observed for other foreign genes (Mellor ef at., 1983) and to overcome any potential toxicity of the TGEV nucleoprotein the gene cassette was also subcloned into a yeast expression vector under the control of the GAL1 inducible promoter. This allowed selection of yeast cells containing the recombinant plasmids with the nucleoprotein gene but without expression from the GAL1 promoter. The TGEV nucleoprotein was expressed in the presence of galactose but not glucose and the product was the same size and reacted with the same monoclonal antibody as the product obtained under the control of the PGK promoter. The level of expression observed under the control of the GAL1 promoter was higher than that observed under the control of the PGK promoter, although the latter is known to be one of the most efficient yeast promoters when acting with its natural gene (Mellor ef al., 1985) . Comparison of the yeast (Sharp efa/., 1986) and the TGEV nucleoprotein codon usage showed that there was little difference in codon usage (data not shown), which should not cause any reduction in efficiency of expression. The presence of yeast transcription termination sequences has been found to enhance the efficiency of expression. The expression plasmid containing the PGK promoter contains a termination site downstream of the cloning site used, whereas the plasmid containing the GAL1 promoter contains no yeast termination sequence. No apparent increase in expression was observed between the products from the GAL 1 promoter in the presence or absence of the PGK terminator. In addition to the TGEV nucleoprotein gene, the SamHI cassette also encoded the second potential open reading frame. While several genes can be translated from polycistronic mRNA in E. coli, no polycistronic mRNA has been found in yeast. Expression occurs only from the first open reading frame downstream of a promoter when an artificial or foreign gene is introduced into yeast cells as observed with other eukaryotic cells. Thus it is unlikely that any expression will occur from the second open reading frame. The antibodies raised against one of the three synthetic oligopeptides have been used in an ELISA test on yeast extracts for the presence of the potential product of this open reading frame but no product has been detected so far (data not shown). Strains, plasmids and media E. coli strains DH1, C600 and JM83 were used for routine plasmid construction. E. coli transformants were selected on LB plates containing ampicillin (100 ^.g ml"'), Sacctiaromyces cerevisiae strain BWG1- . GAL^ Guarente ef al., 1982) was used as a host strain for the expression of recombinant plasmids. The constitutive PGK expression vector. pMA91, has been described by Mellor et al. (1983) . The inducible GAU-GAL10 expression vector, pB620, was derived from pCGE329 by the addition of a SamHI site to the GAL1 side of the 0.74-kbp GAL1-GAL10 divergent promoter (Boeke ef at., 1985) . Yeast cells were grown in synthetic media containing 2% glucose or galactose and 0.67% yeast nitrogen base (Difco) supplemented with the required amino acids. TGEV strain FS772/70 was grown in LLC-PK1 cells in the presence of [^H]-uracil and actinomycin D (Garwes ef at., 1984) . Messenger RNA was isolated from the cells and purified from other RNA species on poly(U) Sepharose as described previously (Britton ef al.. 1987) . The TGEV mRNA species were fractionated on 15-30% (w/v) linear sucrose gradients in 50 mM Tris-HCI (pH 7.4), 100 mM LiCI. 1 mM EDTA. and 0.1% SDS, for 5 h at HOOOOxga^, and 10X using an MSE 65 preparative ultracentrifuge. Standard reccmbinant DNA methods were used (Maniatis er al., 1982) with enzymes purchased from New England Biolabs (CP Laboratories. Bishop's Stortford. England) unless otherwise stated in the text. DNA fragments were isolated from agarose gels by electroelution. Ligation reactions were carried out as described by Britton ef at. (1984) . E. coli cells were routinely transformed using the RbCI method developed by V. Simanis (Hanahan. 1985) . Vector DNA was routinely treated with alkaline phosphatase prior to ligation. Yeast cells were transformed using the spheroplast method described by Rothstein (1985) . Preparation of the oligo(dT) tailed vector primer. Purified pUC9 was digested using Psfl, and homopolymer tails of about 50 deoxythymidylate (dT) residues were added to the DNA in a solution containing 0.14 M potassium cacodylate, 30 mM Tris. (Okayama and Berg, 1982) . The extent of tailing was analysed by digestion of the vector-primer with Haell. An increase in the size of the two smallest fragments, compared to Psfl-and Haell-digested pUC9, indicated the addition of oligo(dT) tails. One of the dT homopolymer tails was removed by digestion with SamHI and EcoRI and chromatography on CL-6B Sepharose (Pharmacia Ltd.). Synthesis ot tirst-strand cDNA. A sample of the oligo(dT)-tailed vector-primer was added to 2 ^.g of RNA solution, containing the 1.7 kb TGEV mRNA species (prepared as described above), previously denatured at 60°C for 5 min. Synthesis of first-strand cDNA was carried out in a solution containing 50 mM Tris-HCI, pH 8. Synthesis of second-strand cDNA synthesis. The second strand of cDN A was produced by a modification of the RNA/DNA replacement method of Okayama and Berg (1982) . The vector-primer containing the mRNA/cDNA hybrid was radissolved in 50 fxl of 20 mM Tris-HCI, pH 7.5. 4 mM MgCl2, 10 mM (NH4)2SO4, 125 mM KCI, 0.15 mM NAD^ 25 ii.g nuclease-free BSA, 25 tiM dNTPs, 20 jiCi [^^PJ-dCTP (1996-231. 3200 Ci mmol"', New Cloning using a specitic restriction fragment for priming cDNA synthesis. The 90 bp H/ndlll-Xtial fragment (Fig, 1) from plasmid pTS15-1 was purified. A sample of the fragment was boiled for 5 min and added to poly(A) mRNA. isolated from TGEV-infected LLC-PKl cells, at 60°C. The mixture was incubated at 60X for 5 min and coofed to 42°C, First-and second-strand cDNA syntheses were carried out as described above. Transformation of cDNA clones into E. coli cDNA was end-repaired using T4 DNA polymerase as described by Maniatis etat. (1982) . The cDNA-vector-primer was self-ligated using T4 DNA ligase at 15^ for 24 h followed by treatment with RNA ligase (Pharmacia Ltd.) for a further 15 h. The cDNA derived from the restriction fragment primer was ligated to phosphorylated EcoRI linkers (Pharmacia Ltd.). cDNA of 1 kbp was purified and ligated into pUC9 previously cut with fcoRI. The tigation mixtures were routinely diluted five-fold and used to transform E. coli cells. cDNA derived using the vector-primer method was transformed into strain JM83 and cDNA derived from the restriction fragment primer method was transformed into strain C600 using the method of Hanahan (1983) . All ampicillin-resistant colonies were replica plated onto Biodyne membranes (P/N BNNG225 1.2 j^m, PAL. Portsmouth. England) and the plasmid copy number was amplified by transferring the membranes onto LB plates containing chloramphenicol (25 jig ml" ^ Hanahan and Meselson, 1980) . The cells were tysed according to the method of Grunstein and Hogness (1975) , but including the 10% SDS step described by Maniatis ef at. (1982) . The membranes were pre-hybridized in buffer containing formamide and poly(A) (Boehringer) as described by Maniatis et al. (1982) .^P -labelled TGEV poly(A)-containing RNA (6x10^ c.p.m.) was used to probe the cDNA for 42 h at 42"C, The membranes were then washed four times at room temperature in X2 SSC containing 0.1% SDS (XI SSC = 0.15 M NaCi. 0.015 M trisodium citrate pH 7,0), once in X1 SSC containing 0.1% SDS at 68"C for 1.5 h and autoradiographed. Plasmid DNA was prepared and purified from positive clones as described by Britton ef al. (1984) . TGEV poly(A)-containing RNA was glyoxylated as described by fVlaniatis ef al. (1982) and separated on a 1% agarose gel. The RNA was immediately transferred onto Biodyne membranes in X20 SSC for 18 h and baked at 80X for 2 h. Nick translation ot recombinant plasmid DNA and hybridization to TGEV RNA were removed by chromatography through Sephadex G-50. Labelled recombinant plasmid DNA was used to probe the TGEV mRNA Northern biots. Random subclones of the cDNA inserts frorth^TSI 5-1 andpF4F-36 were generated by purifying H/ndlll. Pvull or the Haelll-Pwll fragments from pTS15-1 and the £coRI fragmentfrom pF4F-36 and cloning sonicated {Deininger, 1983) fragments into Smal-cut, phosphatase-treated M13mp10 (Amersham International). In addition, various restriction endonuclease fragments, deduced from the restriction maps (Fig. 1) , were cloned into M13mp-vectors to cover areas of sequence partially covered by random clones. M13/di-deoxynucleotide sequencing (Sanger e(a/., 1977; Bankier and Barren, 1983) was carried out using [a-^^S]-dATP (Amersham International) and sequencing reactions were analysed on buffer gradient gels (Biggin e( at. 1983) . A sonic digitizer (Graf/Bar, Science Accessories Corporation) was used to read the data, which were analysed on a VAX 11/750, using the programs of Staden (1982 Staden ( ,1984 and UWGCG (Devereuxef a/., 1984). Protein secondary structure was determined from its primary structure using PREDICT, a program that compares eight different prediction algorithms (Eliopoulos et al., 1982) . Yeast cells were isolated at mid-exponential phase peiieted and resuspended in 1 ml of phosphate-buffered saline (PBS. 10 mM potassium phosphate. 150 mM NaCi. pH 7.2) containing 1 mM PMSF (phenyl methyl sulphonyi fluonde) and 5 mM benzamidine. Cell-free extracts were obtained by disruption of the cells with glass beads (diam. 0.45 mm) for 3 min at 4^ and clarified by centrifugation. Samples containing 100 |ig of protein were fractionated on 12% SDS-polyacrylamide gels as described by Britton ef a/. (1982) . Following electrophoresis, the separated proteins were transferred onto nitrocellulose membranes (Schleicher & Schuelt. BA85. pore size 0,45 p.m) using an eiectroblotting apparatus (Biorad Laboratories Ltd.) at 100 V for 2 h in electrophoresis running buffer with 0.025% SDS and 20% methanol. Nitrocellulose membranes containing the fractionated yeast proteins were washed several times with PBS containing 1 % gelatine and 0.1 % Tween 20. The presence of TGEV nucleoprotein was determined using a monoclonal antibody (DA3: previously raised against TGEV nucleoprotein in mice; Garwes etat., 1987) , rabbit anti-mouse antibodies, and 5-10 n.C\ '^^l-protein A (Amersham International pic, 32 mCi mg "' Protein A. code No. IM112) per blot. Shotgun DNA sequencing Buffer gradient gels and ^^S label as an aid to rapid DNA sequence determination Ty elements transpose through an RNA intermediate Sequences of the nucleocapsrd genes from two strains of avian infectious bronchitis virus Towards a genetically-engineered vaccine against porcine transmissible gastroenteritis virus Expression of porcine transmissible gastroenteritis virus genes in E. coli as p-galactosidase chimaeric proteins Location and direction of transcription of the ptsH and ptst genes on the Escherichia coli Ki2 genome Phosphotransferase-mediated regulation of carbohydrate utilisation in Escherichia coli K12; identification of the products of genes on the specialised transducing phages \iex(crr) and Xgsrftgs) Random subcloning of sonicated DNA: application to shotgun DNA sequence analysis Coronavirus cell-associated RNA-dependent RNA polymerase A comprehensive set of sequence anaiysis programs for the VAX A structural model for the chromophorebinding domain of ovine rhodopsin Coronaviruses in animals The polypeptide structure of transmissible gastroenteritis virus Defective replication of porcine transmissible gastroenteritis virus in a continuous cell line Identification of heat-dissociable RNA complexes in two porcine coronaviruses identification of epitopes of immunologicai importance on the peplomer of porcine transmissible gastroenteritis virus Colony hybridization; a method for the isolation of cloned DNAs that contain a specific gene A GAU0-CYC1 hybrid yeast promoter identifies the GAL4 regulatory region as an upstream site Studies on transformation of £scftencft/aco// with plasmids Techniques for transformation of Escrtench/a coli Characterization and translation of transmissible gastroenteritis virus mRNAs Sequence analysis of the porcine transmissible gastroenteritis coronavirus nudeocapsid protein gene Comparison of initiation of protein synthesis in prokaryotes, eukaryotes and organelles Molecular Cloning. A Laboratory Manual Factors affecting heterologous gene expression in Saccharomyces cerevisiae Efficient synthesis of enzymatically active calf chymosin in Saccharomyces cerevisiae Expression of hepatitis Bsurtace antigen gene in yeast Expression of the Epstein-Barr virus 138 kDa eariy protein in Escherichia coli ior use as antigen in diagnostic tests High-efficiency cioning of fulllength cDNA Antigenic relationship of the feiine infectious peritonitis virus to coronaviruses of other species Cloning in yeast DNA sequencing with chain terminating inhibitors Codon usage in yeast: cluster analysis cleariy differentiates highly and lowly expressed genes The 5'-end sequence of the murine coronavirus genome: implications for muitiple fusion sites in leader-primed transcription The biology of coronaviruses Coronavirus JHM, nucleotide sequence of the mRNA that encodes nudeocapsid protein Codon preference and its use in identifying protein coding regions in long DNA sequences Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing Graphic methods to determine the function of nucieic acid sequences Coronavirus multiplication strategy, i. Identification and characterisation of virus-specif ied RNA Coronavirus multipiication: the locations of genes for the virion proteins on the avian infectious bronchitis virus genome Phosphoproteins of murine hepatitis virus Synthesis and assembly of hepatitis B virus surface antigen particles in yeast Identification of a 17,000 molecular weight antigenic potypeptide in transmissible gastroenteritis virus-infected cells Antibody response in swine to individual transmissible gastroenteritis virus (TGEV) proteins This research was supported by the Biomolecular EngineeHng Programme of the Commission of the European Communities. Contract No. GB1-2-089-UK. R. S. Carmenes (from Departa-mento de Bioquimica, Universidad de Oviedo, Spain) was supported by an EEC training grant. Dr F. Parra was supported by a NATO Fellowship. We wouid iike to thank Dr R. Serrano (EMBL, Heidelberg, RFA) for supplying plasmid pB620 and yeast strain BWG1-7A. Dr G. Yarranton (CellTech Ltd.) for plasmid pMA91, and Miss Fiona Stewart of this Institute for the monoclonal antibody DA3,