key: cord-0687860-i17gweqp authors: Kapke, Paul A.; Tung, Frank Y.T.; Hogue, Brenda G.; Brian, David A.; Woods, Roger D.; Wesley, Ronald title: The amino-terminal signal peptide on the porcine transmissible gastroenteritis coronavirus matrix protein is not an absolute requirement for membrane translocation and glycosylation date: 1988-08-31 journal: Virology DOI: 10.1016/0042-6822(88)90581-8 sha: bf78f453e4f454f92454f388f5167542c4825f33 doc_id: 687860 cord_uid: i17gweqp Abstract cDNA clones mapping within the first 2601 bases of the 3′ end of the porcine transmissible gastroenteritis coronavirus (TGEV) genome were sequenced by the method of Maxam and Gilbert and an open reading frame yielding a protein having properties of the matrix (M or E1) protein was identified. It is positioned at the 5′ side of the nucleocapsid (N) gene from which it is separated by an intergenic stretch of 12 bases. The deduced M protein comprises 262 amino acids, has a molecular weight of 29,544, is moderately hydrophobic, and has a net charge of +7 at neutral pH. Thirty-four percent of its amino acid sequence is homologous with the M protein of the bovine coronavirus (BCV), 32% with that of the mouse hepatitis coronavirus (MHV), and 19% with that of the avian infectious bronchitis coronavirus (IBV). Judging from alignment with the BCV, MHV, and IBV M proteins, the amino terminus of the TGEV M protein extends 54 amino acids from the virion envelope which compares with only 28 for BCV, 26 for MHV, and 21 for IBV. Eleven of the sixteen amino-terminal amino acids are hydrophobic and the positions of charged amino acids around this sequence suggest that the first 16 amino acids comprise a potentially cleavable signal peptide for membrane insertion. A similar sequence is not found in the M proteins of BCV, MHV, or IBV. When mRNA from infected cells, or RNA prepared by in vitro transcription of the reconstructed M gene, was translated in vitro in the presence of microsomes, the M protein became translocated and glycosylated. When a protein without the amino-terminal signal peptide was made by translating a truncated version of the M gene transcript, some translocation and glycosylation also occurred suggesting that the amino-terminal signal peptide on the TGEV M protein is not an absolute requirement for membrane translocation. Interestingly, the amino-terminal peptide did not appear to be cleaved during in vitro translation in the presence of microsomes suggesting that a step in virion assembly may be required for proper exposure of the cleavage site to the signal peptidase. The porcine transmissible gastroenteritis coronavirus (TGEV) comprises three major structural proteins: an internal nucleocapsid phosphoprotein (N) of 43 kDa and two glycosylated envelope proteins, one of 29 kDa (a matrix-like protein, M or El) and one of 200 kDa (the peplomeric, P, or E2 protein) (Brian et al., 1983; Garwes and Pocock, 1975; Kapke and Brian, 1986; Wesley and Woods, 1986) . While the 200-kDa P glycoprotein is demonstrably important in stimulating neutralizing antibody (Garwes et a/., 1978) , the 29-kDa M glycoprotein may also be important, especially if complement is part of the virus-antibody reaction (Woods et al., 1987) . To investigate the role of individual viral proteins in virus replication and in induction of immunity, we have prepared cDNA clones beginning from the polyadenylated 3' end of the TGEV genome and examined the sequences of potential genes (Kapke and Brian, 1986) . Within the first (3') 2000 bases, we deduced, from an examination of open reading frames, a noncoding region of 276 bases, and genes for a 9101 mol wt hypothetical hydrophobic polypeptide, a 43,426 mol wt nucleocapsid protein, and part of a matrix protein, arranged in that order from the 3' end of the genome. Assuming that a conserved intergenic sequence would be found in TGEV as has been found in the mouse hepatitis coronavirus (MHV) (Budzilowicz eta/., 1985) , and the avian infectious bronchitis coronavirus (IBV) (Brown and Boursnell, 1984) , we prepared a synthetic oligodeoxynucleotide that is complementary to the TGEV intergenic sequence and used it as a primer for first-strand DNA synthesis in the preparation of additional genomic cDNA clones. Several cDNA clones were thus prepared and seven that mapped within the first (3') 2601 bases were sequenced in part and another clone was sequenced completely to derive a potential gene sequence for the M protein. The nucleotide sequence predicted an M protein that shared many features with the M proteins of the mouse hepatitis virus, the bovine coronavirus, and the avian infec-367 0047-6R77IRR S7 CMl tious bronchitis virus, but also predicted an unexpected, potentially cleavable amino-terminal signal peptide that makes the TGEV M protein strikingly different. In this study, we have confirmed our preliminary report of the M gene sequence (Kapke et a/., 1987) and have examined the behavior of the amino-terminal peptide during synthesis of the M protein. AND METHODS The Purdue strain of TGEV was grown on swine testicle (ST) cells as previously described (Kapke and Brian, 1986) . cDNA cloning of TGEV genomic RNA Virus genomic RNA was prepared as previously described (Kapke and Brian, 1986) . cDNA cloning was accomplished by the method of Gubler and Hoffman (1983) essentially as described (Kapke and Brian, 1986 ) except that the synthetic oligodeoxynucleotide 5'TTAGAAGTTTAGTTA3' was used as primer for first-strand cDNA synthesis. The primer was synthesized by the phosphoramadite method and was purified by polyacrylamide gel electrophoresis. Clones were selected by colony hybridization to randomprimed cDNA prepared from size-selected genomic RNA (Kapke and Brian, 1986) . Clones were initially physically mapped to obtain their approximate position on the genome by using a matrix cross-hybridization method in which plasmid DNA from individual clones was probed with purified inserts or segments of inserts that had been radiolabeled with 32P by nick-translation. DNA sequencing and sequence analyses DNA sequencing was done by the chemical method of Maxam and Gilbert (1980) and sequence analyses were done with the aid of the computer programs developed by Queen and Korn (1984) To reconstruct the full-length M gene, clones FT36 and C4 were digested with Accl and the small frag-ment of FT36 (which contained the first 782 bases of the sequence shown in Fig. 2 ) and the large fragment of C4 (which contained bases 783 through 934, an oligo(dC) tail of 13 bases, and the rest of the pUC9 sequence) were ligated and used to transform Escherichia colistrain 294. The insert was removed from this plasmid using Pstl, digested with Bsp1286 to remove the 5' 114 bases, and blunt-ended with mung bean nuclease. The fragment now contained 13 bases upstream (5'-ward) from the CTAAAC presumed intergenie sequence (or 22 bases upstream from the presumed M gene start codon) and extended to the end of the 13-base C-tail which begins 9 bases downstream from the M gene stop codon (bases 922-924 in Fig. 2 ). The blunt-ended fragment was ligated into the Smal site of the pGEM3 vector (Promega Biotec) and the orientation yielding a sense-strand RNA by transcription with SP6 polymerase was chosen. The construct was designated pGEM-M-l. Transcripts were prepared from EcoRI-cut plasmid using the SP6 transcription system marketed by Promega Biotec. To reconstruct the truncated version of the M gene (i.e., the M gene with no N-terminal signal peptide), clone pGEM-M-l was cut with Sphl, which cuts within the multiple cloning region of the pGEM vector and between bases 174 and 175 in Fig. 2 , and the resulting large fragment was isolated, religated, and designated pGEM-M-2. Translation of pGEM-M-2 transcripts allows initiation at the second AUG downstream from the CTAAAC intergenic sequence (beginning at base 200 in Fig. 2) i.e., at the fifth amino acid downstream from the potential peptidase cleavage site between glycine and lysine (von Heijne, 1986) . The sequence at the 5' junction of the insert (virus sense) and the vector were confirmed by sequencing for both the pGEM-M-1 and pGEM-M-2 constructs. Cells were grown to confluency in 850-cm2 roller bottles (Falcon) and infected with a multiplicity of infection of approximately 10. At 6 hr p.i., cells were rinsed twice with Earl's balanced salt solution (EBSS), scraped from the bottle, transferred to a 50-ml conical tip polypropylene tube, and pelleted. Cells from five roller bottles yielded a pellet of approximately 5 ml and constituted one batch for RNA extraction. Cells were lysed by the addition of 5 voI(25 ml to a 5-ml cell pellet) of lysis buffer containing 10 mM Tris-HCI (pH 7.0) 10 ml\/l NaCI, 5 ml\/l MgC12, 1% NP-40(v/v) at room temperature followed by vigorous vot-texing for 10 sec. Nuclei were removed immediately by centrifugation at 4K x g, 5 min, and to the supernatant was added 0.2 vol of 10% SDS(w/v) in water and 4 mg proteinase K crystals. The solution was incubated 30 min at 37" and extracted with an equal volume of phenol-chloroform-isoamyl alcohol (24:24: 1 ), and RNA was precipitated with 2.2 vol of ethanol after adding 0.1 vol 3 11/1 Na acetate. RNA from a 5-ml cell pellet was dissolved in water and polyadenylated RNA was selected by oligo(dT)-cellulose chromatography using a binding buffer of 0.5 M NaCI, 0.1 M Tris-HCI (pH 7.5) 0.2% SDS (w/v), and an elution buffer of 0.1 M Tris-HCI (pH 7.5) 0.2% SDS (w/v). Polyadenylated RNA was ethanol precipitated with Na acetate, dissolved in 100 PI water, ethanol precipitated without salt, dissolved in 100 ~1 water, and 5 ~1 of this solution was used in a 50-~1 translation reaction. In vitro translation was done using a wheat germ system (Amersham) in a 50 ~1 reaction volume that contained 25 ~1 wheat germ extract, 1 ~1 1 mM amino acid solution deficient in methionine (Promega Biotec), 3 ~1 1 M KAc (to make a final K+ concentration of 1.17 mn/l), 2 ~1 RNasin (Promega Biotec), 4 ~1 microsomes (Amersham, Promega Biotec, or as a gift from Dr. Peter Walter, University of California School of Medicine, San Francisco) or 4 ~1 microsome blank solution when microsomes were left out, 5 ~1 300 PM octanoyl-asparagine-leucine-threonine or 5 ~1 water when the tripeptide was left out, 5 ~1 [35S]-methionine (>800 Ci/mmol, New England Nuclear), 5 ~1 RNA. Octanoylasparagine-leucine-threonine, a competitive inhibitor of N-linked glycosylation (Lau et al., 1983 ) was a kind gift from Dr. Fred Naider, City University of New York, and was prepared as a 300 PM stock in an aqueous solution containing 25% dimethyl sulfoxide. Translations were done for 1 hr at 25". Sodium carbonate treatment of microsomes followed the procedure of Fujiki et a/. (1982) . Deglycosylation of translation products was done with N-glycanase (Genzyme Corp.) or with endoglycosidase H (ICN) using methods recommended by the manufacturers. lmmunoprecipitates were prepared as described by Anderson and Blobel (1983) except that iodoacetamide was not used to block SH groups prior to electrophoresis. Five microliters porcine hyperimmune TGEV-specific serum was used per 50 ~1 translation volume, and protein A-Sepharose CL-4B (Pharmacia) was used to adsorb the immunoprecipitates. Porcine hyperimmune anti-TGEV serum was produced in a specific pathogen-free gilt (gilt 53) and was a kind gift from Dr. Lorant Kemeny, National Animal Disease Center (Ames, IA) Kemeny, 1976). Preprolactin mRNA was generated with SP6 polymerase from cloned cDNA kindly given to us by Drs. William Hansen and Peter Walter, University of Sequencing strategy used to derive the TGEV M gene sequence. cDNA clones FG5, C4, F5, E2, FT36, FT35, and FT44 were cloned into the fstl site of vector pUC9 and were all found to be in the same orientation with respect to the virus genomic RNA illustrated at the top of the figure. That is, the 3'end of the insert is near the HindIll site in the multiple cloning region, and the 5' end is near the SalI site in the multiple cloning region. FT43 was likewise cloned but was found to be in the opposite orientation. Nucleotide position 1 on the restriction map sequence is the first base at the 5 end (virus-sense) of the FT36 insert. 0 and 0 indicate sites labeled on fragments of clone FG5 at the 3' end of DNA with reverse transcriptase and at the 5' end with polynucleotide kinase, respectively. 0 indicates 3' end-labeling with reverse transcriptase at the SalI site in the multiple cloning region of clones C4, F5, E2, and FT36. n indicates 3'end-labeling with reverse transcriptase at the HindIll site in the multiple cloning region of clones C4, F5, E2, FT36, and FT35. + indicates 3' end-labeling with reverse transcriptase at the Xholl site in clones E2 and FT43, or at the Hinfl site on clone FT44. and P-lactamase mRNA was obtained from Promega Biotec. For labeling intracellular M protein, ST cells in 60-mm plastic petri dishes were infected with a multiplicity of infection of approximately 10, incubated 1 hr, and refed, after rinsing, with 10 ml per dish of minimum essential medium containing 5% normal methionine concentration, 10% fetal calf serum (Sterile Systems), and 200 &i [35S]methionine (Translabel, ICN). Where indicated, tunicamycin (Sigma) at a final concentration of 2 pg/ml was included in the medium used for refeeding. At 6 hr p.i., cells were rinsed with EBSS, scraped into a 15-ml conical tip polypropylene tube and pelleted, and lysate was prepared by adding 0.5 ml phosphate-buffered saline, 1 o/o NP-40, 10 units Aprotinin (Sigma)/ml, and incubating the mixture at 25" for 30 min with frequent vortexing. Nuclei and cell debris were removed by centrifugation at 13,000 g for 5 min and 50 ~1 cell lysate supernatant was used in an immunoprecipitation reaction as described above for products of a 50 @I in vitro translation reaction. The M protein was immunoprecipitated with 10 ~1 M-specific monoclonal antiserum (identified as 1 A6; Woods eta/., 1987). For labeling virion M protein, cells grown in 1 50-cm2 plastic flasks were infected as described above, and incubated with 500 PCi [35S]methionine (Translabel, ICN) per flask. At 18 hr p.i., virus was purified from clarified supernatant fluids by isopycnic sedimentation in sucrose gradients as previously described (Brian et al., 1980) . Virion proteins were solubilized in 49/o SDS, and M protein was immunoprecipitated with M-specific monoclonal antibody and deglycosylated with Nglycanase. In vitro translation reaction products or immunoprecipitates on protein A-Sepharose CL-4B beads were diluted with an equal volume of 2X Laemmli sample treatment buffer [1X sample treatment buffer is 0.0625 M Tris-HCI (pH 6.8) 2% SDS, 10% glycerol, 5% 2mercaptoethanol] that contained 5 M urea, heated 2 min at 1 OO", and electrophoresed using the method of Laemmli (1970). Deduced amino acid sequence of the matrix protein Seven clones, C4, E2, F5, FT35, FT36, FT43, and FT44, mapping in the positions illustrated in Fig. 1 , were sequenced in part to extend the TGEV genomic sequence that was known from clones FG5 and 121 (Kapke and Brian, 1986) . Clone FG5 maps at the extreme 3' end of the genome and contains the sequence for the hypothetical hydrophobic protein gene, the N gene, and part of the M gene. Identification of the third open reading frame as the M gene sequence was based on regions of extensive amino acid homology with the M proteins of MHV (Armstrong et a/., 1984) and IBV (Boursnell et al., 1984) . The sequencing strategy we used is described in Fig. 1 . The molecular weight of the glycosylated M protein has been estimated from electrophoretic migration patterns to be approximately 28 to 30 kDa (Brian et a/., 1983; Garwes and Pocock, 1975; Wesley and Woods, 1986) . We therefore anticipated that we would be able to deduce from the gene sequence a molecular weight of 28 kDa or less for the unglycosylated protein. and Brian, 1986) yielded a protein of 289 amino acids (Armstrong et al., 1984; Tooze et al., 1984) and bovine having a molecular weight of greater than 32,000 (Fig. coronavirus (Lapps ef al., 1987) We hypothesized that this is unlikely, however, based on the documented evidence for leader-primed transcription in coronavirus replication (Makino et a/., 1986b), and on the existence of a primer binding-like intergenic sequence early in the open reading frame. The most probable site for initiation of transcription of the M message is suggested by the sequence CTAAAC beginning at base 128 in Fig. 2 , which is part of a conserved intergenic sequence in the TGEV genome. It is a sequence found in total and again in part between the M and N genes beginning at base 926 in Fig. 2 , and also between the N and hypothetical hydrophobic protein genes (Kapke and Brian, 1986). It is also part of the intergenic sequence found in the MHV genome (Budzilowicz et al., 1985) . If CTAAAC functions as part of an intergenic sequence that directs leader-primed synthesis and thereby defines the start of the M transcript for TGEV, then the M protein coding sequence could start with the first available methionine 3'-ward of the CTAAAC sequence, a codon that begins at base 137 in Fig. 2 . It could also start with the second, downstream, in-frame AUG codon beginning at base 200, but this is surrounded by a much less favorable Kozak consensus sequence (Kozak, 1983) . To test our hypothesis, we prepared transcripts identical to the postulated functional mRNA structure (transcripts from the reconstructed full-length M gene, construct pGEM-M-1, that would initiate translation at the first AUG downstream from the CTAAAC sequence) and compared sizes of the resulting translation products with those of mRNA isolated from infected cells. Figure 3A, lanes 2 and 5, M protein translated in vitro from cell-derived, poly(A)selected mRNA and immunoprecipitated with TGEVspecific antibody had an electrophoretically determined molecular weight of 25K and comigrated with protein translated from the reconstructed full-length M gene. Protein translated from the reconstructed fulllength M gene immunoprecipitated with the same antibody thus confirming its authenticity (Fig. 3A, lane 8) . Furthermore, truncated M protein (generated from construct pGEM-M-2), although also antigenically authentic, migrated distinctly faster than full-length M but with a migration rate much less than expected for a 2-kDa molecular weight difference (Fig. 3A, lanes 9 and 12) . The small difference in electrophoretic mobility between the two forms of M again is an ostensible function of the hydrophobic nature of the protein. The electrophoretically determined molecular weight of the truncated M polypeptide is 24.5K. Judging from the size of the various translation products, it is unlikely that initiation of translation in vivo starts at any place other than at the first AUG downstream from the CTAAAC intergenic sequence. When the first methionine codon downstream from the CTAAAC sequence is used as the initiation site for translation, the deduced M protein comprises 262 amino acids and has a molecular weight of 29,544. It is moderately hydrophobic with 44% of its amino acids being hydrophobic, and is basic since it carries a net charge of +7 at neutral pH. A potentially cleavable amino-terminal signal peptide is not an absolute requirement for membrane translocation and glycosylation A comparison of the amino acid sequence for the M proteins of TGEV, BCV, MHV, and IBV (Fig. 4) several features that are shared among all four viruses, but also one feature for TGEV that is strikingly contrasting. Regions of high sequence homology are found among the proteins. Most notable is a stretch of 8 amino acids beginning at position 132 on the TGEV sequence that is identical for all four viruses. By computer analysis using a protein alignment function, 34, 32, and 19% of the TGEV protein sequence is homologous with that of BCV, MHV, and IBV, respectively. Furthermore, a hydrophobicity plot of the TGEV M protein shows three internal hydrophobic domains that align with similar domains in BCV, MHV, and IBV ( Fig. 4 and data not shown; Rottier et al., 1984; 1986) , BCV (Lapps et al., 1987) and IBV (Boursnell et al,, 1984) . This suggests that the topology of the four proteins is similar; that is, from its entrance into the virion membrane and as it extends toward the carboxy terminus, the protein spans the membrane three times and has a relatively hydrophilic intravirion carboxy-terminal region (Rottier et al,, 1984 (Rottier et al,, , 1986 . The striking feature of the TGEV M protein is its much longer amino terminus that includes a sequence resembling a cleavable peptide for membrane insertion. Assuming a parallel structure for the M proteins of the four viruses and assuming the MHV M protein enters the virion envelope at position 26 (Rottier et a/., 1986) then the external amino-terminal portion is 28 amino acids for BCV, 21 for IBV, and 54 for TGEV. Within the first 54 amino acids there is one asparagine at position 32 that has the appropriate surrounding sequence required for N-linked glycosylation (Hubbard and Ivatt, 1981) the kind of glycosylation shown for the TGEV M protein (Jacobs et a/., 1986). Unlike the amino terminus of the MHV, BCV, and IBV proteins, the TGEV M protein is hydrophobic for the first 16 amino acids. Eleven of the sixteen terminal amino acids are hydrophobic, and amino acids at positions 2, 17, and 18 are charged (Fig. 4) . By inspection, this sequence has the properties of a cleavable amino-terminal signal peptide and from the "-3, -1" rule (von Heijne, 1986) peptidase cleavage would occur between amino acids 16 and 17. Since an amino-terminal peptide is not required for membrane translocation and glycosylation of the M proteins of BCV, MHV, and IBV, we examined what effect the peptide had on the translocation of the TGEV M protein. Both forms of the reconstructed M gene generate the asparagine glycosylation site when translated, pGEM-M-1 at amino acid position number 32 and pGEM-M-2 at amino acid position number 11 of their respective translation products (Fig. 4) . Both forms of transcripts were therefore translated in the presence of microsomes known to have glycosylating activity. Figures 3A, lanes 5, 6, 9, and 10, 3B, lanes 1, 2, 3, and 4, and 3C, lanes 1, 2, 3, and 4 , illustrate that, whereas the full-length reconstructed M gene (1 st AUG) yielded a 25-kDa protein that was mostly glycosylated, 65-880~ as determined from optical density tracings, the truncated M gene (2nd AUG) yielded a protein that was also glycosylated but to a far lesser extent, only lo-30%. The glycosylated products of both the 1st AUG and 2nd AUG transcripts were approximately 28 kDa (Fig. 3A, lanes 6 and 1 O) , the same size as the glycosylated product from viral mRNA translation (Fig 3A, lane 3) and glycosylated virion M (Fig. 3A, lanes 16 and 17) . Glycosylation was confirmed to have taken place since the products from in vitro translation migrated again with the unglycosylated polypeptides after they had been digested with AI-glycanase (Fig. 3A, lanes 7 and 11) or endoglycosidase H (Fig. 3B, lanes 5 and 6) . These results demonstrate that although the asparagine glycosylation site of both the full-length and truncated forms of M protein became translocated to the lumenal side of the microsome, translocation with the amino-terminal peptide present was far more efficient. Furthermore, translocation of the amino terminus appeared to be specifically enhanced by the peptide since carbonate treatment showed both full-length and truncated forms of the protein as a whole to be equally membrane embedded (Fig. 3C, lanes 5, 6,7, and 8) presumably as a result of internal translocation signals of the kind described for the MHV and IBV M proteins (Rottier et al., 1985; Machamer and Rose, 1987) . Following carbonate treatment greater than 85% of both the full-length and truncated forms of the M protein remained membrane bound as determined by optical density tracing of the autoradiogram. Cleavage of the amino-terminal peptide did not appear to occur during in vitro translation To test for amino-terminal peptide cleavage, viral mRNA from infected cells and transcripts of the cloned M gene were translated in the presence of microsomes known to contain signal peptidase activity (Fig. 3D ). In initial experiments, a tripeptide serving as a competitive inhibitor of asparagine-linked glycosylation (Lau et a/., 1983) was incubated with the microsomal-translation mixture in order to inhibit concurrent glycosylation that would otherwise obscure the results of peptide cleavage. Although inhibition of glycosylation was never complete, at no time was there evidence of peptide cleavage (data not shown). To use a second approach, mRNA and transcripts were translated in the presence of microsomes and the products were deglycosylated with either /V-glycanase or endoglycosidase H, and sizes were compared by electrophoresis (Figs. 3A, lanes 4, 7, and 11, and 36, lanes 5 and 6) . Interestingly, the sizes of the deglycosylated products, as described above, appeared to be no smaller than the polypeptides synthesized without microsomes. There appeared, therefore, to be no amino-terminal peptide removal during in vitro translation in the presence of microsomes. Results of experiments to determine whether the N-terminal peptide is cleaved in vivo suggest that cleavage in vivo may be dependent upon the glycosylation of M or other glycoproteins. When infected cells were incubated in the presence of tunicamycin and radiolabeled M was immunoprecipitated from cell lysate using monoclonal antibody, only one major band was found and it migrated as an uncleaved protein of 25 kDa (Fig. 3A, lane 14) . Similar results were obtained when polyclonal hyperimmune TGEV serum was used (data not shown). On the other hand, in the absence of tunicamycin, two forms of M were precipitated. These were a 28-kDa species, the size of fully glycosylated M, and a 24.5-kDa species, the size of M from which the N-terminal signal had been removed (Fig. 3A , lane 13). Glycosylation in vivo may have important consequences on the exposure of the peptidase cleavage site because of interactions between M and other viral components in vivo. In light of a recent report demonstrating by aminoterminal amino acid sequencing that the virion form of TGEV M protein is indeed cleaved (Laude et a/,, 1987) we included deglycosylated, radiolabeled virion M in our electrophoretic analysis (Fig. 3A, lane 15) . The deglycosylated virion M did not migrate as a cleaved polypeptide but rather migrated with an apparent molecular weight of approximately 26 kDa suggesting that it had perhaps undergone a second as yet undetermined modification after becoming virion-associated. The M protein of MHV was the first coronavirus M protein to be sequenced (Armstrong et al., 1984) and its topography with regard to membrane orientation and insertion has been carefully documented (Rottier et a/., 1984 (Rottier et a/., , 1986 . It therefore serves as the prototypic coronavirus M protein to which others can be compared. The coronavirus M protein apparently functions to direct the budding of virus into the rough endoplasmic reticulum and the Golgi since it inserts into these membranes and is found at highest concentrations there (Holmes et al., 1984; Tooze et a/., 1984) . Presumably virus assembly is mediated through an interaction between the M protein (in the membrane) and the N protein (in the cytoplasm) or the RNA (in the cytoplasm), or both. The M protein of MHV does not have an amino-terminal cleavable signal peptide for membrane translocation, but rather utilizes one or more of its three internal hydrophobic domains for membrane insertion (Rottier era/., 1985) . The same is also true for the IBV M protein (Machamer and Rose, 1987) , and is probably true for the BCV M protein (Lapps et a/., 1987) which has a structure similar to that of the MHV M protein. It came as a surprise to us, therefore, that the deduced TGEV M protein has, in addition to the three internal hydrophobic domains, an amino-terminal sequence that possesses the properties of a cleavable signal peptide for membrane insertion (von Heijne, 1986). The potentially cleavable N-terminal peptide was also revealed by the work of Laude et a/. (1987) in which a nearly identical M gene sequence was reported for the same strain of TGEV. In their sequence, bases in positions 375, 468, and 720 of Fig. 2 were T, C, and A, respectively, making the corresponding amino acids at these codon positions Val, Asn, and Asp. Furthermore, they obtained direct evidence that the N-terminal signal peptide was cleaved from the virion-associated M protein. At least two fundamental questions are therefore raised by the existence of an amino-terminal signal peptide in the TGEV M protein: (i) What is the evolutionary origin of such a sequence, and (ii) what role does the sequence play for the TGEV M protein? With regard to the evolutionary origin of the hydrophobic amino terminus, two possibilities can be entertained, assuming the four coronaviruses, TGEV, BCV, MHV, and IBV have a common evolutionary origin. (i) There was an amino-terminal hydrophobic sequence in the primordial protein which was lost during the evolution of BCV, MHV, and IBV since it is superfluous for membrane insertion and virus assembly. Mechanistically, the loss of a genetic sequence could be explained by the dissociating-reassociating polymerase hypothesized by Lai et a/. (Makino et a/., 1986a) . The nucleotide sequence encoding the amino-terminal hydrophobic sequence could have been eliminated by the polymerase as it copied the negative-strand template. (ii) There was no amino-terminal hydrophobic sequence in the primordial protein but the TGEV M protein acquired one during evolution. The polymerase during replication of the TGEV genome could have become dissociated and then reassociated with another minus-strand RNA template that carried the sequence for a hydrophobic signal. While negativestrand RNA of this kind is probably nonexistent or of low abundance in eucaryotic cells normally, it does exist in cells coinfected with another RNA virus. Conceivably the TGEV M protein could have acquired its amino-terminal sequence by copying the negative strand of another RNA virus. In this regard, it is interesting that 6 of the first 8 amino acids in the TGEV M sequence are identical to the VSV G protein signal sequence (Rose and Gallione, 1981) . With regard to the function of the amino-terminal hydrophobic sequence, we propose that it aids in the translocation of the amino terminus of the TGEV M protein through the endoplasmic reticulum as would other amino-terminal signal peptides, but it is not an absolute requirement. The TGEV M protein without its amino-terminal signal peptide, as approximated by the translation product of the pGEM-M-2 construct which is 4 amino acids shorter than the polypeptide identified by the peptidase cleavage site (Laude et al., 1987) can translocate apparently by the use of an internal signal(s) of the type described for the MHV and IBV M proteins (Rottier et a/., 1985; Machamer and Rose, 1987) . It is interesting to note that even after removal of the amino-terminal peptide, the TGEV M protein extends 12 amino acids farther from the envelope than does BCV, 11 amino acids farther than MHV, and 16 amino acids farther than IBV. Perhaps the amino-terminal signal peptide, while not being an absolute requirement for translocation of the TGEV M protein, aids enough in the translocation of the additional external amino-terminal sequence that it was evolutionarily selected. Since the amino-terminal peptide is apparently not removed from the M protein during in vitro translation in the presence of microsomes, but is removed from the M protein in the assembled virion (Laude et a/., 1987) we propose that its cleavage depends on the context in which it finds itself. Perhaps a step in virion assembly, for example, an interaction with another viral protein or viral RNA, may be required for cleavage to occur. Glycosylation of the M protein alone is apparently not a prerequisite for cleavage in vitro (Figs. 3A and B) but may be important in the in vivo context. Context is important for the cleavability of other signal peptides. For example, under certain constraints the potentially cleavable signal on the invariant (I?) chain of class II histocompatibility antigens (Lipp and Dobberstein, 1986) is not cleaved. We further propose that the M protein with an uncleaved signal peptide would have the orientation depicted in Fig. 5 . With this orientation, the cleavage site is buried in the membrane and fewer than 37 amino acids are exposed as a loop on the lumenal side of the endoplasmic reticulum. Once virion assembly begins, the cleavage site becomes exposed to the signal peptidase, cleavage occurs, and 37 amino acids, including a glycosylated asparagine, are left remaining on the virion surface. Because of the amino-terminal hydrophobic sequence, the TGEV M protein may behave differently than its MHV, BCV, or IBV counterparts with regard to intracellular trafficking. This work was supported by Grant Al-14367 from the National Institute of Allergy and Infectious Diseases, by Grant 82-CRSR-2-1090 from the U.S. Department of Agriculture, and in part by a grant from the National Foundation for lleitis and Colitis, Inc. lmmunoprecipitation of proteins from cell-free translations ARMSTRONG, I., NIEMANN, H., SMEEKENS, S., ROTTIER, P., and WARREN, G. (1984) . Sequence and topology of a model intracellular membrane protein, El glycoprotein, from a coronavirus. Nature (Lendon) 308,751-752. BOURSNELL, M. E. G., BROWN, T. D. K., and BINNS, M. M. (1984) .Sequence of the membrane protein gene from avian coronavirus IBV. virus Res. 1, 303-313. BRIAN, D. A., DENNIS, D. E., and GUY, 1. S. (1980) 12, 581-599. Rose, J. K., and Gallione, C. (1981) . Nucleotide sequences of the mRNAs encoding the VSV G and M proteins as determined from cDNA clones containing the complete coding regions. J. Viral. 39, 519-528. ROTTIER, P., ARMSTRONG, J., and MEYER, D. I. (1985) . Signal recognition particle-dependent insertion of coronavirus El, and intracellular membrane glycoprotein. J. Biol. Chem. 260, 4648-4652. ROTTIER, P., BRANDENBURG, D., ARMSTRONG, J., VAN DER ZEIJST, B., and WARREN, G. (1984