key: cord-0977863-b3shy00j authors: Steinbacher, Stefan; Miller, Stefan; Baxa, Ulrich; Budisa, Nediljko; Weintraub, Andrej; Seckler, Robert; Huber, Robert title: Phage P22 tailspike protein: crystal structure of the head-binding domain at 2.3 Å, fully refined structure of the endorhamnosidase at 1.56 Å resolution, and the molecular basis of O-antigen recognition and cleavage 1 1 Edited by K. Nagai date: 1997-04-11 journal: Journal of Molecular Biology DOI: 10.1006/jmbi.1997.0922 sha: 56bd4bd786dd96bfda9f2c096eeed6eee4280ee8 doc_id: 977863 cord_uid: b3shy00j Abstract The tailspike protein of Salmonella phage P22 is a viral adhesion protein with both receptor binding and destroying activities. It recognises the O-antigenic repeating units of cell surface lipopolysaccharide of serogroup A, B and D1 as receptor, but also inactivates its receptor by endoglycosidase (endorhamnosidase) activity. In the final step of bacteriophage P22 assembly six homotrimeric tailspike molecules are non-covalently attached to the DNA injection apparatus, mediated by their N-terminal, head-binding domains. We report the crystal structure of the head-binding domain of P22 tailspike protein at 2.3 Å resolution, solved with a recombinant telluromethionine derivative and non-crystallographic symmetry averaging. The trimeric dome-like structure is formed by two perpendicular β-sheets of five and three strands, respectively in each subunit and caps a three-helix bundle observed in the structure of the C-terminal receptor binding and cleaving fragment, reported here after full refinement at 1.56 Å resolution. In the central part of the receptor binding fragment, three parallel β-helices of 13 complete turns are associated side-by-side, while the three polypeptide strands merge into a single domain towards their C termini, with close interdigitation at the junction to the β-helix part. Complex structures with receptor fragments from S. typhimurium, S. enteritidis and S. typhi253Ty determined at 1.8 Å resolution are described in detail. Insertions into the β-helix form the O-antigen binding groove, which also harbours the active site residues Asp392, Asp395 and Glu359. In the intact structure of the tailspike protein, head-binding and receptor-binding parts are probably linked by a flexible hinge whose function may be either to deal with shearing forces on the exposed, 150 Å long tailspikes or to allow them to bend during the infection process. Salmonella phage P22 is a dsDNA phage of the Podoviridae family, which is characterised by a short base plate or tail structure incorporated into one of the 12 5-fold vertices of the icosahedral phage head (Poteete, 1994; Figure 1 ). Its scaffolding protein assisted assembly pathway proceeds though a DNA-free procapsid structure (Earnshaw & Casjens, 1980; Casjens & Hendrix, 1988) . The structure of the scaffolding-containing procapsid, an empty procapsid, and the mature capsid with T 7 laevo, analysed by cryo-electronmicroscopy at 22, 28 and 28 A Ê resolution, respectively, revealed a marked structural transition during DNA packing (Prasad et al., 1993; Thuman-Commike et al., 1996) . The formation of the short tail is initiated by incorporation of the portal protein 12-mer into one vertex of the procapsid during procapsid assembly together with 10 to 20 copies of each of three pilot proteins, gp7, gp16 and gp20 (Bazinet et al., 1988) . The pilot proteins and the portal complex are required for infectivity but not for procapsid assembly (Botstein et al., 1973) . Upon DNA packaging, proteins gp4, gp10 and gp26 are attached, forming the slender neck structure of the base plate. The portal vertex is both the site of DNA entry during phage maturation and DNA exit during infection. In the very last step of phage assembly tailspikes are attached to the neck structure. The tailspike protein, a homotrimer of 72 kDa subunits is anchored by its N-terminal, head-binding domain in a non-covalent but irreversible manner (Berget & Poteete, 1980; Maurides et al., 1990) . Three tailspikes are required on average to form an infectious phage (Israel, 1978) . The tailspike protein functions as a viral adhesion protein and binds to the O-antigenic repeating units of Salmonella host lipopolysaccharide (LPS), its cellular receptor (Israel et al., 1967) . The number of O-antigenic repeating units of LPS varies greatly within the LPS pool of a cell. The distribution of the O-antigen chain length was reported to be unequal in S. typhimurium with 77% having between 19 and 34 repeating units, but two thirds of all LPS molecules being devoid of O side-chains (Palva & Ma È kela È , 1980) . Additional modi®cations, such as acetylation or glucosylation, known as Oantigen form variation, also contribute to microheterogeneity (Kauffmann, 1941; Goldman & Leive, 1980; Palva & Ma È kela È , 1980) . The chemical structure of the O-antigen correlates with the serological classi®cation of Salmonella as these are the main antigens upon gastrointestinal infections (Lu È deritz et al., 1966) . The interaction of the tailspike protein with its receptor de®nes the host range of phage P22 comprising serotypes A, B and D1. These share a common trisaccharide repeating unit a-D-mannose-(1-4)-a-L-rhamnose-(1-3)-a-D-galactose for the O-antigen and differ in their branching carbohydrate moieties, a 3,6-dideoxyhexose a-(1-3)-linked to Dmannose. Dideoxyhexoses are paratose (serogroup A), abequose (serotype B), or tyvelose (serotype D1). In addition, P22 tolerates the O-antigen 12 2 , which shows an additional a-(1-4)-linked D-glucose at D-galactose as in S. typhi 253Ty. The tailspike protein shows a receptor destroying endorhamnosidase activity cleaving the a(1,3)-O-glycosidic bond between rhamnose and galactose of the O-antigenic repeats. It produces mainly octasaccharide fragments of two repeating units with rhamnose at the reducing end (Iwashita & Kanegasaki, 1976; Eriksson & Lindberg, 1977; Eriksson et al., 1979) . Receptor destroying enzymatic activities are well known for viruses that use carbohydrates as receptors. In¯uenza A and B virus or paramyxovirus have neuraminidases (Air & Laver, 1989) , whereas in¯uenza virus C and some coronaviruses that recognise an O-acetylated sialic acid epitope have a sialate 9-O-acetylesterase, removing an acetyl group (Herrler et al., 1985) . Endoglycosidase or acetylesterase activities have also been demonstrated for a large number of phages acting on encapsulated Gram-negative bacteria like Escherichia coli, Salmonella or Klebsiella (Lindberg, 1977; Svenson et al., 1979) . Proteolytic fragmentation of intermediates in thermal unfolding and refolding of the tailspike protein suggested a domain border between residues 100 and 120. The C-terminal fragment (residues 109 to 666) maintains LPS binding properties and enzymatic activity of the intact tailspike protein and resembles the complete protein in its folding pathway (Chen & King, 1991; Danner et al., 1993) . The crystal structure of the C-terminal part, lacking the N-terminal head-binding domain has been solved at 2.0 A Ê resolution (Steinbacher et al., 1994) . We have recently analysed the crystal structure of the C-terminal fragment in complex with O-antigen fragments at 1.8 A Ê resolution comprising two repeating units, i.e. octa-or decasaccharides, derived from S. typhimurium (serogroup B), S. enteritidis (serogroup D1), and S. typhi253Ty (serotype D1, O-antigen 12 2 ), which shows additional a-(1-4)linked D-glucose at D-galactose . Here we report on the crystal structure of the head-binding domain (residues 1 to 124 with an additional C-terminal hexa-His tag) at 2.3 A Ê resolution. The structure solution was mainly based on a telluromethionine containing derivative Figure 1 . Diagram of Salmonella phage P22. The icosahedral head is formed by approximately 420 molecules gp5. The portal protein (gp1) 12-mer is inserted into one icosahedral facet. Up to six tailspike molecules (gp9) are attached to a slender neck structure formed by gp4, gp10 and gp26. This contact is mediated by the N-terminal, head-binding domain. Minor components (gp7, gp16 and gp20) are associated with the portal protein or the DNA. produced by overexpressing the protein in the Met auxotrophic E. coli strain B834(DE3)(hsd metB) in the presence of Ac-DL-Met(Te)-OH lithium salt (Karnbrock et al., 1996; N.B. et al., unpublished results) . We also present the crystal structure of the C-terminal fragment after full re®nement at 1.56 A Ê and O-antigen complexes determined at 1.8 A Ê in detail. Structure determination of head-binding domain The structures of a variety of plant, insect and animal viruses as well as those of a number of bacteriophages have been analysed by high resolution X-ray crystallography (Harrison et al., 1996; Liljas, 1996) . These studies mainly focus on icosahedral capsids, taking advantage of their internal symmetry both for crystallisation and structure solution. In contrast, only very limited structural information at atomic resolution is available on tail or ®bre components of phages and viruses protruding from symmetric capsids or anchored in lipid membranes, although these are crucial for receptor recognition or viral entry into the host cell. Exceptions are haemagglutinin (Wilson et al., 1981) and neuraminidase (Varghese et al., 1983) from inuenza A virus, tick-borne encephalitis envelope protein (Rey et al., 1995) and the receptor binding knob domain of adenovirus type 5 ®bre protein (Xia et al., 1994) . The tailspike protein could be crystallised and subsequently analysed by X-ray crystallography by fragmentation into two structurally intact parts based on biochemical studies (Chen & King, 1991) and recombinant overexpression in E. coli. Stable, functional proteins were formed upon expression of residues 1 to 124 comprising the N-terminal, head-binding domain (Miller, 1995) and of the C-terminal part consisting of residues 109 to 666 (Danner et al., 1993) . The crystal structure of the head-binding domain was solved by isomorphous replacement methods. Phasing was based on a strong telluromethionine derivative and a weak uranium derivative ( Table 1 ). The incorporation of the amino acid analogues L-selenomethionine and L-telluromethionine into proteins overexpressed by recombinant DNA technology using Met auxotrophic E. coli strains opened the possibility to produce heavy atom derivatives of protein crystals on a rational basis. While use of synchrotron radiation is essential to take advantage of the anomalous contribution of selenium by MAD phasing (Hendrickson, 1991) in Met(Se)-derivatives, Met(Te)-derivatives show good phasing with CuK a -radiation due to the larger number of additional electrons. The incorporation of the tellurium analogue Met(Te) was hampered by the lack of an ef®cient chemical synthesis and mainly by its very limited stability under aqueous, aerobic conditions as in E. coli cultures. N-acetyltelluromethionine and its Tehydroxylated form, Ac-DL-Met(Te)-OH rapidly formed under aerobic conditions, were demonstrated to have a signi®cantly increased stability. Ac-DL-Met(Te) has a half-life of about 20 hours compared to 30 minutes for free L-Met(Te) in non-degassed aqueous solution at pH 7.0 and room temperature, which is suf®cient for bioincorporation (Karnbrock et al., 1996) . Both compounds can be used directly for ef®cient high level incorporation of Met(Te) residues as demonstrated for a f R iso jF 2 r À Fja F 2 r F 2 , where F PH and F P are the derivative and the native structure factor amplitudes, respectively. g R gullis hkl jF r AE F p j À F r calcXja hkl jF r AE F jX h F H /residual; rms mean heavy-atom contribution/ rms residual, de®ned as [(F 2 rg À F 2 r À) 2 /n] 1/2 with the sum over all re¯ections, where FPHC is the calculated structure factor and FPH is the structure factor amplitude of the heavy-atom derivative, respectively. variety of proteins (Budisa et al., 1995; and unpublished results) . Met(Te) residues are stable when they are buried in the interior of the protein. For structure determination of the head-binding domain the re®ned coordinates of Te were used to derive ncs-operators for two trimers in the asymmetric unit. The MIR phased electron density was improved by sixfold ncs-averaging. The model was subjected to crystallographic re®nement with ncsrestraints. The ®nal crystallographic R-factor for the headbinding domain was 19.9% for data from 8.0 to 2.3 A Ê resolution using data set TEME and 18.6% using native data from 8.0 to 2.6 A Ê resolution for re®nement. The rms-deviations between ncs-related protein molecules were 0.28 and 0.29 A Ê , respectively for all protein atoms. The quality of the ®nal model is summarised in Table 2 . The model of the N-terminal domain comprises residues 5 to 108 according to the numbering of Sauer et al. (1982) , which does not include the Nterminal Met. Residues 1 to 4, the preceding Met and residues 109 to 124 as well as the C-terminal hexa-His tag are not visible in the electron density and are probably disordered or mobile. All residues lie within or near energetically allowed regions of (È, c) values (Ramachandran & Sasisekharan, 1968) . The molecular structure of each monomer of the homotrimer is characterised by two regular bsheets, A and B, oriented nearly perpendicular to each other and composed of ®ve and three strands, respectively. The direction of the strands with respect to the molecular triad is almost parallel for bsheet A, whereas b-sheet B is perpendicular to the triad. The topology of the strands is exclusively antiparallel ( Figure 2 ). The N-terminal residues 7 to 9 from a neighbouring subunit extend b-sheet B by one strand forming three regular hydrogenbonds with residues 81 to 83. Each monomer is stabilised by a hydrophobic core built of eight Val, nine Ile, one Pro, one Phe, one Met and two Leu residues. In the trimer the monomers form a dome-like structure 30 A Ê high and about 55 A Ê in diameter (Figure 3 (a)). The two b-sheets do not form a barrel structure but the walls of the dome. Prominent features that stabilise the trimer structure are a cluster of residues Phe22 that close the dome at its top ( Figure 3 (b)) and the N-terminal strand that extends on the inner surface of the molecule to the neighbouring subunit (Figure 3 (c)). Polypeptide arms extending over or under domains of neighbouring subunits intertwining with others are a remarkable feature of virus structures (Harrison et al., 1996) . The contacts between adjacent subunits are, however, not very extensive and involve a salt bridge formed between Arg13 and Asp100#, a hydrophobic contact between Lys80 and Tyr108 and hydrogen bonds between Lys71 and Thr17# and between Arg20 and the carbonyl group of Gly76# (# denotes residues from a neighbouring subunit). During thermal denaturation, the N-terminal domain is the ®rst part of the tailspike protein to unfold (Chen & King, 1991) . The treatment of a thermal unfolding intermediate with proteases results in complete destruction of the N-terminal part up to residues 107 to 120, whereas the Cterminal part remains intact. The N-terminal domain is dispensable for thermostability and SDS Figure 2 . Schematic representation of the topology of one subunit of the head-binding domain. b-Sheet A comprises ®ve strands (A1 from 100 to 109, A2 from 89 to 94; A3 from 29 to 32, A4 from 65 to 67, A5 from 72 to 74), b-sheet B comprises three strands (B1 from 81 to 83, B2 from 49 to 54, B3 from 56 to 62). In the trimer, bsheet B is extended by the N-terminal strand of a neighbouring subunit (residues 7 to 9). The secondary structure was analysed with DSSP (Kabsch & Sander, 1983) . resistance of the intact protein, and its deletion has only minor effects on tailspike folding kinetics (Danner et al., 1993) . A single point mutation produced by chemical mutagenesis, Asp100Asn has been reported to in¯uence the head-binding properties of tailspike protein. The mutant's af®nity for phage heads is only 1% compared to the wild-type protein and its N-terminal domain is denatured by SDS at room temperature. A set of intragenic second site mutations (Arg13Ser, Arg13Leu and Arg13His) suppresses the Asp100Asn phenotype (Schwarz & (Kraulis, 1991) . Structure of Phage P22 Tailspike Protein Berget, 1989) . Both residues, Asp100 and Arg13, are located at the subunit interface and form an inter-subunit salt bridge. Arg13 protrudes from the N-terminal strand running across the inner surface of the trimer. Both residues are partly solvent accessible and located at the bottom of a shallow cleft formed at the subunit interface. The effects of the mutation can be rationalised by the electrostatic interaction within the salt bridge. A marked clustering of charged residues is present at the interface of the subunits, extending to the top of the molecule. Hydrophilic residues predominate at exposed loop structures, whereas small patches of hydrophobic residues are exposed in the centre of the outer surface of b-sheet A. The electrostatic surface potential of the N-terminal domain is therefore dominated by areas of neutral potential as seen in Figure 4 . Regions of negative and only small ones of positive potential are concentrated at the subunit interface. It is not known which protein of the neck structure binds to the N-terminal domain of the tailspike protein. Prominent loop structures and the hydrophobic centre of b-sheet A are likely candidates for head binding. Structure of the C-terminal fragment: the right- The crystal structure of the C-terminal fragment, resembling the intact protein in enzymatic activity, receptor binding properties and temperature stability was initially re®ned to a 2.0 A Ê resolution (Steinbacher et al., 1994) . Data to 1.56 A Ê resolution were collected at the beamline BW6 at the HASY-LAB of the DESY in Hamburg. The model was subsequently re®ned to a ®nal R-factor of 17.1% with data between 8 and 1.56 A Ê resolution ( Table 3) . The homotrimeric molecule is 133 A Ê in length and between 35 and 80 A Ê in diameter ( Figure 5 ). Each subunit is composed of six segments, together resembling the shape of a ®sh. They correspond to the mouth, the main body, the dorsal ®n, and the ®rst, second and third segment of the caudal ®n, respectively. In addition, three insertions in the main body group together to form a ventral ®n. In terms of secondary structure, the main body comprises a parallel b-helix (residues 143 to 540) of 13 complete turns, a domain motif also present in pectate lyases, from Erwinia and Bacillus (Yoder et al., 1993; Pickersgill et al., 1994; Yoder & Jurnak, 1995) , alkaline protease from Pseudomonas aeruginosa (Baumann et al., 1993) , and the virulence factor P.69 pertactin from Bordetella pertussis (Emsley et al., 1996) . A left-handed b-helix has been identi®ed in UDP-N-acetylglucosamine acyltransferase from E. coli (Raetz & Roderick, 1995) and in carbonic anhydrase from Methanosarcina thermophila (Kisker et al., 1996) . The b-helix motifs of the tailspike protein, pectate lyase and pertactin share a common triangular cross-section, but differ in their oligomer state as pectate lyases and pertactin are monomeric proteins. Both structures with a left-handed domain motif, which has a much more regular triangular cross-section, are trimeric and share con- Figure 4 . Stereoview of the electrostatic potential of the head-binding domain from À7kT/e À (red) to 7kT/e À (blue). The three monomers are labelled I to III. The view is perpendicular to the molecular triad (green triangle). Charged residues mainly cluster at the domain interface, whereas the centre of b-sheet A is dominated by polar and hydrophobic residues corresponding to a neutral potential. The saltbridge Arg19/Asp100# is located at the subunit interface (B). The Figure was produced with GRASP (Nicholls et al., 1993) . siderable overall similarities despite diverse functions. In the trimer of the tailspike protein, adjacent bhelices form a b-sandwich with about 66 residues involved in the intersubunit contact. As the majority of these residues is either polar (18) or charged (18) the interface is hydrophilic. A wide channel along the molecular triad harbours about 132 ordered water molecules, whereas the interior of the b-helix is mainly ®lled with hydrophobic residues and only 12 ordered water molecules. A complete b-helix turn comprises approximately 22 residues in the N-terminal and central part, decreasing to 16 residues in the C-terminal part. Stacks of hydrophobic side-chains are formed by aliphatic or aromatic residues, but repetitive elements extend only up to three residues ( Figure 6 ). The mouth segment preceding the main body is an extension of the b-helix by one turn, formed by a b-strand and a short seven-residue a-helix. At the N terminus of the endorhamnosidase fragment, a seven-residue a-helix forms a three helix bundle in the trimer. The dorsal ®n domain (residues 197 to 259) is inserted into the b-helix and forms a compact domain of mainly irregular structure and a strongly twisted triple-stranded b-sheet of topology 2-1-3 and only four regular hydrogen bonds. In contrast to the main body with independently folded subunits, the three polypeptides merge into a single domain at the caudal ®n. The latter consists of three segments. In the ®rst segment, each polypeptide folds around the molecular triad, interdigitating with the others. The second segment is folded into a ®ve-stranded antiparallel b-sheet, which is extended by two parallel strands from the ®rst caudal ®n segments of the other two subunits. The seven-stranded sheets form a prism-like structure in the trimer. The third segment of the caudal ®n constitutes the C-terminal end of the tailspike protein, where a propeller is formed by threestranded antiparallel b-sheets. Each b-sheet participates in two six-stranded b-barrels, one with each of the two remaining polypeptides. Only hydrophobic residues are buried in the interior of the caudal ®n domain. The N-terminal fragment (residues 1 to 124) shows electron density for residues 5 to 108, whereas in the structure of the C-terminal fragment (residues 109 to 666) residues 109 to 112 are disordered. The N-terminal three-helix bundle from residues 113 to 120 present at the mouth segment of the C-terminal fragment was chosen as an overlapping segment in the construction of the expression vector for the head-binding domain. It is stabilised by hydrophobic interactions between aromatic and aliphatic residues and was therefore expected to be ordered in the N-terminal fragment as well. Although the absence of electron density for residues 109 to 124 in the N-terminal fragment could be due to complete disorder of this part, it could also be explained by mobility of the stable three-helix bundle with respect to the compact head-binding domain. Simple docking experiments between both fragments show that the three-helix bundle cannot penetrate the N-terminal dome very deeply and therefore only few direct contacts between both fragments are possible. This is suggestive of a hinge function of the connecting part between both fragments, which might be necessary either to deal with shearing forces on the exposed 150 A Ê long molecule or might have a functional role during the infection process, allowing the tailspikes to bend away from the DNA injection apparatus. The infection of Salmonella by phage P22 starts with the speci®c recognition of O-antigenic repeating units of LPS by the tailspikes. This interaction de®nes the host range comprising serotypes A, B and D1. The binding cleft and the active site of the tailspike protein have recently been identi®ed by X-ray analyses of complexes with receptor fragments , which was con-®rmed by mutational data . The bound carbohydrates (Figures 7 and 8) were puri-®ed from S. typhimurium (serotype B), S. enteritidis (serotype D1) and S. typhi253Ty (serotype D1, Oantigen 12 2 ) after partial digestion with the phage P22 endorhamnosidase. They comprise two O-antigenic repeats and represent the multiple host speci-®city of phage P22 and its tolerance to O-antigen 12 2 introduced by random a(1-4)-glucosylation at D-glucose due to form variation. The shape and location of the receptor binding site in the tailspike protein are ideal to recognise an elongated receptor molecule like LPS O-antigen repeats (Figure 9 ). The terminal rhamnose at the reducing end of the bound product of the endorhamnosidase is located at the lower part of the binding site towards the C terminus of the protein. The electron density of the terminal rhamnose shows a discontinuity between C2 and C3, which is present in all three complexes determined and probably re¯ects some conformational¯exibility or heterogeneity ( Figure 10) . However, the terminal rhamnose ®ts the electron density best in a twisted boat conformation, which is often observed for carbohydrates bound to the active site of a glycosidase. The C1 hydroxyl group is clearly present in a-con®guration, as in the substrate. It is hydrogen-bonded to Asp392 with a distance of 2.7 A Ê . A water molecule is located 2.9 A Ê below the anomeric C-atom of the terminal rhamnose. This water molecule is Hbonded to Glu359, Ser360 and Asp395 with distances of 2.7, 2.6 and 2.8 A Ê , respectively. In addition, Glu359 forms a salt bridge with Lys363. Residue Trp391 contributes to the binding of the terminal rhamnose by a hydrophobic contact to the 6-methyl group and it additionally may in¯uence the hydrophobicity in the environment of Asp392, which probably serves as general acid (Figure 11 ). The mutants Glu359Gln, Asp392Asn and Asp395-Asn were shown to be enzymatically inactive without reducing the substrate binding af®nity with binding constants of about 1 Â 10 6 M À1 for the octasaccharide product or dodecasaccharide substrate . It is not known to which class of endoglycosidases, inverting or retaining, the tailspike protein belongs. Both classes share two acidic residues as a common feature. The distance between both is assumed to be a valuable indicator of the enzymatic Figure 5 . Stereoview of the tailspike protein with bound O-antigen receptor fragment from S. typhi253Ty. The headbinding domain caps a three helix-bundle present at the C terminus of the receptor binding and catalytic fragment. The active site is located 80 A Ê above the C terminus. mechanism, as average distances are about 4.5 to 5.5 A Ê and 9.0 to 10.0 A Ê for retaining and inverting endoglycosidases, respectively. The larger distance for inverting enzymes originates from the necessity to harbour both the substrate and a catalytic water molecule between the general acid and the general base (McCarter & Whithers, 1994; Davies & Henrissat, 1995) . The active site topology of the tailspike protein shows distances of 5.8 A Ê (Asp392, Asp395), 8.2 A Ê (Asp392, Glu359) and 7.1 A Ê (Asp395, Glu359). The distance of 5.8 A Ê between Asp392 and Asp395 is comparable to the distance range observed for retaining enzymes. In that case Asp395 would serve as nucleophile and attack the anomeric C-atom. The distance of 4.0 A Ê between the anomeric C-atom of Rha I and Asp395 would make a different binding mode for the substrate necessary, with a disruption of the hydrophobic contact of the 6-methyl group to Trp391. Taking into account the precise location of the extended product in the binding cleft, it is likely that substrate and product bind similarly. However, even if Asp395 would serve as a nucleophile, no acidic residue is present on the opposite side of the carbohydrate to activate a potential water molecule as nucleophile in a double displacement mechanism. On the other hand, the steric situation is ideal for a direct attack on the anomeric C-atom of rhamnose by the bound water molecule, activated by Glu359 and Asp395 as general bases. Therefore, an inverting mechanism with Glu359 and Asp395 as general bases and Asp392 as general acid is proposed. In O-antigenic repeating unit I, which is proximal to the active site, the 3,6-dideoxyhexose points to the solvent region and makes only van der Waals contacts to Val240 and Ser237 of the dorsal ®n domain by its 6-methyl group. The 6-methyl is common to the recognised 3,6-dideoxyhexoses paratose, abequose and tyvelose. Therefore, the position of repeating unit I is almost identical in the three complex structures determined by X-ray crystallography. In contrast, the 3,6-dideoxyhexose of repeating unit II points into a shallow depression. The stereochemistry of the 3,6-dideoxyhexoses abequose and tyvelose is clearly seen in the electron density, as shown for repeating unit II ( Figure 12 ). The accommodation of abequose or tyvelose of serotypes B and D1 is achieved by positional wobbling between two acidic residues, Asp303 and Glu309 (Figure 13 ). Alternative interactions achieved by positional wobbling combined with water-mediated protein carbohydrate contacts creates a binding subsite with multiple speci®cities. O-antigen form variation introduces D-glucose residues a-1,4 linked to D-galactose of the trisaccharide backbone. There is only electron density for the glucose of repeating unit I (Figure 14) , although it is also present in 80% of repeating unit II of the S. typhi253Ty O-antigen fragments (Weintraub et al., 1988) used for soaking experiments. The electron density for Gal II is much Each row corresponds to a b-helix turn (1 to 13) and each column to a ladder of aligned residues, pointing either into the interior of the b-helix (i) or to the outside (a). Dots indicate insertion or deletions into the b-helix. The sheets termed A (blue), B (yellow) and C (green) (Steinbacher et al., 1994) correspond to the sheets PB2, PB3 and PB1 in pectate lyase C from Ewinia chrysanthemi (Yoder et al., 1993) . Only short repetitive elements of identical amino acids are present. Structure of Phage P22 Tailspike Protein weaker in the case of the S. typhi253Ty O-antigen complex then in the complexes with the homogeneous S. typhimurium and S. enteritidis octasaccharides. No density is visible for Glc II of the S. typhi253Ty decasaccharide, which might be explained by the lack of a well de®ned subsite for the additional D-glucose at the upper entrance of the binding cleft. The D-glucose of repeating unit I in S. typhi253Ty O-antigens is accommodated in a well de®ned additional subsite (Figure 15 ). The occupation of this subsite requires Lys302 to change its conformation. This is the only major side-chain rearrangement observed upon O-antigen binding. The overall dimensions of the tailspike protein and in¯uenza virus haemagglutinin (Wilson et al., 1981) are rather similar. Both are trimeric, elongated molecules of 150 and 135 A Ê length, respectively, either bound to the neck structure of a bacteriophage via a specialised domain by strong but non-covalent protein-protein interactions or ®xed in the outer membrane of an enveloped virus via a transmembrane anchor. Nature has solved the requirements for both proteins by a very different architecture. The molecular axis of haemagglu-tinin comprises a long three-a-helix bundle as a characteristic feature, whereas the central structural element of the tailspike protein is a three-b-helix bundle. The receptor binding site of haemagglutinin is located in a carbohydrate binding domain at its tip, comprising a jelly-roll fold as often observed in viral coat proteins (Weis et al., 1988) . The receptor binding site of the tailspike protein is located at the centre of the molecule. Its cleftstructure is ideal to recognise the elongated Oantigen polymer of LPS. The different locations and architectures of the binding sites re¯ects different substrate types. For the tailspike protein, the kinetics of receptor binding, which occurs rapidly and reversibly, suggest that the interaction with the receptor still allows the phage to browse over the surface of the host cell by rapid release and rebinding of the receptor . Therefore, the low enzymatic activity will not contribute much to lateral mobility and it will certainly not give the tailspike protein the function of a drill in order to remove the O-antigen from the LPS and make a potential second receptor accessible. However, the activity is high enough to cleave a few O-antigen chains during the infection process. The location of the active site proximal to the cell surface with respect to the binding site is compatible with a liberating function to release newly assembled phages after cell lysis from the cell debris. Studies on the role of receptor destroying activities have been reported for in¯uenza A virus (Liu et al., 1995) and human parain¯uenza 3 virus (Huberman et al., 1995) , where binding and recep-tor destroying activity are either separated on haemagglutinin and neuraminidase or combined in one protein, the haemagglutinin-neuraminidase, respectively. The loss or reduction of receptor destroying neuraminidase activity does not prevent viral entry but blocks multicycle infections due to aggregation of virus particles or delayed virus release. Point mutants of the tailspike protein defective in endorhamnosidase activity but not in receptor The tailspike protein of Salmonella phage P22 is a homotrimer of 72 kDa subunits, which functions as viral adhesion protein with a very low receptor destroying enzymatic activity. The head-binding domain and the C-terminal part are probably linked in a¯exible manner. Two O-antigenic repeating units are suf®cient for receptor binding. They are recognised in a 20 A Ê long cleft in the centre of the 150 A Ê long molecule, which is located between insertions into the right-handed parallel b-helix main body of the molecule. The active-site topology suggests an inverting endoglycosidase mechanism with Asp392 as general acid and a catalytic water molecule activated by Glu359 and Asp395 as general bases. The multiple host speci®city is achieved by a wobble mechanism between two acidic residues, Asp303 and Glu309 in a subsite for 3,6-dideoxyhexoses (paratose, abequose and tyvelose), which also involves water-mediated contacts to the protein as an essential feature. The glucose introduced by O-antigen form variation is accommodated in an additional subsite for the O-antigen repeating unit proximal to the active site. Crystallization and heavy-atom derivative preparation The N-terminal domain of the tailspike protein (residues 1 to 124 with a C-terminal hexa-His tag) was overexpressed in E. coli Bl21(DE3) under the control of a T7 promoter system and puri®ed by Ni 2 -af®nity chromatography (Miller, 1995) . L-Met(Te) was incorporated as described by Budisa et al. (1995) . Brie¯y, the methionine auxotrophic strain E. coli B834(DE3)(hsd metB) harbouring the expression plasmid was grown on minimal medium with a limited amount of Met(S) (0.05 mM) to an A 600 of 1.0, harvested, washed and resuspended in minimal medium. After induction with 1 mM IPTG, Ac-DL-Met(Te)-OH lithium salt (Karnbrock et al., 1996) was added to a ®nal concentration of 1 mM. Cells were incubated for another 12 hours under vigorous shaking at 37 C. Puri®cation was done as for the wild-type protein (Miller, 1995) . The head-binding domain in 50 mM Hepes/NaOH (pH 6.5) was concentrated to 10 mg/ml using a Centricon 30 device (Amicon). Initial crystallization experiments were performed with Crystal Screen I and II (Hampton Research). In the ®nal condition 5 ml of protein solution were mixed with 3 to 5 ml of reservoir solution containing 20% (w/v) polyethylene glycol 8000, 0.2 M MgCl 2 , 0.1 M bis-Tris-HCl (pH 6.6) and equilibrated against 5 ml reservoir solution by vapour diffusion. Crystals of approximately 0.2 mm were obtained by seeding the hanging drop 6 to 12 hours after mixing with a small single crystal. Met(Te) material gave larger crystals compared to wild-type protein. Crystals were harvested in the reservoir solution. Uranium derivatives were prepared by soaking either native or Met(Te) crystals in reservoir solution with 15 mM UO 2 (SO 4 ) for one day. Figure 10 . Stereoview of the 2F o À F c electron density of the terminal rhamnose I (Rha I) of the O-antigen fragment from S. typhi253Ty bound to the active site at 1.8 A Ê resolution contoured at 1s. The discontinuity of the electron density between C2 and C3 is also observed for S. typhimurium and S. enteritidis receptor fragments. The main conformation adopted by Rha I corresponds to a twisted boat conformation with C1 in a-con®guration as in the substrate. Figure 11 . Stereoview of the active site with Rha I of the bound product. The steric situation suggests an inverting mechanism. A water molecule tightly bound also in the absence of the carbohydrate is activated by Glu359 and Asp395 as general bases. Asp392 serves as general acid and is H-bonded to O1 of Rha I. Measurement of X-ray intensities was carried out using a MARresearch image plate (MARresearch, Hamburg) and CuK a -radiation from a Rigaku RU 200 X-ray generator operated at 5.4 kW at 16 C. Images were processed with MOSFLM (Leslie, 1991) and integrated intensities scaled and merged with AGROVATA, ROTA-VATA and TRUNCATE of the CCP4 package (CCP4, 1994) . The crystals were of space group P2(1) with cell dimensions a 57.3 A Ê , b 82.1 A Ê , c 73.8 A Ê , a 90 , b 90.9 , g 90 and two trimers in the asymmetric unit, resulting in a Matthews coef®cient (Matthews, 1968 ) of 2.05 A Ê 3 Da À1 . Data to 1.56 A Ê resolution of the C-terminal fragment (residues 109 to 666) were collected at room temperature using a MARresearch image plate (MARresearch, Hamburg) at the beamline BW6 of the HASYLAB at DESY (Hamburg). Data were processed with MOSFLM (Leslie, 1991) and integrated intensities scaled and merged with AGROVATA, ROTAVATA and TRUNCATE of the CCP4 package (CCP4, 1994). The resolution limit of the images was chosen to have I > 5s(I) in the outermost resolution shell using the program X-DAMAGE (Lo È we, 1995) . The selfrotation function, calculated with GLRF (Tong & Rossmann, 1990) showed only one peak when searching for triads, consistent with almost parallel molecular axes for both asymmetric trimers. The uranium derivative (UOTE) was identi®ed using Met(Te) crystals as parent native crystals as these were easier to grow and diffracted to higher resolution due to their larger size. Six sites were identi®ed in the Met(Te) derivative (TEME) using difference Patterson and difference Fourier maps with the PROTEIN package (Steigemann, 1991) . A double derivative UOTE was subsequently prepared. Phases for difference Fourier maps were calculated with MLPHARE (CCP4, 1994) and improved with DM (CCP4, 1994) by solvent¯attening and histogram mapping. The MIR phases, calculated with three derivatives, had an overall ®gure of merit of 0.57 in the resolution range 20 Figure 12 . Stereoview of the 2F o À F c electron density of the 3,6-dideoxyhexoses in repeating unit II pointing towards the protein surface. (a) Abequose (serotype B) and (b) tyvelose (serotype D1) at 1.8 A Ê resolution contoured at 1s. Figure 13 . Stereoview of the binding site for 3,6-dideoxyhexoses in repeating unit II. Multiple host speci®city is achieved by alternative contacts between the 3,6-dideoxyhexose and the protein, which are partly mediated by two water molecules. The distances between Abe O2 and Asp303 and Tyv O4 and Glu309 are 2.7 and 3.2 A Ê , respectively. Methods used in the structure determination of bovine mitochondrial F1 ATPase The neuraminidase of in¯uenza virus Three-dimensional structure of the alkaline protease of Pseudomonas aeruginosa: a two-domain protein with a calcium binding parallel beta roll motif Interactions of phage P22 tails with their cellular receptor, Salmonella Oantigen polysaccharide Puri®cation and organization of gene 1 portal protein required for phage P22 DNA packaging Structure and function of the bacteriophage P22 tail protein Mechanism of head assembly and DNA encapsulation in Salmonella phage P22: I. Genes, proteins, structures and DNA maturation High-level biosynthetic substitution of methionine in proteins by its analogs 2-aminohexanoic acid, selenomethionine, telluromethionine and ethionine in Escherichia coli Control mechanisms in dsDNA bacteriophage assembly Thermal unfolding pathway for the thermostable P22 tailspike endorhamnosidase The CCP4 suite: programs for protein crystallography Folding and assembly of phage P22 tailspike endorhamnosidase lacking the N-terminal, head-binding domain Structures and mechanisms of glycosyl hydrolases DNA packaging by double stranded DNA bacteriophages Accurate bond and angle parameters for X-ray protein structure re®nement Structure of Bordetella pertussis virulence factor P.69 pertactin Stereoview of the binding site for glucose I of O-antigen 12 2 introduced by form variation. Lys302 has to reorient its side-chain to open the binding site. This is the only rearrangement of side-chains observed upon receptor binding Adsoption of phage P22 to Salmonella typhimurium Salmonella phage glycanases: substrate speci®city of the phage Heterogeneity of antigen-side-chain length in lipopolysaccharide from Escherichia coli 0111 and Salmonella typhimurium LT2 Virus structure Determination of macromolecular structures from anomalous diffraction of synchrotron radiation The receptor-destroying enzyme of in¯uenza C virus is neuraminate-Oacetylesterase Hemagglutinin-neuraminidase of human parainuenza 3: role of the neuraminidase in the viral life cycle In vitro morphogenesis of phage P22 from heads and base-plate parts A model for the adsorption of phage P22 to Salmonella typhimurium Enzymatic and molecular properties of base-plate parts of bacteriophage P22 A graphics model building & re®ne-ment system for macromolecules Improved methods for building protein models in electron density maps and the location of errors in these models Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features A new ef®cient synthesis of acetytelluro-and acetylselenomethionine and their use in the biosynthesis of heavy-atom protein analogs A typhoid variant and a new serological variation in the Salmonella group A left-handed b-helix revealed by the crystal structure of a carbonic anhydrase from the archeon Methanosarcina thermophile Masks made easy. ESF/CCP4 Newsletter Halloween ... masks and bones MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures Recent changes to the MOSFLM package for processing ®lm and image plate data Bacteriophage surface carbohydrates and bacteriophage adsorption In¯uenza type A virus neuraminidase does not play a role in viral entry, replication, assembly, or budding Ro È ntgenstrukturanalyse des 20S Proteasoms aus Thermoplasma acidophilum. Thesis, Technische Universita È t Mu È nchen Bacteriol. Rev. 30 Solvent content of protein crystals Intragenic suppression of a capsid assembly-defective P22 tailspike mutation Mechnisms of enzymatic glycoside hydrolysis Graspgraphical representation and analysis of surface properties Lipopolysaccharide Heterogeneity in Salmonella typhimurium analyzed by sodium dodecyl sulfate/polyacrylamide gel electrophoresis The structure of Bacillus subtilis pectate lyase in complex with calcium P22 bacteriophage Threedimensional transformation of capsids associated with genome packaging in a bacterial virus A left-handed parallel b helix in the structure of UDP-N-acetylglucosamine acyltransferase Conformation of polypeptides and proteins The envelope glycoprotein from tick-borne encephalitis virus at 2 A Ê resolution Phage P22 tail protein: gene and amino acid sequence Characterization of bacteriophage P22 tailspike mutant proteins with altered endorhamnosidase and capsid assembly activities Recent advances in the PRO-TEIN program system for the X-ray structure analysis of biological macromolecules Crystal structure of P22 tailspike protein: interdigitated subunits in a thermostable trimer Crystal structure of phage P22 tailspike protein complexed with Salmonella sp. O-antigen receptors Salmonella bacteriophage glycanases: endorhamnosidases of Salmonella typhimurium bacteriophages Three-dimensional structure of scaffolding-containing phage P22 procapsids by electron cryomicroscopy The locked rotation function Structure of the in¯uenza virus glycoprotein antigen neuraminidase at 2.9 A Ê resolution Heterogeneity in oligosaccharides from the O-polysaccharide chain of the lipopolysaccharide from Salmonella typhi253Ty determined by fast atom bombardment mass spectrometry Structure of the in¯uenza virus haemagglutinin complexed with its receptor, sialic acid Structure of the haemagglutinin membrane glycoprotein of in¯uenza virus at 3 A Ê resolution Crystal structure of the receptor-binding domain of adenovirus type 5 ®ber protein at 1.7 A Ê resolution The parallel b helix and other coiled folds New domain motif: the structure of pectate lyase C, a secreted plant virulence factor. Science, 260, 1503± 1607 We thank Dr H. D. Bartunik for help in data collection at beamline BW6 at the HASYLAB at DESY, Hamburg and Professor Dr L. Moroder and Dr W. Karnbrock (Max-Planck-Institut fu È r Biochemie, Martinsried) for the grateful gift of telluromethionine compounds. The work was supported by grants from the Deutsche Forschungsgemeinschaft and the Fonds der Chemischen Industrie (S.M., U.B. and R.S.).