key: cord-0759600-dnh4n8do authors: Rosenthal, Peter B.; Zhang, Xiaodong; Formanowski, Frank; Fitz, Wolfgang; Wong, Chi-Huey; Meier-Ewert, Herbert; Skehel, John J.; Wiley, Don C. title: Structure of the haemagglutinin-esterase-fusion glycoprotein of influenza C virus date: 1998 journal: Nature DOI: 10.1038/23974 sha: 0e50d299c3c9c97a52e3ceee9870f2784c34e630 doc_id: 759600 cord_uid: dnh4n8do The spike glycoproteins of the lipid-enveloped orthomyxoviruses and paramyxoviruses have three functions: to recognize the receptor on the cell surface, to mediate viral fusion with the cell membrane, and to destroy the receptor. In influenza C virus, a single glycoprotein, the haemagglutinin-esterase-fusion (HEF) protein, possesses all three functions (reviewed in ref. 1). In influenza A and B, the first two activities are mediated by haemagglutinin and the third by a second glycoprotein, neuraminidase. Here we report the crystal structure of the HEF envelope glycoprotein of influenza C virus. We have identified the receptor-binding site and the receptor-destroying enzyme (9-O -acetylesterase) sites, by using receptor analogues. The receptor-binding domain is structurally similar to the sialic acid-binding domain of influenza A haemagglutinin, but binds 9-O -acetylsialic acid. The esterase domain has a structure similar to the esterase from Streptomyces scabies and a brain acetylhydrolase(2),(3). The receptor domain is inserted into a surface loop of the esterase domain and the esterase domain is inserted into a surface loop of the stem. The stem domain is similar to that of influenza A haemagglutinin, except that the triple-stranded, α-helical bundle diverges at both of its ends, and the amino terminus of HEF2, the fusion peptide, is partially exposed. The segregation of HEF's three functions into structurally distinct domains suggests that the entire stem region, including sequences at the amino and carboxy termini of HEF1 which precede the post-translational cleavage site between HEF1 and HEF2, forms an independent fusion domain which is probably derived from an ancestral membrane fusion protein. the crystal structure of the HEF envelope glycoprotein of influenza C virus. We have identified the receptor-binding site and the receptor-destroying enzyme (9-O-acetylesterase) sites, by using receptor analogues. The receptor-binding domain is structurally similar to the sialic acid-binding domain of influenza A haemagglutinin, but binds 9-O-acetylsialic acid. The esterase domain has a structure similar to the esterase from Streptomyces scabies and a brain acetylhydrolase 2,3 . The receptor domain is inserted into a surface loop of the esterase domain and the esterase domain is inserted into a surface loop of the stem. The stem domain is similar to that of influenza A haemagglutinin, except that the triple-stranded, ␣-helical bundle diverges at both of its ends, and the amino terminus of HEF2, the fusion peptide, is partially exposed. The segregation of HEF's three functions into structurally distinct domains suggests that the entire stem region, including sequences at the amino and carboxy termini of HEF1 which precede the post-translational cleavage site between HEF1 and HEF2, forms an independent fusion domain which is probably derived from an ancestral membrane fusion protein. The structure of the HEF trimer is shown in Fig. 1 . The HEF monomer is composed of three domains: an elongated stem (red in Fig. 2a ) active in membrane fusion (F), a receptor-destroying esterase domain (E) (green in Fig. 2a) , and a receptor-binding domain (R) (blue in Fig. 2a) . Two of these compact domains are made from non-contiguous segments of amino-acid sequence ( Fig. 2b ): the stem domain F consists of the amino-terminal amino acids 1-40 and carboxy-terminal residues 367-432 of HEF1 and all of HEF2 (labelled F1, F2, F3 in Fig. 2a, b) ; the esterase domain E consists of HEF1 segments comprising residues 41-150, which precede the receptor domain, and residues 311-366, which follow R (labelled E1, EЈ and E2 in Fig. 2a, b) . The single-segment R domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop near the top of the stem domain F (Fig. 2c) . The R and E domains of HEF are both compact, having their N and C termini within a few angströms of each other, so that they can be accommodated into a pre-existing protein at surface loops without disrupting either protein's structure or function. Alignment of the amino-acid sequences of HEF and HA based on their three-dimensional structures indicates that they have 12% sequence identity (the alignment is available from the authors). Nevertheless, both the overall structure (compare Fig. 2a and d) and the detailed folds of individual segments (Fig. 2a) of HEF and HA are remarkably similar. This is true for the globular R domain, as well as for the highly extended segments F1 and F2 (Fig. 2a) , and for the similar helical-hairpin and membrane-proximal five-stranded ␤-sheet forming HEF2 and HA2 (F3 in Fig. 2a) . Two of the segments of the enzyme domain, E1 and E2, are not found in HA (compare Fig. 2b and e) , which is consistent with its lack of enzyme activity. If the EЈ domain of HA is a vestigial fragment of E, then HA may have evolved by the deletion of segments E1 and E2 from an ancestral gene similar to HEF. Segments E1, EЈ, R, E2, and F2 are present, with 30% sequence identity, in the haemagglutinin-esterase (HE) found in coronaviruses 4 . (Data concerning the antigenicity of HEF and the structural relation between HEF and HE will be published elsewhere.) The R domains of both HEF and HE are similar eight-stranded 'Swiss rolls' of ␤-sheet (Fig. 2a, d) containing the receptor-binding sites bounded by an ␣-helix, a loop, and an extended strand (superimposed in Fig. 3b ) (except for one residue, Tyr 127 from EЈ). The structures of the complexes of HEF with two receptor analogues 5,6 , 9-acetamidosialic acid ␣-methylglycoside (Fig. 3a ) and 9-acetamidosialic acid ␣-thiomethylmercuryl glycoside 7 , show that sialosides bind similarly to HEF and to HA, although they are recognized by different amino-acid side chains (see ref. 8 and references therein). Sialic acid linkage specificity (␣(2,3) vs ␣-(2,6)) has been attributed to residues near amino acid 226 in HA [8] [9] [10] . The homologous loop (near HEF 270) is truncated in HEF (Fig. 2b) , consistent with the lack of linkage specificity of influenza C virus 11 . The HEF receptor differs from the HA receptor by the addition of an acetyl group at the 9-O position of the glycerol side chain. The acetyl methyl group binds in a nonpolar pocket unique to HEF (Fig. 3a) . Complexes of HEF with the two non-hydrolysable receptor analogues described above also show how substrate interacts with the novel 9-O-acetylesterase active site (Fig. 4a ). Ser 57 in the catalytic triad (Ser 57 from E1, His 355 and Asp 352 from E2) is positioned for nucleophilic attack on the carbonyl carbon of the 9-O acetyl group of the bound sialoside. The carbonyl oxygen points into an 'oxyanion hole' formed by the side chain of Asn 117, and the NH groups of Gly 85 and Ser 57 (Fig. 4a) . The ligand interactions and the structure of the enzyme site are completely different from those of the receptor-binding site. A search for proteins with structural similarity to the HEF enzyme domain identified the ␣1 subunit of platelet-activating factor acetylhydrolase (Ib) from bovine brain (PAF-AH) 2 and the esterase from Streptomyces scabies (SsEst) 1 (Fig. 4b) . All three proteins have a similar topology, despite sharing only 13% sequence identity, and their core residues superimpose (r.m.s. ϳ3 Å ), including the central five-stranded ␤-sheet and long flanking helices (blue in Fig. 4b ). When the core folds are superimposed, the catalytic triads (green in Fig. 4b ) and oxyanion-hole residues of each protein overlap. Interactions between the ␣-helices in the stem of the HEF trimer and the packing of the N-terminal fusion peptide of F3 have implications for low-pH-induced conformational change in HEF and the mechanism by which this induces membrane fusion at low pH. The F3 segments of HA and HEF are very similar in structure (Fig. 2a) : each monomer has a central ␣-helix along the threefold axis and a smaller N-terminal helix packed antiparallel on the outside and connected by an interhelical loop (Fig. 2a) . In HEF2, although the central helices interact closely in the middle like HA, they diverge from the trimer axis at both ends ( Fig. 5a) . At the top, the interhelical loops interpose between the first five turns of the long helices (residues 80-97), where loop residues HEF2 Arg 69 and central helix residues HEF2 Glu 95 form salt bridges and contact an unidentified ion (possibly a sulphate) on the trimer axis. Although HEF2 has a sequence deletion of seven residues in the loop region, this difference would preserve the register of the heptads during the formation of an extended triple-stranded coil in the low-pH conformation, like that in HA 12 . Unlike HA, however, the top third of the triple-stranded helical bundle must first make interactions on the trimer axis after the removal of the interhelical loop, before the N-terminal coiled-coil extension can form. Three tryptophans (HEF2 116) form the last interaction on the trimer axis (Fig. 5a) , below which the helices diverge as in HA. Unlike HA, in which residues 2 (Leu) and 3 (Phe) of the HA2 N-terminal 'fusion peptides' interact across the trimer axis, HEF2 residues Val 11 and Leu 12 are closest to the trimer axis but further penetration is blocked by tryptophans at position 116. Residues N-terminal to residue 10 fold back out to the surface of the protein, where aspartic acids at positions 5 and 6 of HEF2 can interact with Arg 29 and Lys 30 of HEF2 and Lys 4 of HEF1. Residues 1-4 of HEF2 appear to be disordered on the surface of the molecule. In HEF, the residues that are functionally analogous to the fusion peptide of HA, namely the buried residues whose exposure would convert a soluble protein into a lipophilic one, are displaced along the sequence by six residues. HEF may therefore be regarded as having an internal fusion peptide, similar to virus fusion proteins that do not require cleavage activation. The conformation of the N terminus of the HEF2 fusion peptide may indicate that fusion peptides can insert into membranes as loops. Because the influenza C virus esterase domain E is folded like other esterases (SsEst and PAF-AH), and the R domain present in HEF and HA is a ubiquitous folding unit also found as a receptorbinding domain in the orbivirus BTV 13 , we conclude that HEF must have evolved from functional domains (Fig. 2c) . A precursor to HEF may have evolved by recombination events that resulted in the insertion of R into a surface loop of E, and E into the interhelical loop of the stem domain F (Fig. 2c ). In such a scheme, the trimeric F domain (Fig. 5b) may have been an ancestral membrane-fusion protein analogous to the single-function fusion proteins of paramyxoviruses such as Sendai 14 . Similar modular structures of the envelope glycoproteins of retroviruses are suggested by biochemical data. The first 62 and the last 20 residues of gp120 from HIV-1 can be removed, retaining receptor binding of the fragment 15 , indicating that those terminal segments might be analogous to F1 and F2, forming with gp41 (F3) the stem of gp160 (refs 15-17) . The HEF structure implies that the membrane fusion domains of HEF and HA consist of F1 and F2, in addition to the segment F3 ¼ HEF2, suggesting that F1 and F2 may play a part in membrane fusion, either by controlling the low-pH-induced conformational change required for fusion or during the formation of a fusion pore. In both HEF and HA, F2 packs against the interhelical loop of F3, which refolds to a helix at low pH, but the position of F2 after refolding is unknown. The ␤-hairpin of F1 (residues 15-32 of HEF1; Fig. 2a ) has already been implicated in the low-pH-induced conformational change of HA by proteolytic susceptibility 18 , and by its location adjacent to the site of the helix-to-␤-turn refolding and chain-direction reversal on the long HA2 helix 12 . Structure determination. HEF was obtained by bromelain digestion of C/Johannesburg/1/66 virions and two crystal forms were characterized as described previously 19 . Crystals in harvest buffer (60% saturated ammonium sulphate, 50 mM Tris-HCl pH 7.1, 140 mM NaCl) were transferred to a ligand soak solution (40 mM 9-acetamidosialic acid ␣-thiomethylmercury glycoside, 300 mM MOPS pH 7.1, 140 mM NaCl) for 2 h and then cryoprotected by serial transfer through ligand soak solution containing 5-25% glycerol in 5% steps. Complexes of 9-acetamidosialic acid ␣-methyl glycoside (K i ¼ 2 mM) with HEF were prepared by soaking form I crystals in 40 mM ligand using the same procedure but with a slightly different soak solution (40 mM 9-acetamidosialic acid ␣-methyl glycoside, 100 mM Tris-HCl pH 7.1, 140 mM NaCl). 9acetamidosialic acid ␣-methyl glycoside was a gift from J. Hanson. Data collection and processing. X-ray diffraction data were collected at the Cornell High Energy Synchrotron Source by flash-cooling crystals and 19 . A 6.5 Å derivative dataset of form I crystals complexed with 9-acetamidosialic acid ␣-thiomethylmercury glycoside collected on a Mar scanner and Elliot GX-13 rotating anode provided initial phases. Data were processed using Denzo, Scalepack 20 , and programs from the CCP4 suite 21 . Phasing, model building and refinement. 6.5 Å resolution SIR phases were extended to 3.5 Å resolution in form I crystals by iterative solvent flattening, histogram matching, and non-crystallographic symmetry averaging about the molecular three-fold axis using the program DM 22 . Details of the phase extension, model building, two-crystal-averaging, and refinement to 3.2 Å resolution will be described elsewhere (X.Z. et al., manuscript submitted). The current model (R free ¼ 26:7%, R work ¼ 22:3%) contains HEF1 residues 1-427 (out of 432), residues 4-165 (out of 175) of HEF2, and no solvent molecules. Core oligosaccharide (MAN-NAG-NAG) was built at 5 of 8 potential Nlinked glycosylation sites (indicated in Fig. 2 ). No evidence for CHO was found at HEF1 Asn 117, occurring at the rarely glycosylated sequence NWSP. For the liganded HEF structures, the HEF model was subjected to rigid body and positional refinement against ligand data to 3.5 Å . Ligands were built into iteratively averaged 2F o Ϫ F c density phased with the HEF model, omitting residues within 5 Å of the binding sites. Structure analysis. A Go plot 23 was used to help identify domains in HEF. The Dali program and database were used to find structurally similar domains 24 . Least-squares superpositions were performed using the program Lsqman 25 . RIBBONS 26 was used to produce Fig. 1a ; GRASP 27 for Fig. 1b ; BOBSCRIPT 28 and Raster3D 29 for Figs 3a and 4a; and SETOR 30 for Figs 2a, 3b, 4b and 5. as a student, and an NIH training grant as a postdoctoral fellow; X.Z. was supported by the HHMI. This work was supported by the NIH, the Deutsche Forschungsgemeinschaft, the MRC and the HHMI. D.C.W. is an investigator of the HHMI. Correspondence and requests for materials should be addressed to D.C.W. Coordinates will be deposited in the Brookhaven Database and are available before release from http://www.crystal.harvard.edu. Signal transduction via the histidyl-aspartyl phosphorelay Two-component signal transducers and MAPK cascades Two-domain reconstitution of a functional protein histidine kinase Requirement of both kinase and phosphatase activities of an Escherichia coli receptor (Tazl) for ligand-dependent signal transduction Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Identification and structural characterization of the ATP/ADP-binding site in the Hsp90 molecular chaperone Crystal structure of an N-terminal fragment of the DNA gyrase B protein Reverse phosphotransfer from OmpR to EnvZ in a kinase − /phosphatase + mutant of EnvZ (EnvZ⋅N347D), a bifunctional signal transducer of Escherichia coli Sigma F, the first compartment-specific transcription factor of B. subtilis, is regulated by anti-sigma factor that is also a protein kinase Compilation of all genes encoding two-component phosphotransfer signal transducers in the genome of Escherichia coli A yeast protein similar to bacterial two-component regulators Two-component regulatory systems can interact to process multiple environmental signals Antibacterial agents that inhibit two-component signal transduction systems Determination of three-dimensional structures of proteins by simulated annealing with interproton distance restraints. Application to crambin, potato carboxypeptidase inhibitor and barley serine proteinase inhibitor 2 X-PLOR Version 3.1: A system for X-ray Crystallography and NMR MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures Raster 3D Version 2.0-a program for photorealistic molecular graphics The entropic penalty of ordered water accounts for weaker binding of the antibiotic novobiocin to a resistant mutant of DNA gyrase: a thermodynamic and crystallographic study Fast structure alignment for protein databank searching Structure and function of the HEF glycoprotein of influenza C virus A novel variant of the catalytic triad in the Streptomyces scabies esterase Brain acetylhydrolase that inactivates platelet-activating factor is a G-protein-like trimer Sequence of mouse hepatitis virus A59 mRNA 2: indications for RNA recombination between coronaviruses and influenza C virus Crystallographic Studies of the Influenza C Virus Surface Glycoprotein. Thesis A synthetic sialic acid analogue is recognized by influenza C virus as a receptor determinant but is resistant to the receptor-destroying enzyme Synthesis and inhibitory properties of a thiomethylmercuric sialic acid with application to the X-ray structure determination of 9-O-acetylsialic acid esterase from influenza C virus Binding of the influenza A virus to cell-surface receptors: Structures of five hemagglutinin-sialyloligosaccharide complexes determined by X-ray crystallography Structure of the influenza haemagglutinin complexed with its receptor, sialic acid Single amino-acid substitutions in influenza haemagglutinin change receptor binding specificity Influenza C virus uses 9-O-acetyl-Nacetylneuaminic acid as a high affinity receptor determinant for attachment to cells Structure of influenza haemagglutinin at the pH of membrane fusion The crystal structure of bluetongue virus VP7 Identification of biological activities of paramyxovirus glycoproteins. Activation of cell fusion, hemolysis and infectivity by proteolytic cleavage of an inactive precursor protein of Sendai virus Truncated variants of gp120 bind CD4 with high affinity and suggest a minimum CD4 binding region Analysis of the interaction of the human immunodeficiency virus type 1 gp120 envelope glycoprotein with the gp41 transmembrane glycoprotein Human immunodeficiency virus type 1 gp120 envelope glycoprotein regions important for association with the gp41 transmembrane glycoprotein Changes in the conformation of influenza virus hemagglutinin at the pH optimum of virus-mediated membrane fusion Crystallization and preliminary X-ray diffraction studies of the influenza C virus glycoprotein Processing of X-ray diffraction data collected in oscillation mode Collaborative Computational Project No. 4, the CCP4 Suite: Programs for Protein Crystallography Correlation of DNA exonic regions with protein structural units in haemoglobin Protein structure comparison by alignment of distance matrices Ribbon models of macromolecules Protein folding and association: insights from interfacial and thermodynamic properties of hydrocarbons An extensively modified version of Molscript which includes greatly enhanced coloring capabilities Raster3D: photorealistic molecular graphics SETOR: Hardware-lighted three-dimensional solid model representations of macromolecules Acknowledgements. We thank L. Kay for providing NMR pulse sequences, L. Pearl for the Hsp90-ATP coordinates before PDB release and discussions, and S. Bagby for comments on the manuscript. This work was supported by grants from JSPS (to T.T.) from CREST (to M.K.) from the NIH (to M. Inouye), and from HHMI (to M. Ikura). R.I. and D.L. acknowledge HFSP postdoctoral fellowships. M. Ikura is an HHMI International Research Scholar and MRCC Scientist.Acknowledgements. We thank M. Frayser and R. Crouse for technical assistance, and members of the Harrison-Wiley laboratory and the staff of the Cornell High Energy Synchrotron Source for assistance with data collection. P.B.R. received graduate research assistant support from the Department of Molecular and Cellular Biology and the committee on higher degrees in biophysics, Harvard University,