key: cord-0000094-9zmyojbu authors: Villullas, Silvia; Hill, Darryl J; Sessions, Richard B; Rea, Jon; Virji, Mumtaz title: Mutational analysis of human CEACAM1: the potential of receptor polymorphism in increasing host susceptibility to bacterial infection date: 2007-02-01 journal: Cell Microbiol DOI: 10.1111/j.1462-5822.2006.00789.x sha: 6ea2532aacbdcea4609067523e9e72fa8cd73200 doc_id: 94 cord_uid: 9zmyojbu A common overlapping site on the N-terminal IgV-like domain of human carcinoembryonic antigen (CEA)-related cell adhesion molecules (CEACAMs) is targeted by several important human respiratory pathogens. These include Neisseria meningitidis (Nm) and Haemophilus influenzae (Hi) that can cause disseminated or persistent localized infections. To define the precise structural features that determine the binding of distinct pathogens with CEACAMs, we have undertaken molecular modelling and mutation of the receptor molecules at previously implicated key target residues required for bacterial binding. These include Ser-32, Tyr-34, Val-39, Gln-44 and Gln-89, in addition to Ile-91, the primary docking site for the pathogens. Most, but not all, of these residues located adjacent to each other in a previous N-domain model of human CEACAM1, which was based on REI, CD2 and CD4. In the current studies, we have refined this model based on the mouse CEACAM1 crystal structure, and observe that all of the above residues form an exposed continuous binding region on the N-domain. Examination of the model also suggested that substitution of two of these residues 34 and 89 could affect the accessibility of Ile-91 for ligand binding. By introducing selected mutations at the positions 91, 34 and 89, we confirmed the primary importance of Ile-91 in all bacterial binding to CEACAM1 despite the inter- and intraspecies structural differences between the bacterial CEACAM-binding ligands. The studies further indicated that the efficiency of binding was significantly enhanced for specific strains by mutations such as Y34F and Q89N, which also altered the hierarchy of Nm versus Hi strain binding. These studies imply that distinct polymorphisms in human epithelial CEACAMs have the potential to decrease or increase the risk of infection by the receptor-targeting pathogens. The bacterial pathogens Neisseria meningitidis (Nm) and Haemophilus influenzae (Hi) are frequently found in the nasopharynx of a substantial proportion of the healthy population but are capable of causing serious infections in susceptible individuals (Turk, 1984; Foxwell et al., 1998) . Nm and typable Hi (THi) can invade the nasopharyngeal epithelial barrier to cause septicaemia and meningitis, which in the case of Nm, may rapidly become life threatening (van Deuren et al., 2000) . Non-typable Hi (NTHi), which lack a polysaccharide capsule, are associated with localized respiratory tract and conjunctival infections (Foxwell et al., 1998) . Strains belonging to Hi-biogroup aegyptius (Hi-aeg) are also associated with Brazilian purpuric fever (Foxwell et al., 1998) . The factors that determine susceptibility to infection by these frequent colonizers are not entirely clear. For both colonization and pathogenesis, the first essential step is adherence to mucosal epithelial cells. Many investigations have shown bacterial targeting of specific human signalling molecules such as integrins, sialic acid binding Ig like lectins (Siglecs) and carcinoembryonic antigen (CEA)-related cell adhesion molecules (CEACAMs) can lead to cellular invasion (Virji et al., 1995; 1999; Hauck and Meyer, 2003; Jones et al., 2003) . Of these, CEACAMs have emerged as common targets of several respiratory mucosal pathogens and include Nm, Hi, Moraxella catarrhalis, as well as the urogenital pathogen Neisseria gonorrhoeae and enteric pathogens Escherichia coli and Salmonella (Leusch et al., 1991; Virji et al., 1996a; Chen et al., 1997; Gray-Owen et al., 1997; Hill and Virji, 2003) . Carcinoembryonic antigen-related cell adhesion molecules belong to the immunoglobulin (Ig) superfamily. Several members of the CEACAM subgroup are expressed on human epithelial cells and include the widely expressed transmembrane CEACAM1 as well as GPI-anchored CEA and CEACAM6. All CEACAMs have an N-terminal IgV-like domain and variable numbers of IgC2-like A and B domains. CEACAM1 comprises up to four extracellular domains: N, A1, B and A2 and either a long or a short cytoplasmic tail (Tsutsumi et al., 1990; Prall et al., 1996; Hammarstrom, 1999) . Various functions have been attributed to CEACAM1 including cell-cell adhesion, insulin regulation and angiogenesis (Obrink, 1997; Hammarstrom, 1999; Wagener and Ergun, 2000; Najjar, 2002) . Targeting of CEACAMs by N. gonorrhoeae, as well as Nm and Hi leads to cellular invasion and passage across polarized monolayers (Virji et al., 1999; Gray-Owen, 2003; M. Soriani, K. Setchfield, D.J. Hill, and M. Virji, unpublished data) . A wide range of bacterial adhesins are involved in targeting the CEACAM N-terminal domains and includes Opa proteins, a major adhesin family of pathogenic Neisseria and P5 proteins of Hi (Chen and Gotschlich, 1996; Virji et al., 1996a; Hill et al., 2001) . Nm and N. gonorrhoeae contain multiple copies of Opa genes that encode conserved domains which form b-barrel structures in bacterial membranes and variable domains that form surface exposed loops. In spite of the surface diversity afforded by the hyper-variable domains of the loops, the majority of the Opa proteins are capable of targeting CEACAMs (Virji et al., 1996a; Hauck and Meyer, 2003) . The P5 proteins of Hi are similar b-barrel forming proteins, also with surface variable loops (Webb and Cripps, 1998; Vandeputte-Rutten et al., 2003) . The interactions between these bacterial ligands and CEACAMs are complex and the binding domain appears to involve more than one variable loop of the bacterial adhesins (Virji et al., 1999; Bos et al., 2002; de Jonge et al., 2003) . Interestingly, antibody inhibition studies have shown that the diverse ligands of neisseria and haemophilus bind to an overlapping site on the N-domain. In addition, mutational analysis of the N-domain of CEACAM1 has identified several critical residues particularly Ile-91. Alanine substitutions at these sites abrogated binding of most Opa-and P5-expressing bacteria to CEACAM1. Additional residues such as Tyr-34, Ser-32, Val-39, Gln-44 and Gln-89, most of which located in the vicinity of Ile-91, appear to determine the efficiency of interactions of various Opa and P5 molecules (Virji et al., 1999; . The bacterial binding surface on the CEACAM1 N-domain is the protein face composed of the beta strands C′′, C′, C, F and G (CFG for brevity). Despite the extensive investigations in a number of laboratories particularly on neisserial Opa proteins, it remains unclear as to precisely how CEACAM-binding ligands engage with the receptors. A three dimensional structural model of the human CEACAM1 N-domain has been previously generated based on other Ig family molecules (Virji et al., 1999) . In the current investigation, we have refined our previous model based on murine CEACAM1 crystal structure (Tan et al., 2002) . Examination of this model showed a better confluence of the above residues into a continuous binding site and suggested that substitutions at positions 34 and 89 at the core of the binding region may affect bacterial access to the implicated key binding residue Ile-91. By introducing conservative and non-conservative substitutions at positions 34, 89 and 91, we have examined the binding of a variety of bacterial strains to the receptor constructs. The data suggest that single nucleotide polymorphisms (SNPs) in individuals or populations that may introduce substitutions in CEACAM sequence particularly at the bacterial binding site, could not only decrease but also significantly increase the functional affinity of pathogen interactions. Increased binding affinity may result in increased cellular invasion and thus may lead to increased host susceptibility to infection by CEACAM-targeting bacteria. A three-dimensional model of CEACAM1 N-domain was previously produced based on the Ig family molecules REI, CD2 and CD4 (Fig. 1A ) (Virji et al., 1999) . In this model, whilst most mutations affecting binding of Nm and Hi were located centrally on the CFG face of the protein, Val-39 involved in Opa binding, was located at a distance towards the bottom of the CFG face (Fig. 1A) . Subsequently, a crystal structure of murine soluble CEACAM1a domains 1 and 4 was reported (Tan et al., 2002) . Based on this, we have remodelled the human CEACAM1 Ndomain. The CC′ loop that contains Val-39 folds back against the CFG face of this model such that it lies in close proximity to the other critical residues involved in bacterial adhesion (Fig. 1B) . Further examination of this model suggested that the bacterial binding pocket forms a rather flat surface in the centre of the CFG face. It also appears that substituting neighbouring residues Tyr-34 and Gln-89 by Phe and Asn, respectively, could result in a further flattening of the bacterial binding surface (Fig. 1C-E) , providing increased access to the key Ile-91. These residue changes also increase the size of the hydrophobic patch centred on I91 (Fig. 1C-E) . Both of these effects might facilitate binding of some bacterial ligands to the target receptor. In order to assess the importance of these residues in interactions with mucosal pathogens, substitutions were introduced by site-directed mutagenesis at three sites (91, 34 and 89) on the N-domain. The substitutions introduced in CEACAM1 [NA1B]-Fc are shown in Fig. 2 . Chimeric receptors proteins with the native sequence or with sub- (Virji et al., 1999) . (B) New model based on the murine CEACAM1a N-domain crystal structure (Tan et al., 2002) . The models are presented as Van der Waals surface representation. In the latter case V39 locates more centrally with the other critical residues for bacterial binding. (C-E, left) Stereo pairs presented are ribbon diagrams for CEACAM1 N-domain in the native form (C), with Y34F (D) or Q89N (E) substitutions show a flattening of the bacterial binding region on CEACAM1 and improving accessibility of the primary binding residue I91. The side-chain atom colouring (C-green, O-red, N-blue) show the increased hydrophobic nature of the binding site around I91 in the mutant structures. (C-E, right) Surface presentations of the binding site (views corresponding to a 90°rotation of the stereo images in the horizontal axis) coloured according to hydrophobicity. stitutions were produced by transient transfection of COS cells for functional studies described below. The N-domain specific monoclonal antibody (mAb) YTH71.3 has been shown to require Gln-89 and Ile-91 for receptor recognition and substitution of these amino acids had no effect on the binding of the polyclonal anti-CEACAM antibody A0115 (Virji et al., 1999) . The novel receptor constructs produced in the current studies were first examined for their ability to bind to YTH71.3, A0115 as well as Kat4c. The latter mAb recognizes the A and B domains of the receptor (Jones et al., 1995) . AO115 and Kat4c bound to various receptor constructs and to the native NA1B-Fc molecule with equal efficiency. For YTH71.3, Leu but not Thr at position 91 supported binding to the same extent as Ile of the native molecule. Similarly at position 34, only Phe could be effectively substituted for Tyr. Finally Q89A and Q89N completely abrogated the antibody binding. Opa-expressing phenotypes of two Nm strains (C751 and MC58) were used to investigate their binding to the modified NA1B-Fc receptors. Three C751 derivatives express-ing distinct Opa proteins (OpaA, OpaB and OpaD) and the MC58 derivative expressing an Opa protein designated OpaX (Virji et al., 1999) were assessed by receptor overlay experiments (Fig. 3) . All the Nm isolates showed reduced binding to the soluble receptor with Y34A and I91A substitutions whilst Q89A affected OpaD binding most significantly confirming previous studies (Virji et al., 1999) . Introduction of a Leu or Thr residue at position 91 was generally less disruptive to Opa interactions. I91T substitution either did not affect binding (OpaB and OpaX) or reduced binding by 50-60% (OpaD and OpaA). I91L substitution had no effect on OpaD or OpaB whilst having opposite effects on OpaA (~50% reduction) and OpaX (twofold increase) binding. Substitutions at position 34 other than Ala also supported binding of some but not all Opa proteins. Y34S substitution was unsuitable for the three Opa proteins of strain C751 but was tolerated by OpaX of MC58. Interestingly, Phe proved to be a more favoured residue in at least two cases with OpaX as well as OpaB binding to Y34F construct at threefold higher levels than the native receptor. Substitutions of Asn at Gln-89 revealed distinct patterns of interactions and binding reduced in the order OpaA > OpaX > OpaB/D (Fig. 3) . Overall, the data suggest that Opa binding to the receptor requires an extended aliphatic chain at position 91 and certain arrangements of the chain may facilitate binding of most Opa proteins (Ile = leu > Thr). At position Α,Β D, E β−strand: (Virji et al., 1999) . Side chains of the residues introduced at positions 34, 89 and 91 in this study are shown in B for comparison 34, the removal of the hydroxyl residue (Y43F) is tolerated or preferred whereas the absence of aromatic ring (Y34S) has an overall deleterious effect on Opa binding. Finally, reduction of the side chain extension (Q89N) reduces binding of three of the four Opa proteins. Comparison of C751 Opa-A, -B and -D in which the differences exist only between their surface exposed loop structures (HV1 and HV2, shown in Fig. 9 ), suggests further that a combination of the two loop structures must be involved in presenting the appropriate binding partners for the distinct residues of the receptor. One non-typable (A950002) and two THi strains (Rd and Eagan) were used in receptor overlay experiments as above (Fig. 4) . Alanine substitutions at Ile-91 confirmed previous results (Virji et al., 2000) . Taken individually, substitutions of Ile-91 for Ala or Thr abrogated binding of Fig. 3 . Relative binding of CEACAM1-Fc constructs to N. meningitidis isolates expressing distinct Opa proteins. Bacterial lysates were dotted on to nitrocellulose and overlaid with NA1B-Fc constructs as indicated. Binding relative to the native receptor was determined by densitometric analysis of immno-blots using NIH Scion Image programme. One hundred per cent binding level is indicated by the horizontal line allowing comparison to native receptor binding. Mean values and SE of > 3 replicates are shown in each case, except Q89A (n = 1) for OpaA, B and X. However, alanine substitution at all three positions confirm previous observations (Virji et al., 1999) . Black inserts in C show the levels of binding of the various receptor constructs to Opaisolate of strain C751; experiments were conducted simultaneously with C751OpaD in the presence of the receptors shown. all three strains. However, substitution I91L reduced binding only of the THi strain Eagan, suggesting a requirement for the extended hydrophobic arm of Ile in Eagan binding. Alanine substitution of Tyr-34 abrogated the binding of Rd and Eagan, whereas no effect on NTHi A950002 was observed. In contrast, Y34F substitution led to an increased binding of all three strains varying from 45 to 80% above native receptor binding levels, whereas Y34S had a differential effect on the three strains tested, ranging from no binding of Rd to a slight increase in the binding of A950002. These data broadly reflect results with neisserial Opa binding. Finally substitutions at Gln-89 show that reduction or abrogation of the side chain (Q89N and Q89A) has no deleterious effect, rather its effect is often that of enhanced binding (Fig. 4) . In summary, Hi strains primarily require Ile-91 to enable receptor targeting. Tyr-34 influences binding of THi and as for Nm, removal of the OH (Y34F) provides a more favourable environment. Gln-89 side chain also limits bacterial interaction and its substitution to shorter side chains (Q89A and Q89N) is more favourable especially for Rd-CEACAM binding. In previous studies, interactions of Hi-aeg strains were shown to be more analogous to Nm (Virji et al., 2000) . The current studies accordingly demonstrated a requirement both for Ile-91 and Tyr-34 for all Hi-aeg (Fig. 5) . In addition, as with some Nm strains, the requirement for hydrophobic and aliphatic chains was partly fulfilled by Leu and to a lesser extent by Thr. Substitutions of residue Tyr-34 with Phe or Ser produced a complex profile. Phe generally created a suitable or better environment but the loss of the aromatic ring was also tolerated (Y34S). At position 89, Q > N substitution enhanced binding dramatically for Ha3. Overall, the three strains tested exhibited similar binding to all receptor constructs that had substitutions at residue Ile-91 as well as those with alanine substitution at residue Tyr-34 (Fig. 5) . However, strains exhibited differences in binding to the receptors with substitutions Y34F and Q89N demonstrating the structural diversity of the CEACAM1-binding ligands of distinct Hi-aeg strains also. From the above data, it is clear that receptor constructs with Q89N substitution have the capacity to increase the binding of some Hi over that of Nm. In previous investigations, we have shown that certain isolates of the two bacteria are capable of competing with the native receptor and that Nm C751 derivatives can displace Ha3 as well as Eagan from the receptor (Virji et al., 2000) . Here we investigated how changes in receptor structure at position 89 may influence this phenomenon. We used receptor dot blot overlay in the presence of varying amounts of the soluble receptor constructs to estimate the ability of C751OpaB and Ha3 to selectively adsorb the receptor. Consistent with the lower affinity of Ha3 compared with C751OpaB/D for CEACAM1 (Virji et al., 2000) , the proportion of the native CEACAM1 that interacted with Ha3 compared with C751OpaB was lower and declined further at limiting concentrations of the receptor (Fig. 6A ). In contrast, Ha3 and C751OpaB had similar affinities for the receptor when the Q89N construct was present in excess (> 0.13 mg ml -1 ; Fig. 6B ). Further, at limiting concentrations of Q89N, the greater affinity of Ha3 was apparent and bound this construct threefold more than the Nm derivative (Fig. 6B ). The data demonstrate the potential influence of structural modulations at the bacterial binding site in changing the colonization profile of the target tissue. To assess whether CEACAM mutations that either significantly increase or decrease bacterial binding to the soluble constructs also affect bacterial binding to cellexpressed receptor constructs, we analysed COS cells transiently transfected with CEACAM1-4L containing Q89A, Q89N and I91A substitutions. Initially, binding of the anti-CEACAM antibodies to transfected CHO cells was examined and their binding was as observed for the soluble receptors (Fig. 7) . However, bacterially expressed ligands may overcome decreased affinity for specific mutant proteins due to multiple ligand-receptor engagement at the target cell surface. Examination of adherent bacteria and levels of receptor expression by microscopy revealed that bacteria bound to all transfectants other than sham-transfected cells (Fig. 8) . However, receptor expression levels had to be considerably high in the case of I91A for significant bacterial numbers to bind the transfectants. Whereas for Q89A and Q89N, bacteria could bind to cells with barely detectable levels of receptors (Fig. 8) . Using CHO cells expressing full length CC1 or CC1(Q89N), the ability of the soluble native (CC1-Fc) or CC1(Q89N)-Fc to inhibit bacterial binding was assessed using C751 OpaB bacteria (Fig. 9 ). Data show the following: (i) Binding to CHO cells as reported previously (Virji et al., 1996b) was observed only with Opa+ and not Opa-bacteria and no binding was seen with untransfected cells (not shown), (ii) Despite the low levels of binding of the isolate to the soluble CC1 construct carrying the Q89N mutation in the dot immunoblot analysis (Fig. 3B) , bacterial binding to the receptor expressed on the cells was clearly visible (Fig. 9D) , (iii) As in the case of THi Rd described above, association was only significantly high with cells expressing high levels of the receptor (assessed in parallel experiments, not shown), (iv) CC1-Fc could compete with bacterial binding to the homologous native structure expressed on CHO cells (Fig. 9B) , confirming previous results (Virji et al., 1996b) , (v) In comparison, CC1-Fc could more efficiently compete out C751 OpaB binding to the CC1(Q89N) receptor expressed on CHO cells (Fig. 9E) , consistent with its higher affinity for the isolate than CC1(Q89N)-Fc (Fig. 3B) , (vi) CC1(Q89N)-Fc construct was largely ineffective in competing with either cell-expressed receptor ( Fig. 9C and F) ent bacterial numbers are thus also low (cf. Fig. 9D and F: lack of low level binding in F compared with D). The data emphasize the importance of cell presentation of receptor in addition to receptor structure in bacterial interactions. The final outcome of this depends on the balance between the effects exerted by residue substitutions and receptor density. Both can affect functional affinity of bacteria-host interactions. To assess the relative importance and the degree of influence of each substitution on bacterial interactions, degrees of receptor recognition were assigned several categories as shown in Fig. 10A . The data show that substitution I91A has a profound effect on the interactions of all strains and whilst the side chain of Ile-91 is best suited to all, Leu can effectively substitute for Ile for several strains, whilst the polar Thr is less well tolerated. However, some Nm Opa proteins can also tolerate I91T substitution. Tyr-34 is also required in most cases. Interestingly, Phe-34 is preferred in general at this position over the native Tyr. Substitutions at Gln-89 have a delete- For the native receptor, expression levels and bacterial binding correlated more frequently than for the mutated molecules. In the case of substitutions at position 89, bacterial binding could be seen at very low levels of receptor expression (arrows in Q89A and Q89N panels). The reverse was the case with I91A, where even at high receptor levels (arrowhead, left panel), bacterial binding (arrowhead middle panel) was relatively low. In sampling of 100 cells, c. 50% exhibited this phenomenon. Note that the receptor recognition by Kat4C was not affected by mutations in the N-domain as determined by immunoblotting or by immunofluorescence microscopy (Fig. 7) . rious effect on binding of some Nm Opa derivatives and the smaller Q89N apparently is better suited overall for binding by Hi and especially Hi-aeg strains. The data emphasize the extensive inter-and intraspecies ligand variations in this receptor targeting, perhaps with greater interstrain differences in Nm. Structural aspects of bacterial ligands that affect the receptor recognition can be considered for strain C751 Opa proteins whose structures are known (Fig. 9B ) (Hobbs et al., 1994) . The hypervariable loops HV1 and HV2 of Opa proteins have been implicated in the binding of CEACAMs (Virji et al., 1999; Bos et al., 2002; de Jonge et al., 2003) . Given that the combinations of HV regions of OpaA, B and D of strain C751 provide three distinct combinations (Fig. 10C) , this is consistent with the distinct patterns of targeting of the variant receptor molecules observed in the study. However, the proteins must all contain sufficient similarity to bind the hydrophobic region around I91. The N-terminal domains of the cell-expressed CEA family of molecules are highly homologous and the majority of CEACAMs are targeted by one or more pathogenic neisserial adhesins belonging to the Opa family of proteins (Virji et al., 1996a; Chen et al., 1997; Gray-Owen et al., 1997) . Despite the structural variability, all Opa proteins target a common site on the receptors whose centre of binding appears to be Ile-91 on the CFG face of the N-domains. Ile-91 is conserved throughout the CEA members. Most of the other important residues on CEACAM1 identified by alanine scanning mutagenesis located in a close proximity of Ile-91 (Virji et al., 1999) . Precisely how variant Opa proteins can bind to a common receptor site is not entirely clear but a complementary set of sequences of more than one variable domains of Opa proteins may be involved. This variability of the ligands may determine the binding preference for distinct members of the CEA family which represents one mechanism that may determine tissue tropism (Virji et al., 1999; Bos et al., 2002; Gray-Owen, 2003; Hauck and Meyer, 2003; de Jonge et al., 2003) . Diverse strains of typable and NTHi lineages including the biogroup aegyptius also bind to the CFG face of the N-domain of CEACAM1 (Virji et al., 2000) . In our previous analysis of receptor binding, THi isolates were shown to behave similarly: each strain tested having a primary requirement for Ile-91 and in addition, was affected by Y34A and Q44A substitutions. In the current study also, the THi strain Rd and Eagan demonstrate similar overall binding patterns. However, in all cases, intraspecies dif- Fig. 9 . Binding of N. meningitidis isolate C751OpaB to cell-expressed receptors: competition between cell-expressed and soluble receptors. Bacterial binding to cell-expressed CC1 or Q89N construct was detected using anti-Nm antisera and TRITC-conjugated secondary antibodies. The interactions of bacteria with the cell-expressed receptors were investigated in the absence (A, D) or presence of competing soluble receptors. Although binding to the soluble receptor Q89N is much lower than the native CC1 in dot-blots (Fig. 3B) , it is relatively high on certain cells (presumably, on those expressing high levels of the receptor) (D). In competitive experiments, the native CC1-Fc (B, E) and the CC1(Q89N)-Fc (C, F) were preincubated at 10 ug ml -1 with bacteria for 15 min, prior to infection of target cells. CC1-Fc inhibited bacterial binding to the homologous receptor significantly (B), and almost abrogated binding to cell-expressed Q89N (E); whereas the soluble Q89N was inefficient at inhibiting bacterial binding to CHO-CC1 (C). However, a level of homologous inhibition was apparent when examining the adhesion of bacteria to cells expressing low levels of the receptor (e.g. 'peppered' areas shown in D are less evident in F). CHO-CC1(Q89N) ferences in response to different amino acid substitutions were apparent. One consistency between all species and strains tested was the dramatic loss of binding following I91A substitution which, as observed previously, appears to be central in an overlapping bacterial binding footprint on CEACAM1 (Virji et al., 1999; . A similar situation occurs on the mouse CEACAM1a (MHVR1a) N-terminal domain, in which Ile-41 appears to engage with the mouse hepatitis virus spike protein. Replacement of Ile-41 by Thr in the MHVR1b allele reduces virus binding significantly (Tan et al., 2002) . Cell surface receptor interactions with their ligands often involve hydrophobic contact points which provides the major binding energy. Hydrophobic residues surrounding these contribute to the specificity of binding (Clackson and Wells, 1995; Kwong et al., 1998; Kim et al., 2001) . Thus in murine CEACAM1, the protruding Ile-41, which is surrounded by a number of surface exposed charged residues, e.g. Asp-42, Glu-44, Arg-47, Asp89, Glu-93 and Arg-97, might form such a binding area (Tan et al., 2002) . Accordingly from current studies also, mutations introduced at position 91 in the human receptor support the requirement for a hydrophobic pocket at this site. In addition, an extended aliphatic chain is preferred as Ala disrupted binding of all bacteria and the polar Thr reduced binding of all Hi and several Nm strains. Only I91L was more frequently tolerated. This binding pocket is flanked by several polar residues (Fig. 1 ) whose contribution to bacterial ligand binding is apparent and variable (Virji et al., 1999; . The importance of the Ile-91 and the surrounding residues on human CEACAM1 in bacterial binding is also supported by the observation that the mAb YTH71.3 directed against the N-domain also requires several residues in common with bacteria and blocks binding of all CEACAM1-binding bacteria we have investigated (Virji et al., 1999; Hill and Virji, 2003) . The murine N-domain strand arrangement derived form crystal structure depicts the CC′ loop to assume a convoluted conformation. The previous model of human CEACAM1 contained a flat CC′ loop like the Ig-folds on Fig. 10 . Binding of bacterial ligands to CEACAM1-Fc constructs as determined in the current study. A. Receptor recognition categories are colour coded according to percent binding relative to the native molecule*. B and C. Diagrams showing CEACAM-binding meningococcal ligand structures. As only the Opa protein structure has been analysed in details so far with respect to CEACAM binding (de Jonge et al., 2003) , for clarity and ease of discussion, a 2D Opa protein structure (B) and relationship of the three variable domains of strain C751 Opa proteins are shown (C). SV, semivariable domains of strain C751 Opa-A, -B and -D are identical as are the hypervariable structures HV1 of OpaA and OpaB and the HV2 of OpaB and OpaD. which it was based. This resulted in Val-39 and Gly-41 being located at a distance from the binding focus Ile-91. Both Val-39 and Gly-41 have been implicated in bacterial binding from alanine scanning and homologue scanning mutagenesis (Bos et al., 1999; Virji et al., 1999) . The remodelling shown here of CEACAM1 produces the convoluted structure of the CC′ loop relocating Val-39 close to Ile-91. Further, the aromatic ring of Tyr-34 suggested to be required to maintain the convoluted structure of the CC′ loop (Tan et al., 2002) when substituted with Phe almost always supported bacterial binding. Indeed, Y34F provides a better environment for most bacterial strains tested. On the other hand, Y34A frequently abrogated receptor recognition. However and surprisingly, Y34S is tolerated by a substantial number of bacterial ligands. Whether this is due to the flexible variable loop domains of the bacterial ligands which may produce an induced fit around the receptor needs consideration. Interestingly, Tyr-34 is conserved in the majority of the human CEACAMs with the exception of CEACAM4 which contains His at this site. However, the importance of Tyr-34 in human CEACAM1 maintaining the three dimensional structure requires human receptor crystallographic data. Substitutions Y34F and Q89N also produced interesting data from the point of view of pathogenesis. Whilst Q89A and Q89N appear to affect some Opa proteins by reducing receptor recognition, suggesting its potential contribution in determining tissue tropism (Virji et al., 1999) , Q89N substitution occasionally caused dramatic increase in bacterial adhesion, especially of Hi isolates. Y34F, as observed above, increased binding of strains within all species examined. As it is possible that multiple receptors presented on the target cells may overcome the reduced binding affinity of mutated receptors, we examined strain Rd binding to I91A and Q89A/N-substituted receptors expressed in transiently transfected COS cells. Whilst the latter receptors were targeted on cells with low levels of receptor expression, only a proportion (c. 50%) of cells with very high levels of I91A receptors had significant numbers of bacteria attached. Thus point mutations of CEACAMs can both decrease and increase bacterial load and additional factors that dictate bacterial binding include receptor levels on the target cells. The ligands of Neisseria (i.e. Opa proteins) and of Hi so far identified (i.e. P5 proteins) share similar beta-barrel structure with surface-exposed variable loops. The regions of P5 that may engage with the receptor have not been identified but those of Opa proteins were studied by mutagenesis of strain H44/76 (de Jonge et al., 2003) . This strain is related to strain MC58, one of the strains used in the current study. The studies of de Jonge et al. implicated G ¥ (I/V/l) ¥ (S/E/Q) as the key motifs of HV2 regions (Fig. 9 ) of meningococcal Opa proteins in receptor targeting. Together with this, an 99 ELK motif of the Opa HV1 region might be involved in the three dimensional presentation of the receptor-engaging residues of the bacterial ligand (de Jonge et al., 2003) . Within the strain C751, OpaB and OpaD proteins contain the motif GxLxS at positions 172-176 and 167-171, respectively, whereas Opa A contains a 168 PxIxN motif. In the HV1 region, OpaA and B contain 99 DLK whereas OpaD contains EDK. Studies in our laboratories are in progress to assess the precise variant C751 Opa and receptor residue pairs involved in mutual recognition. Polymorphisms that affect host susceptibility may be found at various sites in the genes encoding host receptors targeted by pathogens and may result in loss or gain of receptor-associated functions. Some SNPs may lead to multiple and diverse downstream effects, e.g. altered transcriptional response and manifestation of disease (Sakuntabhai et al., 2005) . Extracellular domain polymorphisms may have a more direct effect via altered binding of pathogen ligands to their receptors. SNPs of the innate immune system especially those affecting pathogen associated pattern recognition receptors and cytokines have been studied extensively. Changes such as Asp299Gly and Thr399Ile in the extracellular domains of LPS-binding Toll-like receptor 4 (TLR4) have been implicated in increased risk to bacterial infections (Schroder and Schumann, 2005) . These SNPs have also been associated with severe respiratory syncytial virus (RSV) bronchiolitis in infants. In this case, altered interaction of the viral fusion (F) protein, implicated as a ligand for TLR4, is regarded as the primary mechanism (Tal et al., 2004; Schroder and Schumann, 2005) . However, these TLR4 SNPs could not be correlated with meningococcal disease (Read et al., 2001) . In contrast, rare SNPs were found more commonly in the TLR4 genes of patients with meningococcal disease (Smirnova et al., 2003) . This supports the notion that rare rather than common variants of TLR4 may be associated with infectious disease susceptibility. Studies presented here suggest some possible polymorphisms that can increase bacterial load. Several SNPs in CEACAMs have been identified and are listed in the NCBI SNP database and three have been identified in the N-domain of CEACAM1 including one at Gln-89. In this case, a Gln-89 to His substitution is observed. Such a residue difference occurs within the members of CEACAM family. For example, CEA contains H at position 89 (Fig. 2) . This is the only major difference between CC1 and CEA that could affect bacterial binding (Fig. 2) . As such, the mutation Q89H in CC1 would be expected to produce CEA-like binding pattern and could affect tropism of the bacteria as observed for CEA (Virji et al., 1999; de Jonge et al., 2003) . Further studies are required to assess whether other CEACAM polymorphisms, for example, substitutions such as Y34F and Q89N occur in human Molecular analysis of bacterial ligand-CEACAM1 interactions 341 populations and their frequencies in susceptible populations. SNP substitutions may also change the colonization profile of the nasopharynx, because in our competition studies we could demonstrate that increased binding afforded by Q89N to Hi-aeg isolate Ha3 increases the binding of Ha3 such that it out-competes Nm isolate C751OpaB in an in vitro competition assay for this receptor. The situation with the native receptor was the reverse. As shown in Figs 8 and 9, additional factors that affect bacterial binding to cell-expressed receptor include receptor density. Whilst certain residue substitutions, e.g. I91A, reduce functional affinity of bacterial interactions, high receptor densities increase such affinity. The final outcome must depend on the interplay between these two parameters. In recent studies, the role of receptor density on enhancement of bacterial attachment and invasion have been evaluated in detail (Bradley et al., 2005; Rowe et al., 2006) . It would be interesting to analyse bacterial invasion in cell lines expressing variant CEACAM1 carrying the above mutations by employing cell lines in which the receptor expression levels can be controlled, which are under development. Besides receptor polymorphisms, several other scenarios may lead to increased bacterial ligand binding to CEACAMs. Both Nm and Hi CEACAM-binding ligands (Opa and P5) are known to undergo antigenic variation. Thus, in any population, antigenic/structural variants are present and may be selected for during the course of host colonization and subsequent pathogenesis (Virji et al., 1996a; Duim et al., 1997; Meyers et al., 2003) . The receptor repertoire and subtypes may select bacteria capable of binding with high affinity. As mentioned above, upregulation of receptor expression on target cells may also increase bacterial binding affinity. In such cases, high affinity interactions result in cellular invasion, whereas lower affinity or load of bacteria may not proceed beyond surface adhesion (Tran Van Nhieu and Isberg, 1993; Bradley et al., 2005) . The expression of CEACAMs on normal epithelia of the respiratory tract has been reported, which would allow bacterial attachment and possible subsequent penetration into these tissues (Tsutsumi et al., 1990; Virji, 2001) . Following exposure to cytokines such as IFN-g or TNF-a, CEACAM expression by colonic carcinoma cells has been shown to increase (Fahlgren et al., 2003) . In addition, certain viral infections have also been shown to upregulate CEACAM1 expression in several epithelial cell lines (Avadhanula et al., 2006) . Increased cytokine levels following viral infection could lead to increased CEACAM expression and bacterial association with respiratory epithelia and subsequent invasion of deeper tissue by these organisms. Such a situation may explain the epidemiological association of increased incidence of Nm and Hi infections following certain viral infections (Cartwright et al., 1991; Takala et al., 1993) . In summary, little is known about why certain people are more susceptible to infection by some of the frequent colonizers of the human nasopharynx. Interestingly, opportunistic pathogens such as Nm and Hi as well as M. catarrhalis (not investigated here) target CEACAM1. As specific substitutions such as Y > F and Q > N produce more favourable targets for distinct mucosal isolates, it is possible that occurrence of such receptor polymorphisms in the human population could lead to greater bacterial binding thus increasing the chances of cellular invasion. Given the colonization rate of these organisms (generally > 10% of the population) and the frequency of invasive infection (up to 3:100 000 population), a combination of events may be required to increase host susceptibility. Inflammatory conditions that increase receptor density in populations carrying specific polymorphisms could provide the worst scenario. The sequence of mature human CEACAM1 N-domain was aligned with the corresponding domain of murine CEACAM1, giving a gapless alignment for residues 1-109 with a residue identity of 42%. The crystal structure co-ordinates of mouse CEACAM1 (residues 1-109) were taken from the structure file containing domains 1 and 2 (PDB code 1L6Z) and a homology model of human CEACAM1 N-domain was built using standard methods. Final refinement of the model was performed by soaking it with a 5 Å thick layer of water and energy minimizing while constraining the backbone atoms to their original positions in the template structure. The final round of minimization was for 2000 conjugate-gradient steps, constraining the backbone heavy atoms with a force constant of 0.5 kcal/Å. A stereochemical analysis of the structure was performed using Procheck and found to be of similar quality to the template crystal structure. Production of mutants was carried out by site-directed mutagenesis of the pIG construct containing the DNA encoding the CEACAM1 NA1B domains described previously (Watt et al., 1994) . The oligonucleotide primers used to create amino acid substitutions at positions 91, 34 and 89 of the N-domain are shown in Table 1 . Some primer sequences have been published previously (Watt et al., 1994) . For introducing mutations, CEACAM1 was amplified by polymerase chain reaction from the pIG-NA1B construct using either the common forward primer and a reverse primer containing the desired mutation, or a complementary forward primer containing the mutation and a common reverse primer. CEACAM1 with the appropriate mutation was amplified using the common forward and common reverse primers. The gene was then cloned into pIG using the restriction sites HindIII and EcoRI. Chimeric soluble receptor proteins containing the appropriate amino acid substitutions were prepared as previously described by transient transfection of COS cells (Teixeira et al., 1994; Virji et al., 1999) . The CEACAM1-Q89A-Fc used in overlay experiments was kindly donated by Dr S. Watt (Virji et al., 1999; Watt et al., 2001) . The strains used in this study have been described previously (Virji et al., 1999; . Nm strain C751 (serogroup A) variants used were C751OpaA, C751OpaB and C751OpaD. The strain MC58 (serogroup B) variant used expressed an Opa previously designated OpaX, which is encoded by the opaB locus. Opa -C751 isolate, which has been shown not to bind to CHO-CC1 (Virji, 1999) and RdCC-, a derivative of THi Rd, known not to bind to CHO-CC1 (M. Virji, unpublished) were used as controls. THi strain Rd is an acapsulate serotype d isolate, Eagan is serotype b isolate and A950002 is a NTHi strain. Hi-aeg strains Ha3, Ha30 and F2087 are all conjunctiva isolates. Nm was grown on brainheart infusion (BHI) agar supplemented with 10% heated horse blood (HBHI). Hi strains were grown on HBHI agar further supplemented with Levinthal base (10 mg ml -1 each of NAD and haemin). All strains were cultured at 37°C in 5% CO2. COS-1 cells (African green monkey kidney cells) used for transient transfection were cultured in Dulbecco's modified Eagle's medium (DMEM) containing 2-10% heat-inactivated Foetal Calf Serum (FCS, Gibco™), 2 mM glutamine, 50 mg ml -1 penicillin and 50 mg ml -1 streptomycin in a humidified atmosphere of 5% CO2 at 37°C. Antibody binding to CEACAM1 constructs. NA1B-Fc proteins were dotted at 0.2 mg ml -1 on to nitrocellulose and non-specific binding sites blocked using 3% (w/v) BSA in Dulbecco's PBS containing 0.05% Tween-20 (PBST) for 1 h at room temperature. Receptor was detected using the following antibodies, rabbit polyclonal AO115, rat monoclonal YTH71.3 both directed against the N-domain and mouse monoclonal Kat4c which recognizes the A and B domains (Jones et al., 1995) . Bound antibody was subsequently detected using an appropriate secondary antibody conjugated to alkaline phosphatase and developed using nitroblue tetrazolium and 5-bromo-4-chloro-3-indolylphosphate. Bacterial interactions with receptor constructs. Bacterial lysates (c. 2 ¥ 10 7 bacteria) were applied to nitrocellulose strips, air-dried and non-specific binding sites blocked using 3% BSA in PBST for 1 h at room temperature. Strips were overlaid with either native or mutated soluble NA1B-Fc diluted in 1% BSA in PBST at required concentrations for 1 h at room temperature. In most experiments, excess (1-3 mg ml -1 ) of the receptor was used. In competition studies, a range of concentrations (0.008-0.5 mg ml -1 ) was employed. Following washing to remove unbound NA1B-Fc, receptor binding was detected using anti-human-Fc alkaline phosphatase conjugate and substrates as described above. For quantification, densitometric analyses of the developed immunoblots were carried out using NIH Scion Image software. In most cases, multiple estimations were carried out and means and SE of each determination have been reported. Site directed mutagenesis of CEACAM1-4L receptor gene was performed using the QuickChange® Site Directed Mutagenesis Kit (Stratagene, La Jolla, CA, USA) according to the manufacturer's instructions. Primers (Table 1) were used to introduce the desired mutations into the pRc/CMV-CEACAM1-4L construct (kindly provided by Professor Wolfgang Zimmermann). Following sequencing to ensure the desired substitution had been obtained, the pRc/CMV-CEACAM1-4L construct was transiently transfected into COS-1 cells for functional analysis using DEAE dextran method described previously (Teixeira et al., 1994; Virji et al., 1996a) . Respiratory viruses augment the adhesion of bacterial pathogens to respiratory epithelium in a viral speciesand cell type-dependent manner Homologue scanning mutagenesis reveals CD66 receptor residues required for neisserial Opa protein binding Carcinoembryonic antigen family receptor recognition by gonococcal Opa proteins requires distinct combinations of hypervariable Opa protein domains Critical determinants of the interactions of capsule-expressing Neisseria meningitidis with host cells: the role of receptor density in increased cellular targeting via the outer membrane Opa proteins Influenza A and meningococcal disease CGM1a antigen of neutrophils, a receptor of gonococcal opacity proteins Several carcinoembryonic antigens (CD66) serve as receptors for gonococcal opacity proteins Update on meningococcal disease with emphasis on pathogenesis and clinical management Molecular variation in the major outer membrane protein P5 gene of nonencapsulated Haemophilus influenzae during chronic infections Interferongamma tempers the expression of carcinoembryonic antigen family molecules in human colon cells: a possible role in innate mucosal defence Nontypeable Haemophilus influenzae: pathogenesis and prevention Neisserial Opa proteins: impact on colonization, dissemination and immunity CD66 carcinoembryonic antigens mediate interactions between Opa-expressing Neisseria gonorrhoeae and human polymorphonuclear phagocytes The carcinoembryonic antigen (CEA) family: structures, suggested functions and expression in normal and malignant tissues Small' talk: Opa proteins as mediators of Neisseria-host-cell communication A novel cell-binding mechanism of Moraxella catarrhalis ubiquitous surface protein UspA: specific targeting of the N-domain of carcinoembryonic antigen-related cell adhesion molecules by UspA1 The variable P5 proteins of typeable and non-typeable Haemophilus influenzae target human CEACAM1 Microevolution within a clonal population of pathogenic bacteria: recombination, gene duplication and horizontal genetic exchange in the opa gene family of Neisseria meningitidis Leukocyte Typing Recognition of sialylated meningococcal lipopolysaccharide by siglecs expressed on myeloid cells leads to enhanced bacterial uptake Mapping the binding domains on meningococcal Opa proteins for CEACAM1 and CEA receptors Molecular dissection of the CD2-CD58 counter-receptor interface identifies CD2 Tyr86 and CD58 Lys34 residues as the functional 'hot spot Structure of an HIV gp120 envelope glycoprotein in complex with the CD4 receptor and a neutralizing human antibody Binding of Escherichia coli and Salmonella strains to members of the carcinoembryonic antigen family: differential binding inhibition by aromatic alpha-glycosides of mannose Epidemiology, hypermutation, within-host evolution and the virulence of Neisseria meningitidis Regulation of insulin action by CEACAM1 CEA adhesion molecules: multifunctional proteins with signal-regulatory properties CD66a (BGP), an adhesion molecule of the carcinoembryonic antigen family, is expressed in epithelium, endothelium, and myeloid cells in a wide range of normal human tissues A functional polymorphism of toll-like receptor 4 is not associated with likelihood or severity of meningococcal disease Co-ordinate action of bacterial adhesins and human carcinoembryonic antigen receptors in enhanced cellular invasion by capsulate serum resistant Neisseria meningitidis A variant in the CD209 promoter is associated with severity of dengue disease Single nucleotide polymorphisms of Toll-like receptors and susceptibility to infectious disease Assay of locus-specific genetic load implicates rare Toll-like receptor 4 mutations in meningococcal susceptibility Preceding respiratory infection predisposing for primary and secondary invasive Haemophilus influenzae type b disease Association between common Toll-like receptor 4 mutations and severe respiratory syncytial virus disease Crystal structure of murine sCEACAM1a[1,4]: a coronavirus receptor in the CEA family The N-domain of the biliary glycoprotein (BGP) adhesion molecule mediates homotypic binding: domain interactions and epitope analysis of BGPc Bacterial internalization mediated by beta 1 chain integrins is determined by ligand affinity and receptor density Immunohistochemical demonstration of nonspecific cross-reacting antigen in normal and neoplastic human tissues using a monoclonal antibody. Comparison with carcinoembryonic antigen localization The pathogenicity of Haemophilus influenzae Crystal structure of Neisserial surface protein A (NspA), a conserved outer membrane protein with vaccine potential CEA and innate immunity Interactions of Haemophilus influenzae with cultured human endothelial cells Opc-and pilusdependent interactions of meningococci with human endothelial cells: molecular mechanisms and modulation by surface polysaccharides The N-domain of the human CD66a adhesion molecule is a target for Opa proteins of Neisseria meningitidis and Neisseria gonorrhoeae Carcinoembryonic antigens (CD66) on epithelial cells and neutrophils are receptors for Opa proteins of pathogenic neisseriae Critical determinants of host receptor targeting by Neisseria meningitidis and Neisseria gonorrhoeae: identification of Opa adhesiotopes on the N-domain of CD66 molecules Carcinoembryonic antigens are targeted by diverse strains of typable and non-typable Haemophilus influenzae Angiogenic properties of the carcinoembryonic antigen-related cell adhesion molecule 1 CD66 identifies the biliary glycoprotein (BGP) adhesion molecule: cloning, expression, and adhesion functions of the BGPc splice variant Homophilic adhesion of human CEACAM1 involves N-terminal domain interactions: structural analysis of the binding site Secondary structure and molecular analysis of interstrain variability in the P5 outer-membrane protein of non-typable Haemophilus influenzae isolated from diverse anatomical sites The studies were supported by grants from the MRC, the Meningitis Research Foundation, the Wellcome Trust, the Basque Country Government PhD studentship (S.V.) and MRC priorityarea studentship (J.R.). The studies were carried out in the Spencer Dayman Meningitis Research Laboratories. We are grateful to Miss Natalie Griffiths for technical assistance. Three days after transfection with either native or mutated pRc/ CMV-CEACAM1-4L constructs, monolayers of transiently transfected COS cells were incubated with THi strain Rd or C751OpaB at a ratio of c. 200 bacteria per cell in medium 199 containing 2% FCS for 1 h at 37°C. Non-adherent bacteria were removed by washing (Virji et al., 1991) . For immunofluorescence detection, cells were fixed in absolute methanol for 10 min, washed and blocked with 1% BSA-PBST for 1 h. The attached bacteria were detected using antibacterial antiserum and tetramethyl rhodamine iso-thiocyanate (TRITC)-conjugated secondary antibodies. Expressed CEACAM1-4L was detected using anti-CEACAM mAb Kat4c and fluorescein-conjugated secondary antibody.