key: cord-1004674-0lykpinp authors: Ma, Wenfu; Goldberg, Jonathan title: Rules for the recognition of dilysine retrieval motifs by coatomer date: 2013-03-12 journal: The EMBO Journal DOI: 10.1038/emboj.2013.41 sha: 8ef9877b632176cd5837575f0ea5d09d9649b800 doc_id: 1004674 cord_uid: 0lykpinp Cytoplasmic dilysine motifs on transmembrane proteins are captured by coatomer α-COP and β′-COP subunits and packaged into COPI-coated vesicles for Golgi-to-ER retrieval. Numerous ER/Golgi proteins contain K(x)Kxx motifs, but the rules for their recognition are unclear. We present crystal structures of α-COP and β′-COP bound to a series of naturally occurring retrieval motifs—encompassing KKxx, KxKxx and non-canonical RKxx and viral KxHxx sequences. Binding experiments show that α-COP and β′-COP have generally the same specificity for KKxx and KxKxx, but only β′-COP recognizes the RKxx signal. Dilysine motif recognition involves lysine side-chain interactions with two acidic patches. Surprisingly, however, KKxx and KxKxx motifs bind differently, with their lysine residues transposed at the binding patches. We derive rules for retrieval motif recognition from key structural features: the reversed binding modes, the recognition of the C-terminal carboxylate group which enforces lysine positional context, and the tolerance of the acidic patches for non-lysine residues. Many resident proteins of the endoplasmic reticulum (ER) are localized to that compartment by continuous retrieval from the Golgi complex. Luminal ER proteins may be retrieved by means of a C-terminal KDEL sequence that binds to the KDEL receptor to trigger retrograde transport (Lewis and Pelham, 1992) . Type I transmembrane proteins-including ER residents and itinerant proteins of the ER/Golgi systemcontain a C-terminal dilysine motif that signals their retrieval from post-ER membranes (Jackson et al, 1993; Gaynor et al, 1994; Townsley and Pelham, 1994) . The dilysine motif, initially identified on the adenoviral E3 19 kDa (E19) protein, consists of a pair of lysine residues at the À 3 and À 4 (or À 5) positions relative to the cytoplasmic C terminus of the cargo molecule-termed as KKxx and KxKxx motifs (Nilsson et al, 1989; Jackson et al, 1990) . Variants of the dilysine motif have been characterized, in particular a KxHxx retrieval motif in the tail of the spike protein of group I coronaviruses (Lontok et al, 2004) , and the RKxx motif present in the Golgi-localized Scyl1 protein (Burman et al, 2008) . The dilysine motif is recognized by coatomer subunits , and dilysine-tagged transmembrane cargo is thereby packaged into COPI(coatomer)-coated vesicles for retrograde transport to the ER . The biochemical basis for recognition of dilysine motifs by coatomer has been controversial, but one path of research has led in recent years to a clear delineation of the dilysine motif-binding sites. These were mapped initially to the ab 0 e-COP subcomplex of coatomer Lowe and Kreis, 1995; Fiedler et al, 1996; Schröder-Köhne et al, 1998) . Subsequent yeast genetic studies implicated the N-terminal b-propeller domains of a-COP and b 0 -COP, the sequence-related large subunits of ab 0 e-COP (Eugster et al, 2004) . A recent structural analysis of the b 0 -COP N-terminal domain crystallized with a KxKxx peptide pinpointed the binding site near to the 'pore' of the b-propeller protein, and a corresponding binding site was mapped via functional tests to the pore region of a-COP (Jackson et al, 2012) . However, the mode of KxKxx motif recognition by coatomer was not resolved in this study because the dilysine-binding site on b 0 -COP was partially occluded by a crystal contact (Jackson et al, 2012) . The location of dilysine-binding sites on the cage-forming ab 0 e-COP subcomplex (and not the adaptin-like bd/gz-COP subcomplex) of coatomer is interesting in view of the finding that dilysine-tagged cargo molecules are required for efficient COPI-coat assembly and vesicle formation (Bremser et al, 1999; Aguilera-Romero et al, 2008) . Thus, the dilysine motif binds to b 0 -COP at a site close to the vertex of the triskelion formed by three ab 0 e-COP assembly units (Lee and Goldberg, 2010) , suggesting how COPI coat nucleation may be coupled to the packaging of dilysine-tagged cargo molecules, in particular oligomeric species such as p24-family proteins and the Mst27p/Mst28p complex (Bremser et al, 1999; Sandmann et al, 2003; Aguilera-Romero et al, 2008) . Collectively, these findings highlight the importance of establishing the binding modes and specificity of dilysinetype motifs for coatomer subunits. Here, we describe 11 highresolution crystal structures that show how a-COP and b 0 -COP bind to a comprehensive series of motifs, including the canonical KKxx and KxKxx classes as well as the variant KxHxx and RKxx signals. We show that the recognition modes of KKxx and KxKxx motifs are distinct, a non-intuitive observation that helps to rationalize the binding mode of the RKxx motif. We demonstrate that a-COP and b 0 -COP subunits have comparable specificity towards KKxx and KxKxx motifs, contrary to previous reports. And we show that coatomer subunit specificity diverges instead at the level of K(x)Kxx sequence context and in the recognition of the RKxx motif, which binds exclusively to b 0 -COP. The results of yeast genetic experiments led to the proposal that a-COP and b 0 -COP are selective towards KKxx and KxKxx motifs, respectively (Eugster et al, 2004) . The reporter molecule used in these studies was fused to the C terminus of yeast Wbp1 (a component of the ER N-oligosaccharyltransferase complex), terminating in the sequence TFKKTN. A recent analysis demonstrated that this sequence binds to b 0 -COP with 10-fold lower affinity than do KxKxx peptides, and that while b 0 -COP has some capacity to transport KKxx-tagged cargo, it has a distinct binding preference for KxKxx motifs (Jackson et al, 2012) . In preliminary binding experiments, we noted, however, that a-COP and b 0 -COP displayed highly similar selectivity towards a panel of KKxx and KxKxx sequences ( Figure 1 ). Figure 1A shows the results of an initial test for binding of dilysine motifs to the complete ab 0 e-COP subcomplex of coatomer (see Materials and methods). The dilysine motifs were fused to the C terminus of glutathione S-transferase (GST), then immobilized on glutathione sepharose beads and probed for binding to ab 0 e-COP. Purified recombinant Bos taurus ab 0 e-COP was added to a mix of 'background' E. coli proteins, and the specificity of binding was assessed by the enrichment of COPs from the input mixture ( Figure 1A ). ab 0 e-COP binds to the KKxx motifs of the mammalian proteins p25 and Ergic-53 and the adenoviral E19 protein, and to a KKxx motif nested in a poly-serine sequence (labelled KKSS in Figure 1 ; the polyS-KKSS sequence confers ER residence to a CD8 reporter molecule in HeLa cells; Jackson et al, 1990) . No specific binding was observed to the p24-family protein p24b, which does not contain a dilysine motif at its C terminus. For control sequences, we added two C-terminal serine residues, KKxxSS (labelled p25 þ SS in Figure 1 ), since this addition was shown to abolish the ER retention of a CD8 reporter molecule (Jackson et al, 1990 ). Next, we tested the binding of S. cerevisiae a-COP and b 0 -COP individually to the dilysine motifs. Utilizing the sequence homology of a-COP and b 0 -COP, we designed two protein complexes of the form shown in Figure 1H , namely the propeller-propeller-solenoid array of one COP (containing the dilysine-binding site in the N-terminal b-propeller domain) plus a short interacting a-solenoid region of the other COP, which aids the stability of the dimer (Lee and Goldberg, 2010 Figures 1C and D) . These data reveal that a-COP and b 0 -COP bind with comparable specificity to a panel of KKxx and KxKxx motifs, and we conclude that b 0 -COP does not disfavour KKxx motifs in general. Instead, we speculate that b 0 -COP may bind weakly to yeast Wbp1 owing to the particular sequence context of its dilysine motif, an issue we return to below. Crystal structures of a-COP and b 0 -COP bound to KxKxx retrieval motifs Efforts to crystallize a-COP and b 0 -COP (specifically their N-terminal b-propeller domains) bound to a comprehensive set of naturally occurring retrieval motifs met with two main challenges. First, the problem of poor solubility of yeast and mammalian a-COP proteins that hampered previous studies (Lee and Goldberg, 2010) was solved by using the Schizosaccharomyces pombe N-terminal b-propeller domain, engineered to improve its solubility by mutating five hydrophobic residues to lysine in a flexible loop-L181K, L185K, I192K, L196K and F197K. Second, a fraction of the crystals we obtained had intimate crystal contacts in the vicinity of the dilysine-binding site that occlude all or part of the site. Six crystal structures were discarded for this reason, yielding a final set of eleven structures: eight S. cerevisiae b 0 -COP/peptide complexes, two S. pombe a-COP/ peptide complexes and one structure of S. pombe a-COP with a vacant binding site. The crystal structures were refined to resolutions ranging from 1.45 to 1.9 Å (see Table I for crystallographic data and a list of the dilysine sequences used). Figures 1F and G show the closely related folds of the N-terminal b-propeller domains of a-COP and b 0 -COP (the S. pombe a-COP and S. cerevisiae b 0 -COP sequences share 30% sequence identity in this B300 residue domain). Both coatomer subunits bind to dilysine motifs via a highly conserved surface region of the b-propeller domain, near to the central 'pore' of the propeller, as mapped previously by Jackson et al (2012) . To facilitate a direct structural comparison of the dilysine-binding sites of a-COP and b 0 -COP, we solved crystal structures of both proteins bound to the KxKxx motif of yeast Emp47p, with the sequence IKTKLL (Figures 2B and D, and see Table I ). These are the first views of dilysine motifs bound to coatomer unobstructed by crystal contacts, and they define the three 'primary' points of interaction between the retrieval motif and coatomer subunit: the C-terminal carboxylate group of the KxKxx peptide binds to a basic patch involving residues Arg15, Lys17 and Arg59 (S. cerevisiae b 0 -COP numbering), and the two lysine side chains interact with two acidic patches. These primary interactions can be seen more clearly in Figures 3B and D, which show the crystal structure of b 0 -COP bound to the KxKxx motif of human Wbp1 (hWbp1 in Table I ; also called OST48). We refer to the acidic patches as patch 1, involving b 0 -COP residue Asp206 that binds to the À 3 lysine side chain, and acidic patch 2, involving residues Asp98 and Asp117 binding to lysine À 5 of the peptide (labelled accordingly in Figure 3B ). In the crystal structure of S. cerevisiae b 0 -COP reported by Jackson et al (2012; PDB identifier 2YNN) , a KxKxx peptide with sequence CTFKTKTN interacts with the basic patch and with patch 1 via the À 3 lysine, but patch 2 is occluded by a crystal contact and the À 5 lysine side chain of the motif projects away from the binding site as a result. In addition to the three primary contacts, important secondary contacts involve the backbone carbonyl oxygen atoms of the À 4, À 3 and À 2 residues, which interact with charged side chains Arg59 and Arg101 (b 0 -COP numbering) at the base of the dilysine-binding site (see, e.g., KEKSD of hWbp1 in Figure 3D ). The primary and secondary interactions made by the KxKxx motif, as defined above, are common to the a-COP/Emp47p, b 0 -COP/Emp47p and b 0 -COP/hWbp1 complexes and, as this implies, the conformations of the bound peptides are highly similar (compare, e.g., Figures 2B and D) . This despite the significant difference in the character of the bound peptides, namely the highly charged hWbp1 sequence EKEKSD versus the more hydrophobic Emp47p sequence IKTKLL. The constancy of the KxKxx motif conformation and interactions among the four high-resolution crystallographic observations (the b 0 -COP/hWbp1 crystals contain two distinct complexes in the asymmetric unit) implies that this is the bona fide binding mode. Equivalence of the dilysine-binding sites of a-COP and b 0 -COP In hindsight, the close correspondence of KxKxx binding to a-COP and b 0 -COP is not surprising in view of the very high degree of conservation of the binding-site residues bound to the ab 0 -COP fragment of the COPI assembly unit. This is a composite model comprising the crystal structure from the present study, as in (G), together with the ab 0 -COP fragment from a previous study (Lee and Goldberg, 2010) . Rules for the recognition of dilysine retrieval motifs by coatomer W Ma and J Goldberg Table I Data collection and refinement statistics Highest resolution shell is shown in parenthesis. c Root-mean-squared deviation (r.m.s. D) e R free was calculated with 5% of the data. Rules for the recognition of dilysine retrieval motifs by coatomer W Ma and J Goldberg (Figures 2 and 3 ). Only two residues differ, and these are conservative alterations: Tyr33 of b 0 -COP changes to His31 in a-COP and Phe142 of b 0 -COP changes to Tyr139 in a-COP. The close structural equivalence of the binding sites accords with the results of pull-down experiments showing the highly similar specificity of a-COP and b 0 -COP towards the dilysine motifs ( Figure 1 ). To extend the results of the pull-down experiments, we measured more precisely the affinities of dilysine motifs using isothermal calorimetry (ITC; Figures 2E-H , and see Materials and methods). All the functional dilysine motifs we tested caused a significant heat change upon mixing with a-COP and b 0 -COP (see Table I for a list of the sequences of peptides used). A control peptide KKxxSS caused no heat change upon mixing with b 0 -COP, confirming that the assay measures the protein/peptide interaction (black curve in Figure 2F , lower panel). Calorimetry data on the KxKxx motifs ( Figures 2G and H) reveal that the Emp47p sequence IKTKLL binds with very similar affinity to a-COP All the a-COP residues that contact the dilysine motif are drawn in the picture. The right panel shows the KKMP peptide together with a difference electron density map calculated following a simulated annealing protocol (SA-omit map; Brü nger, 1998), calculated at 1.9 Å resolution and contoured at 3.0s (Table 1) . Electron density has been truncated for N-terminal residues outside the binding site. (B) View of the KTKLL motif of Emp47p bound to a-COP. In the right panel, the SA-omit map is calculated at 1.9 Å resolution and contoured at 3.0s. (C) Structure of the KKLV motif of the p24-family protein p25 (from human) bound to b 0 -COP. The SA-omit map is calculated at 1.45 Å resolution and contoured at 3.0s. EKEKSD binds somewhat more tightly to both proteins, but again with almost equal affinity to a-COP Collectively, the results of the ITC and crystallography experiments on KxKxx motifs define a common and specific mode of binding to the a-COP and b 0 -COP subunits of coatomer. Intuitively, one might expect the two motifs to bind in a similar fashion, with the intermediary residue of KxKxx looped away from the binding site relative to KKxx. But this is not the case, as our crystallographic analysis shows that the two motifs adopt different binding modes. Figures 2A and C show the structure of a-COP bound to the KKMP sequence of adenoviral E19 protein and of b 0 -COP bound to the KKLV sequence of p25, a p24-family protein (see Table I for full peptide sequences). The KKxx motifs adopt a helical conformation at the binding sites. The comparison with the KxKxx binding mode shows some shared features: the terminal carboxylate group is bound in the same manner to the basic patch, and the backbone carbonyl oxygen atoms of the À 3 and À 2 (though not the À 4) residues of the KKxx motif interact with the side chains of Arg59 and Arg101 in a similar fashion to KxKxx (see Figure 3 for a side-by-side comparison of motifs). The lysine side chains of the KKxx motifs are recognized by the acidic patches 1 and 2 but, surprisingly, they bind in a reversed fashion relative to KxKxx. In Figures 3A and B , we emphasize the alternate binding modes with the clockwise green arrow indicating the helical conformation of the À 4 and À 3 residues of the KKxx motif and the anticlockwise arrow denoting the direction of the À 5, À 4 and À 3 residues of the KxKxx motif. These crystallographic results indicate that the KKxx and KxKxx motifs adopt distinctive low energy binding modes to accommodate their lysine residues in the acidic patches. How certain can we are that this represents the bona fide binding mode for KKxx motifs? First, a comparison of the peptide conformations in the b 0 -COP/p25, a-COP/E19 (Figures 2 and 3) and b 0 -COP/yWbp1 (Supplementary Figure S1 ) complexes shows highly similar conformations and sets of contacts at the binding sites. This involves a total of five distinct crystallographic observations of KKxx peptides, as both the a-COP/E19 and b 0 -COP/yWbp1 crystals contain two complexes in the asymmetric unit (Table I) . Second, the peptide conformations were defined by difference Fourier analysis of 1.45 Å (p25), 1.9 Å (E19) and 1.5 Å (yWbp1) resolution X-ray data sets. Third, the KKxx binding mode is common to the a-COP and b 0 -COP binding sites. Fourth, the binding mode is independent of the sequence context, unaffected by a preponderance of hydrophobic (p25; KKLV), hydrophilic (yWbp1; KKTN) or terminal proline (E19; KKMP) residues. On this basis, we conclude that the two dominant classes of dilysine motifs, KKxx and KxKxx, bind to a common site on a-COP and b 0 -COP proteins, as shown previously (Eugster et al, 2004; Jackson et al, 2012) , but adopt distinct modes of interaction. The measurements by calorimetry of KKxx motif interactions are shown in Figures 2E and F. As expected, both a-COP and b 0 -COP bind to the p25 and adenoviral E19 sequences, in accord with the results of the pull-down experiments, but there is somewhat more variation in the specificity of the two COPs (compared with their highly similar affinities towards KxKxx motifs). The proline-containing E19 motif, KKMP, binds with comparable affinity to a-COP (K d ¼ 64.5 mM) and b 0 -COP (K d ¼ 49.0 mM), while the more conventional KKLV sequence of p25 binds somewhat more tightly to a-COP (K d ¼ 11.3 mM) than to b 0 -COP (K d ¼ 27.7 mM). The results reinforce the view that b 0 -COP does not disfavour KKxx motifs in general. The outlier in this analysis is the dilysine motif of yeast Wbp1 (yWbp1; KKTN), which binds with significantly lower affinity to b 0 -COP (K d ¼ 170.9 mM) than to a-COP (K d ¼ 36.6 mM). These measurements are consistent with a previous study showing a 10-fold lower affinity of the KKTN sequence for b 0 -COP compared to KxKxx motifs (Jackson et al, 2012) . And they offer a possible explanation for the results of trafficking experiments that revealed a selective retrieval defect in coatomer mutants lacking a shows the KKLV sequence of human p25 bound to b 0 -COP (refined at 1.45 Å resolution). The molecular surface of b 0 -COP is coloured according to residue properties: acidic residues coloured red, basic side chains blue and hydrophobic side chains yellow. Labels indicate acidic patch 1 (red circle, Asp206) and acidic patch 2 (Asp98 and Asp117). The binding mode of the KKxx motif is indicated by the clockwise arrow coloured green. The KKLV peptide is coloured magenta, except for the side-chain nitrogen atoms of the lysine residues (blue) and the terminal carboxylate oxygen atoms (red); for complete atom colouring of the peptide see (C). Rules for the recognition of dilysine retrieval motifs by coatomer W Ma and J Goldberg functional a-COP dilysine-binding site that presumably are reliant on the b 0 -COP subunit for dilysine retrieval. Specifically, the a-COP mutants were unable to retain a KKTN-tagged reporter molecule in the ER, but were able to maintain the localization of Emp47p, dependent on its KTKLL motif (Schröder-Köhne et al, 1998; Eugster et al, 2004) . In the final stage of our analysis of KKxx motifs, we sought a molecular explanation for this, the difference in specificity of a-COP and b 0 -COP towards the yWbp1 KKTN signal. Presumably, an explanation will be found in the interplay between the terminal residues of the motif KKTN (but not the upstream residues, TFKKTN, since these lie outside the binding site) and specific residue differences between the a-COP and b 0 -COP binding sites. As described previously, the binding sites differ at only two positions, and only one of the two is in the vicinity of the terminal KKTN side chains: namely Tyr33 (b 0 -COP) and its equivalent His31 (a-COP). Supplementary Figure S1A shows the structure of the b 0 -COP/yWbp1 complex (the millimolar concentrations of peptide and b 0 -COP protein used for crystallization are more than sufficient to saturate the binding site; K d ¼ 170.9 mM). In the picture, we have overlapped the His31 side chain of a-COP with Tyr33 of b 0 -COP. The juxtaposition of the Tyr33/His31 side chains and the threonine residue of the KKTN motif lead us to propose that an unfavourably close approach of the Tyr33 hydroxyl group impinges on the b-branched threonine residue (and hinders a favoured sidechain rotamer) and is responsible for lowering the affinity of the b 0 -COP/KKTN complex. This invokes a rather subtle interplay, such that the shorter His33 side chain of a-COP is more compatible with a b-branched residue at the À 2 position, all the more so because position 33 is conserved in a-COP as either a histidine or phenylalanine residue across species (and is a Phe in S. cerevisiae a-COP). Nevertheless, a peptide in which the threonine residue is replaced with serine, TFKKTN to TFKKSN, was tested for its affinity to a-COP and b 0 -COP by ITC (Supplementary Figure S1B) . Strikingly, the KKSN peptide binds more tightly than KKTN to b 0 -COP by a factor of 1.8-fold, and a comparison of the (ratio of) affinities of a-COP and b 0 -COP for KKxx motifs shows that the KKTN motif of yWbp1 favours a-COP by a factor of 4.7, the KKSN motif favours a-COP by a factor of 2.2, and the KKLV motif of p25 favours a-COP by a factor of 2.5. The results confirm the interplay between the Tyr33/His31 residue at the binding site and the À 2 position of the dilysine motif, and they reveal that threonine is disfavoured by b 0 -COP at the À 2 position, hence the reliance of the yWbp1 retrieval motif on a functional a-COP subunit for trafficking in vivo (Eugster et al, 2004) . In our subsequent analysis of the variant KxHxx motif, we found that b 0 -COP binds with far lower affinity than a-COP to the KVHVQ sequence ( Figure 4C ; peptide PEDVspike in Table I) , and the crystal structure of the b 0 -COP/KVHVQ complex ( Figure 4B ) shows the close approach of the À 2 valine residue to Tyr33. Collectively, these observations are consistent with the proposal that b 0 -COP disfavours dilysine motifs that contain a b-branched residue (I, T, or V) at the À 2 position. The KxHxx motif was identified as a functional retrieval signal present on the cytoplasmic tail of the spike protein of group 1 coronaviruses and of the SARS coronavirus (Lontok et al, 2004) . We determined the crystal structure of b 0 -COP bound to the KVHVQ sequence of the spike protein from porcine epidemic diarrhoea virus, abbreviated as PEDVspike in Table I (Shirato et al, 2011) . The PEDVspike motif binds very weakly to b 0 -COP, as described above, but crystallization drives together the complex ( Figures 4B and C) . The KxHxx sequence adopts a highly similar binding mode to the KxKxx motif. A slight positional shift of the entire histidine residue enables its side-chain e2 nitrogen atom to bond Asp206 of b 0 -COP at acidic patch 1. To assess whether the shift is a conserved feature of the KxHxx binding mode, we solved the structures of additional b 0 -COP/KxHxx complexes. For this purpose, we identified conserved KxHxx sequences in the cytoplasmic C termini of the Insig-1 (KPHSD) and Insig-2 (KSHQE) proteins. The Insig proteins are ER localized and facilitate the retention of SCAP/ sterol/SREBP complexes in the ER (Yang et al, 2002) , but a role for the Insig C terminus in active retrieval of the protein from post-ER membranes has not been considered; we leave this as an open question. ITC measurements showed that the KxHxx motifs of the Insig proteins bind to a-COP and b 0 -COP ( Figure 4C ). Structures of b 0 -COP bound to the Insig-1 and Insig-2 sequences reveal a highly similar binding mode to the KxHxx motif of PEDVspike (Figure 4 ; Table I ). Importantly, the position of the histidine residue is conserved in the three complexes. Figure 4A shows an overlap of the b 0 -COP/Insig-1 complex with the KxKxx sequence of Emp47p, and highlights the subtle shift in the position of the histidine residue relative to the À 3 lysine of Emp47p (note, e.g., the displacement of Table 1 , refined at 1.55 Å resolution). The molecular surface of b 0 -COP is coloured according to residue properties: acidic residues coloured red, basic side chains blue and hydrophobic side chains yellow. (C) Summary of binding measurements by isothermal calorimetry of KxHxx motif interactions with a-COP and b 0 -COP. the Ca position) that enables the histidine side-chain e2 nitrogen atom to occupy a very similar position to the lysine side-chain nitrogen atom, both then bonding to Asp206 of acidic patch 1. These results indicate that acidic patch 1 of the binding site can accommodate histidine residues, and as a consequence the KxHxx motif mimics the KxKxx binding mode in interactions with coatomer. A CD8 reporter molecule tagged with the KKMP motif of adenoviral E19 protein is dependent on both lysine residues for ER retention, and among a series of substitutions of lysine with basic residues-HK, KH, RK and KR-only RKMP was shown to be a functional motif (Jackson et al, 1990) . Figure 5 illustrates the structural analysis of b 0 -COP bound to the RKLD motif of the Golgi-localized protein Scyl1 (Burman et al, 2008) . Surprisingly, the conformation of the RKxx motif does not mimic KKxx, and we suggest that an arginine residue is slightly too long for optimal binding to acidic patch 1 in the KKxx mode of interaction. Instead, the RKxx motif mimics the KxKxx binding mode from position À 4 to the C terminus ( Figure 5A ). The arginine side chain adopts a unique binding position, outside the circumscribed dilysine-binding site (shown schematically in Figure 6 ). Figure 5B illustrates how the arginine residue is extended to form bonds via its guanidinium group to the backbone carbonyl oxygen atom of b 0 -COP residue Arg185. Interestingly, the corresponding carbonyl oxygen atom is not available to form this bond in a-COP; instead, in all the a-COP crystal structures the carbonyl group of Arg213 points away from the binding site (c angle of Arg213 is rotated B1651 relative to the conformation in b 0 -COP; Figure 5B ). These structural observations suggest that the RKxx motif should favour b 0 -COP over a-COP. ITC measurements show that the RKLD sequence binds to b 0 -COP (K d ¼ 20.4 mM) with comparable affinity to KKxx and KxKxx motifs ( Figure 5C ). The AKLD and KRLD test sequences each lose B4-fold the RKxx motif from Scyl1, coloured magenta (sequence RKLD; refined at 1.5 Å resolution). Overlapped is the structure of the KxKxx motif of Emp47p, coloured green. The molecular surface of b 0 -COP is coloured according to residue properties: acidic residues coloured red, basic side chains blue and hydrophobic side chains yellow. (B) Picture of the Scyl1 RKLD motif bound to b 0 -COP in a similar orientation to (A), showing the bonding interactions made by the arginine residue of the RKLD motif (coloured magenta) to the backbone carbonyl oxygen atom of b 0 -COP residue Arg185 (b 0 -COP coloured cyan). The corresponding carbonyl group of a-COP residue Arg213 points in the opposite direction, as shown in the overlapped structure (coloured pink). Side-chain atoms of Arg185, Arg213 (and Asp212) are omitted for clarity. A key salt bridge that maintains the local conformation of b 0 -COP is drawn between residues Glu184 and Arg163 (indicated with blue dot). (C) Isothermal calorimetry measurements for a-COP and b 0 -COP binding to the GARKLD sequence of Scyl1 and variants thereof. The upper panel shows examples of the primary data; lower panel shows data fitted to binding isotherms. (D) Alignment of a-COP and b 0 -COP sequences in the vicinity of b 0 -COP residue Arg163 (blue dot), the key salt-bridge residue drawn in (B). This basic residue is conserved in b 0 -COP sequences, but changes to glutamine or glutamic acid in a-COP and the salt bridge is lost as a result. affinity, indicative of suboptimal binding relative to the RKLD motif. Strikingly, the RKLD sequence binds exceedingly weakly to a-COP (K d 41 mM), revealing a particular incompatibility of this motif with the a-COP binding site, the structural correlate of which we infer is the clash of the arginine side chain with the backbone amide group of a-COP residue Arg213 ( Figure 5B ). The concordance of the binding measurements and structural observations indicate that the b 0 -COP/RKxx crystal structure, and in particular the bonding arrangement of the À 4 arginine residue, corresponds to the binding mode in solution. And it is of note that the RKxx motif adopts the same conformation in the two distinct complexes in the crystal asymmetric unit (Table I ). An analysis of structure and sequence conservation in this region of the a-COP and b 0 -COP molecules suggests that the specificity of the RKxx motif for b 0 -COP is a phylogenetically conserved feature (Figures 5B and D) . Binding of the RKxx motif relies on the orientation of the carbonyl group of Arg185 (S. cerevisiae b 0 -COP numbering), and this backbone orientation is imposed in turn by the salt bridge between residues Glu184 and Arg163 ( Figure 5B ). The salt-bridge pairing is highly conserved in b 0 -COP sequences: Glu184 is conserved as a glutamic acid, and Arg163 as an arginine, or as lysine in Caenorhabditis species. By contrast, the salt bridge is lost in a-COP sequences: the position corresponding to b 0 -COP Glu184 is highly conserved as an aspartic acid (or threonine in Saccharomyces spp. a-COP); however, as shown in Figure 5D , the position corresponding to b 0 -COP Arg163 changes in a-COP to a conserved glutamine or glutamic acid residue. Thus, the RKxx motif adopts a distinctive binding mode, and utilizes an expanded region of the binding site that is unique to the b 0 -COP subunit. A comparative assessment of the crystal structures and ITC data suggests that the sequence context of the KKxx and KxKxx motifs plays a relatively minor role in coatomer recognition. The analysis of yWbp1 revealed one exception to this, namely the incompatibility of b-branched residues at the À 2 position with tight binding to b 0 -COP. Additionally, we speculate that aspartic and glutamic acid residues may be favoured at the À 1 position, to facilitate a charge interaction with Arg272 (b 0 -COP numbering), perhaps explaining the tight binding of the hWbp1 KEKSD motif ( Figure 3D ; 3.4 mM) . These exceptions aside, the binding site seems to be designed to bind dilysine motifs without regard for sequence context. As illustrated in Figure 7A , the side chains of the À 1 and À 2 residues (KTKLL of Emp47p coloured magenta; KKLV of p25 coloured green) are directed away from the binding site; recognition of these residues is instead via their backbone carbonyl groups. Likewise, the À 4 position of the KxKxx motif is directed away from the binding site (see, e.g., KEKSD in Figure 3D) . A study of ER retrieval signals in mammalian cells took a combinatorial approach to generate a series of K(x)Kxx motifs, and obtained a wealth of information on the trafficking itinerary of reporter molecules bearing these C-terminal sequences (Zerangue et al, 2001) . The results showed that certain dilysine-containing sequences strongly enforced ER localization (e.g., KKYL and KKTN), whereas others did not (e.g., KKYY, KKTA and KKSP). With hindsight, it seems likely that many of the K(x)Kxx sequences generated Figure 6 Binding modes of the four classes of dilysine-type retrieval signals. Schematic diagram shows the binding modes of the different classes of retrieval signals related to the dilysine motif. The acid patches 1 and 2 are labelled and indicated in red; these patches are defined in the text and in Figure 3A . The basic patch that recognizes the terminal carboxylate group is coloured blue. a-COP and b 0 -COP recognize all the signals, except for the RKxx motif which is specific for b 0 -COP. Rules for the recognition of dilysine retrieval motifs by coatomer W Ma and J Goldberg in the combinatorial library encode additional trafficking information, such as the C-terminal (di)hydrophobic signal that links to the COPII coat machinery (Nufer et al, 2002; Miller et al, 2003) . For example, the screen of Zerangue et al (2001) reported that proline at position À 1 is strongly disfavoured in a functional retrieval motif, yet the adenoviral E19 KKMP sequence is a bona fide retrieval signal. We tested the affinity of two additional sequences described by Zerangue et al (2001) : the KKYL sequence reported as an especially potent retrieval signal binds to both a-COP and b 0 -COP, with comparable affinity to other K(x)Kxx sequences; however, the KKAA motif reported not to function as a retrieval signal does also bind to both COP subunits in our ITC assay ( Figure 7B ). We surmise that residues surrounding the lysines do not play a dominant role in binding to a-COP and b 0 -COP subunits, and we suggest that this may allow cargo C termini to nest additional topogenic information in their K(x)Kxx sequences without compromising the interaction with coatomer. In this study, we establish how the four classes of dilysinetype retrieval motifs are recognized by coatomer subunits. Our focus was on naturally occurring motifs, rather than on K(x)Kxx sequence permutations that yield the highest affinity for coatomer. Consequently, there are no doubt rules and restrictions on dilysine-motif recognition not accounted for in our analysis. Nevertheless, the range of structures and sequences we have explored offer a fairly thorough description of the binding modes, recognition bonding interactions and certain sequence restrictions, to provide a heuristic basis on which to further explore the role of coatomer and dilysine signals in retrograde trafficking. Here, we briefly summarize these rules with reference to Figure 6 . A dilysine-type motif adopts one of three possible binding modes (the KxKxx and KxHxx motifs are considered as equivalent), dependent on the combination of the two basic residues-KxK, KK, RK-and independent of sequence context. For all binding modes, there is an absolute requirement for a basic residue at the À 3 position of the motif, because the recognition of the C-terminal carboxylate group controls the distance to the acidic patches 1 and 2 on a-COP and b 0 -COP. The general preference for lysine residues in the retrieval signal does not arise from an absolute discrimination of the acidic patches against arginine and histidine side chains, as demonstrated by the finding that ER localization of hWbp1 protein is maintained when the KxKxx motif is replaced by the non-natural RxKxx, KxRxx, HxHxx, or the coronavirus KxHxx motif (Hardt and Bause, 2002) . Favouritism towards lysine is also due to the binding modes of the motifs (enforced by interactions with peptide backbone carbonyl groups), which position arginine and histidine side chains suboptimally at the acidic patches. This effect seems to be the most pronounced for KKxx-family motifs, as reporter-based analysis shows that KRxx, KHxx and HKxx motifs are not functional retrieval signals (Jackson et al, 1990) ; only KKxx and RKxx motifs function as such, and as we have demonstrated RKxx does so by adopting a KxKxx-like binding mode and positioning its arginine side chain beyond the acidic patches. Our results shed light on the dilysine-binding specificity of the coatomer subunits. They demonstrate that a-COP and b 0 -COP do not discriminate between KKxx and KxKxx motifs as originally proposed by a study of Emp47p and Wbp1 trafficking in yeast (Eugster et al, 2004) . Instead, discrete structural differences at the binding sites allow only a-COP to bind tightly to dilysine motifs with a b-branched residue at the À 2 position, and allow only b 0 -COP to accept the rare RKxx signal. The major unresolved issue in this context is the functional organization of a-COP and b 0 -COP in the polymerized COPI coat. Our previous structural analysis of ab 0 -COP implicated a triskelion vertex formed by b-propeller contacts that are highly similar to those in the COPII cage (Stagg et al, 2006; Fath et al, 2007; Lee and Goldberg, 2010) . However, as we emphasize in Supplementary Figure S2A , the ab 0 -COP crystal structure included only a fragment of a-COP so that the location of the b-propeller domain of a-COP remains unclear. A direct attack on this problem will require electron microscopy, and a recent study of reconstituted COPI-coated vesicles using cryo-electron tomography (cryo-ET) has pointed the way forward (Faini et al, 2012) . This study identified a three-fold symmetry centre in the density distribution for the COPI cage, consistent with the ab 0 -COP triskelion nucleus. However, fitting of additional solenoid regions of ab 0 -COP and of the AP2 crystal structure (as a model for the bd/gz-COP subcomplex) was unsatisfactory, and clearer insights into the COPI architecture await an advance in the resolution of the cryo-ET analysis. A key insight from the cryo-ET study is that protein features near to the three-fold symmetry centre reside close to the membrane surface (Faini et al, 2012) . This is important, because most dilysine motifs are presented on cytoplasmic tail sequences that are intriguingly short. To assess this issue, we modelled the binding of a dilysine motif to b 0 -COP in the ab 0 -COP triskelion on a membrane surface. As Supplementary Figure S2 illustrates, the dilysine motif binds very close to the contact surfaces at the triskelion centre: the KKxx motif fits snugly between adjacent b 0 -COP surfaces, whereas the KxKxx motif clashes somewhat with a neighbouring surface (however, we did not evaluate these contacts in detail because the triskelion interface is distorted from proper three-fold symmetry by crystal packing interactions). Importantly, the first lysine residue of the KKxx motif resides B25 Å from the membrane surface, according to this model (Supplementary Figure S2C) . This is just compatible with the lengths of cargo cytoplasmic tails: for example, the number of residues between the membrane surface and the first lysine residue is 13 in yWbp1, 10 in p25 and 12 in adenoviral E19. This striking juxtaposition of coat proteins and cargo signals has implications for the mechanism, still unresolved, through which coatomer couples cargo packaging to COPI-vesicle formation (Bremser et al, 1999) . The expression and purification of full-length B. taurus ab 0 e-COP subcomplex was carried out as described previously (Lee and Goldberg, 2010; Yu et al, 2012) . Two dimeric forms of S. cerevisiae ab 0 -COP complex-one comprising a-COP(1-818)/b 0 -COP(651-814) and the other b 0 -COP(1-814)/a-COP(642-818)-were prepared from baculovirus-infected Hi-5 insect cells, with the proteins encoded on individual vectors (pFastBac HTB vector from Invitrogen) and the baculoviruses constructed using the DH10Bac bacmid procedure (Invitrogen). Insect cells were harvested 48 h post infection, lysed by sonication and centrifuged at 100 000 Â g for 90 min. Protein complexes were purified by Nickel-affinity (NI-NTA, Invitrogen), ion-exchange (HiTrap Q, Invitrogen), and size-exclusion chromatography. All proteins were brought to 20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 5 mM dithithreitol (DTT), and were flash frozen in liquid nitrogen. Proteins for crystallization were produced in E. coli. The N-terminal b-propeller domains of S. cerevisiae b 0 -COP (residues 1-301) and quintuple-mutant S. pombe a-COP (1-327; L181K, L185K, I192K, L196K and F197K) were expressed with an N-terminal hexa-histidine tag plus Smt3. Bacterial cells expressing target protein were lysed by French press and purified on a nickelaffinity column. The Smt3 tag was removed by Ulp1 protease cleavage and protein purified by anion exchange and size-exclusion chromatography. Mutant protein constructs were generated using the Phusion PCR method (New England Biolabs) and purified in the same manner as wild-type protein. For pull-down experiments, the cytoplasmic regions of dilysinecontaining proteins were fused to the C terminus of GST. Constructs were made in the pGEX-4T-1 vector using the Phusion mutagenesis kit. High purity synthetic peptides were purchased from the Tufts University Core Facility. The sequences of the peptides used in crystallographic analyses and calorimetry experiments are listed in Table I . The GST pull-down experiments were performed as described in Mossessova et al (2003) , with several modifications. A saturating quantity of GST-dilysine fusion protein was incubated with 10 ml of a 50% (v/v) slurry of glutathione sepharose 4B beads (GE Healthcare) for 30 min at 41C. Beads were washed once with 500 ml Buffer A (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 5 mM DTT and 0.1% Triton X-100), leaving B60 mg GST fusion protein bound to the beads. Beads were mixed with 0.1 mg/ml coatomer proteins and 5 mg/ml E. coli proteins (prepared as described in Mossessova et al, 2003) in 100 ml buffer A. The assay mix was incubated at 41C for 45 min and washed once with 100 ml buffer A. Proteins were eluted with 60 ml SDS sample buffer, separated by 4-20% gradient SDS-PAGE and stained with Coomassie blue. Binding experiments were performed on a microCal ITC200 instrument (GE Healthcare) operated at 251C. The protein and synthetic dilysine peptide samples were dialysed overnight against separate volumes of the same buffer: 20 mM Tris-HCl pH 7.5, 150 mM NaCl (peptides were dialysed using a 100-500 Da cutoff dialysis membrane). Measurements were taken over B17 injections and data were fitted using Origin 7.0 software to estimate the equilibrium dissociation constant, stoichiometry and enthalpy of binding. Binding of peptides to a-COP was tested using the quintuple-mutant S. pombe a-COP(1-327), and binding to b 0 -COP was tested using the S. cerevisiae b 0 -COP(1-814)/a-COP(624-818) dimer. For crystallization purposes, purified S. cerevisiae b 0 -COP(1-301) was concentrated to 30 mg/ml in 20 mM Tris-HCl pH 7.4, 150 mM NaCl and 5 mM DTT and flash frozen in liquid nitrogen. Crystals of protein/peptide complexes were grown by the hanging-drop method. For example, a 1-ml solution containing 1 mM b 0 -COP and 1.5 mM p25 peptide was mixed with an equal volume of well solution comprising 0.1 M MES pH 6.5, 30% PEG-400. Crystals grew at room temperature in 4 days in the space group P2 1 2 1 2 1 (see Table I for peptide sequence and crystallographic details). Crystals were transferred into a cryoprotectant buffer containing well solution plus 20% glycerol, and frozen in liquid nitrogen. The b 0 -COP/ p25 crystals diffracted synchrotron X-rays beyond 1.5 Å resolution. The model was refined using Phenix. Subsequently, the p25 peptide model was built manually into high-resolution difference Fourier maps. The density for the p25 peptide (FEAKKLV) is excellent other than the first two residues, FE, which are outside the binding site and disordered in the maps. Crystallographic data and refinement results are summarized in Table I. All the other structures of b 0 -COP/peptide complexes were determined in a similar manner. However, they almost all grew as distinct crystal forms M Hepes pH 7.5 and 20% PEG-10000; hWbp1, 0.1 M Hepes pH 6.3 and Emp47p, 0.1 M MES pH 6.0 and Scyl1, 0.1 M HEPES pH 7.1 and PEDVspike, 0.1 M MES pH 6.2 and Synthetic peptides (adenoviral E19 and Emp47p) were added in a 1.5-fold molar excess to the a-COP protein, and crystals were grown by the hanging-drop method. For a-COP/E19 crystals, the well solution contained 0.2 M K Na tartrate plus 20% PEG-3350, and for the a-COP/Emp47p crystals it contained 0.2 M ammonium acetate plus 20% PEG-3350. The apo form of S. pombe a-COP was crystallized from 0.2 M tri-sodium citrate plus 20% PEG-3350. Crystals of a-COP/EMP47p and apo-a-COP were cryoprotected using well solution plus 25% glycerol. The a-COP/E19 crystals were transferred to paraffin oil prior to freezing Coupling of coat assembly and vesicle budding to packaging of putative cargo receptors Crystallography and NMR system: a new software suite for macromolecular structure determination Scyl1, mutated in a recessive form of spinocerebellar neurodegeneration, regulates COPI-mediated retrograde traffic Coatomer interaction with di-lysine endoplasmic reticulum retention motifs The alpha-and beta'-COP WD40 domains mediate cargo-selective interactions with distinct di-lysine motifs The structures of COPI-coated vesicles reveal alternate coatomer conformations and interactions Structure and organization of coat proteins in the COPII cage Bimodal interaction of coatomer with the p24 family of putative cargo receptors Signalmediated retrieval of a membrane protein from the Golgi to the ER in yeast Lysine can be replaced by histidine but not by arginine as the ER retrieval motif for type I membrane proteins Molecular basis for recognition of dilysine trafficking motifs by COPI Identification of a consensus motif for retention of transmembrane proteins in the endoplasmic reticulum Retrieval of transmembrane proteins to the endoplasmic reticulum Structure of coatomer cage proteins and the relationship among COPI, COPII and clathrin vesicle coats Coatomer is essential for retrieval of dilysine-tagged proteins to the endoplasmic reticulum Ligand-induced redistribution of a human KDEL receptor from the Golgi complex to the endoplasmic reticulum Intracellular targeting signals contribute to localization of coronavirus spike proteins near the virus assembly site In vitro assembly and disassembly of coatomer Multiple cargo binding sites on the COPII subunit Sec24 ensure capture of diverse membrane proteins into transport vesicles SNARE selectivity of the COPII coat Short cytoplasmic sequences serve as retention signals for transmembrane proteins in the endoplasmic reticulum Role of cytoplasmic C-terminal amino acids of membrane proteins in ER export Processing of X-ray diffraction data collected in oscillation mode Suppression of coatomer mutants by a new protein family with COPI and COPII binding motifs in Saccharomyces cerevisiae Alpha-COP can discriminate between distinct, functional di-lysine signals in vitro and regulates access into retrograde transport Mutation in the cytoplasmic retrieval signal of porcine epidemic diarrhea virus spike (S) protein is responsible for enhanced fusion activity Structure of the Sec13/31 COPII coat cage The KKXX signal mediates retrieval of membrane proteins from the Golgi to the ER in yeast Crucial step in cholesterol homeostasis: sterols promote binding of SCAP to INSIG-1, a membrane protein that facilitates retention of SREBPs in ER A structure-based mechanism for Arf1-dependent recruitment of coatomer to membranes Analysis of endoplasmic reticulum trafficking signals by combinatorial screening in mammalian cells We thank staff of the NE-CAT beamlines at the Advanced Photon Source of the Argonne National Laboratory for access to synchrotron facilities. We thank all the members of our laboratory for advice and insights on coatomer behaviour and protein biochemistry experiments.Author contributions: JG and WM designed the experiments and analysed the data. WM performed the experiments. JG and WM prepared the manuscript. The authors declare that they have no conflict of interest. Supplementary data are available at The EMBO Journal Online (http://www.embojournal.org).