key: cord-0770246-9gtsqyfj authors: Rawlings, Neil D.; Barrett, Alan J. title: Introduction: The Clans and Families of Cysteine Peptidases date: 2012-11-09 journal: Handbook of Proteolytic Enzymes DOI: 10.1016/b978-0-12-382219-2.00404-x sha: 1f095b254f7775c63b8df3e68e7fddf12eae4b46 doc_id: 770246 cord_uid: 9gtsqyfj The third edition of the Handbook of Proteolytic Enzymes aims to be a comprehensive reference work for the enzymes that cleave proteins and peptides, and contains over 800 chapters. Each chapter is organized into sections describing the name and history, activity and specificity, structural chemistry, preparation, biological aspects, and distinguishing features for a specific peptidase. The subject of Chapter 404 is Clans and Families of Cysteine Peptidases. Peptidases in which the nucleophile that attacks the scissile peptide bond is the sulfhydryl group of a cysteine residue are known as cysteine-type peptidases. The catalytic mechanism is similar to that of the serine-type peptidases in that a nucleophile and a proton donor/general base are required, and the proton donor in all cysteine peptidases in which it has been identified is a histidine residue, as in the majority of serine-type peptidases. In some families only the dyad of Cys and His seems to be essential for catalysis, whereas in others there is evidence that a third residue is required to orientate the imidazolium ring of the His (a role analogous to that of the essential aspartate in some serine peptidases). The catalytic mechanisms of cysteine peptidases are described fully in Chapter 405. The clans and families of cysteine peptidases are listed in Table 404 .1. The first clearly recognized cysteine peptidase was papain (Chapter 418), and this forms the foundation of clan CA. The crystal structure (Figure 404.1) shows two structural domains separated by the active site cleft. The N-terminal domain consists largely of a bundle of α-helices whereas the C-terminal domain contains a β-barrel. A long helix runs through the middle of the molecule, and the catalytic cysteine is at the start of this. Clan CA contains over 20 families (see Table 404 .1). Almost all of these have been brought into the clan on the basis of structures that are known from crystallographic data to be similar. The other families are assigned to clan CA because they contain similar sequence motifs around the catalytic residues (Figures 404.2À404.7) . Besides Cys158 and His292 of the catalytic dyad, two other residues are important for catalysis in papain and its relatives (numbering as in MEROPS database). These are a Gln152 that helps in the formation of the 'oxyanion hole', an electrophilic center that stabilizes the tetrahedral intermediate, and Asn308 (sometimes Asp in families C12, C28 and C39), which is thought to orientate the imidazolium ring of the catalytic His (Chapter 405). The order of these residues in the sequence is Gln, Cys, His, Asn/Asp, and in mature papain, the numbering is Gln19, Cys25, His159 and Asn175. In family C19, asparagine serves the same function as Gln152. No equivalent of Gln152 has been identified in the foot-and-mouth disease virus L-endopeptidase. The catalytic cysteine is usually followed by an aromatic, hydrophobic amino acid, but glycine occupies this position in some peptidases of family C2 and all those of family C12. Figure 404 .8 shows secondary structure diagrams of example peptidases from the clan. All the examples shown are α/β proteins, and all have the nucleophilic Cys at the start of a helix and the catalytic His at the start of a β-sheet. Inhibitors are of some use in identifying peptidases of clan CA. Many members of the clan are irreversibly inhibited by compound E-64, but reversible inhibition by the arginine-like side chain of E-64 is not significant and should be ignored. Proteins of the cystatin family also inhibit many peptidases in clan CA, but it should be noted that some cystatins have a second reactive site that can inhibit legumain, a clan CD peptidase [1] , and some cystatins can inhibit metallopeptidases [2] . Family C1. The evolutionary tree (see the MEROPS database) shows that the family is divided into two subfamilies, C1A (or papain subfamily) and C1B (or bleomycin hydrolase subfamily). It may well be that a peptidase of family C1 was present in the universal ancestor of all organisms, and that this evolved into a subfamily C1A-type protein in early archaea and a subfamily C1B-type protein in early bacteria. Further divergence would have occurred in the separate groups of organisms, and horizontal transfers can explain the presence of a C1A homolog in Clostridium and of bleomycin hydrolase in eukaryotes. Subfamily C1A is the larger of these subunits, and consists of secreted and lysosomal enzymes. Subfamily C1A contains mainly endopeptidases from DNA viruses, protozoa, plants and animals, and exopeptidases from gram-positive bacteria, fungi and animals. Members but the majority of completed archaean genomes lack C1A homologs. Homologs are patchily distributed in bacteria, with most genomes lacking a homolog. Most bacterial homologs are from members of the phyla Bacteroidetes and Chloroflexi. Even within a genus, some species will possess a homolog and some will not; for example, half of the Mycoplasma species with completed genomes have homologs. The subfamily includes the plant enzymes papain, chymopapain and actinidain, and the animal lysosomal enzymes cathepsins B, H and L. Figure 404 .1 shows the three-dimensional structure of papain. There is a clear division within the subfamily between the cathepsin B-like enzymes and the papain-like enzymes. Among the cathepsin B-like enzymes are dipeptidyl-peptidase I, and the endopeptidases from Giardia. Included among the papain-like enzymes are cathepsins O, H, L, K and S. Cathepsin O is very divergent, whereas the others are more closely related. The close relationship between mammalian cathepsin H and plant aleurain is apparent. The distinction between exopeptidases and endopeptidases is blurred for some members of subfamily C1A. Dipeptidyl-peptidase I (Chapter 447) acts principally as an exopeptidase, removing N-terminal dipeptides, but may have limited endopeptidase activity. Cathepsins B and H (Chapters 406 and 408) both possess endopeptidase activity, but arguably are more important for their exopeptidase activities. Cathepsin B acts as a peptidyldipeptidase, releasing C-terminal dipeptides, and this activity is attributable to the existence of an extended loop that forms a cap to the active-site cleft, and carries a pair of His residues that are thought to bind the C-terminal carboxylate of the substrate (Figure 404.9) . Cathepsin X is a rather specific carboxypeptidase (Chapter 415). The enzymes that enter the secretory pathway, destined either for secretion or for the lysosomal system, are synthesized as precursors, with N-terminal propeptides as well as the signal peptides. Most members of family C1 have propeptides homologous to that of papain, but the propeptide of cathepsin B is much shorter and very different in sequence. However, the propeptides of both cathepsin B and papain are thought to act in the same way in the proenzymes, blocking the active site by being bound in the reverse orientation to a substrate, as can be seen from the structure of procathepsin B (Figure 404 .10). Papain-like propeptides may be identified from the presence of the ERFNIN motif, in which some of the following residues (numbered according to the papain propeptide) are conserved: Glu64, Arg68, Phe72, Asn75, Asn83, Phe96, Asp98 and Glu103. The papain propeptide is homologous to proteins CTLA-2α and CTLA-2β of activated T cells, which have been shown to be cysteine endopeptidase inhibitors [3] . It is notable that residue Pro2 is often conserved in the mature peptidases of family C1 (see the alignment in MEROPS), and this may prevent attack by aminopeptidases, since the Xaa-Pro bond is resistant to many such enzymes. Most members of subfamily C1A are monomeric, but dipeptidyl-peptidase I is a homotetramer in which each FIGURE 404.5 Richardson diagram of foot-and-mouth disease virus L-peptidase. The image was prepared from the Protein Data Bank entry (1QMY) as described in the Introduction (p. li). One molecule of the trimer is shown. Catalytic residues are shown in ball-and-stick representation: Cys51 (engineered to be Ala), His148 and Asp163 (numbering as in PDB entry). The image was prepared from the Protein Data Bank entry (1CV8) as described in the Introduction (p. li). Catalytic residues are shown in balland-stick representation: Cys24, His120 and Asn141 (numbering as in PDB entry). E-64 is shown in gray in ball-and-stick representation. monomer consists of three chains as a result of proteolytic processing. Some peptidases of subfamily C1A have C-terminal extensions relative to papain. Among the endopeptidases from plants, the extensions at the C-terminus of oryzains α and β, and cysteine endopeptidases from tomato (UniProt P20721), Arabidopsis thaliana (UniProt: P43297), Douglas fir (UniProt: Q40922), carnation (UniProt: Q43423) and pea (UniProt: Q41064) are homologous. The C-terminal extensions found in endopeptidases from Trypanosoma and Leishmania are unique in the family. Inserts relative to papain occur within the catalytic domain in other members of the family. In cathepsin B, the 'occluding loop' that carries the histidine residues important for peptidyl-dipeptidase activity is inserted between the catalytic Cys and His residues. This loop is not present in enzymes from Giardia [4] that are the most divergent of the cathepsin B-like proteins and probably represent the primitive state. Some endopeptidases from Dictyostelium possess glycine-and serine-rich inserts between the active site His and Asn. The specificity subsite that is dominant in most peptidases of subfamily C1A is S2, which commonly displays a preference for occupation by a bulky hydrophobic side chain, and not a charged one. Unusually, the S2 subsite of cathepsin B readily accepts Arg, however, so that Z-Arg-ArgkNHMec is a good selective substrate for the enzyme. This distinctive specificity of cathepsin B can be explained by the residue lying at the bottom of the S2 pocket, which in papain is Ser338, but in cathepsin B is Glu (numbering as in the MEROPS C1 alignment). Glu338 also occurs in cysteine endopeptidases from Figure 404 .4) contains several soluble, intracellular peptidases. These are aminopeptidases that commonly show selectivity for release of N-terminal Arg residues, and include aminopeptidase C (PepC) from bacteria (Chapter 451), and bleomycin hydrolase (Chapter 449) from eukaryotes. Note that no bleomycin hydrolase homolog is The aminopeptidase C-like enzymes are oligomeric. The yeast bleomycin hydrolase is probably representative, and is a homohexamer, with the active sites arranged on the inner face of the central channel, in an arrangement reminiscent of that in the proteasome. This arrangement evidently allows only small molecules to interact with the catalytic site. Unlike papain and cathepsin B, the aminopeptidases of family C1B do not contain disulfide bonds and are synthesized without propeptides. As is discussed in Chapter 449, the mature bleomycin hydrolase subunit consists of three domains, the peptidase domain, an oligomerization (or 'hook') domain, and a helical domain. Half of the hook domain is N-terminal to the catalytic domain, but the other half is an insert (relative to papain) preceding the catalytic His. The helical domain corresponds to two inserts in the catalytic domain with respect to papain. Figure 404 .11 shows the tertiary structure of the yeast bleomycin hydrolase subunit. There are a number of proteins homologous to papain that lack peptidase activity because catalytic residues have been replaced. These include an oil-bodyassociated protein from soybean (UniProt: P22895) in which the catalytic Cys is replaced by Gly, a surface protective protein from Plasmodium (UniProt: P08676) and a protein from Schistosoma japonicum (GenBank: X70969) in which the catalytic Cys has been replaced by Ser. In mammals, non-peptidase homologs include FIGURE 404.9 Richardson diagram of the human cathepsin B/CA030 complex. The image was prepared from the Protein Data Bank entry (1CSB) as described in the Introduction (p. li). Catalytic residues are shown in ball-and-stick representation: Cys29, His199 and Asn219 (numbering as in PDB entry). One of the pair of histidines (His110) on the occluding loop is also shown in ball-and-stick representation. FIGURE 404.10 Richardson diagram of rat procathepsin B. The image was prepared from the Protein Data Bank entry (1MIR) as described in the Introduction (p. li). Catalytic residues are shown in ball-and-stick representation: Cys29 (engineered to be serine), His199 and Asn219 (numbering as in PDB entry). One of the pair of histidines (His110) on the occluding loop is also shown in ball-and-stick representation. The propeptide is shown in gray. testin, which is secreted by Sertoli cells of the testis but has no known function, and tubulointerstitial nephritis antigen, an extracellular matrix basement protein that is a target for antibody-mediated interstitial nephritis [5] . Both mammalian proteins have the catalytic Cys replaced by Ser. Several pseudogenes are known in mammals, including a human cathepsin L-like pseudogene (GenBank: L07772). Family C2 contains the calpains. The molecule of a 'typical' calpain is a heterodimer of a heavy chain and a light chain (Chapters 454, 456). The heavy chain is a mosaic protein containing the peptidase domain (domain II) (Figure 404 .12) and also a C-terminal domain (domain IV) with four calcium-binding EF-hand structures. The calcium-binding domain is homologous to calciumbinding domains in other proteins. The functions of heavy chain domains I and III are unknown. Crystal structures of calcium-free human calpain have been described [6, 7] , and they show that in the absence of calcium the peptidase unit adopts a conformation that disrupts the catalytic site. In most mammalian cells there are two calpains with different calcium requirements for activity: calpain-1 (Chapter 454) and calpain-2 (Chapter 455). Calpain 3 (Chapter 456) has a 48 residue insert in the catalytic domain between the catalytic Cys and His. A number of atypical calpains are known, in which the domains N-terminal and C-terminal to the protease domain are different from those in the typical calpains, and a small subunit may not be required for activity. A Drosophila homolog of calpain possesses an insert in the C-terminal domain, and there is evidence that differential splicing can give rise to a form lacking the calcium-binding domain [8] . Another calpain homolog exists in Drosophila and is involved in the development of the small optic lobe. This putative endopeptidase is the product of the sol gene, and the protein is predicted to be multidomain, including six zinc fingers as well as a peptidase domain, but no calcium-binding domain [9] (Figure 404 .12). Again, an alternatively spliced form exists, this time lacking the peptidase domain. The calpain family is not restricted to animals: there are sequences from fungi, including Aspergillus and Saccharomyces, the protozoa Trypanosoma and Leishmania, the plant Arabidopsis, and the cyanobacterium Nostoc (but not Synechocystis). Homologs are also known from members of a wide variety of bacterial phyla, but homologs are absent from archaea. The only bacterial homolog that has been characterized is the Tpr peptidase from Porphyromonas gingivalis, which is so dissimilar in sequence to other members of the family that it had been considered a member of a separate family, and is now the only representative in subfamily C2B. The absence of homologs from archaea and many bacteria with completely sequenced genomes, implies that horizontal gene transfer has occurred, probably from a protist or fungus to a bacterium (see the evolutionary tree in the MEROPS database). Aspergillus palB protein and a hypothetical protein from Caenorhabditis are unusual among peptidases in clan CA in that the catalytic Cys is followed by Ser rather than a hydrophobic residue (see the alignment in MEROPS). Both the Aspergillus and Caenorhabditis proteins are multidomain, but show only limited sequence similarity to domains I and III of human calpain. Neither protein possesses the calcium-binding domain IV. A non-peptidase homolog is known from Trypanosoma brucei in which the catalytic Cys has been replaced by Ser. Family C10 contains streptopain (Chapter 483) and a few similar enzymes, all of which are from gram-positive bacteria. Streptopain is inactivated by E-64 much more slowly than papain, but has a similar specificity, with a preference for a hydrophobic residue at P2 [10, 11] . In some members of the family, Asn308 (numbering as in Figure 404 .7) is replaced by Asp. The prtT peptidase from Porphyromonas is a mosaic protein with a C-terminal hemagglutinin domain unrelated to those in gingipains R and K in family C25 but 98% identical to the C-terminal domain of hemin-regulated protein from Porphyromonas (UniProt: P72198). Surprisingly, streptopain homologs are found in some Bacteroides species but not others. There are several families in clan CA in which the peptidases release ubiquitin from conjugated proteins. Ubiquitin is a 76 amino acid protein that is attached to other proteins as a signal for intracellular translocation or degradation (usually mediated by the proteasome). Ubiquitin is attached through the carboxyl group of its C-terminal glycine residue either to the N-terminus of another polypeptide or to the α-amino group of a lysine residue, which forms an isopeptide bond (Chapters 460À462). Ubiquitin may also be attached to another molecule of ubiquitin, and proteins targeted for degradation may be polyubiquitinated. In order for ubiquitin to be recycled there is a need for peptidases to liberate the free ubiquitin by hydrolysis of the C-terminal glycyl bond. Peptidases that release ubiquitin from conjugates are known as 'ubiquitin C-terminal hydrolases' or 'deubiquitinylating enzymes'. Because release of ubiquitin may involve cleavage of an isopeptide bond, ubiquitin C-terminal hydrolases are not classified as peptidases in Enzyme Nomenclature but as thiolester hydrolases. There are at least eight families of deubiquitinylating enzymes in clan CA, and there are enzymes in other, unrelated, families, including the metallopeptidase AMSH (family M67) which is a component of the 26S proteasome (Chapter 817). There are over 100 different deubiquitinylating enzymes in humans, which implies a sophisticated network controlling translocation and degradation of proteins which is only just being understood. Family C12 includes peptidases that can hydrolyze the ubiquitin-conjugate glycyl bond whether it is an α-peptide bond or an isopeptide bond, with various specificities. The tertiary structure of the human ubiquitin C-terminal hydrolase UCH-L3 has been solved [12] , and is shown in Figure 404 .4. There are many structural similarities to papain. The molecule is bilobed, one lobe consisting mainly of helices, the other containing a β-barrel surrounded by helices. The catalytic Cys is at the start of one of the helices, and the catalytic His is the start of a β strand. A difference from papain is that the first strand of the β-barrel precedes the helix bearing the catalytic cysteine in the sequence, whereas in papain the helix precedes all the strands of the β-barrel. The spacing between Gln152 and Cys158 (numbering as in Figure 404 .7) is identical to that in families C1, C2 and C10. Peptidases in family C12 are synthesized without propeptides, and are intracellular; they are known only from animals, plants and fungi. Family C19 contains the second group of ubiquitin C-terminal hydrolases. These are also intracellular peptidases, but are able to release ubiquitin from much larger peptides that have been polyubiquitinated (Chapters 463À473). Because ubiquitination is also essential for the assembly of multimeric proteins, there is a need to remove the ubiquitin molecules from polyubiquitinated subunits, which is presumably one of the functions of peptidases of family C19. Peptidases in family C19 have more complicated structures than those in C12, and many are multidomain proteins. There is also a much greater variety, with Saccharomyces possessing at least 19 homologs, and even greater numbers in mammals. The tertiary structure of ubiquitin-specific peptidase 7 has been solved and a catalytic triad consisting of Cys, His and Asp was identified (Figure 404.13; also see later Figure 404 .35). The Asp is equivalent to the Asn152 of papain. The side chain of an asparagine stabilizes the oxyanion hole and is equivalent of Gln19 in papain. The fold has been likened to that of an open hand, consisting of three subdomains, with the active site Cys on the 'thumb' subdomain, the active site His on the 'palm' and the active site cleft between the 'thumb' and the 'palm.' The 'fingers' interact with ubiquitin [13] . Members of family C19 are present in all the eukaryotic genomes so far completely sequenced, including Encephalitozoon cuniculi. Family C19 sequences are known from several protozoan species that do not contain family C12. C19 homologs are known from bacteria such as the pathogen Burkholderia pseudomallei, but the sequences are shorter (c. 280 residues) and lack inserts between the catalytic Cys and His. The bacterial homologs are derived from the most divergent branch in the evolutionary tree. The TssM protein from Burkholderia mallei has been shown to be a deubiquitinating enzyme [14] . Other families of deubiquitinating enzymes were initially postulated by Makarova et al [15] , which have now been experimentally verified. The human Cezanne protein, from family C64, was shown to release ubiquitin from linear or branched synthetic ubiquitin chains and from ubiquitinated proteins [16] , and from the solved structure of tumor necrosis factor α-induced protein 3, the fold was shown to be similar to that of papain [17] . Family C65 includes the otubains (see Chapter 477). The tertiary structure of otubain-2 has been determined and shows that the fold is similar to that of papain [18] . Otubain-1 has been shown to cleave Lys48-linked polyubiquitin [19] . The CylD protein from family C67 (see Chapter 476) has been shown to have de-ubiquitinating activity that is directed towards non-Lys48-linked polyubiquitin chains [20] . The tertiary structure has been solved [21] . Family C85 includes the DUBA deubiquitinylating peptidase, which has been shown to cleave the lysine-63-linked polyubiquitin chains on tumor necrosis factor receptor-associated factor 3 [22] . However, no tertiary structure has been solved for any member of the family, and the active site residues are predictions only [15] . Ataxin-3 (Chapter 479) is yet another deubiquitinating enzyme and a member of family C86. The nuclear magnetic resonance solution structure of the domain containing the active site of ataxin-3, known as the Josephin domain, has been determined [23] , and shows the characteristics of a cysteine peptidase in clan CA. Family C88 includes OTU1 peptidase from Saccharomyces cerevisiae (Chapter 478) which preferentially cleaves longer polyubiquitin chains with Lys48 linkages. The tertiary structure has been solved for the catalytic domain, and the structure is similar to that of other clan CA members [24] . Surprisingly, enzymes that releases ubiquitin from proteins have been discovered in viruses. The UL36 protein from herpes simplex virus releases ubiquitin from proteins ubiquitinylated via Lys48 and is the founder of family C76 (Chapter 480). The tertiary structure has been solved, and shows a papain-like fold [25] . The Crimean-Congo hemorrhagic fever virus (a Nairovirus) also possesses a deubiquitinating enzyme which can downregulate antiviral responses of the host by release of ubiquitin and interferon-stimulated gene product 15 from proteins of the antiviral and inflammatory pathways [26] . The Nairovirus enzyme is a member of family C87, and the order of active site residues, identified by mutagenesis [26] , led us to include the family in clan CA. Other proteins act like ubiquitin and can be tagged to proteins, these include SUMO and NEDD-8. Ubiquitin-fold modifier-1 (Ufm1) is another ubiquitinlike protein, and like ubiquitin attaches to proteins via the C-terminal glycine. The physiological relevance for tagging proteins with Ufm1 is unknown, but tagged proteins are not degraded. Peptidases in family C78, which includes UfSP1 peptidase (Chapter 481), release Ufm1 from conjugates and process the Ufm1 precursor by removing a C-terminal dipeptide [27] . Once again, resolution of the tertiary structure has confirmed the relationship to papain [28] and the family C78 is included in clan CA. The aspartic acid that is the third member of the catalytic triad unusually is N-terminal of the histidine, whereas in most papain-like proteins follows the histidine. The residue stabilizing the oxyanion hole is a tyrosine (whereas it is glutamine in papain). In both these respects, members of family C78 resemble members of the family C54. Family C54 includes the yeast peptidase Aut2 that is responsible for the removal of a C-terminal arginine residue from the Apg8 (or Aut7) protein. The Apg8FG protein, as the processed substrate is known, is then able to form a conjugate with an unidentified protein via the newly exposed C-terminal Gly and the conjugate binds tightly to membranes permitting the formation of the double-membrane vesicles required for the transport of material from the cytoplasm to an autophagic vesicle. Homologs are known from human, mouse, Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana. Inhibition by N-ethylmaleimide and site-directed mutagenesis of Cys159 (see Alignment C54 in MEROPS) has shown that yeast peptidase Aut2 is a cysteine peptidase (Chapter 482), but other catalytic residues have not been determined. The tertiary structure of human autophagin-1 has been solved [29, 30] , and shows an catalytic triad consisting of Cys74, Asp278 and His280 that is unusual amongst papain-like proteins, in which the third member of the catalytic triad follows the histidine in the sequence (see Figure 404 .14). Families C54 and C78 are unique in having the third member (in both families an aspartic acid) preceding the histidine. Members of both families are also unique in having a tyrosine helping to stabilize the oxyanion hole. Family C47 includes staphopain (see Chapters 484 and 485), a cysteine peptidase from Staphylococcus epidermidis. The tertiary structure has been determined and shows a papain-like fold [31] (see Figure 404 .6). Staphopain is inhibited by E-64 and cystatin A. Family C39 includes endopeptidases that are involved in the processing and export of bacteriocins. Bacteriocins are antibiotic proteins secreted by some species of bacteria that have the effect of inhibiting the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide at a Gly-Glyk bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. The endopeptidase serves both functions, and is also known as an ATPbinding cassette transporter or ABC-transporter. The endopeptidases are integral membrane proteins containing an N-terminal peptidase domain followed by six or more transmembrane domains and a C-terminal ATPbinding domain. The structure of the peptidase domain has been solved, and shows a papain-like fold (see Figure 404 .15) [32] . The non-peptidase-domains are homologous to a variety of other ABC-transport proteins. Peptidase specificity is not unlike that of the ubiquitin peptidases. The complex structure of bacterial cell walls means that there are a large number of peptidases involved in cell wall processing and lysis. Newly synthesized bacteriophage virions escape from the host cell by lysing it, and the bacteriophage genomes encode a cell wall lytic enzyme that acts either as a peptidase to degrade the cell wall crosslinks, or as an amidase to disrupt the acetylmuramyl bonds. Staphylococcus aureus has a multi-domain enzyme known as autolysin the C-terminal domain of which is related to lysostaphin (peptidase family M37). The autolysin gene has been acquired by a number of staphylococcal phages, and in bacteriophage ϕ11, autolysin has been shown to cleave the cell wall cross-linking peptide at D-AlakGly. The peptidase activity is expressed by the Nterminal domain [33] . An alignment shows that residues Gln31, Cys32, His95 and Asn113 are conserved, suggesting that the N-terminal domain is a cysteine peptidase with a papain-like fold, which we include in family C51. Pseudomonas syringae is a plant pathogen which secretes disease-causing factors into the host cell via its type III secretion pathway [34] . The AvrRpt2 protein (Chapter 552) is one of these virulence factors which activates itself and probably cleaves the RIN4 protein of the host plant cell. By comparison with staphopain, AvrRpt2 protein has been predicted to be a cysteine peptidase with a Cys, His and Asp, papain-like, catalytic triad, and mutation of any of these residues prevented autoactivation or RIN4 digestion [35] . The AvrRpt2 protein is included in family C70. Family C58 includes the YopT effector protein from the plague organism Yersinia pestis. The Yop effectors are proteins secreted into host cells that disrupt the immune response; two of these effectors are now known to be peptidases, YopT and YopJ (see Chapter 488). The YopT effector has been shown to be a cysteine peptidase, and is known to remove a C-terminal prenylated cysteine (or a short peptide containing the C-terminal cysteine) from GTPases. The effect is to release the GTPase from the membrane. The tertiary structure has been solved for a homolog from the plant pathogen Pseudomonas syringae, which is the avirulence protein AvrPphB (see Figure 404 . 16) . The only proteolytic activity that could be demonstrated was that for autocatalytic activation of the precursor protein. A papain-like catalytic triad (Cys/ His/Asp) has been identified in both the YopT effector and the AvrPphB protein by site-directed mutagenesis. However, there does not appear to be a conserved residue in the family that would act like Gln19 in papain [36] . Another related peptidase from Pseudomonas syringae, HopN1 is a virulence factor and is injected into host plant cells by a type III secretion system where is suppresses cell death [37] . The HopN1 peptidase is only distantly related and consequently, family C58 is divided into two subfamilies with YopT and AvrPphB members of subfamily C58A and HopN1 in C58B. The IdeS peptidase (Chapter 489) from Streptococcus pyogenes is included in family C66. This peptidase cleaves the γ-chains of human IgG. From the crystal structure [38] , a papain-like fold is apparent and a catalytic triad consisting of Cys94, His262 and Asp284 (see Figure 404 .17). Family C83 includes phytochelatin synthases (Chapter 490), which are γ-glutamylcysteine dipeptidyltranspeptidases, from cyanobacteria and plants. These enzymes remove the C-terminal Gly from glutathione (γ-Glu-Cys-Gly). This is in contrast to the action of the γ-glutamyltransferases (Chapter 820) that remove the N-terminal residue. Removal of the C-terminal Gly is an important step towards formation of phytochelatin from linear polymers of the γ-Glu-Cys, and phytochelatin is important because ions of heavy metal ions such as mercury and cadmium are rendered harmless by forming tight complexes with phytochelatin. The tertiary structure of γ-glutamylcysteine dipeptidyltranspeptidase from the cyanobacterium Nostoc sp. PCC 7120 has been solved, and shows a papain-like fold and an active site also similar to that of papain, containing Gln, Cys, His and Asp [39] (see Figure 404 .18). The 'papain-like' endopeptidases of RNA viruses form a large group of cysteine peptidases that contain the catalytic dyad residues in the order Cys, His. All of the families had been included in clan CA, even though for many no tertiary structure for any member had been solved. When the crystal structure of the nsP2 (nonstructural protein 2) from Sindbis virus (a member of family C9) was determined, the fold was quite unlike that of papain, and the family was assigned to its own clan (CN, see below). Consequently, many families of peptidases from RNA viruses that had been included in clan CA have been removed, because until a tertiary structure has been solved it will not be possible from sequence alone to determine if any of these families belongs to clan CA or clan CN. The families of RNA viruses retained in clan CA, and for which tertiary structures have been solved, are C28, C31, C32 and C87. Each family contains peptidases from only one family of viruses. Thus, endopeptidases from aphthoviruses are in family C28; arteriviruses in C31 and C32, and nairoviruses in C87. The cleavages mediated by the clan CA viral endopeptidases usually show Gly in P1 (contrasting with Gln in P1 for many cleavages by clan PA(C) endopeptidases). In most of these viruses there is a single polyprotein containing a single peptidase, but there may be up to three peptidases in a single polyprotein. In the foot-andmouth disease virus, potyviruses and togaviruses, the other cysteine peptidases are not related to papain (having the catalytic dyad in the order His, Cys). In the coronaviruses and arteriviruses, all the endopeptidases have a Cys, His catalytic dyad, suggesting papain-like tertiary folds, and are probably derived from one or two sequence duplications. Such duplications appear to have been as ancient as the speciation events that gave rise to the different viruses, because there is no significant sequence similarity, and hence the peptidases are in different families. The crystal structure of the L-peptidase from the footand-mouth disease virus (in family C28) shows a clear relationship to papain. Although aphthoviruses such as the foot-and-mouth disease virus possess picornain 3C (Chapter 537) as a general polyprotein-processing enzyme, they lack a picornain 2A, and the functions of that endopeptidase are performed by the L-peptidase or an autolytic activity (see Chapter 491). The L-peptidase is sited at the N-terminus of the polyprotein, from which it releases itself. The L-peptidase also cleaves the eukaryotic initiation factor eIF-4G (formerly known as p220 or eIF-4γ), thus preventing cap-dependent synthesis of hostcell proteins. Family C16 contains the mouse hepatitis virus p28 endopeptidase, which releases itself from the N-terminus of the polyprotein encoded by gene A. The catalytic dyad has been identified by mutagenesis [40] . The polyprotein encoded by gene A includes a picornain-like endopeptidase and a putative second papain-like endopeptidase, formerly assigned to a separate family, C29, but now considered to be a member of family C16. The compound E-64c inhibits polyprotein processing in the mouse hepatitis virus [41] , presumably by inhibiting the action of one or both of the two papain-like endopeptidases. The tertiary structure of the papain-like endopeptidase from SARS virus has been solved (see Figure 404 . 19) . Families C31, C32 and C33 include polyprotein-processing endopeptidases from arteriviruses. In the Lelystad (porcine reproductive and respiratory syndrome) virus, the three papain-like endopeptidases are consecutive in the pol polyprotein, with PCPα at the N-terminus, followed by PCPβ and then Nsp2, which are so different in sequence that they are assigned to the separate families C31, C32 and C33, respectively. PCPα and PCPβ together constitute the Nsp1 protein. The catalytic dyads have been determined by mutagenesis for all three endopeptidases [42, 43] . Besides the three papain-like cysteine endopeptidases, the arteriviruses possess a fourth, serinetype, endopeptidase (see Chapter 691) in the pol polyprotein. Not all arteriviruses require all four endopeptidases, because in the equine arteritis virus, PCPα is inactive, the catalytic Cys being replaced by Lys. The PCPα endopeptidase releases itself from the N-terminus of the Nsp1 protein [42] . The PCPα endopeptidase cleaves between the Nsp1 and Nsp2 proteins in the pol polyprotein [42] . The Nsp2 endopeptidase cleaves between the Nsp2 and Nsp3 proteins, and, unusually, the catalytic Cys is followed by Gly. Cys-Gly is generally regarded as diagnostic for the subclan PA(C) viral endopeptidases, but it has also been seen in some calpain homologs [43] and in family C12. In the archaean Methanothermobacter, the cell wall is analogous to that of bacteria, but the crosslinking peptide contains amino acids only in the normal, laevo, orientation. The crosslink is made between the ε-amino group of L-Lys and the α-carboxyl group of L-Ala. It is this bond that is broken by the pseudomurein endoisopeptidase (family C71) from Methanobacterium phage psiM2. The enzyme can be inhibited by EDTA, suggesting that metal ions are important for activity. However, a cysteine-type catalytic triad (Cys, His, Asp), similar to that of papain, has also been proposed [44] , and family C71 is included in clan CA. A second clan of cysteine peptidases contains processing endopeptidases of RNA viruses. Formerly termed clan CB, this has the catalytic dyad in the order His, Cys in the sequence. A comparison of sequence motifs led Bazan and Fletterick [45] to suggest that the polyprotein processing cysteine endopeptidases of several single, positive-stranded RNA viruses were structurally related to chymotrypsin, and this was confirmed when the crystal structures of picornains 3C from human hepatitis A virus (Chapter 541) (Figure 404 .20) and human rhinovirus type 14 (Chapter 537) were found to share the two β-barrel fold of chymotrypsin. There now seems no doubt that these peptidases have evolved from a common ancestor with chymotrypsin. The catalytic His of picornain 3C (in family C3) is equivalent in position to His57 of chymotrypsin, and the Cys replaces Ser195 (see the representation of two-dimensional structures for clan PA, see Figure 559 .2). In rhinovirus picornain 2A a His, Asp, Cys catalytic triad is seen [46] , but the Asp is replaced by Glu in some other members of family C3. The viral endopeptidases and chymotrypsin clearly belong in the same clan, and because the clan contains both cysteine-type and serine-type peptidases it has been named clan PA. The clan is divided into subclan PA(S), for the serine peptidases, and subclan PA(C) for the cysteine peptidases. Families C4, C24, C30, C37 and C38 also contain endopeptidases from RNA viruses that are predicted to have the chymotrypsin fold, and are placed in subclan PA(C) together with family C3. The viral peptidases in subclan PA(C) contrast in several respects with those in clan CA. In subclan PA(C) the active site Cys is followed by Gly (like the Ser in chymotrypsin) whereas in clan CA the catalytic Cys is generally followed by a large hydrophobic residue. As regards function, the endopeptidases of subclan PA(C) tend to be general processing enzymes, cleaving several bonds in the polyprotein, whereas those in clan CA generally perform only a single cleavage. There is also a difference in P1 substrate specificity: in subclan PA(C) cleavage is commonly at a GlnkGly bond, whereas in clan CA cleavage usually occurs at a GlykGly bond. The subclan PA (C) endopeptidase is usually positioned between the helicase and the RNA polymerase in the polyprotein, whereas the clan CA enzyme usually precedes both the helicase and the RNA polymerase. Taken together with other indications that the positive-stranded RNA viruses are evolutionarily related, the similar locations of the subclan PA (C) endopeptidases in the genomes is entirely consistent with a common evolutionary origin of all of the peptidases in the subclan. The high rate of mutation of RNA viruses, lacking proofreading in the replication of the genome, could rapidly have obscured sequence relationships between the sequences. Family C3 contains the processing endopeptidases of picornaviruses, aphthoviruses, nepoviruses and comoviruses. The first group includes animal viruses such as those that cause polio and encephalomyocarditis. Picornaviruses encode one polyprotein that contains coat and core proteins, an RNA polymerase and two endopeptidases that are homologous to each other and responsible for excising the individual proteins. The larger of the endopeptidases, picornain 3C, is responsible for most of the cleavages, mainly at GlnkGly bonds, whereas the smaller endopeptidase, picornain 2A, releases itself by cleavage of a TyrkGly bond. Picornain 2A has the further function of cleaving the eukaryote initiation factor 4G of the host cell [47] . Picornaviruses are unusual in possessing two homologous endopeptidases; in aphthoviruses, for example, cleavage of the eukaryote initiation factor 4G is performed by an endopeptidase unrelated to the picornains, the L-peptidase of family C28, clan CA (Chapter 491). Although the picornains C3 and 2A are homologous, the sequences are so different that they are assigned to the separate subfamilies C3A and C3B, respectively. The polyprotein processing enzyme from aphthoviruses, which include foot-and-mouth disease, is so different in sequence from the picornains that it too is assigned to a separate subfamily, C3C. The processing endopeptidase for the hepatitis A virus is a member of a fourth subfamily, C3E, and a fifth subfamily, C3F, includes the protease from parechovirus 1. Not all members of family C3 are animal pathogens, and subfamily C3D includes processing peptidases for the polyprotein of cowpea mosaic virus and other plant pathogens. A seventh subfamily, C3G, includes a polyprotein processing endopeptidase from rice tungro spherical virus. Catalytic activity has been demonstrated and the cleavage sites determined by site directed mutagenesis to be Gln2526kAsp and Gln2852kAla [48] ; the catalytic residues are predicted to be His2680, Asp2722 and Cys2811 (see Alignment C3G in MEROPS). A homolog is also known from maize chlorotic dwarf waikavirus. Family C4 contains one of the three polyprotein processing endopeptidases from potyviruses, which are plant pathogens. The NIa endopeptidase has been shown by site-directed mutagenesis to have at least a catalytic dyad that is in the order His, Cys [49] . Another similarity with family C3 is the preference for cleavage of GlnkGly bonds. The endopeptidase is a multifunctional molecule, acting also as the Vpg protein, which is attached covalently to the viral RNA [50] . Family C24 includes the processing endopeptidase from caliciviruses. Catalytic residues have been identified in the rabbit hemorrhagic disease virus, and a His, Cys dyad is the most likely order in the sequence, but a histidine C-terminal to the cysteine is also important, probably for substrate binding [51] . The endopeptidase cleaves GlukGly bonds in the polyprotein [52] . Family C30 includes the processing endopeptidase from coronaviruses. Again, there is a His, Cys catalytic dyad, and cleavage is preferentially at GlnkGly bonds [53] . A representation of the three-dimensional structure of the processing peptidase from porcine transmissible gastroenteritis virus main protease is shown in Figure 404 . 21 . Family C37 contains the processing endopeptidases of Southampton and Norwalk caliciviruses. The catalytic cysteine has been identified in Southampton virus, and there is a preference for cleavage at GlnkGly bonds [54] . Family C38 comprises a putative endopeptidase from parsnip yellow fleck virus identified by comparison with the picornavirus polyprotein [55] . Family C74 includes the NS2 protein, one of the three polyprotein processing peptidases from pestiviruses. The catalytic triad has been established by mutagenesis as His, Glu, Cys, consistent with other families in subclan PA(C) [56] . Clan CD contains endopeptidases in which only a catalytic dyad appears to exist, in the order His, Cys in the sequence. Chen et al. [57] recognized that sequence motifs around the catalytic residues in mammalian legumain were similar to those in the caspases (family C14). Moreover, they saw similar motifs in the cysteine peptidases of families C11 (clostripain) and C25 (gingipain R). They proposed the existence of a clan CD of cysteine peptidases with a protein fold similar to that already known for the caspases (family C14). The proposal soon received support from the crystal structure of gingipain R2. The structure of caspase 1 (see Figure 404 .22, Chapter 503) shows a distinctive α/β fold quite unlike those of clans CA and PA. In the sequence motif described by Chen et al. [57] , the catalytic histidine is preceded by a block of hydrophobic residues and a glycine, and the catalytic cysteine is preceded by a second block of hydrophobic residues, a glutamine and an alanine (see Figures 404.23 and 404.24 . A similar motif has been recognized in separase (family C50), RTX selfcleaving toxin(family C80) and prtH peptidase (family C84), and all three families have been added to clan CD. Similar motifs are present in proteins not yet known to be peptidases [58] so the clan may expand still further. The peptidases in clan CD have some biochemical similarities. They are all endopeptidases with restricted specificity dominated by selectivity for the P1 residue of the substrate. Thus, in family C14 cleavage occurs predominately after Asp, in C13 after Asn, in C11 and C50 after Arg, and in C25 after Arg or Lys depending on the peptidase. This P1-dominated type of specificity resembles that of the unrelated chymotrypsin-like serine peptidases in subclanclan PA(S), and is quite unlike clan CA. The endopeptidases in the clan are not irreversibly inhibited by E-64, in contrast to most enzymes in clan CA. Clostripain and legumain react more rapidly with iodoacetamide than with iodoacetate, whereas iodoacetate is the faster inhibitor for papain and other members of family C1 (Chapter 419). CrmA, a serpin from cowpox virus, is an inhibitor of the caspases [59] that also inhibits the serine endopeptidase granzyme B (Chapter 598) [60] . Rather than reflecting any structural relationship, however, this is undoubtedly the result of a similar specificity of the enzymes for aspartyl bonds. Family C14 comprises a number of cytosolic endopeptidases that have strict specificity for hydrolysis of aspartyl bonds, the best known of which are caspase-1 (Chapter 505) and caspase-3 (Chapter 509). As can be seen in the alignment in MEROPS, there is a catalytic dyad in the order His, Cys in the sequence, as expected in clan CD. Caspase-1 is synthesized as a single-chain precursor, and the activation by cleavage of four aspartyl bonds is presumably autocatalytic. The mature endopeptidase is a heterodimer of a 22 kDa heavy chain and a 10 kDa light chain [61] . Caspase-1 also mediates the processing of interleukin 1 β at aspartyl bonds [62] , and was formerly known as interleukin 1 β-converting enzyme (ICE). The discovery that the product of the ced3 gene in Caenorhabditis elegans, which when defective increases the lifespan of the nematode [63] is a homolog of caspase-1 led to the discovery of a proteolytic cascade that leads to apoptosis in mammalian cells. Apoptosis is receptor mediated, and once the system is activated, proteins associate with one another by means of 'death domains' and 'deatheffector domains'. Death-effector domains occur in a number of intracellular proteins and in caspases-8 and -10 [64] . FLIP, a newly reported inhibitor of caspase-8, occurs in two alternatively spliced forms, the longer of which, FLIP L , contains two death-effector domains, and a caspase-like domain in which the catalytic Cys is replaced by Tyr, and the His by Arg (human) or Leu (mouse) [65] (see the alignment for family C14 in MEROPS). The evolutionary tree shows a pronounced divergence within the family, which is therefore divided into two subfamilies. Subfamily C14A includes caspases, whereas subfamily C14B includes metacaspases (see Chapter 517) and paracaspases (see Chapter 515). Paracaspase is known from a variety of animal species, including human, and provides an important link to a number of bacterial species, including homologs in the completed genomes of Nostoc, Mesorhizobium loti and Xylella fastidiosa [58] . The family is generally absent from archaea, but homologs have been detected in the proteomes of Methanosarcina barkeri, Methanothermus fervidus and Thermococcus onnurineus, which are presumably derived from horizontal gene transfers. There are also homologs in two insect ascoviruses, which are double-stranded DNA viruses. Family C13 includes a number of endopeptidases that specifically cleave asparaginyl bonds. An asparaginyl endopeptidase was first identified in seeds of leguminous plants and was named 'legumain' [66] . The legumain of plant seeds has now been found in a wide variety of dicotyledonous plants and is proposed to be responsible for the post-translational processing of seed proteins prior to storage [67] . The processing of a few of these proteins, such as concanavalin A, also involves protein splicing at asparaginyl bonds, and this transamidase reaction also can be mediated by legumain [68] . Homologous asparaginyl endopeptidases have been characterized from the blood flukes Schistosoma mansoni and S. japonicum; they were initially thought to be responsible for hemoglobin breakdown and were termed 'hemoglobinases' but this name is no longer appropriate. Mammalian legumain has been characterized and shown to be a lysosomal enzyme [69] . Legumain is not inhibited by compound E-64 and like clostripain it reacts more rapidly with iodoacetamide than with iodoacetate. The enzyme is inhibited by egg-white cystatin, which is also an inhibitor of peptidases in family C1 (clan CA), but it has been found that cystatin possesses two reactive sites, one for legumain and one for the papain-like enzymes, and is able to inhibit both simultaneously and independently [1] (Chapter 518). Not a conventional endopeptidase, but also in the legumain family, is glycosylphosphatidylinositol:protein transamidase. This has been identified in Saccharomyces and humans, and is responsible for attaching glycosylphosphatidylinositol (GPI) anchors to the C-termini of newly synthesized proteins in the endoplasmic reticulum. A C-terminal peptide is removed before the preformed GPI anchor is attached, apparently in a single reaction. The transamidase differs considerably in sequence from the legumains, and the Cys residue that had been predicted to be catalytic in Schistosoma legumain [70] is replaced by Leu. The evolutionary tree demonstrates the marked difference between the forms of legumain and the GPI8 transamidase (Chapter 520). Members of family C13 are known principally from animals, plants and fungi: the family is absent from the genome of the microsporidian Encephalitozoon cuniculi and not known from any other protozoa. However, despite generally being absent from prokaryotes, homologs are present in the gram-negative bacteria Pseudomonas aeruginosa and Caulobacter crescentus. The Pseudomonas homolog is the most divergent member of the family appears to be derived from a divergence that predates that of legumain and glycosylphosphatidylinositol:protein transamidase. Family C11 contains only clostripain (Chapter 521). This is an endopeptidase discovered in Clostridium histolyticum that has a preference for hydrolysis of arginyl bonds. Clostripain is synthesized as a proprotein, and activation requires loss of a 23 residue propeptide and autolytic removal of an internal nonapeptide to produce a heterodimer consisting of a 131-residue light chain and a 336-residue heavy chain that contains the residues of the catalytic dyad. Clostripain differs from enzymes of the papain family in a number of respects: it is calcium dependent, it is more rapidly inactivated by iodoacetamide than by iodoacetate, and E-64 gives only reversible inhibition attributable to the presence of an arginine-like sidechain in the compound. Forms of clostripain are known from Clostridium perfringens and Thermotoga maritima. Family C25 is represented by gingipains R and K (Chapters 522 and 523), secreted endopeptidases from the pathogenic bacterium Porphyromonas gingivalis. Gingipain is a multidomain protein, containing not only a peptidase unit but also a C-terminal hemagglutinin domain (Figure 404.25) . Gingipain R has specificity for arginyl bonds, whereas gingipain K cleaves after lysine residues. Both enzymes are thought to contribute to the disease processes in gingivitis. Family C50 contains separase (Chapter 524), an endopeptidase that is involved in the separation of sister chromatids during the anaphase periods of mitosis or meiosis. In mitosis, the chromatids are bound together by the multi-subunit protein cohesin, and cleavage of the Scc1 subunit of cohesion by separase promotes chromatid separation. Cleavage occurs at the Arg180kArg and Arg268kArg bonds of the Scc1 subunit. Separase also cleaves the kinetochore-associated protein Slk19 at the start of anaphase, which is necessary for the development of a stable spindle. In meiosis, the Rec8 protein performs the same role as cohesin and is also susceptible to cleavage by separase. A simple consensus for cleavage is an acidic residue at P4, Gly in P2, Arg in P1 and Arg, Lys or Ser in P1 0 . Members of family C50 are known only from eukaryotes, and are found in every eukaryote genome that has so far been completely sequenced. Family C80 contains self-cleaving proteins that are precursors of bacterial toxins, including Vibrio cholera RTX toxin (Chapter 525), and Clostridium difficile toxins A and B. However, peptidase activity is not restricted to self-cleavage and the mature RTX toxin has been shown to degrade the leucine-rich protein YopM [71] . The tertiary structure of the cholera RTX toxin has been solved [72] and shows a structure similar to that of caspase-1 (see Figure 404 .26). Family C80 is therefore included in clan CD. The PrtH protein, a virulence factor from Tannerella (formerly Bacteroides) forsythensis, is the only characterized peptidase in family C84. However, very little is known about PrtH, except that it hydrolyzes Bz-Val-Gly-Arg-p-nitroanilide, which can be inhibited by standard cysteine peptidase inhibitors and the metal chelator EDTA [73] . The prtH gene is associated with the transition from a commensal to pathogenic organism, and increased levels are associated with release of cells from the substratum [74] . Although there is no information about structure or active site residues, a caspases-like structure has been predicted along with a His, Cys catalytic dyad. The His and Cys residues occur in motifs similar to those around the active site residues of caspases and legumains [75] . Clan CE contains four families: C5, C48, C55 and C57. Three-dimensional structures have been solved for adenain (Chapter 526) in family C5 and the Ulp1 endopeptidase (Chapter 527) in family C48. The families C55, containing the YopJ protease from Yersinia, and C57, containing the vaccinia virus I7 protein, are included in clan CE because the order of putative catalytic residues (His, Glu or Asp, Gln, Cys) is the same in each. Several of the peptidases in clan CE show specificity for cleavage in the Gly-GlykXaa motif. An alignment showing conservation of sequence around the catalytic residues is shown in Figure 404 .27, and secondary structures are compared in Figure 404 . 28 . Family C5 contains adenain, an endopeptidase from adenoviruses. Adenoviruses are double-stranded DNA viruses that do not encode polyproteins but have a separate gene for each protein. Activation of the proteins involves processing at the N-terminus to remove a propeptide, and it is the adenovirus endopeptidase that is responsible for this [76] . The tertiary structure of adenain from human adenovirus type 2 (Figure 404 .29) shows an α/β fold with some remarkable similarities to that of papain but with the catalytic residues in the order His, Cys in the sequence. The catalytic dyad is complemented by two other residues: Glu71 (most commonly Asp, in the clan as a whole; numbering as in Figure 404 .28), which has the same role of orientating the imidazolium ring of the catalytic His54 as does Asn308 of papain, and Gln115, which helps form the oxyanion hole, similarly to Gln152 of papain. The catalytic Cys122 is also positioned at the start of a long helix, as in papain [77] . However, the structures are so different that we conclude that they have resulted from convergent evolution. The adenovirus endopeptidase is synthesized as an active enzyme without a propeptide. The endopeptidase is highly selective for Xaa-Xbb-Gly-XbbkXbb bonds, in which Xaa is either Met, Leu or Ile, and Xbb is any amino acid. A virally encoded, eleven residue cofactor is also required for activity [78] , and the enzyme is apparently further activated by interaction with DNA. In Family C48 the tertiary structure of the yeast Ulp1 endopeptidase (Chapter 527) shows a similar protein fold to adenain (Figure 404 .30). The Ulp1 endopeptidase is responsible for the release of the Smt3 proteins. Smt3 and the animal equivalent SUMO-1 (also known as sentrin) are proteins that target other proteins for export from the nucleus via the nuclear pore complex protein RANBP2. Smt3 and SUMO-1 are distantly related to ubiquitin, and like ubiquitin possess C-terminal glycine residues that can be linked in isopeptide bonds. Unlike ubiquitin, Smt3 and SUMO-1 do not target proteins for degradation, and do not form homopolymers, because the lysine essential to that process is not present. The Ulp1 endopeptidase releases Smt3 from its precursor by removing three C-terminal residues to expose the -Gly-Gly C-terminus; the mammalian equivalent, SENP1-endopeptidase, removes four C-terminal residues. Ulp1 endopeptidase can also release Smt3 from targeted proteins. The enzyme is normally localized to the periphery of the nucleus, but is found within the nucleus during mitotic anaphase. Family C55 is that of the YopJ endopeptidase from Yersinia (Chapter 69). The YopJ protein contributes to the disease process by causing host macrophages to undergo apoptosis and suppressing the action of cytokines such as TNFα and NF-κB. YopJ is thought to act by cleaving the Bid protein, which then induces apoptosis by translocating to the mitochondrion, causing the release of cytochrome c and the activation of caspases 9, 3 and 7 [79] . YopJ is an outer membrane protein that is encoded on a plasmid. There are a number of homologs in this family that are not known to be peptidases (even though all the catalytic residues are conserved) including avirulence protein AvrBsT from Xanthomonas. The family is known only from Proteobacteria. Although no catalytic residues have been identified for any member of the family, they are predicted to occur in the order His, Glu, Gln, Cys. Family C57 contains the I7 protein from the vaccinia virus (Chapter 535), which is believed to be a polyprotein processing endopeptidase. Although no catalytic residues have been identified for any member of the family, they are predicted to occur in the order His, Asp/Asn, Gln, Cys. The polyprotein processing endopeptidase present in the African swine fever virus (Chapter 534) is included in family C63. Although no tertiary structure has been solved, the order of active site residues (His, Asn, Gln, Cys) is consistent with membership of clan CE, as is the restricted specificity for Gly-GlykXaa bonds [80] . Bacterial homologs are known from Chlamydia, Xanthomonas, Pseudomonas and Mesorhizobium but have low similarity to the eukaryotic sequences. The Pseudomonas and Mesorhizobium homologs are multidomain, mosaic proteins. No peptidase activity has yet been demonstrated for any of the bacterial homologs, but the gene has been shown to be encoded on a plasmid and essential for virulence in Pseudomonas [81] . No homologs are known from archaea. Not all deubiquitinating enzymes are proteins with a papain-like fold. The ElaD peptidase from Escherichia coli is able to release ubiquitin from synthetic constructs with a preference for Lys63-linked ubiquitin chains, but also cleaving at Lys-48 [82, 83] . This peptidase is included in family C79. This family is included in clan CE because we propose a His, Cys catalytic dyad (see the alignment in MEROPS). Clan CF, family C15. The crystal structure of pyroglutamyl-peptidase I from Thermococcus (Figure 404 . 31) shows a catalytic triad with the residues in the order Glu, Cys, His in the sequence, which is different to any other clan of cysteine peptidases. The enzyme is an intracellular omega peptidase that removes an N-terminal pyroglutamyl residue from a polypeptide. Cys144 and His166 have been identified as catalytic residues [84] . Human pyroglutamyl-peptidase I has been cloned and expressed [85] ; (Chapter 548), and appears to be structurally very similar to the archaean and bacterial forms of the enzyme. Family C46 contains the Hedgehog proteins. The Hedgehog proteins are formed of two structural domains, and the only peptidase activity is an autolytic reaction that separates the two domains. The reaction is cleavage of a conserved Gly258kCys259 bond, and a cholesterol moiety is simultaneously attached to the new C-terminal glycyl residue of the N-terminal domain. The 18-kDa Nterminal domain is structurally related to a zinc D-Ala-D-Ala carboxypeptidase in clan MD; it is not known to have any peptidase activity but has a function in the embryological development of dorsal-ventral patterning. Mutational studies have shown that residues Cys258 and His329 in the 25-kDa C-terminal domain are essential for the autolytic cleavage [86] . The reaction is not inhibited by any standard peptidase inhibitors, but requires DTT in vitro. The cleavage mechanism is believed to be similar to that of the activation of prohistidine decarboxylase [87] , involving activation of Cys258 by His329 for a nucleophilic attack on the carbonyl of Gly257. There is a motif around the cleavage site that is similar to that found at the N-terminal splice junction of self-splicing proteins (Chapter 76), and splicing may involve a thioester intermediate [88] . The crystal structure of the Sonic hedgehog C-terminal domain (Figure 404 .32) shows that it consists of two tandem intein-like repeats [89] , and family C46 is now included in a clan of mixed catalytic types, PD, along with the three intein families (see Chapter 76) [90] . Peptidases within clan CL have active site residues in the order His, Cys in the sequence, and a tertiary fold consisting of a closed β-barrel surrounded by helices (see Figure 404 .33). The active site is at the end of the barrel. The fold is similar to that of the ykuD transpeptidase from Bacillus subtilis [91] . Members of the clan are involved in hydrolysis of bacterial cell wall peptides. Family C60 includes peptidases known as sortases (Chapter 551) from Gram-positive bacteria, and sortase A from Staphylococcus aureus cleaves proteins at a Leu-Pro-Xaa-ThrkGly motif, and then catalyzes the formation of an amide bond between the newly exposed C-terminal Thr and the cross-linking pentapeptide of the bacterial cell wall [91] . The proteins that are attached in this way to the exterior of the cell wall can be virulence factors, and inhibition of sortases has been shown to decrease virulence. Surprisingly, given a lack of structural relationship to members of clan CA, sortases are inhibited by E-64. The evolutionary tree shows a deep divergence within the family, which is divided into two subfamilies. Subfamily C60A includes sortase A, whereas C60B includes sortase B. Sortase B, also from Staphylococcus aureus, recognizes a different cleavage motif to that of sortase A [92] . The only other family in clan CL is C82. This also includes peptidases involved in bacterial cell wall metabolism. The L,D-transpeptidase from Enterococcus faecium (Chapter 552) cleaves the L-LyskD-Ala bonds from the crosslinking tetra-and pentapeptides, and also possesses cell-wall crosslinking activity [93] . The structure of a fragment of the peptidase has been solved and shows a fold similar to that of sortases [94] . The hepatitis C polyprotein is processed by virally encoded endopeptidases as well as by host endopeptidases. Most of the cleavages are performed by hepacivirin (Chapter 688), which is a serine-type endopeptidase with a trypsin-like fold. Cleavage of the NS2/3 site at a LeukAla bond is mediated by the second virally encoded endopeptidase, endopeptidase 2, often termed the 'NS2-3 endopeptidase', not because of the cleavage site but because the endopeptidase consists of the NS2 protein and one-third of the NS3 protein (Chapter 554). Endopeptidase 2 is included in family C18. Site-directed mutagenesis had identified His952 and Cys993 as the catalytic dyad [95] . However, evidence that the NS2-3 endopeptidase is zinc dependent, including inhibition by EDTA and increased activity with the addition of zinc [96] , had cast doubt on the identification of the catalytic type. The tertiary structure has been determined, and shows a fold unrelated to that of any other peptidase, hence family C18 is the only family in clan CM (see Figure 404 .34). The active peptidase is a homodimer, and each monomer comprises an N-terminal, helical subdomain, a C-terminal subdomain with a β-sheet and an extended linker region. The active site His and Glu are situated in the N-terminal domain and the Cys in the Cterminal domain. An active site is only formed in the dimer, when the His and Glu from one monomer oppose the Cys from the other, thus the NS2 dimer has two active sites [97] . The formation of an active site between different subunits in a dimer is unusual amongst peptidases, but does happen with retropepsins (see Chapter 44) and has been proposed for the Pyrococcus furiosus protease 1 (see Chapter 549). The tertiary structure of the NS2 dimer also revealed the presence of a structural zinc ion, which explains the metal-dependence. Family C9 includes the non-structural polyproteinprocessing endopeptidase from togaviruses, the nsP2 endopeptidase. The catalytic dyad has been identified by site-specific mutagenesis [98] , and the nsP2 endopeptidase is a bifunctional molecule with the peptidase unit at the C-terminus and an N-terminal unit that is probably involved in RNA-binding during replication. The N-terminal domain is homologous to a similarly located domain in the cucumber green mottle mosaic tobamovirus. The crystal structure of the nsP2 endopeptidase from Venezuelan equine encephalitis virus has been solved, and shows a fold unrelated to that of any other peptidase [99] (see Figure 404 .35). Hence C9 is the only family in clan CN. Until the structure was solved, it had been expected that members of family C9 would have a fold similar to that of papain. Clan PC is another clan containing peptidases of different catalytic types, and subclan PC(C) contains the cysteinetype peptidases in the clan. Members of the clan have a The image was prepared from the Protein Data Bank entry (2HD0) as described in the Introduction (p. li). One homodimer is shown. Catalytic residues are shown in ball-and-stick representation for each monomer: His952 in purple, Glu972 in blue and Cys993 in yellow (numbering as in PDB entry). fold similar to that of class I glutamine amidotransferase, which consists of an α/β/α sandwich. The nucleophile can be either a cysteine or a serine, and these are in equivalent positions. Family C26, contains γ-glutamyl hydrolase (Chapter 550). The tertiary structure of human γ-glutamyl hydrolase (Figure 404 .36) shows a homotetramer. A Cys, His catalytic dyad is evident, and the histidine is hydrogenbonded to a glutamate and a tyrosine, either of which could be a third member of a catalytic triad [100] . Family C56 contains only protease I from Pyrococcus (see Chapter 549). Although the crystal structure has been solved ( Figure 404.37) , the catalytic mechanism is not known with certainty. The endopeptidase is a homotrimer, and Cys100 and His101 have been implicated in the catalytic mechanism, with the suggestion that a catalytic triad is formed between residues from opposing monomers, with Glu74 completing the active site. This would be a unique catalytic mechanism amongst the peptidases. His101 is poorly conserved, however, which may mean that very few members of the protein family are peptidases. His101 is also not in an equivalent position to His220 from γ-glutamyl hydrolase. The image was prepared from the Protein Data Bank entry (1L9X) as described in the Introduction (p. li). One monomer of the tetramer is shown. Catalytic residues are shown in ball-and-stick representation: Cys110 and His220 (numbering as in PDB entry). FIGURE 404.37 Richardson diagram of Pyrococcus horikoshii PfpI endopeptidase. The image was prepared from the Protein Data Bank entry (1G2I) as described in the Introduction (p. li). One monomer of the homotrimer is shown. Catalytic residues are shown in ball-and-stick representation: Cys100 and His101 (numbering as in PDB entry). Family C40 contains dipeptidyl-peptidase VI from Bacillus sphaericus. This is an enzyme that is expressed during sporulation, and is responsible for degradation of bacterial cell wall components. Because the enzyme is cytoplasmic and is synthesized without signal and propeptides, it is presumably acting at a late stage in cell wall component turnover. There are a number of homologs of the Bacillus endopeptidase, but similarity is restricted to a C-terminal domain of 110 residues. These include nlpC lipoprotein from Escherichia coli, invasion-associated protein p60 from Listeria species, a starch-degrading enzyme from Clostridium acetobutylicum, and a phosphatase-associated protein from Bacillus subtilis. There are single conserved Cys and His residues). The crystal structures of several members of the family have been solved, and the fold has been described as 'papain-like' but primitive [101] . However, unlike papain where the catalytic Cys is at the N-terminus of a helix, in the structures from family C40 the active site Cys is at the C-terminus of a helix, which means the sequence must run in the opposite direction to that in papain (see Figure 404 . 38) . No similarity between the structures can be detected using the DALI algorithm [102] . The structural resemblances are therefore assumed to be derived from convergent rather than divergent evolution, and proteins in C40 are assigned to the clan CO. All members of this family have either a signal peptide or a transmembrane region, indicating a cell-surface location. Clan PB contains N-terminal nucleophile hydrolases, amongst which the most important peptidase is the proteasome. In the proteasome the N-terminal nucleophile is threonine, but there are members of the clan in which the nucleophile is cysteine. These are exclusively self-processing proenzymes that carry out a single peptide bond cleavage to form the active enzyme, but in contrast to the proteasome, the enzyme that is formed is not a peptidase. The cysteinetype enzymes of clan PB form a subclan PB(C), and are discussed more fully in Chapter 808. There are a few other families of cysteine peptidases for which no tertiary structures are available and too little is known about the catalytic machinery to permit any conclusions about their relationships. Family C6 contains one of the two cysteine endopeptidases found in potyviruses, the helper-component endopeptidase. (The second is the NIa peptidase, Chapter 544) The only cleavage performed by the helper-component endopeptidase is its own release from the polyprotein by cleavage of a GlykGly bond; further processing is performed by the serine-type P1 endopeptidase (Chapter 689). The helper-component endopeptidase is a two-part molecule with the peptidase unit at the C-terminus, and the helper component, required for virus transmission from plant to plant by aphids, at the N-terminus. The catalytic residues have been identified by site-directed mutagenesis in tobacco etch virus [103] . Families C7 and C8 contain the chestnut blight hypovirus endopeptidases p29 and p48. Chestnut blight is caused by the fungus Cryphonectria parasitica, but the symptoms are reduced if the fungus is infected by the double-stranded RNA hypovirus. The virus encodes two polyproteins, and the enzymes responsible for their processing constitute the smaller polyprotein. The p29 endopeptidase cleaves a single GlykGly bond in the polyprotein to release itself and the p48 endopeptidase [104] . The p48 endopeptidase cleaves a single GlykAla bond in the larger polyprotein [105] . Catalytic residues have been identified by mutagenesis for both endopeptidases [105, 106] . The endopeptidases show no significant sequence similarity to any other peptidase, and each is considered a representative of a distinct family, although there is conservation of a Gly-Tyr-Cys-Tyr motif containing the catalytic Cys between the p29 endopeptidase (family C7) and family C6. Family C21 contains the polyprotein-processing endopeptidase from tymoviruses. The 206 kDa polyprotein is cleaved at an AlakThr bond [107] to release the 70 kDa polymerase from the C-terminus and a 150 kDa protein FIGURE 404.38 Richardson diagram of the NLP/P60-like putative peptidase from Nostoc punctiforme. The image was prepared from the Protein Data Bank entry (2EVR) as described in the Introduction (p. li). Catalytic residues are shown in ball-and-stick representation: His184 and His196 in purple and Cys134 in yellow (numbering as in PDB entry). that includes the endopeptidase and a helicase from the N-terminus. The catalytic residues have been identified by mutagenesis [108] . Family C23 includes the p223 polyprotein processing endopeptidase from the blueberry scorch virus. Cleavage is at a single GlykAla bond, and site-directed mutagenesis has been used to identify the potential catalytic residues [109] . Family C27 includes the non-structural polyproteinprocessing endopeptidase from rubella virus. Processing occurs at a single GlykGly bond, and the Cys, His catalytic dyad has been identified [110] . Family C36 is represented by a polyprotein processing endopeptidase beet necrotic yellow vein virus [111] . Family C42 includes a polyprotein-processing endopeptidase from the beet yellows closterovirus. Processing occurs at a single GlykGly bond, and the catalytic dyad has been identified by site-directed mutagenesis [112] . Family C53 contains the NPro endopeptidase of pestiviruses (Chapter 557). Pestiviruses possess two polyprotein-processing endopeptidases: the serine-type NS2-3 endopeptidase (Chapter 554), which is the more general processing activity; and the NPro endopeptidase, which is the N-terminal protein and releases itself from the polyprotein by autolytic cleavage of a CyskSer bond. The residue Cys69 has now been identified as a catalytic residue [113] . Family C75 includes the AgrB protein from Staphylococcus aureus. The agr operon includes four genes whose products are involved in quorum sensing. The ArgA protein is a response regulator, ArgC is a sensor kinase and AgrD is a polypeptide that is integrated into the cytoplasmic membrane via its N-terminal region, and is the precursor for an autoinducing peptide (AIP) that is the ligand for AgrC. The AgrB protein is the peptidase that releases AIP from AgrD [114] , following the removal of an N-terminal peptide by type I signal peptidase [115] . The active site residues His77 and Cys84 were identified in Staphylococcus aureus AgrB by [114] . AgrB is an integral membrane protein, with the active site close to the cytoplasm [116] . Inhibition of mammalian legumain by some cystatins is due to a novel second reactive site BJ46a, a snake venom metalloproteinase inhibitor. Isolation, characterization, cloning and insights into its mechanism of action Inhibition of cathepsin L-like cysteine proteases by cytotoxic T-lymphocyte antigen-2 beta A primitive enzyme for a primitive cell: the protease required for excystation of Giardia Molecular cloning, expression, and chromosomal localization of a human tubulointerstitial nephritis antigen Crystal structure of calpain reveals the structural basis for Ca 21 -dependent protease activity and a novel mode of enzyme activation The crystal structure of calcium-free human m-calpain suggests an electrostatic switch mechanism for activation by calcium CalpA, a Drosophila calpain homolog specifically expressed in a small set of nerve, midgut, and blood cells Molecular cloning and analysis of small optic lobes, a structural brain gene of Drosophila melanogaster On the mechanism of action of streptococcal proteinase. II. Comparison of the kinetics of proteinase-and papain-catalyzed hydrolysis of N-acylamino acid esters L-trans-Epoxysuccinyl-leucylamido(4-guanidino)butane (E-64) and its analogues as inhibitors of cysteine proteinases including cathepsins B, H and L Crystal structure of a deubiquitinating enzyme (human UCH-L3) at 1.8Å resolution Crystal structure of a UBP-family deubiquitinating enzyme in isolation and in complex with ubiquitin aldehyde Burkholderia mallei tssM encodes a putative deubiquitinase that is secreted and expressed inside infected RAW 264.7 murine macrophages A novel superfamily of predicted cysteine proteases from eukaryotes, viruses and Chlamydia pneumoniae A novel type of deubiquitinating enzyme Structure of the A20 OTU domain and mechanistic insights into deubiquitination Crystal structure of human otubain 2 Evidence for bidentate substrate binding as the basis for the K48 linkage specificity of otubain 1 The tumour suppressor CYLD negatively regulates NF-kappaB signalling by deubiquitination The structure of the CYLD USP domain explains its specificity for Lys63-linked polyubiquitin and reveals a B Box module DUBA: a deubiquitinase that regulates type I interferon production The solution structure of the Josephin domain of ataxin-3: structural determinants for molecular recognition Structural basis for ubiquitin recognition by the OTU1 ovarian tumor domain protein Structure of a herpesvirusencoded cysteine protease reveals a unique class of deubiquitinating enzymes Ovarian tumor domain-containing viral proteases evade ubiquitin-and ISG15-dependent innate immune responses Two novel ubiquitin-fold modifier 1 (Ufm1)-specific proteases, UfSP1 and UfSP2 Structural basis for Ufm1 processing by UfSP1 The crystal structure of human Atg4b, a processing and de-conjugating enzyme for autophagosome-forming modifiers Structural basis for the specificity and catalysis of human Atg4B responsible for mammalian autophagy Crystal structure of a thiol proteinase from Staphylococcus aureus V8 in the E-64 inhibitor complex Crystal structure of the peptidase domain of streptococcus ComA, a bifunctional ATP-binding cassette transporter involved in the quorum-sensing pathway Multiple enzymatic activities of the murein hydrolase from staphylococcal phage phi11. Identification of a D-alanyl-glycine endopeptidase activity Cysteine proteases in phytopathogenic bacteria: identification of plant targets and activation of innate immunity Initiation of RPS2-specified disease resistance in Arabidopsis is coupled to the AvrRpt2-directed elimination of RIN4 A Yersinia effector and a Pseudomonas avirulence protein define a family of cysteine proteases functioning in bacterial pathogenesis HopPtoN is a Pseudomonas syringae Hrp (type III secretion system) cysteine protease effector that suppresses pathogeninduced necrosis associated with both compatible and incompatible plant interactions Structure of the streptococcal endopeptidase IdeS, a cysteine proteinase with strict specificity for IgG A papain-like enzyme at work: native and acyl-enzyme intermediate structures in phytochelatin synthesis Identification of the catalytic sites of a papain-like cysteine proteinase of murine coronavirus Coronavirus protein processing and RNA synthesis is inhibited by the cysteine proteinase inhibitor E64d Processing and evolution of the N-terminal region of the arterivirus replicase ORF1a protein: identification of two papain-like cysteine proteases The arterivirus Nsp2 protease. An unusual cysteine protease with primary structure similarities to both papain-like and chymotrypsin-like proteases Pseudomurein endoisopeptidases PeiW and PeiP, two moderately related members of a novel family of proteases produced in Methanothermobacter strains Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: structural and functional implications The structure of the 2A proteinase from a common cold virus: a proteinase responsible for the shut-off of host-cell protein synthesis A single amino acid change in protein synthesis initiation factor 4G renders cap-dependent translation resistant to picornaviral 2A proteases Rice tungro spherical virus polyprotein processing: identification of a virus-encoded protease and mutational analysis of putative cleavage sites Characterization of the catalytic residues of the tobacco etch virus 49-kDa proteinase Autocatalytic activity of the tobacco etch virus NIa proteinase in viral and foreign protein sequences Identification and characterization of a 3C-like protease from rabbit hemorrhagic disease virus 3C-like protease of rabbit hemorrhagic disease virus: identification of cleavage sites in the ORF1 polyprotein and analysis of cleavage specificity Characterisation and mutational analysis of an ORF 1a-encoding proteinase domain responsible for proteolytic processing of the infectious bronchitis virus 1a/1b polyprotein Polyprotein processing in Southampton virus: identification of 3C-like protease cleavage sites by in vitro mutagenesis Sequence analysis of the parsnip yellow fleck virus polyprotein: evidence of affinities with picornaviruses Temporal modulation of an autoprotease is crucial for replication and pathogenicity of an RNA virus Identification of the active site of legumain links it to caspases, clostripain and gingipains in a new clan of cysteine endopeptidases Classification of the caspasehemoglobinase fold: detection of new families and implications for the origin of the eukaryotic separins Viral inhibition of inflammation: cowpox virus encodes an inhibitor of the interleukin-1 beta converting enzyme Granzyme B is inhibited by the cowpox virus serpin cytokine response modifier A Activation of the native 45-kDa precursor form of interleukin-1-converting enzyme Interleukin-1β converting enzyme: A novel cysteine protease required for IL-1β production and implicated in programmed cell death The Caenorhabditis elegans cell-death protein CED-3 is a cysteine protease with substrate specificities similar to those of the human CPP32 protease Apoptosis. Placing death under control Inhibition of death receptor signals by cellular FLIP The two cysteine endopeptidases of legume seeds: purification and characterization by use of specific fluorometric assays Vacuolar processing enzyme responsible for maturation of seed proteins In vitro splicing of concanavalin A is catalyzed by asparaginyl endopeptidase Cloning, isolation, and characterization of mammalian legumain, an asparaginyl endopeptidase Expression and partial characterization of a cathepsin B-like enzyme (Sm31) and a proposed 'haemoglobinase' (Sm32) from Schistosoma mansoni Structural and molecular mechanism for autoprocessing of MARTX toxin of Vibrio cholerae at multiple sites Small molecule-induced allosteric activation of the Vibrio cholerae RTX cysteine protease domain Cloning, expression, and sequencing of a protease gene from Bacteroides forsythus ATCC 43037 in Escherichia coli A 5-year longitudinal study of Tannerella forsythia prtH genotype: association with loss of attachment Prediction of a caspase-like fold in Tannerella forsythia virulence factor PrtH Adenovirus endopeptidases Crystal structure of the human adenovirus proteinase with its 11 amino acid cofactor Activation of adenovirus-coded protease and processing of preterminal protein Yersinia enterocolitica YopP-induced apoptosis of macrophages involves the apoptotic signaling cascade upstream of bid Gly-Gly-X, a novel consensus sequence for the proteolytic processing of viral and cellular proteins Isolation and characterization of virulence gene psvA on a plasmid of Pseudomonas syringae pv. eriobotryae ElaD, a deubiquitinating protease expressed by E. coli SseL, a Salmonella deubiquitinase required for macrophage killing and virulence Mutational analysis of the active site of Pseudomonas fluorescens pyrrolidone carboxyl peptidase Pyroglutamyl-peptidase I: cloning, sequencing, and characterisation of the recombinant human enzyme Autoproteolysis in hedgehog protein biogenesis Sitedirected alteration of serine 82 causes nonproductive chain cleavage in prohistidine decarboxylase The product of hedgehog autoproteolytic cleavage active in local and long-range signalling Crystal structure of a hedgehog autoprocessing domain: homology between hedgehog and self-splicing proteins Asparagine peptide lyases: a seventh catalytic type of proteolytic enzymes Crystal structures of Staphylococcus aureus sortase A and its substrate complex An iron-regulated sortase anchors a class of surface protein during Staphylococcus aureus pathogenesis Specificity of L,D-transpeptidases from gram-positive bacteria producing different peptidoglycan chemotypes Crystal structure of a novel β-lactam-insensitive peptidoglycan transpeptidase A second hepatitis C virus-encoded proteinase Two distinct proteinase activities required for the processing of a putative nonstructural precursor protein of hepatitis C virus Structure of the catalytic domain of the hepatitis C virus NS2-3 protease Identification of the active site residues in the nsP2 proteinase of sindbis virus The crystal structure of the Venezuelan equine encephalitis alphavirus nsP2 protease Threedimensional structure of human γ-glutamyl hydrolase. A class I glutamine amidotransferase adapted for a complex substrate Dali: a network tool for protein structure comparison Identification of essential residues in potyvirus proteinase HC-Pro by site-directed mutagenesis Cotranslational autoproteolysis involved in gene expression from a double-stranded RNA genetic element associated with hypovirulence of the chestnut blight fungus Gene expression by a hypovirulence-associated virus of the chestnut blight fungus involves two papain-like protease activities. Essential residues and cleavage site requirements for p48 autoproteolysis The autocatalytic protease p29 encoded by a hypovirulence-associated virus of the chestnut blight fungus resembles the potyvirus-encoded protease HC-Pro Identification of the cleavage site recognized by the turnip yellow mosaic virus protease Identification of the essential cysteine and histidine residues of the turnip yellow mosaic virus protease Autocatalytic processing of the 223-kDa protein of Blueberry scorch carlavirus by a papain-like proteinase Characterization of the rubella virus nonstructural protease domain and its cleavage site Evidence for in vitro and in vivo autocatalytic processing of the primary translation product of beet necrotic yellow vein virus RNA 1 by a papain-like proteinase Beet yellows closterovirus: complete genome structure and identification of a leader papain-like thiol protease Nterminal protease of pestiviruses: identification of putative catalytic residues by site-directed mutagenesis Identification of the putative staphylococcal AgrB catalytic residues involving the proteolytic cleavage of AgrD to generate autoinducing peptide A role for type I signal peptidase in Staphylococcus aureus quorum sensing Identification of a staphylococcal AgrB segment(s) responsible for group-specific processing of AgrD by gene swapping Rawlings The Wellcome Trust Sanger Institute Email: ab9@sanger.ac.uk Handbook of Proteolytic Enzymes, 3rd Edn ©