key: cord-0000431-iejfgkst authors: Chen, YanYi; Xue, ShengHui; Zhou, YuBin; Yang, Jenny Jie title: Calciomics: prediction and analysis of EF-hand calcium binding proteins by protein engineering date: 2010-02-07 journal: Sci China Chem DOI: 10.1007/s11426-010-0011-5 sha: d0c9e6bf8bbaac865997bebbe5426ac5ea26f764 doc_id: 431 cord_uid: iejfgkst Ca(2+) plays a pivotal role in the physiology and biochemistry of prokaryotic and mammalian organisms. Viruses also utilize the universal Ca(2+) signal to create a specific cellular environment to achieve coexistence with the host, and to propagate. In this paper we first describe our development of a grafting approach to understand site-specific Ca(2+) binding properties of EF-hand proteins with a helix-loop-helix Ca(2+) binding motif, then summarize our prediction and identification of EF-hand Ca(2+) binding sites on a genome-wide scale in bacteria and virus, and next report the application of the grafting approach to probe the metal binding capability of predicted EF-hand motifs within the streptococcal hemoprotein receptor (Shr) of Streptococcus pyrogenes and the nonstructural protein 1 (nsP1) of Sindbis virus. When methods such as the grafting approach are developed in conjunction with prediction algorithms we are better able to probe continuous Ca(2+)-binding sites that have been previously underrepresented due to the limitation of conventional methodology. Ca 2+ , a signal for "life and death", is involved in almost every aspect of cellular processes. Due to its abundant bioavailability, Ca 2+ was selected through evolution to perform multiple biochemical roles, acting as a second messenger inside mammalian cells to regulate a myriad of important cellular processes from triggering life during fertilization to facilitating apoptosis [1, 2] . As best exemplified by fast responses controlled by highly localized Ca 2+ spikes and slow responses regulated by repetitive global Ca 2+ transient oscillation or intracellular Ca 2+ waves, Ca 2+ signals exhibit diversified spatio-temporal patterns to meet varying demand of cellular processes [3] (Figure 1 ). Errors in any step of the calcium signal pathway can be critical, resulting in uncontrolled cell death or abnormal gene expression [4, 5] . Ca 2+ is able to bind to hundreds of cellular proteins over a 10 6 -fold range of affinities (nM to mM) (Figure 2 (a)), depending on the nature of the Ca 2+ -modulated events. Ca 2+ binding has been shown to be essential for stabilizing proteins as well as maintaining proper cellular free Ca 2+ concentrations as seen in buffer proteins such as calbindin D 9k and parvalbumin ( Figure 2 ). Generally the Ca 2+ modulated activity is achieved through Ca 2+ -dependent conformational changes in Ca 2+ -binding proteins. For example, one of the ubiquitous intracellular trigger (modulating) proteins, calmodulin (CaM), has been shown to interact with over 300 proteins [6] ( Figure 2 ). Interestingly, this protein has been recently shown to regulate a large class of membrane proteins that are essential for cell signaling and cell-cell communication such as gap junctions [6] and voltage-dependent Ca 2+ channels. Although bacterial cells do not have complex subcompartments or organelles, there is strong evidence that Ca 2+ plays an essential role in bacterial signaling, communication and stability similar to that observed in eukaryotic cells (Figure 1 (a)) [7] [8] [9] [10] [11] . Bacterial cells also have a well-regulated cytosolic free Ca 2+ concentration (ap- Viruses avidly perturb the intracellular Ca 2+ signaling network to achieve their own demand. A number of viral proteins (oval shape) from different families of viruses disrupt Ca 2+ signaling by targeting various Ca 2+ signaling components. Adapted with permission from ref. [3, 6] . proximately 0.1-2 μM) that is significantly lower than that observed in the extracellular medium (mM) due to Ca 2+ transporters and channels [8] [9] [10] [11] . Similar to the eukaryotic systems, P-type ATPase Ca 2+ efflux pumps have been characterized from Synechococcus and Flavobacterium. A Ca 2+ transporter of S. pneumoniae is involved in Ca 2+ -DNA uptake, lysis, and competence [12, 13] . Uptake of Ca 2+ and other divalent cations can also accompany uptake of phosphate by the phosphate transport system of E. coli. Furthermore, it has been reported that bacteria contain Ca 2+ binding proteins that are essential for cell adhesion and communication [14] [15] [16] [17] . Viruses, on the other hand, utilize the universal Ca 2+ signal to create a specific cellular environment to achieve their own purposes (Figure 1(b) ). Ca 2+ plays important roles in viral gene expression, post-translational processing of viral proteins, virion structure formation, virus entry, and virion maturation and release. As shown in Figure 1 (b), the interplay between viruses and Ca 2+ in the infected cell falls generally into three major categories: (1) viral proteins directly or indirectly disturb Ca 2+ homeostasis by altering membrane permeability and/or manipulating key components of the Ca 2+ -signaling apparatus; (2) viral proteins directly bind to Ca 2+ for structural integrity or functionality; and (3) critical virus-host interactions depend on cellular Ca 2+ -regulated proteins or pathways. According to their structural features, Ca 2+ -binding sites in proteins are classified as either non-continuous or continuous. In non-continuous sites the Ca 2+ ligand residues are located remotely from one another in the protein sequence. Most of the Ca 2+ binding proteins, such as cadherins, C 2 domains, site I of thermitase, phospholipase A 2 , and D- galactose binding protein (GBP) belong to this family. Continuous Ca 2+ -binding sites have binding pockets formed by a stretch of contiguous amino acids in the primary sequence (e.g. EF-hand proteins) ( Figure 2 ). EF-hand proteins have a conserved Ca 2+ binding loop flanked by two helices [18, 19] . Based on the conserved features of the Ca 2+ -binding loop, EF-hand proteins have been divided into two major groups: the canonical EF-hands as seen in CaM and the pseudo EF-hands exclusively found in the N-termini of S100 and S100-like proteins [18] . Their major difference lies in the Ca 2+ binding loop: the 12-residue canonical EF-hand loop binds Ca 2+ mainly via sidechains (loop positions 1, 3, 5, 12), whereas the 14-residue pseudo EF-hand loop chelates Ca 2+ mostly via backbone carbonyls (positions 1, 4, 6, 9) ( Figure 2 ). Each type of EF-hand loop has a bidentate Ca 2+ ligand (Glu or Asp) that functions as an anchor at the C-terminal of the binding loop. Among all the structures reported to date, the majority of EF-hand sites have been found to be paired either within multiple canonical EF-hand motifs or through the interaction of one pseudo EF-hand motif with one canonical motif [18] (Figure 2 ). For proteins with odd numbers of EF-hands, such as the penta-EF-hand calpain, EF-hand motifs are coupled through homodimerization or heterodimerization [20] [21] [22] . Due to the spectroscopically-silent nature of calcium and its physiological abundance, determination of the calcium binding capability of proteins is challenging. First, most experimental methods such as dialysis are only sensitive to the total calcium content. In addition, overcoming the persistent background contamination of calcium during the preparation of calcium-free sample for proteins with strong calcium binding affinities is a non-trivial task. Further, since most calcium binding proteins contain multiple calcium binding sites that cooperatively bind calcium resulting in induced conformational change (e.g., CaM) ( Figure 2 ), obtaining site-specific calcium binding affinity is limited by complication from contributions from cooperativity and conformational entropy [23] . Hence, understanding the molecular mechanism of biological function related to calcium is largely hampered by the lack of site specific information about the calcium-binding properties, especially for the ubiquitous EF-hand calcium-binding motif. Progress in understanding the molecular mechanism of calcium modulated biological process requires us to answer several important questions. First, what are the site-specific calcium binding affinities of calcium binding proteins, particularly those that utilize multiple coupled calcium binding sites to respond to sharp changes in cellular calcium concentration? Next, how can we predict or identify calcium binding sites in proteins using genomic and structural information? Finally, how can we verify calcium binding capabilities in the bacteria and virus genomes? In this paper we first describe our effort in developing a grafting approach to understand site-specific calcium binding affinities using calmodulin as an example. Next we discuss our progress in predicting EF-hand calcium binding sites in various biological systems such as bacteria and virus systems. We then report our results following application of the grafting approach to probe calcium binding capabilities in streptococcal hemoprotein receptor (Shr) of Streptococcus pyrogenes and the nonstructural protein of Sindbis virus. To overcome the above-mentioned barriers and limitations associated with naturally-occurring Ca 2+ binding proteins, we have developed a grafting approach for engineering a single Ca 2+ -binding site in order to dissect the key structural factors that control Ca 2+ -binding affinity, conformational change and cooperativity. In principle, the key determinants for Ca 2+ affinity can be systematically introduced into a stable host protein frame and evaluated by eliminating or minimizing the contribution of conformational change. The key factors that are essential for Ca 2+ -dependent conformational change can be further revealed by analyzing the folding, stability, dynamics, and conformations of the host protein upon binding of a designed Ca 2+ -binding site without the complication of cooperativity. The cooperativity of two-coupled Ca 2+ -binding sites can then be estimated once the intrinsic Ca 2+ -binding affinities of both sites are obtained based on the energetics relationship. Figure 3 shows our grafting approach in obtaining site-specific calcium binding affinity using domain1 CD2 as a scaffold protein. We have shown that CD2 is an excellent scaffold protein [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] . It retains its native structure following insertion of the EF-hand motif both in the absence and presence of Ca 2+ ions. This provides the foundation for measuring the intrinsic Ca 2+ binding affinity with minimized contribution of protein conformational change. In addition, the aromatic Grafting approach to probe site-specific metal binding properties of Ca 2+ -binding proteins. (a) Schematic representation of the grafting approach. Any predicted linear Ca 2+ -binding sequence can be inserted into the host protein CD2 domain 1 (CD2.D1) between residues S52 and G53 without disrupting the integrity of the host protein. Metal binding to the engineered protein can be monitored by taking advantage of a potential LRET pair, the buried Trp (W32) within the two layers of beta-sheets and the terbium ion bound to the inserted sequence. (b) Flow chart showing the application of grafting approach to confirm metal binding of predicted Ca 2+ binding sites. Adapted with permission from ref. [23, 29] . residues in CD2 enable us to obtain Tb 3+ affinity of the grafted Ca 2+ binding loop using FRET. Ca 2+ and its analog La 3+ are able to compete with Tb 3+ for the grafted metal binding site. We have also optimized the length of two glycine linkers that connect the Ca 2+ binding loop and CD2 to provide sufficient freedom for the loop. The grafted EFloop III of CaM in different protein environments and scaffolds (such as CD2) has similar metal binding affinities for La 3+ and Tb 3+ , which implies that the grafted EF-hand loop is largely solvated and functions independently from the host protein or the protein environment. More importantly, using high resolution NMR and 15 N labeled protein, we have shown that both Ca 2+ and La 3+ specifically interact with the residues in the grafted EF-loop [25] , suggesting that the grafted loop retains its native Ca 2+ binding property. In addition, to dissect the contribution of the EF-loop and its flanking segments on Ca 2+ affinity, we have inserted the EF-loop, the loop with the exiting F-helix, and the loop with both EF-helices of Site III of CaM into CD2. In contrast to the largely unfolded structure of the isolated peptide fragment, the inserted flanking helices are partially formed, as revealed by both CD and NMR. Ca 2+ affinity is enhanced about 3-10 fold when the flanking helices are attached. Further, we have first estimated the intrinsic Ca 2+ affinities of the four EF-hand loops of CaM (I-IV) by individually grafting them into CD2. EF-loop I exhibits the strongest while EF-loop IV has the weakest binding affinity for Ca 2+ , La 3+ , and Tb 3+ . EF-loops I-IV of CaM have dissociation constants for Ca 2+ of 34, 245, 185, and 814 μM, respectively. Based on the results, we proposed a charge-ligandbalanced model in which both the number of negatively charged ligand residues and the balanced electrostatic dentate-dentate repulsion by the adjacent charged residues are major determinants for the Ca 2+ binding affinities of EFloops in CaM. Our grafting method provides a new strategy to obtain site-specific Ca 2+ binding properties and to estimate the cooperativity and conformational change contributions of coupled EF-hand motifs. We have shown that the contribution of the cooperativity and conformational change to the Ca 2+ affinity for the C-terminal is 40% greater than that for the N-terminal. The same approach will be used to probe the site-specific Ca 2+ affinity of bacterial proteins. Furthermore, we have applied high resolution pulsed-fieldgradient diffusion NMR (PFG NMR) and analytical ultracentrifugation to investigate the oligomeric state of the isolated EF-loop III of CaM in CD2 with and without the flanking helices. The loop without the helices (CaM-CD2-III-5G) remains unpaired in solution in the absence and presence of Ca 2+ . However, the loop with the flanking helices (CaM-CD2-III-5G-EF) is a dimer in the presence of Ca 2+ [34] . Our findings suggest that hydrophobic residues on flanking helices play an essential role in dimerization and coupling of two EF-hand motifs for stronger Ca 2+ affinity. By taking advantage of the sequence homology of currently available EF-hand loops and the flanking structural contents, we generated a series of patterns for the prediction of EFhand proteins. We have modified the pattern PS00018 by allowing more choices (Glu, Gln, and Ser) at position 1 and adding constraints at the flanking helical regions for canonical EF-hand motifs. In addition, several patterns have been developed to identify EF-hand like sites with different structural elements flanking the loop. Further, to circumvent the problem of identifying the pseudo EF-hand loop, a pattern has been developed by moderately loosening the constraints at the paired C-terminal canonical EF-hand and incorporating reserved residues in the N-terminal pseudo EF-hand. Compared with the original pattern PS00303, the new pattern reflects conserved genomic information in both EF-motifs and significantly improved the predictive accuracy and sensitivity [41] . To understand the role of Ca 2+ in bacteria, we have predicted and analyzed potential bacterial EF-hand and EFhand like Ca 2+ -binding motifs on a genome-wide scale using our developed bioinformatic tool (http://www.chemistry.gsu. edu/faculty/Yang/Calciomics.htm). A total of 390 putative Ca 2+ -binding proteins have been predicted. Of these, 40 proteins were identified with multiple EF-hands ranging from 2 to 6, and 16 of these 40 proteins have been reported previously [32] . The other 350 proteins contain mononuclear EF-hands. Several examples in three classes of these predictions with diversity in the Ca 2+ -binding loop and flanking structural regions together with one class of prediction from other methods are shown in Table 1 . These proteins are implicated in a variety of cellular activities, including Ca 2+ homeostasis [42] [43] [44] , chemotaxis [8, 45, 46] , binding to scaffold proteins [47] , resistance to acid stress [48, 49] etc. According to their sequence homology and based on the assumption that they evolved from a common ancestor, these proteins could be further classified into several major phylogenetic groups [41] . A notable example is the streptococcal hemoprotein receptor (Shr), a surface protein with a role in iron uptake that has no significant homologues in other bacteria but shares partial homology with eukaryotic receptors such as Toll and G-protein dependent receptors (gi 15675635, GenBank). Additional sequence analysis identified a leucine-rich repeat domain, an EF-hand Ca 2+ domain, and two NEAT domains [50] . As shown in Figure 4 , the single EF-hand motif identified in Shr has a significant homology to that of CaM with all the conserved Ca 2+ binding ligand residues and two flanking helices (Figure 4 ). Though EF-hands have been found abundantly in eukaryotes and bacteria, literature reporting EF-hand or EF-hand like Ca 2+ -binding motifs in virus proteins is scarce, possibly due to lack of accurate prediction methods and robust validating methodologies. A thorough search in PubMed with the key words "EF-hand and virus" only results in 4 examples (1, 3, 5, 7, 9, 12) and the hydrophobic residues (n, blue). The predicted EF-hand from Shr (Streptococcal hemoprotein receptor, S. pyrogenes) and PlcR (phospholipase accessory protein, P. aeruginosa) are aligned with some EF-hands known to form oligomers: CaM EF3, the third EF-hand from calmodulin; TnC EF3, the third EF-hand from troponin C; PV EF3, the third EF-hand from parvalbumin; D9K EF2, the canonical EF-hand from calbindin D9K, The search patterns used for the identification of the EF-hand loop and flanking helices (Helix E and Helix F) are also shown in the bottom. ). In addition, the functions of almost 20% of these matched proteins remain uncharacterized. We hope that our prediction will serve as a prelude to more extensive searching for additional viral Ca 2+ -binding proteins that are closely associated with virus-host interacting events ( Figure 1(b) ). Rubella virus (RUB), the only member of the genus Rubivirus, in the Togaviridae family, is the causative agent of a disease called rubella or German measles. Nonstructural protein (NS) open reading frame (ORF) of RUB encodes a polypeptide precursor which is able to cleave itself into two replicase components involved in viral RNA replication. A putative EF-hand Ca 2+ binding motif of the nonstructural protease that cleaves the precursor was successfully predicted across different genotypes of RUB and determined by established grafting approach [51] . The grafted EF-loop bound to Ca 2+ and its trivalent analogs Tb 3+ and La 3+ with dissociation constants of 214, 47, and 14 μM, respectively. The NS protease containing mutations of cal-cium binding sites elimination (D1210A and D1217A) was less efficient at precursor cleavage than the wt NS protease at 35°C, and the mutant NS protease was temperature sensitive at 39°C, confirming that the Ca 2+ binding loop played a structural role in the NS protease and was specifically required for optimal stability under physiological conditions. Interestingly, the same bioinformatics algorithm that successfully predicted the Ca 2+ -binding loop in the RUB NS protease also predicted an EF-hand Ca 2+ -binding motif in nsP1 of alphaviruses (Figure 4(b) ). NsP1 is one of the four nonstructural proteins produced by alphaviruses and is involved in membrane binding and has methyl/guanylyl transferase activity. Next, we grafted two predicted 29-residue EF-hand motifs, one from the Shr of S. pyrogenes (CD2.Shr.EF) and the viral nsP1 of Sindbis virus (CD2.Sin.EF), into CD2.D1 to examine their Ca 2+ binding capability by using aromatic residue-sensitized Tb 3+ luminescence resonance energy transfer (Tb 3+ -LRET) ( Figure 5 ). Circular dichroism studies of both engineered proteins showed a notable trough at ~216 nm which is characteristic of β-sheet structure. More negative signals were observed below 240 nm due to the contribution from the insertion of the helix-loop-helix sequences. Both proteins were able to bind the Ca 2+ analog, Tb 3+ , with affinities of 25.1 μM for CD2.Shr.EF and 16.4 μM for CD2.Sin.EF ( Figure 5(c)-(d) ). The biological relevance of these EF-hand Ca 2+ -binding motifs will be further investigated. Overall, based on sequence homology, we have developed a straightforward and fast method to detect linear Ca 2+ -binding motifs from genomic information. Genome-wide analysis of EF-hand Ca 2+ -binding motifs in bacteria and virus have been analyzed with this methodology. Experimentally, we have also developed a robust and reliable grafting approach to study Ca 2+ -binding properties of continuous Ca 2+ binding sites. This novel approach has been successfully used to dissect site-specific Ca 2+ binding affinity and cooperativity among the four canonical EF-hands in the prototypical Ca 2+ -binding protein, calmodulin. The combination of these two approaches is expected to enable us to explore more Ca 2+ binding sites that are underrepresented due to the limitation of available methodology. EF) and from the nsP1 of Sindbis virus (CD2.Sin.EF). (a) Modeled structure of the engineered protein with the insertion of a 29-residue helix-loop-helix EF-hand motif. (b) Far ultra-violet circular dichroism spectra of CD2 wild type and the engineered proteins. (c) and (d) Tb 3+ -binding curves of the engineered proteins CD2.Shr.EF and CD2.Sin.EF. The titration curve is fitted for a 1:1 binding stoichiometry Calcium: a life and death signal Calcium and cell death Calcium signalling: dynamics, homeostasis and remodelling Calcium and cell death Calcium transport proteins in the nonfailing and failing heart: gene expression and function Viral calciomics: interplays between Ca 2+ and virus A novel calcium-dependent bacterial phosphatidylinositol-specific phospholipase C displaying unprecedented magnitudes of thio effect, inverse thio effect, and stereoselectivity Maintenance of intracellular calcium in Escherichia coli Energetics of calcium efflux from cells of Escherichia coli Free calcium transients in chemotactic and non-chemotactic strains of Escherichia coli determined by using recombinant aequorin Poly-3-hydroxybutyrate/polyphosphate complexes form voltage-activated Ca 2+ channels in the plasma membranes of Escherichia coli Characterization of a calcium porter of Streptococcus pneumoniae involved in calcium regulation of growth and competence Mutations which alter the kinetics of calcium transport alter the regulation of competence in Streptococcus pneumoniae Cell adhesion receptors: signaling capacity and exploitation by bacterial pathogens Calcium in bacteria: a solution to which problem? Calcium signalling in bacteria An extracellular calcium-binding domain in bacteria with a distant relationship to EFhands Classification and evolution of EF-hand proteins Chimeric HTH motifs based on EF-hands Dissociation and aggregation of calpain in the presence of calcium Binding and aggregation of human mu-calpain by terbium ion Homodimerization of calpain 3 penta-EF-hand domain Probing site-specific calmodulin calcium and lanthanide affinity by grafting Predicting calcium binding sites in proteins-a graph theory and geometry approach Calcium and lanthanide affinity of the EF-loops from the C-terminal domain of calmodulin Design of a calcium-binding protein with desired structure in a cell adhesion molecule The effects of ca 2+ binding on the dynamic properties of a designed ca 2+ -binding protein(,) The Encyclopedia of Inorganic Chemistry, Second Edition A grafting approach to obtain site-specific metal-binding properties of EF-hand proteins Rational design of a calcium-binding protein Obtaining site-specific calcium-binding affinities of calmodulin Structural analysis, identification, and design of calcium-binding sites in proteins Metalbinding studies for a de novo designed calcium-binding protein Isolated EF-loop III of calmodulin in a scaffold protein remains unpaired in solution using pulsed-field-gradient NMR spectroscopy Design, synthesis, and characterization of a calcium-sensitive near infrared dye Calcium binding properties of EF-loops in a beta-sheet protein Metal binding affinity and structural properties of an isolated EF-loop in a scaffold protein Criteria for Designing a Calcium Binding Protein Peptide analogs from E-cadherin with different calcium-binding affinities Identifying and designing of calcium binding sites in proteins by computational algorithm. In: Computational Studies, Nanotechnology, and Solution Thermodynamics of Polymer Systems Prediction of EF-hand calcium-binding proteins and analysis of bacterial EF-hand proteins Calcium signalling in Bacillus subtilis Maintenance of intracellular calcium in Escherichia coli NMR assignments, secondary structure, and global fold of calerythrin, an EF-hand calcium-binding protein from Saccharopolyspora erythraea NMR assignments, secondary structure, and global fold of calerythrin, an EF-hand calcium-binding protein from Saccharopolyspora erythraea A novel calcium binding site in the galactose-binding protein of bacterial transport and chemotaxis Cellulosome assembly revealed by the crystal structure of the cohesin-dockerin complex Protein synthesis in Brucella abortus induced during macrophage infection Characterization of heat, oxidative, and acid stress responses in Brucella melitensis NEAT: a domain duplicated in genes near the components of a putative Fe 3+ siderophore transporter from Gram-positive pathogenic bacteria Identification of a Ca 2+ -binding domain in the rubella virus nonstructural protease