key: cord-1010056-7rqt8uyk
authors: Zhang, Yuan; Zheng, Nan; Hao, Pei; Cao, Ying; Zhong, Yang
title: A molecular docking model of SARS-CoV S1 protein in complex with its receptor, human ACE2
date: 2005-06-23
journal: Comput Biol Chem
DOI: 10.1016/j.compbiolchem.2005.04.008
sha: 69054df4eca171e384599fa7c01967e17512943f
doc_id: 1010056
cord_uid: 7rqt8uyk

The exact residues within severe acute respiratory syndrome coronavirus (SARS-CoV) S1 protein and its receptor, human ACE2, involved in their interaction still remain largely undetermined. Identification of exact amino acid residues that are crucial for the interaction of S1 with ACE2 could provide working hypotheses for experimental studies and might be helpful for the development of antiviral inhibitor. In this paper, a molecular docking model of SARS-CoV S1 protein in complex with human ACE2 was constructed. The interacting residue pairs within this complex model and their contact types were also identified. Our model, supported by significant biochemical evidence, suggested receptor-binding residues were concentrated in two segments of S1 protein. In contrast, the interfacial residues in ACE2, though close to each other in tertiary structure, were found to be widely scattered in the primary sequence. In particular, the S1 residue ARG453 and ACE2 residue LYS341 might be the key residues in the complex formation.

As a structural glycoprotein on the virion surface, the spike protein of coronavirus is responsible for binding to host cellular receptors and the following fusion between the viral envelope and the cellular membrane (Hofmann and Pohlmann, 2004) . The identification of angiotensin-converting enzyme 2 (ACE2) as a functional receptor for severe acute respiratory syndrome coronavirus (SARS-CoV) has provided insights into the host range, cell tropism and pathogenesis of this newly identified etiological agent. Binding analyses (Babcock et al., 2004; Wong et al., 2004) localized a 193 amino-acid (residues 318-510) receptor-binding domain (RBD) in the S1 domain of SARS-CoV spike glycoprotein. This fragment was shown to bind ACE2 more potently than the full-length S1 domain, and an initial search for S-protein residues critical to receptor-binding also pinpointed GLU452 and ASP454 within this fragment. However, the exact residues within S1 and ACE2 involved in their interaction still remain largely undetermined (Hofmann and Pohlmann, 2004) . Since previous structural bioinformatics technology has showed its great power in the study of SARS-CoV (Cai et al., 2003; Yu et al., 2003; Liu et al., 2004a,b; Zhang and Yap, 2004; , we expect a molecular docking model (PDB code: 1XJP) of S1/ACE2 complex presented here will be helpful to uncover amino acid residues potentially crucial for the association between S1 and ACE2.

The theoretical model of the SARS-CoV S1 domain (PDB code: 1Q4Z) (Spiga et al., 2003) and the native crystal struc-ture of the human ACE2 extracellular domain (PDB code: 1R42) (Towler et al., 2004) were downloaded from the protein data bank (PDB) (Berman et al., 2000) and used as inputs to the fully automatic ZDOCK (Chen and Weng, 2003) server (http://zdock.bu.edu/) for protein docking computation. Relying on a composite scoring function combining pairwise shape complementarity with desolvation and electrostatics (Chen and Weng, 2003) , ZDOCK uses a fast Fourier transform based algorithm (Chen and Weng., 2002) to perform a global search in the translational and rotational space without the need for assumption about binding sites. As shown in previous critical assessment of prediction of interaction (CAPRI) challenge, ZDOCK is among the best for protein docking algorithms (Chen et al., 2003a,b) . But due to the difficulty of protein docking task, ZDOCK server, like other approaches, always generates multiple predictions ranked in descending order on the basis of scoring function for one target, and consequently the identification and Fig. 1 . Ribbon diagram of the SARS-CoV S1 (yellow)/ACE2 (blue) complex model (PDB code: 1XJP). The theoretical model of S1 domain (PDB code: 1Q4Z) and the native crystal structure of the human ACE2 extracellular domain (PDB code: 1R42) were downloaded from the protein data bank (PDB). This model was generated by the fully automatic ZDOCK protein-protein docking server and manually selected on the basis of structural biology knowledge. Table 1 The interacting residue pairs in the S1/ACE2 complex model S1 residue ACE2 residue Contact type

Hydrogen bond ASP312 THR334

incorporation of crucial biological knowledge must play a significant role in the following manual inspection to choose the most likely complex model (Chen et al., 2003a,b) . In this case, a previous study derived from a homologous model of ACE2 (Prabakaran et al., 2004) has indicated that the distal ridge of this molecule was likely to participate in binding because there was no room for the S1 protein to associate with its receptor at the cellular membrane proximal face of the ACE2 ectodomain. Moreover, a recent research , suggesting that the S protein-binding site of ACE2 is topologically separated from its catalytic site that is surrounded by the two distal ridges, further strengthened this point of view. Furthermore, the previously identified RBD, especially the two key residues within this fragment, must be on the interface of selected complex model. In conclusion, our criteria for model selection is to choose the highest ranked prediction in agreement with the biological information mentioned above. Based on the criteria, the fourth model, out of 1000 predictions generated, was proposed as the best model ( Fig. 1) . Subsequently, interface forming residue graphical contacts (IFRgc) , which is a webbased tool integrated in the STING Millennium Suite (http://asparagin.cenargen.embrapa.br/SMS/), was used to identify and analyze amino acid contacts across protein interfaces within the proposed complex model. The resulting list of interacting residue pairs and their contact types are shown in Table 1 .

In our selected complex model, a positively charged cavity at the distal end of S1 protein envelopes one highly negatively charged ridge on the top of ACE2, i.e., it is consistent with earlier speculation stated above. And contacting residues are concentrated in two segments of S1 ternary structure. One segment, including four residues (ARG449, PRO450, ARG453, and ASP454) nested in the previously identified RBD, contained all of the three attractive charged residues determined above. In particular, the residue ARG453 interacted in several ways-electrostatic complementarity, hydrogen bonding and hydrophobic interactions-with the residues LYS341 and GLU56 of ACE2. This suggests that it might be the key residue in the complex formation. Moreover, the negative charge of ASP454 complemented the positive charge of ACE2 residue LYS341, in agreement with the observation that the substitution of ASP454 with alanine completely inhibited ACE2 binding . Similarly, another important residue (GLU452) was adjacent to the interface of this complex, hinting at a possible effect on the receptor binding. All of this evidence suggests that this segment is probably the primary determinant for formation of the complex and hence could be an attractive target for antiviral inhibitors. In contrast, the other segment comprising of the remaining three residues, might only provide a secondary contribution to the complex formation because it is not nested in the RBD. Notably, the existence of electrostatic repulsion at the residue ASP312, even though it might be partially counteracted by the hydrogen bond at the same site, could serve as a likely explanation for the low receptor-binding affinity of the whole S1 protein relative to that of the RBD .

Unlike the S1 protein, the interfacial residues in ACE2 ectodomain, though close to each other in tertiary structure, were found to be widely scattered in the primary sequence. One instance was the hydrophobic patch formed by the residues VAL298, THR334, LYS341 and VAL364. The other example was that two neighboring acidic residues (GLU56 and GLU57) and a distant basic residue (LYS341) together formed attractive electrostatic interactions with three adjacent residues (ARG449, ARG453 and ASP454) in the S1 tertiary structure. The residue LYS341, similarly to its counterpart the S1 residue ARG453, made multiple contacts with residues of the S1 protein, playing a central role in the associations between ACE2 and S1. Also, recent research showed that murine ACE2 bound the S1 domain of SARS-CoV with lower affinity than the human receptor and allowed less-efficient spike protein-mediated cellular entry. Consequently, the comparison between human and murine ACE2 sequences could provide valuable information on the mapping of the S protein-binding region onto ACE2. In fact, the pairwise alignment of human ACE2 (GenBank accession number: AAT45083) and its murine homolog (GenBank accession number: AAH26801) revealed two point mutations occurring at the interfacial sites. One was the substitution from human VAL298 to murine methionine while another was the residue ASP335 in human sequence substituted with murine glutamate. Both substitutions lead to an increase in side chain volume that may cause steric hindrance or clash with contacting residues. Finally, it is reasonable to expect that the substitution of ASP335 with basic amino acids, such as lysine, will result in greatly increased S1 protein-binding affinity. In other words, a fragment containing THR334, LYS341 and mutated residue 335 might have the potential to effectively block the receptor binding and cellular entry of SARS-CoV.

In summary, the results presented here reveal amino acid residues potentially crucial for the interaction of S1 with ACE2, and offer an opportunity for the application of site-directed mutagenesis technology to test our hypothesis and the development of effective antiviral therapies. In fact, our hypothesis has been strengthened by a recently published docking study (Huentelman et al., 2004) in which a novel ACE2 inhibitor that potently blocked the SARS-CoV spike protein-mediated cell fusion, was successfully identified and experimentally confirmed. Moreover, the roles of two S1 segments, implied by our model, can be strongly supported by some preliminary results [Hsiao et al., unpublished data , abstract available at http://www.egms.de/en/meetings/sars2004/04sars058.shtml] that identified two separate ACE2-binding domains on the S1 protein. The low affinity binding domain was mapped within the N-terminal 333 residues while the high affinity binding domain was located between the residue 334 and 666. Furthermore, rabbit antisera raised against peptide fragment, which corresponded to S1 residues 433-467 (containing our predicted primary determinant), could completely block S protein binding to VERO E6 cells that use ACE2 as the receptor for SARS-CoV. Clearly, all of this evidence indicates that our complex model is very likely to shed light on the development of oligopeptide competitive inhibitors.

Amino acids 270 to 510 of the severe acute respiratory syndrome coronavirus spike protein are required for interaction with receptor

The protein data bank

Putative caveolin-binding sites in SARS-CoV proteins

ZDOCK: an initial-stage proteindocking algorithm

ZDOCK predictions for the CAPRI challenge

Docking unbound proteins using shape complementarity. desolvation, and electrostatics

A novel shape complementarity scoring function for protein-protein docking

STING Millennium Suite: integrated software for extensive analyses of 3D structures of proteins and their complexes

Cellular entry of the SARS coronavirus

Structure-based discovery of a novel angiotensin-converting enzyme 2 inhibitor

Efficient replication of severe acute respiratory syndrome coronavirus in mouse cells is limited by murine angiotensin-converting enzyme 2

Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus

Homology models and molecular dynamics simulations of main proteinase from coronavirus associated with severe acute respiratory syndrome (SARS)

Molecular dynamics simulations of various coronavirus main proteinases

STING contacts: a web-based application for identification and analysis of amino acid contacts within protein structure and across protein interfaces

Retroviruses pseudotyped with the severe acute respiratory syndrome coronavirus spike protein efficiently infect cells expressing angiotensin-converting enzyme 2

A model of the ACE2 structure and function as a SARS-CoV receptor

Molecular modelling of S1 and S2 subunits of SARS coronavirus spike glycoprotein

ACE2 X-ray structures reveal a large hinge-bending motion important for inhibitor binding and catalysis

A 193-amino acid fragment of the SARS coronavirus S prote in efficiently binds angiotensin-converting enzyme 2

Putative hAPN receptor binding sites in SARS CoV spike protein

The 3D structure analysis of SARS-CoV S1 protein reveals a link to influenza virus neuraminidase and implications for drug and antibody discovery

Reconstruction of the most recent common ancestor sequences of SARS-Cov S gene and detection of adaptive evolution in the spike protein

This work was supported in part by grants from the National Key Projects for Basic Research (973) (2002CB512801, 2003CB715904).

The Editors and the Authors decided to publish this paper as soon as possible because of a very high potential medical importance of the findings. We are aware of the fact that the evidence based on molecular docking alone may not be strong enough to fully justify the conclusions reached in this commentary paper.