key: cord-0752072-pgiqxd7v authors: Bernini, Andrea; Spiga, Ottavia; Ciutti, Arianna; Chiellini, Stefano; Bracci, Luisa; Yan, Xiyun; Zheng, Bojian; Huang, Jiandong; He, Ming-Liang; Song, Huai-Dong; Hao, Pei; Zhao, Guoping; Niccolai, Neri title: Prediction of quaternary assembly of SARS coronavirus peplomer date: 2004-12-24 journal: Biochem Biophys Res Commun DOI: 10.1016/j.bbrc.2004.10.156 sha: 4a691acb8d49f816a5f377c8ada003ae2d2347c1 doc_id: 752072 cord_uid: pgiqxd7v The tertiary structures of the S1 and S2 domains of the spike protein of the coronavirus which is responsible of the severe acute respiratory syndrome (SARS) have been recently predicted. Here a molecular assembly of SARS coronavirus peplomer which accounts for the available functional data is suggested. The interaction between S1 and S2 appears to be stabilised by a large hydrophobic network of aromatic side chains present in both domains. This feature results to be common to all coronaviruses, suggesting potential targeting for drugs preventing coronavirus-related infections. We are now in a post-epidemic period of the severe acute respiratory syndrome (SARS), caused by the coronavirus henceforth called SARS-CoV. Nevertheless, since the mode of transmission, spread, and mechanisms of virulence of SARS-CoV are not fully understood, all the possible weapons that Immunology and Pharmacology can provide should be prepared against the virus to defend ourselves better when this virus will rear again its infecting crown [1] . For a pharmacological approach the structural characterisation of the molecular repertoire of the target organism is of fundamental importance . In this respect, not much is available yet for SARS-CoV, as only two crystallographic determinations [2, 3] and few predictive models [4] [5] [6] are so far available. Among the above-mentioned structures, the predicted structures of S1 and S2 domains of the viral spike glycoprotein [5] can represent a rational basis to design specific antiviral drugs and diagnostic kits. This protein, indeed, has been found to be the viral membrane protein responsible of SARS-CoV cell entry by interacting with the receptor of the target cell and causing subsequent virus-cell fusion [7] . SARS-CoV ultra-high resolution images have been obtained [8] by scanning electron microscopy, SEM, which indicate that the spike glycoprotein is organised as a trimer. This finding offers a fundamental hint to investigate the overall assembly of the outer viral particles, peplomers, which give that characteristic crown-like aspect to the virion, therefore classified in the coronaviridae family [9] . A stable quaternary structure without covalent crosslinking has been proposed, in general, for coronavirus peplomers [7] . This feature is consistent with our previous structural predictions, as no Cys residue without a corresponding cystine-bridged counterpart is present in both models of SARS-CoV spike glycoprotein domains. The distribution of N-glycosylation and mutation sites has also been considered for a fine-tuning of the peplomer structural features with functional data. Peplomer model building has been performed on the basis of the structures of the S1 and S2 domains of the SARS-CoV spike protein which are available in the Protein Data Bank [10] : the structure models 1Q4Z and 1U4K have been used for S1 and S2, respectively. Docking of the two domains has been manually performed and the reliability of each of the possible peplomer assemblies has been discussed according to the ProsaII software package [11] . Accordingly, quaternary structures exhibiting the lowest energies for atom pair and solvent interaction were considered for further optimisation by using molecular dynamics simulations with Gromacs [12] . After a PROCHECK analysis of the final refined peplomer structure it has been deposited in the Protein Data Bank with the ID code 1T7G. All displays of structures, as well as exposed surface area (ESA) calculations, were carried out with the program MOLMOL [13] . SARS coronavirus peplomer shape and dimensions are now well defined by recent SEM determinations [8] , and the club-shaped protrusions of a trimer glycoprotein appear to extend itself approximately 200 Å from the virion envelope membrane with a maximum width of 100-200 Å . It has been shown that coronaviruses present the S1 domain as the globular head of the spike with receptorbinding activity and that the S2 domain is present in the stalk portion of the spike [14] . In this respect, the fact that SEM images clearly suggest that in the viral peplomers the spike glycoprotein is present as a trimer [8] results to be a fundamental starting point for our model building procedure. This is also in accord with the general rule that coronavirus spike proteins form threestranded left-handed coiled-coils. Moreover, the fact that the 320-518 fragment of S1 domain has been identified as the SARS-CoV peplomer binding site to the ACE2 cellular receptor [15] implies that the residues which are the most involved in the interaction with the receptor have to be positioned in the S1 external top side. These first morphological and functional hints have been coupled to the results of a systematic search for surface hot spots of S1 and S2 SARS-CoV domains, i.e., potential drug binding and/or protein-protein inter-action sites, to gain structural information on the relative orientations of these S1 and S2 domains. This analysis has been performed on the basis of S1 and S2 molecular models available [5] . Furthermore, a Clustal W [16] analysis of all the coronavirus spike proteins present in the SwissProt protein sequence data bank has been carried out and 236 sequences have been found to be compared with the one of SARS-CoV, SwissProt Accession No. P59594, originally used for our model building of the S1 and S2 domains of the S glycoprotein. To build the molecular model of the SARS-CoV peplomer, the modelled structures of its S1 and S2 domains have been used together with homology criteria with the quaternary assembly of other viral systems [14, 17, 18] . In the first step of the model building procedure, the positioning of each of the three S2 in respect to the others was carried out by assembling the long a-helix spanning residues 904-968, constituting the first heptad repeat (HR1, according to the prediction by Multicoil [19] ), in a three-stranded, left-handed coiled-coil. The three HR1s were first aligned parallel along the major axis, and then rotated about the center of mass of the same amount to get the amino acids of positions a and d justaxposed. The structure was refined by minimizaton followed by a simulated annealing dynamics. In the second step, possible interfaces between the S1 and S2 domains, and among the S1 + S2 components, needed for the assembly of a trimeric structure, have been systematically searched. Thus, in spite of the limited sequence homology, ranging from 20.39% to 27.63%, found for all these spike glycoproteins, in the S1 hydrophobic pocket delimited by F187, F334, F253, and W423 a high level of residue conservation is present. In this respect SARS-CoV, when compared with all the other coronaviruses, is unique in its W/F swapping between position 253 and 423, see Table 1 . From Table 1 it can also be noticed that in the S2 domain the hydrophobic residues L803 and F805, totally conserved among the SARS-CoV available genomes and fully exposed in the S2 molecular model, are located in a sequence position where only hydrophobic residues are found. The large difference in the pathologies induced by coronavirus infections suggests that a role for these two hydrophobic moieties of S1 and S2 domains might be attributed to the peplomer assembly rather than to the interaction with the host cell. Residues F187, F334, F253, and W423 could, indeed, form the S1 hydrophobic pocket where S2 puts its hydrophobic finger formed by L803 and F805 residues. The remarkable agreement between the steric requirements for the S1-S2 interaction and the surface position of the proposed S1 binding site to the ACE2 receptor points towards a finite orientation of these peplomeric domains, see Fig. 1 . From this starting point, to recompose the full SARS-CoV peplomer from its components, we used the following simple criteria: (i) orienting each S1 and the S2 stalk domains so that they could dock through the interaction described above, (ii) keeping all the potential N-glycosylation sites as surface exposed as possible, and (iii) positioning the largest hydrophobic surface patches in subunit interfaces. The fact that in the S2 trimer the side chains of L803 and F805 residues are still surface accessible after the coiled coil formation supports the hypothesis that the peplomer reaches its structural stability through the hydrophobic interactions of the S1 pocket with the S2 finger, as depicted in Fig. 2 . Then, geometrical and energetic considerations converge towards possible solutions for the structure of the SARS-CoV peplomer. In Fig. 3 , three molecules of S1-S2 adducts are positioned after their assembly, in a way which is consistent with the overall size of the peplomer [8] . It should be noted also that, among the mutations which have been found in the SARS-CoV spike protein region of all the available genomes, see Table 2 , nothing occurs in the S1-S2 interfaces here identified. Fig. 1 . The S1 domain oriented to fit the morphological SEM images of [8] . In yellow and in red the potential N-glycosylation sites and the residues involved in the interaction with ACE2 [15] are, respectively, colored. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this paper.) For the S2 moiety of the S glycoprotein, composed by one well-structured moiety containing the HR1 and another subdomain spanning the 1027-1195 segment of S glycoprotein and containing the HR2 (residues 1148-1193), ambiguity remains on the structure and on the location of the second in the peplomer structure. Such subdomain is critical for the interaction with the viral envelope, due to its proximity to the trans-membrane region and for the overall structure stability of the peplomer. In fact, a peptide reproducing the C terminal heptad repeat fragment 1161-1187 of S2 exhibits antiviral activity [20] . Thus, 94% of SARS-CoV peplomer structure has been modelled and deposited with the pdb ID code 1T7G. Accordingly, the most interesting regions to be reproduced in synthetic peptides for mimotope design have been found, as well as the hydrophobic sites distributed at the S1/S1, S2/S2, and S1/S2 interfaces for targeting of potential antiviral drugs (patent RM2004A000162). The fact that we could not model the 665-736 sequence of the S glycoprotein does not interfere with the exposed surface on top of the SARS-CoV peplomer, as the missing modelled moiety can be identified in a peplomer lateral region, where a deep groove is found. This peplomer region could be filled by the non-modelled part of the sequence, which consistently exhibits an extensive hydrophobic character [21] . In the present peplomer model, the so-called HR1 and HR2 moieties, i.e., the N and C terminal heptad repeat regions, respectively, spanning the sequences 904-975 and 1148-1193, are not bound together. This feature is consistent with the extensive conformational change, necessary for this and several other viruses for the fusogenic mechanism [22] [23] [24] . Then, on the basis of the obtained peplomer structure, correlations can be explored between viral genome mutations and possible interactions with the host cell receptor(s). As reported in Table 2 , on the basis of the proposed quaternary assembly a systematic topological analysis of mutation sites occurring in the 36 genomes so far available for the SARS-CoV spike glycoprotein has been done. It can be observed that (i) most of the mutations are found in exposed sites; (ii) the only mutation involving a relevant position for the receptor interaction, i.e., 344 K/R, is a conservative one; and (iii) non-conservative substitutions are found in the buried positions of residue 778 with Y/D, which do not induce sterical conflicts. The structural characterisation of SARS-CoV spike glycoprotein domains, here described, suggests also a general scheme for the peplomer assembly of all coronaviruses. In fact, from the sequence alignment of the spike glycoproteins of all known coronaviruses, as shown in Table 1 , it appears that the above-described hydrophobic interaction between the S1 pocket and the S2 finger is very conserved. It could, therefore, represent a very critical region for the interaction between S1 and S2 domains for all coronaviruses, opening new perspectives for the design of small molecules that can efficiently interfere with the viral replication. Hence, SARS as well as all the other members of the coronaviridae family could put down their infecting crown with the same type of antiviral drug, which could protect from possible transmission of coronavirus infections from wild animals. Table 2 Mutation site topology occurring in all the available strains of SARS -CoV spike glycoprotein aa Mutation topology TOR2 BJ01 HZS2_C CUHK_LC2 SOD HZS2_FC GZ_A HGZ8L1_A 49 Exposed top S S S S S S L S 74 Exposed top H H H H H H H F 75 Exposed top T T T T T T R T 77 Exposed top G D D G G G D D 239 Partially exposed side S S S S S S S L 244 Buried I T T I I I T T 311 Exposed side G G G G G G G R 344 Partially exposed side Buried, interface S1-S1 Comparative analysis of the SARS coronavirus genome: a good start to a long journey The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor The severe acute respiratory syndrome-coronavirus replicative protein nsp9 is a single-stranded RNA-binding subunit unique in the RNA virus world Small envelope protein E of SARS: cloning, expression, purification, CD determination, and bioinformatics analysis Molecular modelling of S1 and S2 subunits of SARS coronavirus spike glycoprotein Molecular model of SARS coronavirus polymerase: implications for biochemical functions and drug design The Coronavirus Surface Glycoprotein Probing the structure of the SARS coronavirus using scanning electron microscopy Evidence for a coiled-coil structure in the spike proteins of coronaviruses The protein data bank Recognition of errors in three-dimensional structures of proteins GROMACS: a message-passing parallel molecular dynamics implementation MOLMOL: a program for display and analysis of macromolecular structures Quaternary structure of coronavirus spikes in complex with carcinoembryonic antigen-related cell adhesion molecule cellular receptors A 193-amino acid fragment of the SARS coronavirus S protein efficiently binds angiotensin-converting enzyme 2 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Binding of influenza virus hemagglutinin to analogs of its cell-surface receptor, sialic acid: analysis by proton nuclear magnetic resonance spectroscopy and X-ray crystallography Assembly of coronavirus spike protein into trimers and its role in epitope expression MultiCoil: a program for predicting two-and three-stranded coiled coils Protection of SARS-associated coronavirus infection by peptides targeting the viral spike protein A simple method for displaying the hydropathic character of a protein Following the rule: formation of the 6-helix bundle of the fusion core from severe acute respiratory syndrome coronavirus spike protein and identification of potent peptide inhibitors Interaction between heptad repeat 1 and 2 regions in spike protein of SARSassociated coronavirus: implications for virus fusogenic mechanism and identification of fusion inhibitors Structural basis for coronavirus-mediated membrane fusion: crystal structure of MHV spike protein fusion core Thanks are due to the University of Siena for financial support. N.N. also thanks Matteo Bernini for helpful discussions.