key: cord-0751035-tj3e4rqq authors: Margiotta, Enrico; Fonseca Guerra, Célia title: SARS-CoV spike proteins can compete for electrolytes in physiological fluids according to structure-based quantum-chemical calculations date: 2021-08-05 journal: Comput Theor Chem DOI: 10.1016/j.comptc.2021.113392 sha: 5a9fcc78340fd7a692e58df4cb58a26ede5897f2 doc_id: 751035 cord_uid: tj3e4rqq The trimeric spike (S) glycoprotein is the trojan horse and the stronghold of the severe acute respiratory syndrome coronaviruses. Although several structures of the S-protein have been solved, a complete understanding of all its functions is still lacking. Our multi-approach study, based on the combination of structural experimental data and quantum-chemical DFT calculations, led to identify a sequestration site for sodium, potassium and chloride ions within the central cavity of both the SARS-CoV-1 and SARS-CoV-2 spike proteins. The same region was found as strictly conserved, even among the sequences of the bat-respective coronaviruses. Due to the prominent role of the main three electrolytes at many levels, and their possible implication in the molecular mechanisms of COVID-19 disease, our study can take the lead in important discoveries related to the SARS-CoV-2 biology, as well as in the design of novel effective therapeutic strategies. The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (S) protein is a trimeric class I fusion protein [1] . It triggers the envelope fusion with the human (h) host cell membrane by h-ACE-2 receptor mediated endocytosis [2] , [3] , [4] . The S-protein represents the trojan horse of coronavirus infection and, consequently, the most promising candidate for the development of vaccines and drugs [5] , [6] . Each S-protein monomer weighs ~180 kDa and is made up of two subunits, S1 and S2 [7] . The S1 contains the N-terminus domain (NTD) and the receptor binding domain (RBD), which interacts with h-ACE-2; the S2 contains the fusion system, constituted by the central helix (CH), the head-repeated region 1 (HR1), the transmembrane head-repeated region 2 (HR2), the fusion peptide (FP), the connecting domains (CD1, CD2), the S1/S2 cleavage site and several glycosylation sites (see Supplementary Information 1.1 (SI-1.1)). Cryo-electron microscopy (cryo-EM) full structures of the 2019-nCoV S-protein have been solved in either its infectious or non-infectious pre-fusion state [1] , [8] . However, the underlying structural mechanism of the infection triggering, necessary for h-ACE-2 binding, is not completely understood and many factors could play a significant role in this process. Since 1977, it is well-established that many viruses can take advantage of inorganic cations in the aqueous environment for replication [9] . The intracellular and extracellular concentration of Na + , K + , Ca 2+ and Clions is crucial for the survival of human cells, as dictating their physiological Donnan potential [10] , [11] . Coronaviruses are enveloped, thus, by logical extension, also SARS-CoV-2 and other enveloped viruses should cope osmolarity [12] . It has been demonstrated that the envelope proteins of SARS-CoV-1 behave as ion channels, showing selectivity towards Na + over K + and Clions [13] , [14] . Not surprisingly, available commercial drugs like ion channel blockers have been proposed for both the SARS-CoV-1 and SARS-CoV-2 targeting [15] , [16] . These findings suggest that the human SARS-related-CoVs could exploit the three major physiological electrolytes (sodium, potassium and chloride ions) not only for adaptation to the aqueous environment, but also to properly engage with the host receptor. In favour of this hypothesis, Duquerroy found in 2005 [17] that the S2 subunit of the SARS-CoV-1 S-protein, as in its post-fusion hairpin conformation (the conformation assumed after fusion between the virus envelope and the host membrane), was stabilized by two chloride atoms, respectively reported as chloride 1 (CL1) and chloride 2 (CL2): CL1 interacted directly by hydrogen bonds (HBs) with a triad of glutamine (Gln/Q) "zippers", while CL2 interacted indirectly with a triad of asparagine (Asn/N) residues via water molecules (hereinafter, CL1 and CL2 terms will be adopted as referring to the respective chloride atom interaction pattern, direct or indirect). Starting from the structural analysis of the central funnel-shaped domain within available pre-fusion cryo-EM structures of the SARS-CoV-1 and SARS-CoV-2 S-glycoprotein, we identified a conserved Gln-rich structural region, potentially able to bind inorganic ions. The latter were structurally unsolved however, due, probably, to the low resolution of the respective structures (R ≥ 3.0Å), at which water molecules and ions are generally unobserved. The putative interaction between the experimentally solved protein site and either alkali metal cations or chloride was modelled quantum chemically, using density functional theory (DFT), with the aim to assess if they could be accommodated efficiently despite their lack of resolution. 3D molecular representations were obtained using VMD 1.9.3 [18] , Chimera 1.1.2 [19] and ADF 2018.105 software [20] (https://www.scm.com/). Structures were selected from the coronavirus 3D structure database (https://cov3d.ibbr.umd.edu/spike) and retrieved from the Protein Data Bank [21] . S-protein CH domain sequences of human and bat SARS-CoVs were obtained from the Uniprot web server [22] . The FTmap software [23] was used to detect interesting small cavities within the CH-funnel of the SARS-CoV-2 spike glycoprotein. Default settings were adopted. Two S-protein cryo-EM structures were chosen for SARS-CoV-2 and SARS-CoV-1, due to the main Gln rotamers observed as conducive to ion binding in the CH domain (Q1002/Q984): PDB IDs 6VSB (SARS-CoV-2) [1] , in open state conformation, being the cationic binding form P + , and 5XLR (SARS-CoV) [24] , which represents the anionic binding form P -(see Fig. 4 -5 in section 3.1). The X-ray crystal structure of the SARS-CoV-1 spike protein in its post-fusion hairpin conformation (PDB ID 1WYY) [17] was used as a reference for subsequent calculations. Hydrogens were added using the software Chimera 1.1.2 [19] and ionization states checked with the PropKa server tool (https://www.ddl.unimi.it/vegaol/propka.htm). CH residues within 5 Å from the glutamine ion "zipper" in each structure (Q1002, SARS-CoV-2, pre-fusion; Q984, SARS-CoV-1, pre-fusion; Q902, SARS-CoV-1, post-fusion) were selected using the VMD engine [18] (T998→T1006 and T980→T988 in SARS-CoV-2 and SARS-CoV-1 post-fusion S-protein, respectively; L898→A906 in the SARS-CoV-1 postfusion hairpin conformation). Residues were then extracted from the respective PDB structure. C/N-terminal residues were capped retaining the peptide bond shared with the first adjacent residue to be removed and the α-carbon of the latter (in SARS-CoV-2, I997 and Y1007, respectively). Sidechains extruding outside of the binding cavity were removed and treated as Gly residues, since not participating to the putative ion-protein interaction pattern, preserving however the helical backbone integrity. Missing atoms were added, and all hydrogen atoms were subjected to minimization with the DFT-B forcefield as provided in the ADF package [20] . In presence of ions, pre-fusion and post-fusion models were finally made of 376 and 358 atoms, respectively. All the calculations were performed with the Amsterdam Density Functional 2018.105 (ADF) [20] . The dispersion-corrected BLYP-D3 (BJ) functional was used [25] , [26] . The TZ2P basis set was employed along with Becke integration grid [27] , [28] . Neither frozen core approximation nor symmetry were adopted [29] . Complexes between cryo-EM pre-fusion structures and ions were optimized in the gas phase constraining proteins at their original conformation, in order to evaluate if the latter may be conducive to ion binding without further refinements. Single point energy calculations on the SARS-CoV-1 spike protein, as in the post-fusion CL1 ion complex (1WYY) [17] , were performed in the gas phase for qualitative comparison with the pre-fusion complexes. As shown in Fig. 1 , the interaction energy, ∆E int , between the ion (I = Na + , K + , Cl -) and the respective Sprotein cavity (P + , P -/P 1WYY ) was calculated. The interaction energy is calculated as the difference in energy between the optimized complex and the free ligands (protein and ion) in their final state: Implicit solvent effects were not included in the calculation because the pre-fusion putative ion binding site does not result directly exposed to the water solution, being located deep in the CH funnel and surrounded by the HR1 domains. Explicit solvent effects were included in separate geometry optimizations for sodium and potassium ion complexes, to explore the potential coordination of axial water molecules. Interaction energy, ∆E int , was also calculated for the hydrated sodium-protein complex, as the difference in energy between the latter and the free ligands in the gas phase. In the case of chloride, water was not included, as water molecules do not participate to the interaction pattern of the post-fusion chloride experimental complex CL1, used here as a reference, in which the Gln sidechain arrangement correlates specifically with direct binding. The first solved cryo-EM structure of the SARS-CoV-2 S-protein in its closed state (PDB ID 6VXX, reported hereafter as P 0 ) [8] reveals that RBDs and CHs form together a central tunnel extending axially from S1 to S2. Specific residues shape the tunnel, extruding into the cavity and forming triangular planes by their sidechains by sidechains are: E988, V991, Q992, D994, R995, T998, Q1002, Q1005, T1006 and T1009 (E970, V973, D974, D976, R977, T980, Q984, Q987, T988 and T991 in SARS-CoV-1). Interestingly, the CH domain (45 aa) is shared with 100% of sequence identity by the h-SARS-CoV-1, h-SARS-CoV-2 and bat-SARS-CoVs Rf1, Rp3 and HKU3, as shown in Table 1 . The CH funnel residues (24 aa) are either charged (10 aa, 22.2%) or polar neutral (14 aa, 31.1%). Importantly, 9 out of the 10 residues that are directly exposed to the inner cavity by their sidechains are polar (90%). The prominent polar composition of the CH funnel suggests that the latter may catalyze ion binding [30] . Unfortunately, the resolution associated to the full cryo-EM structures precludes ions and water molecules from being solved. With the aim to preliminarily identify suitable binding cavities within the CH domain, we docked fragment probes to three representative SARS-CoV-2 structures (PDB IDs 6VXX, P 0 , in the RBDclosed state, 6VYB [8] and 6VSB [1] , P + , in the RBD-open state) (section 2), using the FTmap software [23] . Results disclosed differences between each structure at the bottom of the funnel. This region is characterized by 4 triads of polar residues, namely Q1002, Q1005, T1006 and T1009 ( Fig. 2-3 ). (P + ) and 6VYB (Fig. 4A) . , Fig. 4C ), 52 showed a Gln sidechain arrangement comparable to the closed state 6VXX (P 0 , Fig. 4B) , not ideal for direct ion binding, while 7 structures showed another rotamer state, which may be conducive to direct chloride binding (P -, Fig. 5 ). P + and P -, respectively). Thus, it is reasonable to assume that the selected amide rotamers lie on a region of enough low thermal motion to be considered structurally reliable for further considerations. In light of these findings, we decided to assess the ability of the CH-site to bind physiological alkali cations (Na + and K + ) and chloride (Cl -) in the representative pre-fusion conformations of the SARS-CoV-2 and SARS-CoV-1 S-protein, by means of Density Functional Theory (DFT) [31] . Complexes with sodium (6VSB, P + Na + ), potassium (6VSB, P + K + ) and chloride (5XLR, P -Cl -) ions were optimized quantum-chemically. The optimized complexes are shown in Fig. 6 (see SI-2 for structure coordinates). Sodium cation interacts preferentially with two Gln oxygens by distance values lower than 3 Å and gives rise to a distorted trigonal planar complex. Potassium, having a bigger radius than sodium, accommodates in the CH site interacting almost equally with all three Q1002 amidic oxygen atoms and forms a quite regular trigonal planar complex. Chloride is complexed directly by the Gln N ε atoms, leading to a tetrahedral coordination geometry, which resembles the CL1 interaction state described by Duquerroy in 2005 [17] . The fourth coordination ligand is absent also in the present case, being replaced by a triad of threonine alkyl groups (T998 and T980, in SARS-CoV-2 and SARS-CoV-1, respectively). The interaction energy between each ion and the S-protein pre-fusion CH site, ∆E int , was calculated (see section 2, Fig. 1 ). The same was done also for the X-ray crystal structure of the SARS-CoV-1 post-fusion ion complex, as a reference for our predictions (1WYY-Cl -). Results are summarized in Table 2 (see the SI-3 for total energies and basis set superposition errors, BSSE). Counterpoise corrected interaction energies, calculated for the modelled pre-fusion ion complexes and the reference complex 1WYY-Clare of the same order of magnitude, which thing allows to consider our models as generally reliable in terms of predicted interaction pattern. All the ions interact favourably with the S-protein, suggesting the ion binding nature of this site. The ∆E int trend for the pre-fusion S-protein ion complexes is in the order Na + > K + > Cl -, making sodium the best putative bound ion. With specific regards to the interaction energy of the CL1 pre-fusion (P -Cl -) and post-fusion complex (1WYY-Cl -), the latter results more favourable. This can be related to the tighter HBs formed with the chloride ion in the post-fusion structure (N ε -H•••Cl -, 2.17Å vs 2.66Å), the involvement of Gln902 γ-carbons in the interaction pattern of the latter, not observed in the Ppre-fusion complex, and probably a better dispersion exerted by the fourth coordination ligand, isoleucine (I905, Fig. 5 ), more hydrophobic than threonine. The P + Na + complex of Fig. 6 was further optimized after inclusion of a water molecule, above the trigonal sodium-protein coordination plane, yielding a distorted tetrahedral geometry (Fig. 7) . The insertion of another water ligand below the same plane, conversely, did not lead to any minimum energetic state. As referring to the P + K + complex, no axial coordination was predicted. In the P + Na + •H 2 O model, the water molecule interacts with Na + and engages HBs with two pre-coordinated oxygen atoms, increasing the stability of the complex. Indeed, the ∆E int calculated for the hydrated complex is -110.9 kcal•mol -1 (BSSE = -0.9 kcal•mol -1 ), corresponding to a gain in stabilization equal to -19 kcal•mol -1 after inclusion of a single water molecule (P + Na + , ∆E int -91.9 kcal•mol -1 ). Water coordination is sterically allowed because the sodium cation coordinates preferentially with two out of three oxygen atoms, i.e., it interacts asymmetrically with the protein and leaves enough space for the accommodation of a water molecule in proximity of the third (weakly bound) atom. Consistently, upon hydration, Na + is predicted to coordinate one amidic oxygen more strongly than the other two (Na + ••O distance decreases from 2.36 to 2.25 Å in presence of water; Fig. 6-7) and to loosen further the weakest pre-existing interaction (Na + ••O distance increases from 3.25 to 3.39 Å in presence of water). Similar considerations suggest that the K + -water coordination is impossible because the symmetric interaction pattern of potassium within the cavity and its radius, bigger than sodium, are not sterically in favour of it. The (Table 2) . Furthermore, the resulting complex geometries and energies are qualitatively consistent with the interaction pattern observed in the experimental X-ray reference structure (for which ions have been solved). We also explored the axial coordination of explicit water molecules by cations, and this resulted favourable only for the top-ranked sodium ion complex, giving further validity to our assumptions. Indeed, the better ∆E int of sodium ion compared to potassium and chloride in the pre-fusion Sprotein structures (Table 2) , along with the prediction of a stable water complex only for Na + , can be linked to the respective concentration of these electrolytes in the extracellular solution. In fact, the sodium ion is more concentrated than potassium and chloride out of the cell membrane, while the opposite is observed in the intracellular solution, due to the Donnan potential. This result is of special importance, since it is consistent with our hypothesis, considering SARS-CoVs as able to take advantage from the extracellular media to assault the host by their spike proteins, as well as with the experimental evidence of chloride sequestration in the intracellular environment (where the anion is more present) after membrane fusion. Accordingly, the postfusion chloride complex (1WYY-Cl -), is more stable than the one predicted for the pre-fusion state (P -Cl -). In other words, the SARS-CoV spike proteins may exploit both the extracellular and intracellular solution by cation and anion binding, respectively, in order to drive the first stages of the infection process. Basing on the present study, the existence of an ion binding site shared by SARS-CoVs is reasonable and highly expected. Although further investigations are required to confirm the potential role of physiological electrolytes in spike proteins, our results suggest that they may represent a term of structural stabilization for the central helix trimer before the infection triggering, in line with the evidence of such role as exerted by chloride atoms on the helical structure of the SARS-CoV-1 spike protein, after fusion with the host cell. Subsequent computational and experimental studies may, therefore, give precious insights about the observed high conservation of the CH domain in bat and human SARS-CoVs, in terms of evolutionary biofunction. Nevertheless, small molecules could also be designed with the property to specifically bind and interfere with the CH funnel, preventing membrane fusion and infection to occur. Indeed, other investigations are ongoing in our laboratories for this purpose. From a biochemical and physio-pathological perspective, the favourable interaction of the three major electrolytes with the S-protein, in either the pre-fusion or post-fusion state, correlates well with their presence in the extracellular solution, enforcing the idea that SARS-CoV-2 (and, in general, SARS-CoVs) may realistically sequestrate electrolytes from physiological body fluids and take advantage from them in order to succeed, depriving the human cells of the salt balance required for their homeostasis. Basing on our results, the main three electrolytes should therefore represent the landmark for translational investigations on the SARS-CoV-2 spike glycoprotein structure functions and on the way the latter can be impaired by ad hoc therapeutic strategies, aiding finally the clinical practice in treating efficiently SARS-related coronavirus diseases. ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐ The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: E.M. conceived the study, performed the calculations, analysed the results and wrote the manuscript. C.F.G. contributed to the accurate setting of quantum-chemical calculations and to the evaluation of results, providing support at any level of the present research. Authors declare no conflict of interest. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Structure, Function, and Evolution of Coronavirus Spike Proteins The coronavirus spike protein is a class I virus fusion protein: structural and functional characterization of the fusion core complex Coronavirus membrane fusion mechanism offers a potential target for antiviral development Oxford COVID Vaccine Trial Group, Safety and immunogenicity of the ChAdOx1 nCoV-19 vaccine against SARS-CoV-2: a preliminary report of a phase 1/2, single-blind COVID-19 Coronavirus spike protein analysis for synthetic vaccines, a peptidomimetic antagonist, and therapeutic drugs, and analysis of a proposed achilles' heel conserved region to minimize probability of escape mutations and drug resistance Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein The inhibition of cell functions after viral infection A proposed general mechanism A model for membrane potential and intracellular ion distribution, Chemistry and Physics of Lipids Effect of Osmotic Pressure on the Stability of Whole Inactivated Influenza Vaccine for Coating on Microneedles SARS coronavirus E protein forms cation-selective ion channels Analysis of SARS-CoV E protein ion channel activity by tuning the protein and lipid charge Hexamethylene amiloride blocks E protein ion channels and inhibits coronavirus replication Pharmaceutical Targeting the Envelope Protein of SARS-CoV-2: the Screening for Inhibitors in Approved Drugs Central ions and lateral asparagine/glutamine zippers stabilize the post-fusion hairpin conformation of the SARS coronavirus spike glycoprotein VMD: Visual molecular dynamics UCSF Chimera-A visualization system for exploratory research and analysis Chemistry with ADF RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences UniProt: the universal protein knowledgebase in 2021 The FTMap family of web servers for determining and characterizing ligand binding hot spots of proteins Cryo-electron microscopy structures of the SARS-CoV spike glycoprotein reveal a prerequisite conformational state for receptor binding Density-functional exchange-energy approximation with correct asymptotic behavior A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu The S66x8 benchmark for noncovalent interactions revisited: explicitly correlated ab initio methods and density functional theory A post-Hartree-Fock model of intermolecular interactions Differential stabilization of adenine quartets by anions and cations Physical origin of selectivity in ionic channels of biological membranes Towards an order-N DFT method HIGHLIGHTS  The central helix domain (CH) of the spike (S) proteins shows a funnel-like polar structure  Peculiar glutamine sidechain patterns have been found within the cryo-EM structure CHs of both SARS-CoV-1 and SARS-CoV-2 S-proteins  Ion-binding motifs are superimposable to the post-fusion state of SARS-CoV-1 Sprotein  DFT calculations and subsequent comparison with the post-fusion X-ray crystal structure predict a positive interaction between physiological electrolytes (Na + , K + and Cl -) and the S-protein funnel We thank the Netherlands Organization for Scientific Research (NWO) for financial support. We also thank Prof. Paolo Ruggerone and Dr. Giuliano Malloci from the University of Cagliari for insightful discussions about the web server algorithms adopted in the present study.