key: cord-0985912-7c64krh6 authors: Sharma, Divya; Rawat, Puneet; Janakiraman, Vani; Gromiha, M. Michael title: Elucidating important structural features for the binding affinity of spike ‐ SARS‐CoV‐2 neutralizing antibody complexes date: 2021-11-17 journal: Proteins DOI: 10.1002/prot.26277 sha: b74704145cf4248fc84e09c3da8b7064462efc0c doc_id: 985912 cord_uid: 7c64krh6 The coronavirus disease 2019 (COVID‐19) has affected the lives of millions of people around the world. In an effort to develop therapeutic interventions and control the pandemic, scientists have isolated several neutralizing antibodies against SARS‐CoV‐2 from the vaccinated and convalescent individuals. These antibodies can be explored further to understand SARS‐CoV‐2 specific antigen–antibody interactions and biophysical parameters related to binding affinity, which can be utilized to engineer more potent antibodies for current and emerging SARS‐CoV‐2 variants. In the present study, we have analyzed the interface between spike protein of SARS‐CoV‐2 and neutralizing antibodies in terms of amino acid residue propensity, pair preference, and atomic interaction energy. We observed that Tyr residues containing contacts are highly preferred and energetically favorable at the interface of spike protein–antibody complexes. We have also developed a regression model to relate the experimental binding affinity for antibodies using structural features, which showed a correlation of 0.93. Moreover, several mutations at the spike protein–antibody interface were identified, which may lead to immune escape (epitope residues) and improved affinity (paratope residues) in current/emerging variants. Overall, the work provides insights into spike protein–antibody interactions, structural parameters related to binding affinity and mutational effects on binding affinity change, which can be helpful to develop better therapeutics against COVID‐19. An in depth analysis of virus-host interaction is essential to understand the pathophysiology of infection and to develop therapeutic intervention. 6 The studies on SARS-CoV-2 protein interaction networks have revealed various drug targets for drug discovery. 7, 8 The deep mutational scanning study by Greaney et al. 9 assessed all possible single amino acid variants in the spike protein and provided immune escape maps for mutations in the presence of antibodies. They applied these maps to five potent neutralizing antibodies (COV2-2094, COV2-2165, COV2-2479, COV2-2050, and COV2-2499) isolated from SARS-CoV-2 convalescent patients and found K378E, K378N, E484K, G446D, and Q498R as immune escape mutations. Another study detected variants that escaped either neutralizing SARS-CoV-2 mAbs or convalescent plasma and observed that mutations S477N and E484K rank prominently among mAb escape mutations. 10 Although these mutational studies have significantly improved our knowledge on immune escape variants, they lack a large-scale analysis to understand the biophysical factors leading to decreased/loss of binding. The analysis of SARS-CoV-2 neutralizing antibodies would provide better insights, allowing engineering of more potent antibodies for current/ emerging strains. [11] [12] [13] In this work, we have analyzed the interface of SARS-CoV-2 spike protein and neutralizing antibody complexes with respect to ACE2 binding to understand the residue propensity, pair preference, and atomic interaction energy. Tyr residues show the highest propensity, and the YY pairs have highest pair preference at the interface of spike-antibody complexes. Further, we use structural features including atomic interactions, interface area, surrounding hydrophobicity, hydrophobic interactions, aromatic-aromatic interactions to develop a regression model to relate the binding affinity (K D ) for each antibody. The comprehensive mutational study of the residues at the spikeantibody interface revealed the important sites in spike protein, which lead to decreased binding affinity. The study is in agreement with experimental studies performed on selected neutralizing antibodies, C121, C144, and C135. 14 We extended the analysis to mutate the residues at the antibody interface to identify mutations that may improve the affinity of neutralizing antibodies. The analysis will be helpful to (a) The experimentally reported structures of 77 SARS-CoV-2 neutralizing antibodies in complex with spike protein were obtained from the literature. The dataset was further screened based on (a) structure resolution (highest resolution among available structures for particular antibody complexes), (b) single antibody per structure, and (c) sequence identity (<95% for whole sequence and <80% for CDRH3 region). The final antibody dataset contained 29 spike protein-antibody complexes ( Table 1 ). The ACE2 bound spike protein structure (PDB id: 6M0J) was used as a reference for comparison. The residues at the interface of spike protein and antibody were identified using all heavy atom distance cutoff of 4 Å. For structural analysis, we removed heteroatom coordinates from the PDB files and retained one subunit of the spike protein RBD and antibody (heavy and light chain). A set of 983 heterodimer complexes were collected from the PDB database 15 propensities of interface residues. An overview of the study is represented in Figure 1 . The experimental binding affinities (K D ) were obtained from the literature for 24 spike-antibody complexes. The binding free energy (ΔG) was calculated from K D using the following equation: where, R = 8.31 J/mol/K, is the gas constant and T is the temperature (298 K). The experimental ΔG values are presented in Table 1 . We have calculated the propensity of 20 amino acids to be present at the interface (P interface ) using the following equation 16 : where, n interface (i) and N(i) are the number of residues of type i at the interface and protein, respectively. n interface is the number of residues at the interface, and N is the total number of residues in the protein. The pair preference (Pair[i,j]) for the spike-antibody residue pairs at the interface was computed using the following equation 17 : where, i and j stand for the interface residues in spike and antibodies, respectively. N i,j is the number of interacting residues of type i in spike and j in antibodies. P N i and P N j are the total number of residues of type i in spike and j in antibodies, respectively. The atomic interaction free energy (E inter ) between interface residues of the spike and antibody was calculated using AMBER potential. 18 It is given as Equation (4) 17 : ; R Ã and ε Ã are, respectively, the van der Waals radius and well depth, q i and q j are, respectively, the charges for the atoms i in spike and j in antibodies, and r ij is the distance between them. Accessible surface areas of the RBD region in the spike protein (S E ), antibody (S H+L ) as well as the entire complex (S complex ) were calculated F I G U R E 1 Overview of the workflow by rolling a water molecule of radius 1.4 Å on the protein/complex surface as described in our previous study. 19 Further, the interface area (S interface ) was calculated using Equation (5) . The surrounding hydrophobicity of residues in spike protein-antibody complexes was computed using the following equation. 20 where, H p (i) is surrounding hydrophobicity of ith residue of the protein. n ij is the total number of surrounding residues of type j around residue i within 8 Å distance (between C α atoms). h j is hydrophobicity value for the residue type j (in kcal/mol) obtained from thermodynamic transfer experiments. 21, 22 The average hydrophobicity indices (H p ) was calculated for the interface residues in each spike-antibody complex using PDBparam. 23 In addition, we computed the interaction energy and number of contacting residues using PRODIGY, 24 and various types of atomic interactions from PIC (Protein Interactions Calculator) server. 25 The details of all the features for the spike-neutralizing complexes considered in the present study are presented in Table S1 . We have developed multiple regression equations to relate structurebased features with binding affinity of spike-antibody complexes. It is defined as, where, i is the number of observations, y i is a dependent variable (binding affinity), x i are structure-based parameters, β 0 , β 1 ,ÁÁÁ, β p are regression coefficients, and ϵ is the error term of the model. A systematic forward feature selection approach was utilized to select the optimum number of features with the best performance. The regression model was evaluated by the Pearson correlation coefficient to measure the strength of the relationship between two variables and the mean absolute error (MAE) to examine the absolute difference between predicted and experimental affinity values. The model was further validated using a jackknife test, where regression equations were developed to predict ΔG using (n À 1) data points to predict the performance on nth datapoint, recursively. Change in affinity and stability upon mutation of the interface residues (paratope and epitope) for all antibody complexes were obtained using sequence and structure based methods. The ProAffiMuSeq server predicts protein-protein binding affinity change upon mutation using sequence-based features and functional class. 26 mCSM relies on graph-based signatures to study missense mutations and predicts change in stability and affinity. 27 The CUPSAT prediction model uses amino acid-atom potentials and torsion angle distribution to assess the amino acid environment of the mutation site and predicts the change in stability upon mutation. 28 FoldX software uses empirical function to evaluate the effect of mutations on the stability, interaction, folding, and dynamics of proteins. 29 For FoldX, "RepairPDB" command was used to rectify pdb files, the residues present at the interface of spike protein and antibody were mutated systematically using "BuildModel" command, and the interaction energy was calculated using "AnalyseComplex" command, and stability was calculated as "Stability" command. The change in interaction energy was further calculated by subtracting the values of mutant with wild-type. 30 3 | RESULTS AND DISCUSSION To identify the residues that have a high frequency of occurrence at the spike-antibody interface, we calculated the propensity of interface residues (Equation (2). The propensity of Tyr was observed to be the highest at the interface of spike-antibody complexes ( Figure 2 ). Interestingly, Tyr predominantly occurred at the interface of antigen-antibody 34 and other protein-protein complexes. 35 It has been reported that tyrosine residues are exceptionally versatile for mediating contacts at interfaces 36 nonpolar-π and polar-π interactions are also preferred, and the polar-π interactions are energetically significant to protein folding and function. 39 A previous study on 200 antibody-antigen complexes has also shown that Tyr residues are preferred at both the epitope and paratope regions. 40 In our previous mutational study, we also observed that Tyr residue at the ACE2 binding regions of spike protein are highly conserved along with Gly residues in all ACE2-binding coronaviruses. 41 Hence, Tyr residue interactions are potentially important for developing highly specific antibodies or small drug molecules against SARS-CoV-2. Out of 20 topmost preferred pairs at the spike-antibody interface (Table S2) 17 contains hydrophobic residues (9 hydrophobichydrophobic interactions) and these hydrophobic residues play a vital role at protein-protein interfaces. 42, 43 It has been shown that the dominance of hydrophobic contacts between SARS-CoV-2 and ACE2 enhances the binding affinity of SARS-CoV-2 for ACE2. 44 In the heterodimer dataset, CC is the most preferred pair, followed by KD, WI, RD, and VF, which does not contain Tyr and is different from the spike-antibody interface. It is important to note that Cys has the lowest propensity at the neutralizing antibody interfaces, and therefore, such pairing is not preferred at the spike-antibody interface. The binding free energy values for antibodies discussed in this work range from À9.7 to À16.3 kcal/mol ( We also calculated the interface surface area and average surrounding hydrophobicity of the spike protein-ACE2 complex and spikeantibody complexes. The surface area of the interacting region for F I G U R E 2 Propensity of interface residues for spike and antibody F I G U R E 3 Binding propensity of amino acid residues at the interfaces of ACE2, heterodimers, and neutralizing antibodies antibodies ranged between 370 and 1300 Å 2 on the spike protein. SARS-CoV-2 spike-ACE2 complex had a relatively larger surface area of 1860 Å 2 than antibodies. The contact surface area in the spikeantibody proteins shows a low positive correlation with enhanced binding affinity (r = .13), which is consistent with previous studies. 47 Similarly, the average surrounding hydrophobicity for the interacting region in antibodies ranged between 10.24 and 14.84 kcal/mol compared with 12.31 kcal/mol for ACE2. We have shown previously on coronaviruses that a higher hydrophobic environment at the interface can improve the binding of the complex. 41 The surrounding hydrophobicity also correlates positively with the binding affinity (r = .26). The We have performed a mutational scanning for the epitope and paratope residues for each neutralizing antibody to identify mutants that improve stability and binding affinity of the spike-antibody complex or vice versa. The analysis of changes in interaction energy upon point mutations showed that 85% of the residues in the epitope and paratope are important for binding as they reduce the binding affinity of the complex upon mutation. There are certain binding sites in antibodies, where mutation leads to increased binding affinity and stability of the paratope to the RBD. The mutation sites which improve affinity and stability for more than 10 amino acid mutations are given in Table 3 . These paratope residues could be considered for engineering more potent antibodies against COVID-19. The SARS-CoV-2 RBD is prone to many mutations that could escape the neutralization. We have identified several mutations in the RBD epitopes that decrease the binding affinity and stability of the antibodies ( Table 3 ). The epitope residues which decrease the binding affinity and stability for more than 50% of mutations are F486, Y489, Q493, L455, and F456 (residues present in more than 50% of the antibodies). These positions have also been reported as having mutations emerging upon exposure (co-incubation) to mAbs. 50 Note: (i) The residues shown in italics for paratope increase the binding affinity, bold residues increase the stability, while residues shown for epitope decrease the binding affinity and stability. (ii) The residues are highlighted if two out of three methods satisfy the criteria that at least 50% of the mutations in each residue increases (for paratope) or decreases (for epitope) the affinity and stability. (iii) Notation for residues; Wild type residue followed by residue number. COV2-2832. 9 decrease in binding affinity to antibodies in our study. We noticed that the residues K417, L452, and T478 within the epitopes are mutated frequently. Further, mutations at these sites reduced the binding free energy (Table 3) . Among the Gamma variant (P.1) mutations, 53 K417T and N501Y are observed frequently in the RBD of the spike protein and these mutations decrease the binding affinity (Table 3) . Our results are consistent with the observations that the variants of concern and variants of interest highlighted by WHO (Table 4 ) also decreased the binding free energy and stability. We have further illustrated the results with two examples: (a) E484K and (b) K417N. E484 in the spike protein of the complex (PDB: 7BZ5) forms hydrogen bonds with Y100 of the antibody heavy chain ( Figure 5A ) and the mutation of E484K disrupt these interactions ( Figure 5B ) and reduced the binding affinity of 1.83 kcal/mol. On the other hand, K417 forms hydrogen bonds and cation-π interactions with Y33 and Y52 of the antibody heavy chain and hydrogen bonds with N92 of the antibody light chain ( Figure 5C ). The mutation K417N abolished the interactions with Y33 and N92 ( Figure 5D ) and decreased the binding affinity by 1.23 kcal/mol. Our study revealed that Tyr is the key residue involved in the binding of the spike and antibodies, which contributes to aromatic and hydro- The data that support the findings of this study are available from the corresponding author upon reasonable request. Structure of the SARS-CoV-2 spike receptorbinding domain bound to the ACE2 receptor A human monoclonal antibody blocking SARS-CoV-2 infection A neutralizing human antibody binds to the N-terminal domain of the spike protein of SARS-CoV-2 SARS-CoV-2 neutralizing antibody structures inform therapeutic strategies Monoclonal antibodies for the S2 subunit of spike of SARS-CoV-1 cross-react with the newly-emerged SARS-CoV-2 The current landscape of coronavirus-host protein-protein interactions A SARS-CoV-2-human protein-protein interaction map reveals drug targets and potential drug-repurposing Network-based drug repurposing for novel coronavirus 2019-nCoV/SARS-CoV-2 Complete mapping of mutations to the SARS-CoV-2 spike receptor-binding domain that escape antibody recognition Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization Neutralizing antibody responses to SARS-CoV-2 in symptomatic COVID-19 is persistent and critical for survival Structural basis for broad sarbecovirus neutralization by a human monoclonal antibody Resistance of SARS-CoV-2 variants to neutralization by monoclonal and serum-derived polyclonal antibodies Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants RCSB protein data bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences Sequence and structural features of binding site residues in protein-protein complexes: comparison with protein-nucleic acid complexes Energy based approach for understanding the recognition mechanism in protein-protein complexes A second generation force field for the simulation of proteins, nucleic acids, and organic molecules Exploring antibody repurposing for COVID-19: beyond presumed roles of therapeutic antibodies Hydrophobic character of amino acid residues in globular proteins The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions Amino acid properties and side-chain orientation in proteins: a cross correlation approach PDBparam: online resource for computing structural parameters of proteins PROD-IGY: a web server for predicting the binding affinity of proteinprotein complexes PIC: protein interactions calculator ProAffiMuSeq: sequencebased method to predict the binding free energy change of proteinprotein complexes upon mutation using functional classification mCSM: predicting the effects of mutations in proteins using graph-based signatures CUPSAT: prediction of protein stability upon point mutations FoldX 5.0: working with RNA, small molecules and a new graphical interface The FoldX web server: an online force field Cross-neutralization of a SARS-CoV-2 antibody to a functionally conserved site is mediated by avidity Structural basis for neutralization of SARS-CoV-2 and SARS-CoV by a potent therapeutic antibody. Science Cross-neutralization of SARS-CoV-2 by a human monoclonal SARS-CoV antibody Local and global anatomy of antibody-protein antigen recognition Protein functional epitopes: hot spots, dynamics and combinatorial libraries The importance of being tyrosine: lessons in molecular recognition from minimalist synthetic binding proteins Structural basis of receptor recognition by SARS-CoV-2 Protein subunit interfaces: heterodimers versus homodimers Exploring and exploiting polar-π interactions with fluorinated aromatic amino acids Use of amino acid composition to predict epitope residues of individual antibodies. Protein Eng Des Sel Why are ACE2 binding coronavirus strains SARS-CoV / SARS-CoV -2 wild and NL63 mild? Proteins: Struct Funct Genet Surface, subunit interfaces and interior of oligomeric proteins Hydrophobic folding units at protein-protein interfaces: implications to protein folding and to protein-protein association Dynamics of the ACE2-SARS-CoV/SARS-CoV-2 spike protein interface reveal unique mechanisms Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Potent binding of 2019 novel coronavirus spike protein by a SARS coronavirus-specific human monoclonal antibody Protein-protein interactions: general trends in the relationship between binding affinity and interfacial buried surface area Aromatic-aromatic interaction: a mechanism of protein structure stabilization Protein-protein binding affinity prediction from amino acid sequence SARS-CoV-2 variants, spike mutations and immune escape Affinity maturation of SARS-CoV-2 neutralizing antibodies confers potency, breadth, and resilience to viral escape mutations Reduced sensitivity of SARS-CoV-2 variant Delta to antibody neutralization Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus Additional supporting information may be found in the online version of the article at the publisher's website.