key: cord-0790980-szq24qlv
authors: Williams, Alexander H.; Zhan, Chang-Guo
title: Fast Prediction of Binding Affinities of the SARS-CoV-2 Spike Protein Mutant N501Y (UK Variant) with ACE2 and Miniprotein Drug Candidates
date: 2021-04-21
journal: J Phys Chem B
DOI: 10.1021/acs.jpcb.1c00869
sha: 501ee5f2671c73ce79ef2ea5b5d3630423b9b0d6
doc_id: 790980
cord_uid: szq24qlv

[Image: see text] A recently identified variant of SARS-CoV-2 virus, known as the United Kingdom (UK) variant (lineage B.1.1.7), has an N501Y mutation on its spike protein. SARS-CoV-2 spike protein binds with angiotensin-converting enzyme 2 (ACE2), a key protein for the viral entry into the host cells. Here, we report an efficient computational approach, including the simple energy minimizations and binding free energy calculations, starting from an experimental structure of the binding complex along with experimental calibration of the calculated binding free energies, to rapidly and reliably predict the binding affinities of the N501Y mutant with human ACE2 (hACE2) and recently reported miniprotein and hACE2 decoy (CTC-445.2) drug candidates. It has been demonstrated that the N501Y mutation markedly increases the ACE2-spike protein binding affinity (K(d)) from 22 to 0.44 nM, which could partially explain why the UK variant is more infectious. The miniproteins are predicted to have ∼10,000- to 100,000-fold diminished binding affinities with the N501Y mutant, creating a need for design of novel therapeutic candidates to overcome the N501Y mutation-induced drug resistance. The N501Y mutation is also predicted to decrease the binding affinity of a hACE2 decoy (CTC-445.2) binding with the spike protein by ∼200-fold. This convenient computational approach along with experimental calibration may be similarly used in the future to predict the binding affinities of potential new variants of the spike protein.

The novel SARS-CoV-2 virus that causes the well-known COVID-19 disease is responsible for the deaths of over one million people, with over 90 million confirmed cases. 1 SARS-CoV-2 enters host cells via its spike protein binding to angiotensin-converting enzyme 2 (ACE2). With such widespread infection, local variants of the virus have begun to be identified, with some showing increased infectivity. 2 One variant of SARS-CoV-2, that has gained national prominence, is known as the United Kingdom (UK) variant, early analysis of which has revealed a nearly 70% increase in infectivity when compared to original SARS-CoV-2 virus identified in 2019. 3 Of note with the UK variant is an N501Y mutation on the SARS-CoV-2 spike protein, which is responsible for the virus' identification and attachment to the target host cells. This mutation manifests itself on the receptor-binding domain (RBD) region of the spike protein, a change that has direct consequences on the binding of the protein with ACE2. This mutation has been identified not only within the UK variant of the virus but has arisen independently in South Africa (lineage B.1.351) and Brazil (lineage P.1), denoting this mutation's effect on transmissibility. 4−6 The amino-acid residue N501 of the SARS-CoV-2 spike protein is located directly on the interface between these two proteins ( Figure 1A,B) , and thus, any mutation on N501 has potential implications for any therapeutic that relies on the spike protein as its vector. Promising therapeutic candidates, such as miniprotein inhibitors, 7 have been reported with picomolar activities against the SARS-CoV-2 spike protein. However, these miniproteins mimic the α-helix binding domain of the ACE2 protein to bind with the same interface of the spike protein as ACE2 and are therefore not immune to the potential changes that the N501Y mutation on the spike protein might cause. With the increase in infectivity that the UK SARS-CoV-2 variant displays, it is essential to know whether the N501Y mutation on the spike protein has compromised the binding affinities of the spike protein-targeted therapeutics that are currently being developed.

To investigate the potential change in binding affinity associated with the N501Y mutation, we have modeled both the SARS-CoV-2 spike protein and its N501Y mutant binding with wild-type human ACE2 (hACE2) and a previously reported T27Y/L79T/N330Y mutant of ACE2 (known as ACE2.v2.4) which has a 40-fold increased binding affinity with the wild-type spike protein of SARS-CoV-2. 8 Additionally, we have modeled the wild-type and N501Y mutant spike proteins with a truncated ACE2 peptide to investigate the potential impacts of the N501Y residue change of the spike protein on ACE2 mimicking therapeutics such as the aforementioned miniprotein drug candidates, demonstrating that the N501Y mutation significantly improves binding affinity to soluble ACE2 derivatives but diminishes affinity to miniprotein drug candidates. The obtained binding and affinity data help to understand why the UK variant is more infectious and provide new insights concerning rational drug design targeting the variant.

The increasing widespread availability of high-resolution cryoelectron microscopy (EM) structures of the spike/ACE2 complex 9 have precluded the need to perform homology modeling from previously obtained coronavirus spike protein structures. Recent studies of protein/ligand, protein/nucleotide, and protein/protein binding systems have also revealed methodologies to accurately predict binding affinity without the need for long-timescale molecular dynamics (MD) simulations, instead only utilizing energy-minimized structures; even with residue mutations introduced, these energyminimized X-ray crystal or cryo-EM structures still produced results consistent with in vitro experimental data. 10−15 These methodologies allow for reliable and computationally efficient studies of this recently discovered variant of SARS-CoV-2 spike protein as long-timescale MD simulations of this complex could be subject to artifacts introduced when simulating the membrane-bound protein with periodic conditions. 16−20 In general, for predicting the mutation-caused shift of the protein/protein binding free energy, it should be reasonable to conceptually consider the amino-acid residue change as a "minor change" or "perturbation" to the entire protein/protein binding complex and, thus, computationally model the "perturbed" complex (i.e., the complex after the residue change) structure starting from the energy-minimized structure of the "unperturbed" complex (i.e., the complex prior to the residue change). Therefore, the energy-minimized structures of both the "perturbed" and "unperturbed" complexes were used, in this study, to perform the molecular mechanics-Poisson− Boltzmann surface area (MM-PBSA) binding free energy calculations to predict their difference in the binding free energy as we did previously for predicting mutation-caused binding free energy shifts in other protein/protein binding systems. 21 Notably, described here is a simplified computational approach for fast and reliable computational prediction of the mutation-caused binding free energy shifts, as compared to the popularly used free-energy perturbation (FEP) approach based on extensive MD simulations. 22−43 Usually, the MD simulations may help to appropriately account for the conformational dynamics of the binding structures, making the FEP simulation methods very attractive for use in computational prediction of relative binding free energies. On the other hand, extensive MD simulations sometimes induce artifacts as noted above. Therefore, when a reliable experimental structure of the "unperturbed" binding system is available, the simple energy minimization (without MD simulation) of the experimental structure will allow us to focus on the minor structural changes associated with the amino-acid substitution and their effects on the binding free energy without worrying about any computational artifacts associated with the imperfect force fields used in the MD simulations. The simple energy minimizations and MM-PBSA binding free energy calculations starting from a reliable experimental structure of the "unperturbed" binding system will ensure a qualitatively reasonable binding structure of the "perturbed" system and the corresponding binding free energy shift. The same simplified computational approach has led to successful design and discovery of a promising mutant of a long-acting protein drug candidate having a markedly improved binding affinity with its protein target. 21 This is why the same computational protocol (including the energyminimization and MM-PBSA calculation) 21 was used in this study to predict the mutation-caused shifts of the binding free energies. In brief, using the cryo-EM structure (RCSB: 7KMB) of the spike/ACE2 complex, after the energy-minimization, the reported mutant of the spike mutant (N501Y) or ACE2 mutant (ACE2v2 and ACE2.v2.4) in the spike/ACE2 complex was reconstructed using the PyMol mutation tool to replace each residue with the reported mutations (T27Y/L79T/ N330Y/A386L for ACE2.v2 and T27Y/L79T/N330Y for ACE2.v2.4). 8, 9, 44 These structures were energy-minimized using the Sander module of the Amber20 package, and their MM-PBSA binding free energies (ΔG PB ) were evaluated using the MM-PBSA.py module of Amber20, as previously reported. 45−47 Additionally, to estimate the change in binding free energy with the miniprotein inhibitors, the available crystal structures of miniproteins LCB1 (RCSB: 7JZU) 7 and LCB3 (RCSB: 7JZM) 7 were energy-minimized, and their binding free energies with both the wild-type spike protein and the N501Y variant were estimated using the same methodology. Further, as shown previously, 21,47 the MM-PBSA calculations overestimated the absolute binding free energies. Nevertheless, we have previously reported a methodology that utilizes known in vitro experimental data to correct these overestimations using a linear regression equation, allowing for both accurate and precise protein/ligand and protein/protein binding affinity predictions. 21,47,48 Therefore, in this study, the previously reported experimental binding affinities 8 for the wild-type spike protein of SARS-CoV-2 with wild-type ACE2, ACE2.v2, and ACE2.v2.4 (along with the binding affinities of the N501Y variant spike protein with both wild-type ACE2 and ACE2.v2.4, which were published while this article was under review 49 ) were used to calibrate the calculated binding free energies. Thus, we were able to obtain to empirical linear correlation relationship for prediction of binding free energies (see below).

The obtained binding structures for the wild-type ACE2 binding with the wild-type and N501Y spike proteins are depicted in Figure 1 . The computational and experimental data correlation graph is depicted in Figure 2 , and the binding modes for the miniproteins LCB1 and LCB3 are depicted in Figure 3 . The obtained energetic data are summarized in Tables 1 and 2 . Based on the directly calculated MM-PBSA binding free energies and the empirical correction analysis using the available experimental binding affinities summarized in Table 1 , we obtained the following empirical linear correlation relationship

Interestingly, in eq 1 (which was updated using the newly reported experimental data 49 for the N501Y mutant of SARS-CoV-2 spike, published while our article was under review), the correction coefficient of the ΔG PB value is 0.265 (see Figure 2 for the correlation between the directly calculated ΔG PB values and the experimental binding free energies), which is very close to our previously reported correction coefficient (0.3057) obtained for the cannabinoid 2 (CB 2 ) receptor binding with its ligands using the methodology. 47 Equation 1 was then used to correct all the ΔG PB values obtained, creating an estimated binding affinity (K d ) for the N501Y mutant spike protein at 0.44 nM (which is very close to the experimentally determined binding affinity of 0.8 nM, 49 reported while our article was under review). Summarized in Table 1 are the directly calculated MM-PBSA binding free energies (ΔG PB ), the corrected ones (ΔG corr ), and the corresponding experimental/predicted K d values.

The cause behind the increase in binding affinity from the wild-type spike protein to the N501Y mutant is clear when looking at interactions between the ACE2 and the spike protein when the N501Y mutation is introduced. Shown in with the wild-type spike protein; while the introduction of N330Y still contributes hydrophobic interactions with G496, the loss of the A386L mutation results in the loss of the strengthened hydrogen bonds between R403 and E37 seen in ACE2.v2. Figure 1 are the energy-minimized binding structures. Whereas N501 in the wild-type spike protein hydrogen-bonds with no ACE2 residues ( Figure 1C ), the hydroxyl group on the Y501 side chain in the N501Y mutant can hydrogen-bond with both the amine group on the K353 side chain of ACE2 and intramolecularly with the amine group on the Q498 side chain ( Figure 1D ). This additional intermolecular hydrogen bond accounts for the increased binding affinity for the N501Y mutant with ACE2.

While the introduced mutations in the engineered ACE2 mutants ACE2.v2 and ACE2.v2.4 do not directly introduce hydrogen bonding in the RBD region between ACE2 and wildtype spike, they do exert steric pressure on nearby residues, strengthening already existing hydrogen bonds. This can especially be seen with the hydrogen bonding between the guanidinium group of R403 of the spike protein and the carboxylic acid of E37 within ACE2. The A386L mutation introduces a sterically bulkier residue next to E37, pressing it closer to R403. This effect is lost between ACE2.v2 and ACE2.v2.4 due to the lack of the A386L residue change ( Figure 3B,F) . The N330Y mutation has an effect similar to the A386L mutation, a sterically bulkier residue presses upon G496 in the spike protein, pressing it closer Y41 and strengthening its hydrogen bond ( Figure 3D,H) . Future versions of these engineered ACE2 proteins may wish to focus on residue changes that directly implicate new hydrogen bonds with either the wild-type or N501Y spike protein.

The quantitative analysis the binding of the miniproteins with the wild-type spike protein is complicated by the lack of published K d values against the said spike protein by their authors. 7 This is primarily due to the authors' lack of more sensitive equipment for the bilayer interferometry (BLI) experiments, obfuscating the binding signal at miniprotein concentrations below 200 pM. 7 Additionally, the authors' use of viral neutralization to obtain an IC 50 value for these miniproteins does not reflect the actual binding affinity due to the long incubation time with the target cells (30 h). However, a rough approximation of the K d can still be obtained from the BLI values above 200 pM for LCB1 and LCB3. With these approximations in place, a more quantitative estimation of the effects of the N501Y spike mutation can be performed, allowing for an estimation of the N501Y spike K d with these miniproteins.

While the N501Y mutation appears to increase the binding affinity against the wild-type and engineered ACE2 proteins, the same cannot be said of the miniprotein designs. The top performing miniproteins LCB1 and LCB3, both with subnanomolar dissociation constants, are reliant on the N501 residue in the wild-type spike protein ( Figure 4D,H) . This change to Y501 removes a strong hydrogen bond with E22 and D3 in LCB1 and LCB3, respectively, leading to a large ΔΔG change between the wild-type and N501Y spike proteins ( Table 2 ). These decreases in the binding affinity implicate a change that could increase the K d of these miniproteins from picomolar to potentially micromolar when using the free energy change equation ΔG exp = −RT ln(K d ). This loss in affinity is also seen in the recently published hACE2 decoy, CTC-445.2, revealing an urgent need to find COVID therapeutics that are resistant to this residue change ( Table  2 and Figure S1 ). 50 

In conclusion, through the use of MM-PBSA binding free energy estimation in correlation with known experimental 50 c Directly calculated binding free energy of the system with the given protein and the wild-type spike protein without empirical correction. d Directly calculated binding free energy of the system with the given protein and the N501Y spike protein without empirical correction. e Difference between wild-type and N501Y spike binding affinities for a given protein. f Experimental binding affinity converted to Gibbs binding free energy: ΔG exp = −RT ln(K d ).

g Calculated binding free energy of the protein/wild-type spike system (corrected from the directly calculated binding free energy ΔG PB ) using eq 1. h Calculated binding free energy of the protein/N501Y spike system (corrected from the directly calculated binding free energy ΔG PB ) using eq 1. i Difference between wild-type and N501Y spike corrected binding affinities for a given protein: ΔΔG corr = ΔG corr (N501Y) − ΔG corr (WT). j Predicted fold change of K d (from the binding affinity of a given protein with the wild-type spike to that with the N501Y mutant) converted from ΔΔG corr . k Predicted K d for a given protein binding with the N501Y spike using the original K d for wild-type spike and the predicted fold change.

The Journal of Physical Chemistry B pubs.acs.org/JPCB Article binding affinities of both wild-type and mutant versions of hACE2 proteins, we have been able to reasonably predict the binding affinities of the recently reported N501Y mutant of SARS-CoV-2 spike protein with both the hACE2 and the recently reported miniproteins. Particularly, the N501Y mutation has markedly increased the ACE2-spike protein binding affinity (K d ) from 22 nM (for the wild-type spike) to 0.4 nM (for the N501Y mutant spike), which could partially explain why the UK variant of SARS-CoV-2 (with N501Y mutation on the spike protein) is more infectious. However, this same increase in binding affinity is not seen within the miniprotein inhibitors (LCB1 and LCB3) of the spike protein due to the loss of a critical hydrogen bond between the spike and miniproteins. The N501Y mutation is also predicted to decrease the binding affinity of a hACE2 decoy (CTC-445.2) binding with the spike protein. These decreases in affinity for recently published therapeutic candidates implicates a dire need to redesign them to lessen their dependence on interactions with the N501 residue due to the prevalence of variants where the residue is changed. These convenient computational methodologies, including the convenient energy minimizations and MM-PBSA binding free energy calculations, starting from an experimental structure of the "unperturbed" binding complex along with experimental calibration of the calculated binding free energies, may be similarly used in the future to predict the binding affinities of potential new mutants of the spike protein with ACE2 and potential therapeutics/ therapeutic candidates targeting the spike protein. 

Every body counts: measuring mortality from the COVID-19 pandemic

Covid-19: New coronavirus variant is identified in UK

Early empirical assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom

Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa

Sixteen novel lineages of SARS-CoV-2 in South Africa

Resurgence of COVID-19 in Manaus, Brazil, despite high seroprevalence

De novo design of picomolar SARS-CoV-2 miniprotein inhibitors

Cryo-EM Structures of SARS-CoV-2 Spike without and with ACE2 Reveal a pH-Dependent Switch to Mediate Endosomal Positioning of Receptor-Binding Domains

Evaluating the performance of MM/PBSA for binding affinity prediction using class A GPCR crystal structures

Assessing the performance of MM/PBSA and MM/ GBSA methods. 8. Predicting binding free energies and poses of protein-RNA complexes

Assessing the performance of MM/PBSA and MM/GBSA methods

Prediction reliability of binding affinities and binding poses for protein-peptide complexes

Fast and accurate predictions of binding free energies using MM-PBSA and MM-GBSA

Predicting binding free energy change caused by point mutations with knowledge-modified MM/ PBSA method

Protein-protein interactions: scoring schemes and binding affinity

Molecular dynamics simulations of lipid bilayers: major artifacts due to truncating electrostatic interactions

The reliability of molecular dynamics simulations of the multidrug transporter P-glycoprotein in a membrane environment

Automated cryo-EM structure refinement using correlation-driven molecular dynamics

A computer perspective of membranes: molecular dynamics studies of lipid bilayer systems

Structure-Based Design and Discovery of a Long-Acting Cocaine Hydrolase Mutant with Improved Binding Affinity to Neonatal Fc Receptor for Treatment of Cocaine Abuse

Absolute binding free energy calculation and design of a subnanomolar inhibitor of phosphodiesterase-10

Identify potent SARS-CoV-2 main protease inhibitors via accelerated free energy perturbation-based virtual screening of existing drugs

Perturbation Simulation on Transition States and Redesign of Butyrylcholinesterase

Free Energy Perturbation (FEP) Simulation on the Transition States of Cocaine Hydrolysis Catalyzed by Human Butyrylcholinesterase and Its Mutants

Free Energy Perturbation Simulation on Transition States and High-Activity Mutants of Human Butyrylcholinesterase for (−)-Cocaine Hydrolysis

Computational Prediction of Mutational Effects on SARS-CoV-2 Binding by Relative Free Energy Calculations

Efficient drug lead discovery and optimization

Alchemical free energy methods for drug discovery: progress and challenges

Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations

Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field

Advancing Drug Discovery through Enhanced Free Energy Calculations

Monte Carlo simulation of differences in free energies of hydration

Accurate Binding Free Energy Predictions in Fragment Optimization

Determination of solvation free energy using molecular dynamics with solute Cartesian mapping: An application to the solvation of 18-crown-6

Free energy calculation methods: a theoretical and empirical comparison of numerical errors and a new method qualitative estimates of free energy changes

Computational science new horizons and relevance to pharmaceutical design

FEP-guided selection of bicyclic heterocycles in lead optimization for non-nucleoside inhibitors of HIV-1 reverse transcriptase

Computer-aided discovery of anti-HIV agents

Dynamics and design of enzymes and inhibitors

Computer-aided molecular design

The statistical-thermodynamic basis for computation of binding affinities: a critical review

Statistical mechanics and molecular dynamics in evaluating thermodynamic properties of biomolecular recognition

Using PyMOL as a platform for computational drug design

py: An Efficient Program for End-State Free Energy Calculations

AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules

Binding Modes and Selectivity of Cannabinoid 1 (CB1) and Cannabinoid 2 (CB2) Receptor Ligands

Computational determination of binding structures and free energies of phosphodiesterase-2 with benzo[1,4]diazepin-2-one derivatives

An engineered decoy receptor for SARS-CoV-2 broadly binds protein S sequence variants

The Journal of Physical Chemistry B pubs.acs.org/JPCB Article