key: cord-0876695-g1vnagrl authors: Narang, Dominic; James, D. Andrew; Balmer, Matthew T.; Wilson, Derek J. title: Protein Footprinting, Conformational Dynamics, and Core Interface-Adjacent Neutralization “Hotspots” in the SARS-CoV-2 Spike Protein Receptor Binding Domain/Human ACE2 Interaction date: 2021-04-01 journal: J Am Soc Mass Spectrom DOI: 10.1021/jasms.0c00465 sha: 79a51b211db2ca04c16a92dbeb41a474ecfd1238 doc_id: 876695 cord_uid: g1vnagrl [Image: see text] The novel severe respiratory syndrome-like coronavirus (SARS-CoV-2) causes COVID-19 in humans and is responsible for one of the most destructive pandemics of the last century. At the root of SARS-CoV infection is the interaction between the viral spike protein and the human angiotensin converting enzyme 2 protein, which allows the virus to gain entry into host cells through endocytosis. In this work, we apply hydrogen–deuterium exchange mass spectrometry (HDX-MS) to provide a detailed view of the functional footprint and conformational dynamics associated with this interaction. Our results broadly agree with the binding interface derived from high resolution X-ray crystal structure data but also provide insights into shifts in structure and dynamics that accompany complexation, including some that occur immediately outside of the core binding interface. We propose that dampening of these “binding-site adjacent” dynamic shifts could represent a mechanism for neutralizing activity in a multitude of spike protein-targeted mAbs that have been found to specifically bind these “peripheral” sites. Our results highlight the unique capacity of HDX-MS to detect potential neutralization “hotspots” outside of the core binding interfaces defined by high resolution structural data. Coronaviruses are a family of pathogens first identified by Dr. June Almeida in 1966, 1 characterized physically by crown-like protrusions that have the appearance of a corona. Coronaviruses are invariably highly transmissible, and most cause mild illnesses associated with the common cold. However, SARS variants, such as the SARS-CoV-1 virus that arose in 2003, can cause acute onset respiratory illness with a substantial mortality rate, especially in elderly patients. In the case of SARS-CoV-1, the overall mortality rate was 10%, one of the highest ever recorded for a highly contagious respiratory virus, and upward of 50% in patients over 60. 2 Ironically, it was the high rate of mortality and morbidity that prevented SARS-CoV-1 from becoming a widespread pandemic, as severe symptoms frequently developed concomitantly with transmissibility, facilitating effective isolation of transmissible individuals. However, the subsequent novel coronavirus, SARS-CoV-2, which emerged in 2019, had the necessary qualities to create a widespread pandemic, most critically a high degree of asymptomatic or weakly symptomatic infection with significant transmissibility during the weakly symptomatic period. 3, 4 Coupled to still-substantial death rates of between 0.5−1% of known infections (and again, much higher for populations over 60) and lingering effects even from mild cases, SARS-CoV-2 represents a near "perfect-storm" pandemic virus that has wreaked havoc on social and economic networks. At the molecular level, SARS-CoV infection is driven by a crucial interaction between the viral spike protein (homotrimers that are what give Coronavirus it is "corona" in electron micrographs) and the human ACE2 protein, whose normal function is to catalyze the hydrolysis of the vasoconstrictor peptide angiotensin II. 5, 6 As determined in a 2.2 Å X-ray crystal structure published in 2004 (PDB 1R42), hACE2 is highly glycosylated, with sites at Asn-53, 90, 103, 322, 432, and 546, all likely to be high occupancy based on electron density, and has three disulfides (133−141, 344−361 and 530−542). 7 hACE2 is anchored to the cell membrane through a short, single-pass transmembrane domain at it is C-terminus, followed by a large extracellular sequence consisting of two domainsa collectrin-like domain and a zinc metalloprotei-nase domain. 7 Some constructs of hACE2, including the one used in the current study, are known to dimerize in vitro; however, this does not impact the thermodynamic properties of the hACE2/spike protein interaction. 8 One of the most challenging aspects of SARS-CoV-2 infection is that hACE2 is a widely expressed protein, occurring in multiple tissues including type II alveolar cells in the lung, enterocytes in the small intestine, arterial and venous endothelial cells, arterial smooth muscle cells, and cortical neurons among others. 9 It is this widespread expression that is responsible for the highly varied symptoms of SARS-CoV-2 infection, including enteric illness (infection of small intestine enterocytes), increased risk of stroke (areterial/venous cell infection), and lost of taste/ smell (infection of cortical neurons and glia). 10 However, it is largely the prevalence of hACE2 in alveolar cells in the lung that drives SARS-CoV mortality. 11 SARS-CoV-2 spike protein is a large (1208 residue), heavily glycosylated polypeptide that forms a homotrimer in the viral capsid. Each monomer consists of two subunits (S1 and S2), which are generated from a single polypeptide, but linked noncovalently after proteolytic processing to generate the functional form. The key receptor binding domain (RBD) corresponding to residues 319−541, which falls within the S1 subunit. 12 Intact, this protein represents an insurmountable challenge to classical structural methods; however, in a demonstration of the power of the new high resolution cryo-EM technologies, the full length structure was determined to 2.8 Å within months of the global onset of COVID-19 (PDB 6VXX, 6VYB). 13 One of the most valuable insights to arise from these full-length structures was the occurrence of an "open" and "closed" configuration of the RBD relative to the rest of the protein, where only the "open" configuration is able to efficiently bind hACE2. 13, 14 Important structures have also been determined by X-ray crystallography, most notably the SARS-CoV-2 spike protein RBD in complex with hACE2 (PDB 6LZG, 6M0J). 15 This wealth of structural data provides key insights into the molecular mechanisms of viral complexation and cell entry in SARS-CoV-2 infection and can be used as a structural basis for accelerated drug and vaccine development. What they generally do not provide is a detailed picture of the shifts in conformational dynamics that also play a critical role in complexation. Adding a dynamic dimension to structural data can greatly expand the insights generated, especially in understanding allosteric effects and the subtle dynamic changes that modulate binding affinity and control specificity. There are a limited number of analytical methods that can provide direct, structurally resolved insights into protein conformational dynamics. The technique most closely linked to structural methods is biophysical NMR, which is uniquely powerful in the sense that it can provide direct, site specific measurements of protein dynamics. 16 However, NMR suffers from a fundamental upper limit on analyte size (normally around 70 kDa, well below the size of the spike/hACE2 complex) that arises from slow rotational diffusion of large analytes. While there are some techniques that can mitigate this drawback somewhat, they are often challenging to implement. 17 Hydrogen−deuterium exchange mass spectrometry, in which the exchange of amide hydrogens on the protein with deuterium from solvent is measured by the increase in protein mass, is quickly becoming the most commonly used approach to measure conformational dynamics in protein interactions. 18−22 The physical basis of the relationship between hydrogen−deuterium exchange and protein conformation dynamics is that the observed rate of exchange depends heavily on (i) the extent to which the amide is hydrogen bonded and (ii) the degree to which the amide is accessible to solvent. 18 Both of these attributes are directly impacted by conformational flexibility, linking HDX "deuterium uptake" measurements to conformational dynamics. In the current work, we use a "bottom up" HDX-MS workflow in which the deuterium labeling step is followed immediately by acidification (which quenches the exchange) and digestion of the labeled proteins by an acid protease. 22 The result is a set of peptides whose deuterium uptake status reflects the extent to which their backbone amides were available for exchange in the "native" protein, providing a degree of structural resolution to the HDX-MS experiment in segments of typically 5−10 amino acids in length. In this work, we examine the interaction between hACE2 and the SARS-CoV-2 Spike RBD using HDX-MS. Our results provide a footprint for the protein/protein interface that agrees with the X-ray crystal structure and also reveal changes in conformational dynamics in regions adjacent to the binding site that are the epitopes for a host of spike-neutralizing antibodies. Purified Receptor binding domain of Spike protein (SPD-C52H3) and hACE2 receptor protein (AC2-H52H8) were obtained from Acro Biosystems. Monopotassium phosphate (P0662), dipotassium phosphate (P3786), guanidine hydrochloride (G7294), tris(2-carboxyethyl)phosphine hydrochloride (TCEP, C4706), [Glub1]-fibrinopeptide B human (GluFib, F3261), formic acid (33015) and deuterium oxide (151882) were purchased from Sigma-Aldrich. Acquity UPLC peptide CSH C18 analytical (186006934) and Vanguard Precolumn CSH C18 trap (186005303) columns were purchased from Waters. Immobilized protease XIII/pepsin (NBA2014002) column was purchased from NovaBioassays. Hydrogen−Deuterium Exchange Mass spectrometry (HDX-MS). HDX-MS experiments were performed using Waters ultraperformance liquid chromatography (nano Acquity UPLC) and Synapt G2-S mass spectrometer (Waters Corp., MA) coupled with LEAP PAL (Trajan Automation, NC) automation technology for sample preparation as reported previously. 23 The final (monomeric) concentration of protein in each sample was 2.5 μM. For undeuterated and deuterated experiments, 7.5 μL of protein samples were diluted in 32.5 μL of 10 mM potassium phosphate 150 mM NaCl buffer (Sigma-Aldrich), pH 7.5, and deuteration buffer (10 mM potassium phosphate 150 mM NaCl (Sigma-Aldrich), pD 7.5, respectively. Five different incubation time points were used for labeling with deuterium: 0.5, 2, 10, 30, and 60 min. After the dilution, the protein samples were quenched with 40 μL of ice-cold quench buffer (100 mM potassium phosphate, 7.5 M GdnHCl, 0.5 M TCEP (Sigma-Aldrich), pH 2.5, 0°C). The quenched samples were then mixed with equal volume of 0.1% formic acid solution. One hundred microliters of the quench samples were injected into the HDX module harboring the inline pepsin protease XIII column for protein digestion. After the digestions, peptides were desalted and purified using the analytical column with a 7 min gradient of Acetonitrile with 0.1% formic acid. The purified peptides were electrosprayed into and detected by the Waters Synapt G2-Si mass spectrometer (Waters Corp, MA) with a mass/charge (m/z) acquisition window of 300−1700. Intermittent infusion of GluFib (785.8426 m/z, Sigma-Aldrich) was used for lock mass correction. The peptides were identified from MS E analysis of 10 μM of RBD and 5.8 μM of hACE2 protein samples, and data were analyzed using the ProteinLynx Global Server software (Waters Corp., MA). The peptides identified in PLGS were then used in the DynamX 3.0 software (Waters Corp,. MA). The deuterium uptake for each peptide was calculated by comparing the centroids of the mass envelopes of the deuterated samples versus the undeuterated sample. HDX data were analyzed by calculating and summing the difference in deuterium uptake for identical peptides between the two states, free and complex state of RBD and hACE2, at all the HDX time points. A back-exchange correction factor was not applied in the HDX data analysis because comparisons were made on the same peptides from both protein states. A difference of 1.5 Da and 3σ (3× standard deviation of a given peptide over the five time courses) was set as the statistically significant threshold. Sequence Coverage and Differential HDX. The system explored in the current work is similar to that used in the X-ray crystal study of the SARS-CoV-2 spike protein/hACE2 complex reported recently, 15 corresponding to 18-740 of the human ACE2 protein (Uniprot Q9BYF1-1) and 319-537 of the SARS-CoV-2 spike protein (Uniprot QHD43416.1). A schematic depiction of the hACE2 domain structure is shown in Figure 1A with the region contained in the dashed box corresponding to the construct used. Our SARS-CoV-2 spike construct was limited specifically to the RBD (residues 319− 541). Both proteins were HDX-labeled with glycosylations intact (represented by cartoon glycosyl groups in Figure 1) , with the aim of making the measurements as reflective as possible of the hACE2/SARS-CoV-2 interaction in vivo. This approach greatly simplifies the experiment but does come at a cost to sequence coverage. 24 We were nonetheless able to achieve reasonable coverage of 77% (hACE2) and 78.5% (spike RBD) in differential measurements ( Figure 1B,C) , where peptides must be observed in both the "free" and "bound" data sets. Regions where coverage is conspicuously absent are, as expected, in the vicinity of known glycosylations ( Figure 1B,C) . Average redundancy, which reflects the extent to which overlapping peptides are detected and provides additional certainty and spatial resolution to uptake difference measurements, was relatively high at 3.78 (hACE2) and 3.98 (spike). Importantly, peptides corresponding to all regions of the protein−protein interface as defined in the X-ray cocrystal structure were detected, including the crucial 31−41 and 353− 357 regions of hACE2 and the hACE2 binding motif of spike protein (residues 437−508). All of these regions showed significantly reduced uptake in the bound state, with "significance" being defined as uptake differences exceeding 1.5 Da and 3σ of the sum of all time points, calculated with error propagation from n = 6 technical replicates per time point (n = 3 technical replicates per state). 25 Differences are heavily focused in regions within the interface, however, significant reductions in uptake are detected in "interfaceadjacent" regions. These "adjacent" changes in conformational dynamics may once have been dismissed as functionally irrelevant consequences of the binding interaction. However, there is increasing evidence that shifts in conformational dynamics occurring outside the core interface can modulate target recognition, binding specificity and even binding affinity in protein−protein interactions. 26 Detecting and characterizing these shifts can therefore reveal additional submolecular targets for new therapeutics, indicators of potential vaccine potency or mechanistic insights for known therapeutics that bind regions outside of the interface. For instance, several neutralizing antibodies for SARS-CoV-2 spike protein isolated from COVID-19 patients (e.g., REGN10954, 10986, 1933, and 10964 among others) 27 have HDX-MS measured epitopes that include interface-adjacent regions we identify here as being impacted by hACE2 complexation. HDX-MS Footprinting of the hACE2/SARS-CoV2-Spike Protein Interaction. X-ray cocrystals are usually thought of as the "gold standard" for mapping binding interfaces. 28 This is because the high-resolution nature of the data allows for the determination of each specific residue that is in close contact with a partner. In Figure 2A , we use the available crystal structure to map site specific contacts using a van der Waals interaction cutoff of 4.5 Å (applied to all potential heavy atom pairs) and a hydrogen bond distance cutoff maximum of 3.5 Å (applied to all potential H-bond donors). We define the densely highlighted region along the N-terminal helix of hACE2 and the center of the spike RBD as the "core interfacial region" (Figure 2A) . The spatial resolution of a typical bottom up HDX-MS experiment is 5−10 residues, depending on the size of the protein (larger proteins tend to generate longer peptides) and the extent to which the protein can be processed by the acid protease used; pepsin is relatively nonspecific but cleaves some sequencesparticularly those rich in Val, Ala, and Gly substantially less efficiently. 29 Spatial resolution can be improved by optimizing the digestion step (including the use of multiple acid proteases) and by using redundancy (peptide overlap) to isolate differently exchanging subsegments of each peptide. Careful optimization and use of redundancy can increase the effective resolution of bottom up HDX-MS data substantially, even to the point of allowing site-specific uptake measurements in some cases, 30 a feat that can also be achieved in top-down or middle-down HDX-MS. 31 However, in the most straightforward implementation of the bottom up regime, like the one used here, maximal spatial resolution is traded for expediency and broad applicability, the result being the detection of binding interfaces as segment-averaged "patches" rather than individual contacts. In the current study we detect several discontinuous regions within the hACE2 and spike sequences that exhibit a significant decrease in deuterium uptake upon complextion. Mapped onto the X-ray cocrystal structure ( Figure 2B ), these sequences correspond to a region covering the core interface but also extending somewhat beyond it in both proteins. This raises the question of whether these "adjacent" regions are genuinely impacted by binding or are an artifact of the comparatively low structural resolution of HDX-MS data. Refining the HDX-MS Footprint Using Uptake Kinetics. To obtain a more detailed picture of structural and dynamic changes resulting from complexation, deuterium uptake vs labeling time profiles were plotted for all peptides ( Figures S1 and S2) , 32 examples of which are shown in Figure 3 . In general, the observed kinetic plot differences can be binned into four categories: (i) peptides that exhibit no change when in the complex, (ii) peptides that exhibit "transient" changes in uptake when in the complex, (iii) peptides that exhibit apparently "permanent" changes in uptake when in the complex, and (iv) peptides that exhibit a mixture of type ii and type iii differences. Type ii differences occur when the uptake difference is exclusively the result of a decrease in the rate of deuterium uptake in the complex, with no change in the ultimate amount of uptake observed. Type iii differences occur when there is a decrease in the number of sites that undergo exchange in the complex (at least on the time scale of our HDX labeling step), and type iv differences occur when there is a change both to the rate and amount of deuterium uptake in the complex. Examples of these kinetic "archetypes" are provided in Figure 3 . Peptides exhibiting type I kinetic differences, which appear as purple bars in Figure 1B ,C, cover by far the majority of the sequence (outside of the interfacial region) in both proteins. Peptides exhibiting type iii and iv kinetic differences are almost exclusively detected within the core of the protein−protein interface. Category ii kinetic differences are detected exclusively in two peptides of the spike RBD, located just outside of the core interface. Deuterium uptake kinetics provide an additional layer of information above and beyond the uptake difference magnitudes reported in Figure 2 . Specifically, close attention to uptake kinetics can yield insights into the nature of the binding interaction and/or the shifts in conformational dynamics that accompany binding. For instance, within the binding interface, relatively weak or high turnover complexation will tend to induce type ii kinetic differences because uptake will come to reflect the "off" (unbound) state on the time scale of the measurement. Low turnover interactions, on 15 The structure in (c) is adapted from Hansen et al. 27 the other hand, will tend to cause type iii and/or type iv differences, because the "off" state will not be significantly populated on the time scale of the measurement. In the case of the hACE2/spike interaction, all peptides that map to the core interface exhibit type iii or type iv differences. This would imply a tight, low turnover interaction, which is consistent with the known K d of 1.2 nM. 13 Outside of the binding interface, a type iii difference is indicative of a "structural rearrangement", where complexation induces a "permanent" reorganization of the hydrogen bond network (at least on the time scale of the measurement) in which one or more backbone amide hydrogens becomes entirely unavailable for exchange. We observe type iii differences outside the interface in a single region of hACE2, corresponding to residues 59−73, located within an α-helix that runs parallel and immediately adjacent to the N-terminal helix of the core interface ( Figure 4A ). This suggests substantial stabilization of this helix upon complexation with spike protein, or, put another way, that this helix is not exceptionally stable prior to spike binding. This helix is present, if somewhat shortened, in the ligand free crystal structure for hACE2, 7 however, it is important to recognize that the crystallization process is inherently biased toward a singular, "lowest free energy" configuration of the molecule, which often has the effect of overrepresenting what are in fact marginally stable secondary structures in solution. 33 Outside of binding interfaces, type ii differences are characteristic of a shift in conformational dynamics that causes one or more amide hydrogens to become less frequently available for exchange upon binding. In the current data set, we observe such shifts in dynamics for segments 442−452 and 471−486, located immediately "above" and "below" the core interface on the spike RBD ( Figure 4B ). Given their position at the periphery of the core binding interface, it is not clear if there are transient contacts between hACE2 and the spike RBD in this area. The X-ray structure provides little evidence for substantial interactions −4 of the 25 residues (Gly446, Tyr449, Ala475, and Gly476) are technically within the van der Waals cutoff with an average distance of 4.2 Å, compared to 24 residues in the core binding region with an average distance of 3.5 Å. However, this does not rule out the possibility of transient interactions. A recent study by Yi et al. identified the same four peripheral region residues as having an impact on complexation when mutated (they also identified 14 sites in the "core" binding region); however, it is not clear that these residues are important specifically because they interact with hACE2 in the complex. 34 What the data HDX do unambiguously show is a distinct shift in conformational dynamics in these regions upon complexation. Regardless of whether this shift is linked to transient interactions with hACE2, these "type (ii)" sites represent potential neutralization "hotspots" that would be at best underemphasized in neutralizing mAb-target prediction based solely on the X-ray structure or mutational analysis. In fact, based on the epitopes of the many spike neutralizing antibodies discovered to date, the term "hotspot" is entirely appropriate for these peripheral sites. For example, in a study in which nine neutralizing antibodies against spike protein were identified from the plasma of acute COVID-19 patients, 27 virtually all of them, including the strongly neutralizing REGN10987 and REGN10933 variants, bind precisely at the "dynamics shift" sites we identify here ( Figure 4C ). Given the core-interface adjacent position of these sites, the authors attributed neutralization to "steric blockage" of the interface by the rest of the antibody, which is certainly possible, but assumes a highly specific (and entirely static) orientation for the incoming antibodies. Our results suggest an alternative neutralization mechanism: That mAb binding in these regions prevents shifts in conformational dynamics that are an integral part of the hACE2/spike protein interaction. Further, the consistent preference of neutralizing antibodies for these regions suggest that they provide a more effective (or perhaps more easily evolved) mechanism for neutralization in vivo than binding to the core interface itself. Given the current COVID-19 crisis, there has been intensive interest in the molecular processes that drive SARS-CoV-2 infection, with antibody discovery and vaccine development occurring at an unprecedented pace. In the current study, we have examined the human hACE2/SARS-CoV-2 spike protein interaction using hydrogen−deuterium exchange mass spectrometry. As expected, this approach provided a clear footprint for the binding interface on both proteins that was entirely consistent with the cocrystal structure of the complex. Careful examination of the HDX uptake kinetics also allowed for a more detailed analysis of the binding interaction, including the significant ordering of a binding interface-adjacent helix in hACE2 and dynamic shifts adjacent to the interface on the spike protein RBD. The dynamic shifts occur in regions of the RBD that are the preferred epitopes for neutralizing antibodies targeting SARS-CoV-2 spike protein. Thus, our results support the view that, in addition to "footprinting", HDX-MS analyses can identify "neutralizing hotspots" outside of core binding interfaces, where modulation of conformational dynamics will have a direct impact on complexation. When present, these hotspots represent additional targets for neutralizing antibodies, thereby increasing the likelihood of a protective humoral immune response. The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/jasms.0c00465. Raw uptake kinetics for human ACE2 peptides and deuterium uptake kinetics for spike protein RBD peptides (Figures S1 and S2) (PDF) Comparing SARS-CoV-2 with SARS-CoV and Influenza Pandemics Clinical and Immunological Assessment of Asymptomatic SARS-CoV-2 Infections Receptor and Viral Determinants of SARS-Coronavirus Adaptation to Human ACE2 Angiotensin-Converting Enzyme 2 Is a Functional Receptor for the SARS Coronavirus ACE2 X-Ray Structures Reveal a Large Hinge-Bending Motion Important for Inhibitor Binding and Catalysis Spike Interacts with Dimeric ACE2 with Limited Intra-Spike Avidity Tissue Distribution of ACE2 Protein, the Functional Receptor for SARS Coronavirus. A First Step in Understanding SARS Pathogenesis How Does SARS-CoV-2 Cause COVID-19? Structures and Distributions of SARS-CoV-2 Spike Proteins on Intact Virions Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Protein Dynamics and Function from Solution State NMR Spectroscopy Attenuated T-2 Relaxation by Mutual Cancellation of Dipole-Dipole Coupling and Chemical Shift Anisotropy Indicates an Avenue to NMR Structures of Very Large Biological Macromolecules in Solution Hydrogen Exchange Mass Spectrometry for Studying Protein Structure and Dynamics Contemporary Hydrogen Deuterium Exchange Mass Spectrometry Conformational Analysis of Complex Protein States by Hydrogen/Deuterium Exchange Mass Spectrometry (HDX-MS): Challenges and Emerging Solutions Hydrogen Deuterium Exchange Mass Spectrometry in Biopharmaceutical Discovery and Development -A Review Bottom-up Hydrogen Deuterium Exchange Mass Spectrometry: Data Analysis and Interpretation Probing Protein Ligand Interactions by Automated Hydrogen/Deuterium Exchange Mass Spectrometry Removal of N-Linked Glycosylations at Acidic PH by PNGase A Facilitates Hydrogen/Deuterium Exchange Mass Spectrometry Analysis of N-Linked Glycoproteins Recommendations for Performing Hydrogen-Deuterium Exchange Mass Spectrometry Reveals Folding and Allostery in Protein-Protein Interactions Inference of Macromolecular Assemblies from Crystalline State A History of Pepsin and Related Enzymes Mapping Residual Structure in Intrinsically Disordered Proteins at Residue Resolution Using Millisecond Hydrogen/Deuterium Exchange and Residue Averaging Automated Hydrogen/Deuterium Exchange Electron Transfer Dissociation High Resolution Mass Spectrometry Measured at Single-Amide Resolution Automatic Post-Processing Program for HDX-MS Data Quality and Bias of Protein Disorder Predictors Key Residues of the Receptor Binding Motif in the Spike Protein of SARS-CoV-2 That Interact with ACE2 and Neutralizing Antibodies