key: cord-0993881-42q5p6jh authors: Cantrelle, François‐Xavier; Boll, Emmanuelle; Brier, Lucile; Moschidi, Danai; Belouzard, Sandrine; Landry, Valérie; Leroux, Florence; Dewitte, Frédérique; Landrieu, Isabelle; Dubuisson, Jean; Deprez, Benoit; Charton, Julie; Hanoulle, Xavier title: NMR Spectroscopy of the Main Protease of SARS‐CoV‐2 and Fragment‐Based Screening Identify Three Protein Hotspots and an Antiviral Fragment date: 2021-10-27 journal: Angew Chem Int Ed Engl DOI: 10.1002/anie.202109965 sha: df6479f3e304d78a701b70dac9703aab98cf1cba doc_id: 993881 cord_uid: 42q5p6jh The main protease (3CLp) of the SARS‐CoV‐2, the causative agent for the COVID‐19 pandemic, is one of the main targets for drug development. To be active, 3CLp relies on a complex interplay between dimerization, active site flexibility, and allosteric regulation. The deciphering of these mechanisms is a crucial step to enable the search for inhibitors. In this context, using NMR spectroscopy, we studied the conformation of dimeric 3CLp from the SARS‐CoV‐2 and monitored ligand binding, based on NMR signal assignments. We performed a fragment‐based screening that led to the identification of 38 fragment hits. Their binding sites showed three hotspots on 3CLp, two in the substrate binding pocket and one at the dimer interface. F01 is a non‐covalent inhibitor of the 3CLp and has antiviral activity in SARS‐CoV‐2 infected cells. This study sheds light on the complex structure‐function relationships of 3CLp and constitutes a strong basis to assist in developing potent 3CLp inhibitors. Since the end of 2019, the world faces the global COVID-19 pandemic that is am ajor health burden worldwide with strong societal and economic impacts.The etiological agent is the severe acute respiratory syndrome coronavirus 2(SARS-CoV-2) with ac ase fatality rate of ca. 2%. [1] This virus represents the seventh coronavirus that infects humans and causes the third b-coronavirus outbreak that emerged in the 21st century.Even though, both vaccines [2] [3] [4] [5] and neutralizing antibodies [6] [7] [8] are now available to fight against SARS-CoV-2, specific and efficient antivirals against b-coronaviruses are urgently needed to overcome the limited vaccine coverage, variant escapes from antibodies and the future outbreaks. TheR NA genome of SARS-CoV-2 encodes for up to 27 different proteins: [9, 10] the structural proteins,t he nonstructural proteins (Nsp) and finally several accessory proteins. TheN sp,c orresponding to the replicase-transcriptase,a re first translated in two polyproteins,pp1a and pp1ab,which are then cleaved by two viral proteases,the main protease (Mpro or 3CLp) and papain-like protease to release 16 functional proteins.3CLp cleaves at 11 sites (Nsp4-Nsp16), including its own release.N ative 3CLp (306 aa) is composed of three domains. [11] Domains Iand II are chymotrypsin-like domains with a b-barrel fold and domain III is a5a-helices globular domain that is involved in the regulation of 3CLp dimerization. Al ong linker (L3) [12] connects the domains II and III whereas the N-ter and C-ter (N-terminal and C-terminal) ends are located at the interface between the protomers ( Figure S1 ). Thef unctional and active SARS-CoV-2 3CLp corresponds to ah omodimeric [13] cysteine protease with an unusual catalytic dyad (Cys145, His41). These are buried in acleft between the domains Iand II that is highly conserved among coronaviruses.T he recognition sequence,( L,F)Qfl-(S,A,G), [14] for the proteolytic cleavage (fl)r equires aG ln at position P1 that is ah allmark feature shared by 3CLp of others coronaviruses, [15, 16] and which in contrast is not present in human proteases. [17] Thesubstrate binding site is made by 4 pockets named S1',S 1, S2 and S4 [11, 18] formed by residues from domains Ia nd II and also by residues from the linker (L3). Thea ctive conformation of the active site is further stabilized by Ser1 from the second protomer, which stresses the functional importance of 3CLp dimerization. Thef unction, conservation, substrate specificity,a nd the absence of human homologue all contribute to make 3CLp an attractive drug target. Structural biology here plays atremendous role as it helps to select the ligands,toget the molecular details of their interactions,a nd to find the proper ways to improve their potency.Sofar, the 3CLp inhibitors correspond either to compounds that covalently react with the catalytic Cys145, or to non-covalent molecules that bind either in the active site or at several allosteric sites.A mong the first category,t here are boceprevir, [19, 20] GC376, [21] inhibitors 11a [22] and 13b; [11] and more recently PF-00835231. [23, 24] In the second, molecules that bind either in its active site (MUT056399, 23R) or elsewhere on its molecular surface, including two allosteric sites (Pelitinib,A T7519), have been identified. [25, 26] Moreover,f ragment screening has also been performed to identify fragments [27] that can be grown, linked or merged to develop potent 3CLp inhibitors. [28] Numerous structural biology methods,i ncluding crystallography (Xray [11, 18, 22, 25] or neutrons [29] ), mass spectrometry, [13, 19, 27] computational analyses, [30, 31, 31, 32] have been used to get ab etter understanding of the complex structure-function relationships in 3CLp,i ncluding the conformational flexibility of its active site, [11, 12, 18, 19, 25, 30, [33] [34] [35] [36] and then to find or conceive inhibitors. In this work, we used NMR spectroscopy to study the dimeric SARS-CoV-2 3CLp.W eobtained its NMR chemical shift backbone assignment and used these data in afragmentbased screening that led to the identification of 38 fragment hits.The deciphering of their binding sites and the conformational consequences they induced in 3CLp led to the identification of 3p rotein hotspots,t wo located in the active site of the protease,w ith two different NMR signatures,a nd one at the dimerization interface.W ef urther show that the fragment lead F01 binds in the active site and is,w ithout optimization, ar eversible 3CLp inhibitor with antiviral activity in SARS-CoV-2 infected Vero-81 cells.T he crystal structure of F01-bound 3CLp that we have solved will help its optimization. These NMR data should help to get ab etter understanding of the complex interplays between the active site plasticity,t he dimerization and the enzymatic activity of 3CLp.This also constitutes an ew tool to assist the development of potent 3CLp inhibitors for the present or future outbreaks. We produced SARS-CoV-2 3CLp samples with different isotopic labeling schemes to study by liquid-state NMR spectroscopy.T he purified protease (306 aa, 67.6 kDa) has both native N-and C-terminal ends (SI and Figure S1 ), which is crucial for both its enzymatic activity and its proper dimerization. We obtained good quality 1 H, 15 N-TROSY HSQC spectrum, with ca. 280 resonances ( Figure 1 ) and then recorded 3D 1 H, 15 N, 13 C-TROSY-H NCACB, -HN-(CO)CACB,-HNCO,-HN(CA)CO,-HN(CO)CAs pectra. Due to unfavorable magnetic relaxation properties some 13 C signals were not observed and thus we had to record additional data on other samples,i ncluding 3CLp bound to boceprevir,and amonomeric 3CLp R298A mutant, to reduce the protein dynamics and the molecular weight, respectively. To perform the NMR backbone assignments of SARS-CoV-2 3CLp,w eu sed ac ombined and integrated strategy that includes classical sequential assignment, analyses of chemical shift perturbations (CSPs) upon boceprevir binding,C S predictions and previous NMR assignments for the isolated N-ter and C-ter domains of SARS-CoV 3CLp [37] (see SI). We assigned 183 proton amide correlations (183/293, 63 %) and further obtained 239, 207 and 234 chemical shifts for Ca,Cb and C',respectively ( Figure 1 , SI, BMRB entry 50780). Most of the unassigned proton amides lie in the first two b-barrel domains or at the dimerization interface ( Figure S2 ). Whereas previous attempts to record multidimensional NMR data on SARS-CoV [38] and SARS-CoV-2 [39] 3CLp have failed, these new NMR data open the field to alarge range of future studies of the dimeric 3CLp in solution and at temperature close to physiological, an important parameter when considering dynamics.T oa ssess the potential of our experimental system, we analyzed the 3CLp spectral perturbations upon binding of either boceprevir or GC376 ( Figures S3,S4 ). In both cases,the perturbations induced are highest in the active site but also propagate further in the two catalytic domains, and even toward its C-terminal end with GC376. NMR perturbations may arise from ligand binding but also from the subsequent conformational changes.G C376 indeed induces perturbations both at the active site and at the dimerization interface,the two regions of the protease that are targeted to develop inhibitors. [13, 25, 27, 40] Moreover,i nt he presence of GC376, afew 3CLp NMR resonances split into two new ones ( Figure S5 ), probably highlighting the two conformations of the P3 moiety of the bound inhibitor. [20] Thesplit resonances notably match with Va l42, Asn142, Gln192 and Gly2. The later one showing that we can detect inter-protomer conformational consequences.Interestingly,when using aR298A 3CLp monomeric mutant, we observed ca. 115 additional resonances in the 2D 1 H, 15 NN MR spectrum that is ca. 100 more than expected. This could be due to the two orientations of the domain III that have been described for SARS-CoV 3CLp R298A. [41] These data highlight the potential for insolution studies of the 3CLp.Based on the NMR assignments we are able to not only detect ligand binding and map the binding site(s), but also to analyze the conformational rearrangement(s) throughout the dimer,p roviding essential molecular detail for medicinal chemistry. Fragment screening is widely used in drug discovery as it allows to efficiently probe the chemical space while keeping reasonable the numbers of molecule that have to be assessed. [28] Thefragment hits identified (low MW) that bind to the target are then optimized to give lead compounds.W e used al ibrary of 960 commercially available fragments with physio-chemical properties that mostly fulfill the "rule of three" criteria [42] (Figure S7a-d) . We designed astrategy with ap rimary and as econdary screening using ligand-and protein-observed NMR methods,r espectively ( Figure 2 ). Thes creening steps were performed in the presence of DTT,anucleophile and reducing agent, to minimize the selection of highly electrophilic and nonspecific compounds that would covalently bind to the protease. The9 60 fragments were split into 192 cocktails of 5 fragments,asthis strategy already proved efficient. [43] All the cocktails have been analyzed with 1 HW ater-LOGSY [44] and additionally with 19 Fs pectroscopy for 91 of them (Figure 2 ), as our library contains 427 fluorine fragments in total. With Water-LOGSY,t he detection of the hits is straightforward since their signals have opposite phase (Figure 3a ). When using 19 Fs pectroscopy,t he spectra only contain one NMR signal for each 19 F-fragment present in the cocktail and we monitored both CSPs and signal broadening (Figure 3b ). The primary screening led to the identification of 159 binders (Scheme S1), corresponding to a1 6.6 %h it rate ( Figure 2) . We performed the secondary screening using 2D 1 H, 15 N TROSY-HSQC spectra that have been acquired on SARS-CoV-2 3CLp in the presence of each of the 159 binders identified in the primary screening.U sing both CSPs and signal broadening (Figure 4) , we confirmed 38 fragments as direct binders of 3CLp,corresponding to an overall ca. 4%hit rate (Figures 2a nd 4a nd Scheme S2, Tables S1,S2). This value can be compared with the ca. 6% obtained in ac ombined MS and X-ray approach. [27] Ther atio of 19 Fcontaining fragments in the hits (ca. 40 %) is close to the ratio in the library used. In contrast, both the average MW and lipophilicity of the fragment hits are higher than those in the entire library ( Figure S7a-d) . Using the backbone assignments,the analysis of the CSPs induced by the 38 hits shows that they can be grouped into three classes corresponding to three 3CLp hotspots ( Figure 5 ; Figure S8 ). In Class I( 24 hits), CSPs are observed for resonances assigned to residues distributed in the active site cleft, in the loop L3, and in the C-ter end, whereas residues from the N-ter end are only moderately affected. Class II is made by 8h its that induce CSPs for only ar estricted set of residues,inthe substrate binding site,that belong exclusively to either the domain Io rt he tip of the loop L3 and that corresponds to the S2 and S3 binding sites.Class III (5 hits) is defined by CSPs for residues located at the dimerization interface of 3CLp (N-ter and C-ter ends). As to the fragment F27, it induces as trong reduction in the signal intensity all along the 3CLp sequence ( Figure S8 and Table S2 ), and may correspond to af alse positive. TheCSPs pattern in Class I, illustrated by fragment F01,is similar to the ones observed in 3CLp upon binding of either boceprevir or GC376 ( Figure S8) , two potent inhibitors.T he Figure 4 ) correspond to residues distributed all along the 3CLp active site cleft (S1-S4 pockets) and indeed match with the residues involved in the binding of GC376 ( Figure S9a) . Moreover,the CSPs propagate toward the 3CLp dimerization interface,a s with GC376 ( Figure S4 ). These NMR data are fully supported by the crystal structure of fragment F01-bound 3CLp that we solved (Figure 6a nd Figure S10 and Table S3 ;P DB:7 p51). The3oxo-2,3-dihydro-indene ring and 5-chloro-2-pyridyl group of F01 occupy the S1 and S2 pockets of 3CLp,r espectively. Three hydrogen bonds are formed between F01 and 3CLp. One of them involves the ketone in the indene ring of F01 that is electrophilic and could covalently react with the catalytic Cys145. This group,located in akey position of the active site, rather behaves as aH -bond acceptor and interacts with His163 (see SI). Theb inding of F01 induces conformational changes in all the active site of 3CLp (see SI, Figure 6a nd Figure S10b ). It induces the displacement of:t he a-helix (Ser46-Leu50) around the S2 pocket, the loop L3 and of Asn142 and Glu166 residues around the S1 pocket. This last movement propagates to the 3CLp dimeric interface with Ser1 of protomer Bbeing slightly displaced. It has been shown that in the 3CLp dimer,Ser1 from protomer Binteracts with Glu166 of protomer Aand stabilizes the active conformation of the S1 pocket. [11, 18] Thus,t he CSPs observed in 3CLp spectrum upon F01 binding both match with its binding site and the induced conformational changes through allosteric pathways ( Figure S10c ). Our data show that conformational plasticity [29, 36] and allosteric regulations [13, 25, 35] within 3CLp can be studied using NMR spectroscopy,e specially the tight interplay between substrate binding,active site conformation and dimerization. Theh its from Class II, such as F30,i nduced CSPs that would correspond to their binding into the S2 and S3 pockets located in the domain I-side of the 3CLp substrate binding site,a sSEN1269 [25] (Figure S9b ). This molecule binds to S2 and induced the displacement of the short a-helix (Ser46-Leu50), for which we observed the highest CSPs upon binding of Class II hits ( Figure S8 ). TheNMR CSPs induced upon binding of the Class III hits, which includes F15,a re localized at the 3CLp dimeric interface and could be predicted to resemble the binding of x1086 and x1187 [13, 27] in the hydrophobic pocket made by residues both in the N-ter (Met6, Phe8) and C-ter (Arg298, Gln299, Va l303) ends ( Figure S9c ). With F15,w ea lso observed ah igh CSP for the resonance corresponding to Gln127, which is at the dimeric interface,a nd that has been shown to make ahydrogen bond with x1086. Interestingly,n oN MR perturbations observed in our screening match with fragment binding into the allosteric sites 1and 2that have been identified by Günther et al. [25] It could be that the binding in these two sites requires bigger and more complex molecule structures,o rs imply that the fragment library used did not allow to probe all the possible binding sites. Looking at the chemical properties of the fragment hits on the basis of their Class I, II or III belonging,wefound that in average Class II hits are smaller than Class Ih its (avg. 233.3 Da vs.245.7 Da), and that Class III hits are even smaller (avg.2 06.85 Da) and are also more hydrophobic (80 %w ith 2 < AlogP < 3) ( Figure S7 ). Among the 38 hits identified in this work, F01 induced the highest CSPs in the NMR spectrum of SARS-CoV-2 3CLp (Table S2 and Scheme S2) . We further characterized F01,t he main hit of our screening.F irst, using NMR titration experiments,w ed etermined ad issociation constant K D = 73 AE 14 mMf or the interaction between F01 and 3CLp (Figure 7a and Figure S11 ). This affinity is higher than expected, as initial hits from fragment-based screening usually bind to their target with alow affinity,inthe 1-10 mM range. [45] Second, using an in vitro enzymatic assay,weshowed that F01 is an inhibitor of 3CLp with am oderate potency( IC 50 = 54 mM) (Figure 7b) . Third, using jump dilution assay,w es howed that F01 is ar eversible inhibitor of the protease (Figure S12 ), which Figure 6 . Crystal structure of the fragment F01-bound 3CLp. a) Close-upv iew of the F01 binding in the active site. Protomer Aiss hown in grey and with surface representation,whereas protomer Bisd isplayed in white and in cartoon representation. Three hydrogen bonds between F01 and 3CLp are displayeda sy ellow dashes. Residue from protomer Aare labeled in black and residue S1 from protomer Bism arked with an asterisk. b) Conformational changes in the F01-bound 3CLp structure (PDB:7p51) compared to the apo 3CLp structure (PDB:7nts). See Figure S10 . agrees with the crystal structure (see Figure 6 ). Finally,V ero-81 cells were infected with SARS-CoV-2 in the presence of increasing concentrations of F01 and then both the viral N protein cellular content was assayed and the number of infectious viral particles was determined in the cell supernatants.T he results showed that F01 has antiviral activity (EC 50 = 150 mM) against SARS-CoV-2 ( Figure 7c and Figure S13) while displaying alow cytotoxicity (CC 50 > 400 mM) (Figure 7d) . Usually,the initial fragment hits have neither in vitro nor biological activity,asthey often are too small and bind to their target with very low affinity.I nt his work, we identified the fragment F01 that even without optimization has antiviral activity against SARS-CoV-2. Very recently,B ajusz et al. have reported afragment, SX013, that blocks the SARS-CoV-2replication in Vero E6 cells with an EC 50 of 304 mM, [46] which is double of that for F01 in Vero-81 cells.The ligand efficiency of F01 is 0.29-0.30 kcal mol À1 heavy atom À1 showing that F01 is agood fragment lead and deserved to be optimized in order to increase its potencyand other drug related properties. [28, 45] Conclusion Whereas structural biology plays ac entral role in drug discovery and drug development, up to date,NMR spectroscopy has not successfully been pushed forward to study the 3CLp from coronaviruses. [37] [38] [39] In this work, we used solutionstate NMR spectroscopy to study the dimeric 3CLp protease of the SARS-CoV-2, which is one of the main targets to develop efficient antivirals to fight against the COVID-19 pandemic.C onsidering the high sequence conservation between the 3CLps, [20, 40] our data will also be valuable for others b-coronaviruses,s uch as MERS-CoV and SARS-CoV (67 %a nd 98 %s equence similarity,r espectively), and possibly for future emerging b-coronaviruses.E ven being incomplete,t he 3CLp backbone chemical shift assignment, obtained at pH and temperature close to physiological ones, has proved to be highly valuable in am edicinal chemistry project as these new NMR data allowed the study of both the structure and conformation of the dimeric protease.A s acomplement to the molecular dynamics, [12, 30, 35, 47] these data also provide,f or future studies,a ne xperimental mean to assess the 3CLp dynamics in solution, an important point to consider in drug development. Since mid-2020, the world faces the apparition of SARS-CoV-2 variants that may,atleast partially,escape to currents vaccines.T his stresses that there is an eed for direct acting antiviral(s) and also that there is ahigh risk for emergence of resistance mutations in 3CLp if targeted. To help resolve this common issue in drug development, ap romising strategy consists in the combination of both orthosteric and allosteric drugs. [48, 49] In this way,o ur NMR data could be valuable to identify both the allosteric sites of SARS-CoV-2 3CLp and the molecules that bind into,a nd to identify the allosteric pathways along which resistance mutations may also occur. Using atwo-step fragment screening,weidentified 38 hits, including the promising fragment F01,and three binding sites, or hotspots,located in the active site and at the dimerization interface of 3CLp.Ithas been shown that 3CLp can indeed be efficiently targeted at its active site,a ti ts dimerization interface and even at different allosteric sites. [11, 13, 14, 18, 22, 24, 25, 27, 29, 50] We showed that F01 binds to 3CLp active site with ar ather good affinity (K D = 73 mM), is an on-covalent reversible inhibitor of the protease (IC 50 = 54 mM) and demonstrates antiviral activity against SARS-CoV-2 (EC 50 = 150 mM), despite no optimization. Our results indicates that F01 is apromising fragment lead that deserved to be optimized to give more potent compounds. [28, 51] Structure-activity relationship studies,g uided by the crystal structure,will help this process and two approaches could be considered:f irst, F01 (Class I) could be linked or merged to Class II hits,and second, F01 could be studied in combination with fragments from Class III that bind at the dimerization interface. This work and our NMR results will benefit to the better understanding of the complex structure-function relationships in the dimer of 3CLp and assist the rational design of potent 3CLp inhibitors,that may both block its active site and Figure 7 . F01 is an inhibitor of 3CLp and is active against SARS-CoV-2 in Vero-81 cells. a) Affinity of the Interaction between F01 and 3CLp. NMR titration curves where the 1 H, 15 N-combined CSPs (Dd,p pm) were plotted as afunctiono fthe F01/3CLp ratios. The K D value (mM) corresponds to the mean (AE SD) calculated over 18 3CLp resonances( Figure S11) . b) F01 inhibits the in vitro enzymatic activity of 3CLp. The half-maximal inhibitory concentration (IC 50 )h as been calculated using the initial velocitieso fthe reactions. c) The antiviral activity of F01 against SARS-CoV-2 has been tested on Vero-81 infected cells. After infection in the presence of increasing F01 concentrations, the cells were lysed (t = 16 h) and the viral N-protein content was quantified and was used to determine the half-maximal effective concentration (EC 50 ). Viral titers were also measured in the cell supernatants ( Figure S13 ). d) The 50 %c ytotoxic concentration (CC 50 )o fF01 has been assayed on Vero-81 cells (t = 20 h). interfere with its dimerization, to tackle current, or even future,c oronaviruses pandemics. [52] World Health Organization Va konakis,C .V .R obinson, Angew.C hem. Int Spyroulias,I .V akonakis Drug Discovery Today Crystal structures of apo 3CLp and 3CLp in complex with fragment F01 have been deposited in the Protein Data Bank as entries 7NTS and 7P51, respectively Acceptedm anuscript online: September2 7, 2021 Version of record online TheNMR facilities were funded by the Nord Region Council, CNRS,Institut Pasteur de Lille,European Union (FEDER), French Research Ministry and University of Lille.F inancial support from the IR-RMN-THC (FR 3050 CNRS) for the infrastructure is gratefully acknowledged. This study was supported by the I-site ULNE (project 3CLPRO-SCREEN-NMR), TheC PER CTRL (Transdisciplinary Research Center on Longevity) program, and the Institut Pasteur de Lille. Prof.B .L uy (Karlsruhe Institute of Te chnology) and Dr. D. Sinnaeve are thanked for advice about the 19 FB URBOP pulses.W ew ould like to thank T. Isabet, S. Sirigu and W. Shepard for their valuable support during data collection at beamlines PX1 and PX2A at the SOLEIL synchrotron facility (Paris,France). We thank Dr V. Villeret and Dr E. Dupre for their advice on crystallogenesis and data processing. Theauthors declare no conflict of interest.Keywords: drug discovery ·f ragment screening · NMR spectroscopy ·p rotein structure ·v iruses