key: cord-0829397-qffqgd08 authors: Dai, Wenhao; Zhang, Bing; Su, Haixia; Li, Jian; Zhao, Yao; Xie, Xiong; Jin, Zhenming; Liu, Fengjiang; Li, Chunpu; Li, You; Bai, Fang; Wang, Haofeng; Cheng, Xi; Cen, Xiaobo; Hu, Shulei; Yang, Xiuna; Wang, Jiang; Liu, Xiang; Xiao, Gengfu; Jiang, Hualiang; Rao, Zihe; Zhang, Lei-Ke; Xu, Yechun; Yang, Haitao; Liu, Hong title: Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease date: 2020-04-22 journal: Science DOI: 10.1126/science.abb4489 sha: 6d94752db4d0d2ba98ecf06b3dc0b1cf8f97f342 doc_id: 829397 cord_uid: qffqgd08 SARS-CoV-2 is the etiological agent responsible for the global COVID-19 outbreak. The main protease (M(pro)) of SARS-CoV-2 is a key enzyme that plays a pivotal role in mediating viral replication and transcription. We designed and synthesized two lead compounds (11a and 11b) targeting M(pro). Both exhibited excellent inhibitory activity and potent anti-SARS-CoV-2 infection activity. The X-ray crystal structures of SARS-CoV-2 M(pro) in complex with 11a or 11b, both determined at 1.5 Å resolution, showed that the aldehyde groups of 11a and 11b are covalently bound to Cys145 of M(pro). Both compounds showed good PK properties in vivo, and 11a also exhibited low toxicity, suggesting that these compounds are promising drug candidates. (Page numbers not final at time of first release) 2 teins) and other accessory proteins (15, 16) . Therefore, these proteases, especially M pro , play a vital role in the life cycle of coronavirus. M pro is a three-domain (domains I to III) cysteine protease involved in most maturation cleavage events within the precursor polyprotein (17) (18) (19) . Active M pro is a homodimer containing two protomers. The CoV M pro features a noncanonical Cys-His dyad located in the cleft between domains I and II (17) (18) (19) . M pro is conserved among CoVs and several common features are shared among the substrates of M pro in different CoVs. The amino acids in substrates from the N terminus to C terminus are numbered as fellows (-P4-P3-P2-P1↓P1′-P2′-P3′-), and the cleavage site is between the P1 and P1′. In particular, a Gln residue is almost always required in the P1 position of the substrates. There is no human homolog of M pro which makes it an ideal antiviral target (20) (21) (22) . The active sites of M pro are highly conserved among all CoV's M pro s and are usually composed of four sites (S1′, S1, S2 and S4) (22) . By analyzing the substrate-binding pocket of SARS-CoV M pro (PDB ID: 2H2Z), novel inhibitors targeting the SARS-CoV-2 M pro were designed and synthesized (Fig. 1) . The thiol of a cysteine residue in the S1′ sites anchors inhibitors by a covalent linkage that is important for the inhibitors to maintain antiviral activity. In our design of new inhibitors, an aldehyde was selected as a new warhead in P1 in order to form a covalent bond with cysteine. The reported SARS-CoV M pro inhibitors often have an (S)-γlactam ring that occupies the S1 site of M pro , and this ring was expected to be a good choice in P1 (23) . Furthermore, the S2 site of coronavirus M pro is usually large enough to accommodate the bigger P2 fragment. To test the importance of different ring systems, a cyclohexyl or 3fluorophenyl were introduced in P2, with the fluorine expected to enhance activity. An indole group was introduced into P3 in order to form new hydrogen bonds with S4 and improve drug-like properties. The synthetic route and chemical structures of the compounds (11a and 11b) are shown in scheme S1. The starting material (N-Boc-L-glutamic acid dimethyl ester 1) was obtained from commercial suppliers and used without further purification to synthesize the key intermediate 3 according to the literature (24) . The intermediates 6a and 6b were synthesized from 4 and acids 5a, 5b. Removal of the t-butoxycarbonyl group from 6a and 6b yielded 7a and 7b. Coupling 7a and 7b with the acid 8 yielded the esters 9a and 9b. The peptidomimetic aldehydes 11a and 11b were approached through a two-step route in which the ester derivatives 9 were first reduced with NaBH4 to generate the primary alcohols 10a and 10b, which were subsequently oxidized into aldehydes 11a and 11b with Dess-Martin Periodinane (DMP). Recombinant SARS-CoV-2 M pro was expressed and purified from Escherichia coli (E. coli) (18, 25) . A fluorescently labeled substrate, MCA-AVLQ↓SGFR-Lys (Dnp)-Lys-NH 2 , derived from the N-terminal auto-cleavage sequence from the viral protease was designed and synthesized for the enzymatic assay. Both 11a and 11b exhibited high SARS-CoV-2 M pro inhibition activity, which reached 100% for 11a and 96% for 11b at 1 µM, respectively. We used a fluorescence resonance energy transfer (FRET)-based cleavage assay to determine the IC 50 values. The results revealed excellent inhibitory potency with IC 50 values of 0.053 ± 0.005 µM and 0.040 ± 0.002 µM, for 11a and 11b respectively (Fig. 2) . In order to elucidate the mechanism of inhibition of SARS-CoV-2 M pro by 11a, we determined the high-resolution crystal structure of this complex at 1.5-Å resolution (table S1). The crystal of M pro -11a belong to the space group C2 and an asymmetric unit contains only one molecule (table S1). Two molecules (designated protomer A and protomer B) associate into a homodimer around a crystallographic 2-fold symmetry axis ( fig. S2 ). The structure of each protomer contains three domains with the substrate-binding site located in the cleft between domain I and II. At the active site of SARS-CoV-2 M pro , Cys145 and His41 (Cys-His) form a catalytic dyad ( fig. S2 ). The electron density map clearly showed compound 11a in the substrate binding pocket of SARS-CoV-2 M pro in an extended conformation ( Fig. 3A and fig. S3 , A and B). Details of the interaction are shown in Fig. 3 , B and C. The electron density shows that the C of the aldehyde group of 11a and the catalytic site Cys145 of SARS-CoV-2 M pro form a standard 1.8-Å C-S covalent bond. The oxygen atom of the aldehyde group also plays a crucial role in stabilizing the conformations of the inhibitor by forming a 2.9-Å hydrogen bond with the backbone of residues Cys145 in the S1′ site. The (S)-γ-lactam ring of 11a at P1 fits well into the S1 site. The oxygen of the (S)-γ-lactam group forms a 2.7-Å hydrogen bond with the side chain of His163. The main chain of Phe140 and side chain of Glu166 also participate in stabilizing the (S)-γ-lactam ring by forming 3.2-Å and 3.0-Å hydrogen bonds with its NH group, respectively. In addition, the amide bonds on the chain of 11a are hydrogen-bonded with the main chains of His164 (3.2 Å) and Glu166 (2.8 Å), respectively. The cyclohexyl moiety of 11a at P2 deeply inserts into the S2 site, stacking with the imidazole ring of His41. The cyclohexyl group is also surrounded by the side chains of Met49, Tyr54, Met165, Asp187 and Arg188, producing extensive hydrophobic interactions. The indole group of 11a at P3 is exposed to solvent (S4 site) and is stabilized by Glu166 through a 2.6-Å hydrogen bond. The side chains of residues Pro168 and Gln189 interact with the indole group of 11a through hydrophobic interactions. Interestingly, multiple water molecules (named W1-W6) play an important role in binding 11a. W1 interacts with the amide bonds of 11a through a 2.9-Å hydrogen bond, whereas W2-6 form a number of hydrogen bonds with the aldehyde group of 11a and the residues of Asn142, Gly143, Thr26, Thr25, His41 and Cys44, which contributes to stabilizing 11a in the binding pocket. The crystal structure of SARS-CoV-2 M pro in complex with 11b is very similar to that of the 11a complex and shows a similar inhibitor binding mode (Fig. 3D and figs . S3, C and D, and S4A). The difference in binding mode is most probably due to the 3-fluorophenyl group of 11b at P2. Compared with the cyclohexyl group in 11a, the 3fluorophenyl group undergoes a significant downward rotation (Fig. 3D) . The side chains of residues His41, Met49, Met165, Val186, Asp187 and Arg188 interact with this aryl group through hydrophobic interactions and the side chain of Gln189 stabilizes the 3-fluorophenyl group with an additional 3.0-Å hydrogen bond (Fig. 3 , E and F). In short, these two crystal structures reveal a similar inhibitory mechanism in which both compounds occupy the substrate-binding pocket and block the enzyme activity of SARS-CoV-2 M pro . Compared with those of N1, N3 and N9 in SARS-CoV M pro complex structures reported previously, the binding modes of 11a and 11b in SARS-CoV-2 M pro complex structures are similar and the differences among these overall structures are small ( Fig. 4 and fig. S4 , B to F) (22) . The differences mainly lie in the interactions at S1′, S2 and S4 subsites, possibly due to various sizes of functional groups at corresponding P1′, P2 and P4 sites in the inhibitors (Fig. 4 , A and C). To further substantiate the enzyme inhibition results, we evaluated the ability of these compounds to inhibit SARS-CoV-2 in vitro ( Fig. 5 and fig. S5 ). As shown in Fig. 5 , compounds 11a and 11b exhibited good anti-SARS-CoV-2infection activity in cell culture with EC50 values of 0.53 ± 0.01 µM and 0.72 ± 0.09 µM using plaque-reduction assay, respectively. Neither compound caused significant cytotoxicity, with half cytotoxic concentration (CC 50 ) values of >100 µM, yielding selectivity indices (SI) for 11a and 11b of >189 and >139, respectively. Both immunofluorescence and quantitative real-time PCR were also employed to monitor the antiviral activity of 11a and 11b. The results show 11a and 11b exhibit a good antiviral effect on SARS-CoV-2 ( Fig. 5 and fig. S5 ). To explore the further druggability of the compounds 11a and 11b, both of the compounds were evaluated for their pharmacokinetic (PK) properties. As shown in table S2, compound 11a given intraperitoneally (5 mg/kg) and intravenously (5 mg/kg) displayed a half-life (T1/2) of 4.27 hours and 4.41 hours, respectively, and a high maximal concentration (C max = 2394 ng/mL) and a good bioavailability of 87.8% were observed when the compound 11a was given intraperitoneally. Metabolic stability of 11a in mice was also good (Clearance (CL) = 17.4 mL/min/mg). When administered intraperitoneally (20 mg/kg), subcutaneously (5 mg/kg) and intravenously (5 mg/kg), compound 11b also showed good PK properties (the bioavailability of intraperitoneally and subcutaneously are more than 80%, and a longer T1/2 of 5.21 hours when 11b was given intraperitoneally). Considering the danger of COVID-19, we selected the intravenous drip administration to further study for the reason that value of the area under the curve (AUC) is high and the effect is rapid. Compared with 11a administrated intravenously, the T 1/2 (1.65h) of 11b is shorter and the clearance rate is faster (CL = 20.6 mL/min/mg). Compound 11a was selected for further investigation with intravenous drip dosing in Sprague-Dawley (SD) rats and Beagle dogs. The results showed (table S3) that 11a exhibited long T 1/2 (SD rat, 7.6 hours and Beagle dog, 5.5h), low clearance rate (rat, 4.01 mL/min/kg and dog, 5.8 mL/min/kg) and high AUC value (rat, 41500 hours*ng/mL and dog, 14900 hours*ng/mL)). Those above PK results indicate that compound 11a is worth to warrant further study. An in vivo toxicity study (table S4) of 11a has been carried out on SD rats and Beagle dogs. The acute toxicity of 11a was measured on SD rats. No SD rats died after receiving 40 mg/kg by intravenous drip administration. When the dosage was raised to 60 mg/kg, one of four SD rats died. The dose range toxicity study of 11a was conducted for seven days at dosing levels of 2, 6, and 18 mg/kg on SD rats and at 10-40 mg/kg on Beagle dogs. All animals received once daily dosing (QD), by intravenous drip, and all animals were clinically observed at least once a day. No obvious toxicity was observed in either group. These above data indicated that 11a is good candidate for further clinical studies. (Page numbers not final at time of first release) 6 The cytotoxicity of these compounds in Vero E6 cells was also determined by using CCK8 assays. The left and right Y-axis of the graphs represent mean % inhibition of virus yield and mean % cytotoxicity of the drugs, respectively. (C and D) Viral RNA copy numbers in the cell supernatants were quantified by qRT-PCR. Data are mean ± SD, n = 3 biological replicates. A Novel Coronavirus from Patients with Pneumonia in China A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: A study of a family cluster A pneumonia outbreak associated with a new coronavirus of probable bat origin A new coronavirus associated with human respiratory disease in China Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding Severe acute respiratory syndrome-related coronavirus: The species and its viruses-a statement of the Coronavirus Study Group WHO Director-General's opening remarks at the media briefing on COVID-19-11 /dg/speeches/detail/who-director-general-s-opening-remarks-atthe-media-briefing-on-covid First Case of 2019 Novel Coronavirus in the United States Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro Can an anti-HIV combination or other existing drugs outwit the new coronavirus? Science Emerging coronaviruses: Genome structure, replication, and pathogenesis Identification of novel subgenomic RNAs and noncanonical transcription initiation signals of severe acute respiratory syndrome coronavirus The newly emerged SARS-like coronavirus HCoV-EMC also has an "Achilles' heel": Current effective inhibitor targeting a 3C-like protease Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra α-helical domain The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor Coronavirus main proteinase (3CL pro ) structure: Basis for design of anti-SARS drugs Phase II, randomized, double-blind, placebo-controlled studies of ruprintrivir nasal spray 2-percent suspension for prevention and treatment of experimentally induced rhinovirus colds in healthy volunteers Reversal of the Progression of Fatal Coronavirus Infection in Cats by a Broad-Spectrum Coronavirus Protease Inhibitor Design of wide-spectrum inhibitors targeting coronavirus main proteases Hilgenfeld, α-Ketoamides as Broad-Spectrum Inhibitors of Coronavirus and Enterovirus Replication: Structure-Based Design, Synthesis, and Activity Assessment Cyanohydrin as an Anchoring Group for Potent and Selective Inhibitors of Enterovirus 71 3C Protease Production of authentic SARS-CoV M pro with enhanced activity: Application as a novel tag-cleavage endopeptidase for protein overproduction Phaser crystallographic software Features and development of Coot PHENIX: A comprehensive Python-based system for macromolecular structure solution collected the diffraction data and solved the crystal structure; Y. L. and X. C. performed the toxicity experiments We thank Prof. James Halpert and LetPub (www.letpub.com) for linguistic assistance during the preparation of this manuscript. We also thank the staff from beamlines BL17U1, BL18U1 and BL19U1 at Shanghai Synchrotron Radiation Facility (SSRF) for assistance during data collection. Funding: We are grateful to the National Natural Science Foundation of China (Nos. 21632008, 21672231, 21877118, 31970165, 91953000 and 81620108027), the Strategic Priority Research Program Competing interests: The Shanghai Institute of Materia Medica has applied for PCT and Chinese patents which cover 11a, 11b and related peptidomimetic aldehyde compounds. Data and materials availability: All data are available in the main text or the supplementary materials. The PDB accession No. for the coordinates of SARS-CoV-2 Mpro in complex with 11a is 6LZE, and the PDB accession No. for the coordinates of SARS-CoV-2 M pro in complex with 11b is 6M0K. The plasmid encoding the SARS-CoV-2 M pro will be freely available. Compounds 11a and 11b are available from H. L under a material transfer agreement with Shanghai Institute of Materia Medica. There is currently an international effort to join forces to design better inhibitors of SARS-CoV-2 M pro as described in the following website: https://covid.postera.ai/covid. This work is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. This license does not apply to figures/photos/artwork or other content included in the article that is credited to a third party; obtain authorization from the rights holder before using such material. science.sciencemag.org/cgi/content/full/science.abb4489/DC1 Materials and Methods Scheme S1 Figs. S1 to S5 Tables S1 to S4 References (26) (27) (28) (29)