key: cord-1020457-1zygu325 authors: Iketani, Sho; Forouhar, Farhad; Liu, Hengrui; Hong, Seo Jung; Lin, Fang-Yu; Nair, Manoj S.; Zask, Arie; Huang, Yaoxing; Xing, Li; Stockwell, Brent R.; Chavez, Alejandro; Ho, David D. title: Lead compounds for the development of SARS-CoV-2 3CL protease inhibitors date: 2020-08-04 journal: bioRxiv DOI: 10.1101/2020.08.03.235291 sha: 415e50eb754c89ba439cb8a373c859ecbbbe43da doc_id: 1020457 cord_uid: 1zygu325 We report the identification of three structurally diverse compounds – compound 4, GC376, and MAC-5576 – as inhibitors of the SARS-CoV-2 3CL protease. Structures of each of these compounds in complex with the protease revealed strategies for further development, as well as general principles for designing SARS-CoV-2 3CL protease inhibitors. These compounds may therefore serve as leads for the basis of building effective SARS-CoV-2 3CL protease inhibitors. 2 Abstract 23 24 We report the identification of three structurally diverse compounds -compound 4, GC376, and 25 MAC-5576 -as inhibitors of the SARS-CoV-2 3CL protease. Structures of each of these compounds 26 in complex with the protease revealed strategies for further development, as well as general 27 principles for designing SARS-CoV-2 3CL protease inhibitors. These compounds may therefore 28 serve as leads for the basis of building effective SARS-CoV-2 3CL protease inhibitors. 29 30 these compounds for inhibition of SARS-CoV-2 viral replication. We found that compound 4 and 48 GC376 could block viral infection (EC 50 values (mean ± s.d.): 3.023 ± 0.923 µM and 4.481 ± 0.529 49 µM, respectively), whereas MAC-5576 did not (Fig. 1c) . Finally, we confirmed that these compounds 50 did not result in cytotoxicity to the cells at the tested concentrations (Extended Data Fig. 2) . 51 52 As the three compounds exhibited inhibitory activity against the SARS-CoV-2 3CL, we 53 proceeded to solve the crystal structure of the apo 3CL protease alone and of each of these 54 compounds in complex with the protease to understand their mechanism of binding as well as to 55 guide future structure-based optimization efforts. We note that while MAC-5576 did not exhibit 56 activity in the cellular assay, its low molecular weight and reasonable biochemical activity prompted 57 4 us to pursue its crystallization as well, as our goal was to broadly investigate inhibitory scaffolds for 58 the SARS-CoV-2 3CL protease. Crystals were obtained (see Methods for detailed information) and 59 structures at 1.85 Å, 1.80 Å, 1.83 Å, and 1.73 Å resolution limits for apo 3CL and 3CL bound to 60 compound 4, GC376, and MAC-5576, respectively, were solved ( Fig. 2a, b, c, Extended Data Fig. 61 3, and Extended Data Fig. 4 reported, allowing it to then react with Cys145 through nucleophilic addition and hemithioacetal 69 formation ( Fig. 2b) 8 . MAC-5576 also covalently modified Cys145 by nucleophilic linkage, as 70 expected (Fig. 2c) . 71 72 As we solved the structures for multiple compounds, we hypothesized that general principles 73 for the design of SARS-CoV-2 3CL protease inhibitors could be identified. We first overlaid all four 74 crystal structures of the 3CL with or without inhibitors (Extended Data Fig. 5) . We observed local 75 conformational changes, with Thr45 to Pro52 distinct from the apo 3CL in all three inhibitor-bound 76 structures, whereas Arg188 to Gln192 differed only in the compound 4 and GC376-bound, but not 77 MAC-5576-bound structures. We then overlaid each of the inhibitors in the substrate binding pocket 78 of the 3CL protease to find commonalities in their interactions (Fig. 2d) . Most notably, we found that 79 all of these compounds occupied the S2 site, with compound 4 and GC376 further anchored in the 80 S1 and S3 sites. The backbone NH of Gly143 points toward the ligand binding pocket, forming 81 hydrogen bonds with the carbonyl oxygen of the ethyl ester of compound 4, and the hemithioacetal 82 of GC376 after the Cys145 addition to the original aldehyde, even though the former hydrogen bond 83 is stronger than the latter. In both structures, the γ -lactam groups occupy the S1 site, and are 84 5 strongly anchored by two hydrogen bonds with the side chains of His163 and Glu166. The isobutyl 85 groups are favorably embedded in the hydrophobic S2 site, surrounded by the alkyl portion of the 86 side chains of His41, Met49, His164, Met165, Asp187, and Gln189. Extending into the S3 pocket, 87 the amide bonds of compound 4 and GC376 are stabilized by hydrogen bond interactions with the 88 side chain of Gln189. Similar interactions are also observed in reports of related compounds, 89 suggesting that overall, the binding modes of this class of substrate mimetic inhibitors share 90 remarkable similarities 5,6,10 . Specifically, they all have a γ -lactam occupying the S1 pocket, 91 preserving the dual hydrogen bonds with His163 and Glu166. Furthermore, they commonly contain a 92 hydrophobic moiety occupying the S2 site. As shown in a structural overlay of compound 4 and 93 GC376 with these related compounds (Extended Data Fig. 6 ), the segment of the inhibitors from S1 94 to S2 align closely on top of each other. Variations of binding start to emerge in the S3 and S4 95 region, which exhibits high degrees of freedom in terms of structural diversity as well as 96 conformational flexibility. In our experiments, the S3 and S4 sites displayed weaker electron density, 97 indicating flexibility in the inhibitor and/or the protease in these regions (Extended Data Fig. 4) . 98 These observations suggest that development of 3CL protease inhibitors may benefit from first 99 establishing robust interactions within the S1, S2, and/or S1' sites, before extending into the S3 and 100 S4 sites. Possibly, compounds such as compound 4 and GC376 are not optimized for binding into 101 the S3 and S4 sites, and there are ample opportunities to improve the inhibitory potencies against 102 the 3CL by designing compounds that exploit the accessible contact points to strengthen the ligand-103 protein interactions. 104 On the other hand, the binding of MAC-5576, as a non-peptidic small molecule, displays 106 unique features that differ from that of compound 4 or GC376. We observed that the thiophene 107 group forms π -π stacking with the His41 side chain imidazole, which undergoes a conformational 108 rotation around its beta-carbon to align parallel to the thiophene, as compared to the other peptide- Structural data for the apo SARS-CoV-2 3CL protease and 3CL in complex with compound 4, 233 GC376, and MAC-5576 will be deposited in the Protein Data Bank (PDB) and made publicly 234 available upon publication. Source data for Fig. 1, Extended Data Fig. 1b, Extended Data Fig. 2, and 235 the unprocessed gel for Extended Data Fig. 1a are available with the paper online. apo form (gray), in complex with, compound 4 (green for 3CL and magenta for compound 4), GC376 302 (cyan for 3CL and yellow for GC376), and MAC-5576 (orange for 3CL and purple for MAC-5576 303 fragment). One protomer for each structure is shown, with the inhibitors shown with stick models. 304 The terminal residue of each structure, as well as two stretches of residues near the binding site that 305 exhibit local conformational change between the apo and inhibitor-bound structures are labeled. Comparison of the binding modes of MAC-5576 with XP-59. MAC-5576 317 (purple) bound to the SARS-CoV-2 3CL protease (orange) was overlaid with XP-59 bound to the 318 SARS-CoV-1 A new coronavirus associated with human respiratory disease in China A pneumonia outbreak associated with a new coronavirus of probable bat 332 origin Remdesivir for the Treatment of Covid-19 -Preliminary Report Remdesivir in adults with severe COVID-19: a randomised, double-blind, 336 placebo-controlled, multicentre trial Structure of M(pro) from SARS-CoV-2 and discovery of its inhibitors Crystal structure of SARS-CoV-2 main protease provides a basis for design 341 of improved alpha-ketoamide inhibitors Synthesis, crystal structure, structure-activity relationships, and antiviral 344 activity of a potent SARS coronavirus 3CL protease inhibitor Broad-spectrum antivirals against 3C or 3C-like proteases of picornaviruses High-throughput screening identifies inhibitors of the SARS 349 coronavirus main proteinase Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 352 main protease A structural view of the inactivation of the SARS coronavirus main 354 proteinase by benzotriazole esters A deliberate approach to screening for initial crystallization conditions of 357 biological macromolecules Integration, scaling, space-group assignment and post-refinement Molecular replacement with MOLREP XtalView/Xfit--A versatile program for manipulating atomic coordinates and 364 electron density PHENIX: a comprehensive Python-based system for macromolecular 368 structure solution Electrostatics of 371 nanosystems: application to microtubules and the ribosome Cyanohydrin as an Anchoring Group for Potent and Selective Inhibitors of 374 Enterovirus 71 3C Protease