key: cord-1035449-zwipydu8 authors: Li, Jian; Zhou, Xuelan; Zhang, Yan; Zhong, Fanglin; Lin, Cheng; McCormick, Peter J.; Jiang, Feng; Zhou, Huan; Wang, Qisheng; Duan, Jingjing; Zhang, Jin title: Crystal structure of SARS-CoV-2 main protease in complex with a Chinese herb inhibitor shikonin date: 2020-06-17 journal: bioRxiv DOI: 10.1101/2020.06.16.155812 sha: 833716d3a336b2c2defd347d0788cf3197719428 doc_id: 1035449 cord_uid: zwipydu8 Main protease (Mpro, also known as 3CLpro) has a major role in the replication of coronavirus life cycle and is one of the most important drug targets for anticoronavirus agents. Here we report the crystal structure of main protease of SARS-CoV-2 bound to a previously identified Chinese herb inhibitor shikonin at 2.45 angstrom resolution. Although the structure revealed here shares similar overall structure with other published structures, there are several key differences which highlight potential features that could be exploited. The catalytic dyad His41-Cys145 undergoes dramatic conformational changes, and the structure reveals an unusual arrangement of oxyanion loop stabilized by the substrate. Binding to shikonin and binding of covalent inhibitors show different binding modes, suggesting a diversity in inhibitor binding. As we learn more about different binding modes and their structure-function relationships, it is probable that we can design more effective and specific drugs with high potency that can serve as effect SARS-CoV-2 anti-viral agents. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), an RNA virus, infects the general population at different ages and can cause severe acute respiratory syndrome in high risk individuals. 1 The main protease (M pro , also known as 3CL pro ) is essential for the production of infectious virions and play a critical role in the replication of SARS-CoV-2. 2 M pro is thus an attractive target for the development of drugs against SARS-CoV-2 and other coronavirus infections. M pro of SARS-CoV-2 is a cysteine protease with relatively high sequence homology to other coronavirus main proteases. The catalytic dyad of M pro is formed by His41 and Cys145. A number of studies using either in silico ligand docking or drug discovery based on available structures have been performed to discover new M pro binding agents. Currently, the inhibitors designed for M pro are covalently bound and peptidomimetic, both properties which lend themselves to potential toxicity due to non-specific reactions with host proteins. One previously identified inhibitor was (±)-5,8-dihydroxy-2- (1-hydroxy-4-methyl-3-pentenyl)-1,4 naphthoquinone, termed shikonin, which derives a Chinese herb, and is a major active component of the roots of Lithospermum erythrorhizon. 4 To find new scaffold and non-covalent inhibitors and reveal further details of inhibitor binding, we determined the crystal structure of M pro in complex with shikonin. The structure reveals a novel binding mode that opens new opportunities for future drug development targeting the M pro protease (Fig. 1a, b) . Shikonin and its derivatives have been reported to have antiviral, antibacterial, anti-inflammatory and anti-tumor effects. 4, 5 Previous data have shown shikonin has 15.75 μ M IC 50 to M pro protease. 5 These data, in combination with the structure revealed in this study highlight shikonin as a starting point for developing future novel non-covalent antiviral molecules. The crystal structure of M pro in complex with shikonin ( Shi M pro ) has been determined at 2.45 Å resolution using a previously described M pro construct (Table 1 , Fig. 2a) . Shi M pro structure shows the same overall fold observed for the previous apo state structure of M pro at pH 7.5 ( apo M pro ). 3 The r.m.s. difference in equivalent Cα positions between apo and Shi M pro is roughly 0.3 Å for all the residues (Fig. 1c) . Some key residues in the oxyanion hole and N-finger were found to be disordered in the apo state structure. These residues are located near the active site of the enzyme and therefore participate in the binding of substrates or inhibitors. However, unlike the apo state of M pro at pH 7.5, the current structure contains residues with clear density for both protomers of the protease in the crystal. Interestingly, shikonin binds to protomer A but not to protomer B (Fig. 2b, c) . The reasons are unclear but may be due to the relatively low affinity of shikonin. 5 Structurally, the two monomers have essentially similar conformation with slight difference in oxyanion loop (Fig. 3 ). There are, however, remarkable differences both in the conformation of the protease monomer between this inhibitor-bound complex and the apo state reported earlier by us. 3 Residues in the oxyanion loop became more ordered due to inhibitor binding, as residues 140-145 appear to interact with the inhibitor. Notably, an unprecedented conformational difference in the catalytic dyad His41 and Cys145 is observed, leading to a steric clash between the previously reported inhibitors and His41-Cys145 catalytic site (Fig. 1d ). [5] [6] [7] [8] [9] Unique binding mode of the non-covalent inhibitor shikonin An overlay of the Shi M pro structure with previously solved inhibitor-bound structures shows relatively high spatial conservation of the three domains ( Fig. 1c, Fig. 4 ). Shikonin contains 1,4-naphthoquinone (1,4-NQ) that consists of a benzene moiety and a fully conjugated cyclic diketone with the carbonyls are arranged in para position, referred to as the inhibitor head group, and a chiral six-carbon side-chain with the hydroxy group at C-1, defined as the ligand tail. The inhibitor binding pocket can be described as a narrow cleft surrounded by S1-S4 subsites (Fig. 1c ). Shikonin establishes a hydrogen bond network with the protease polar triad Cys145, His164 and Met165 located on the S1 subsite. The aromatic head groups of shikonin forms a π -π interaction with His41 on the S2 subsite. The hydroxy and methyl group of the isohexenyl side chain tail has two H-bonding with Gln189 and Thr190 on the S3 subsite, respectively. Superimposing Shi M pro with other inhibitor-bound structures reveals striking difference in the arrangement of the catalytic dyad His41-Cys145 and smaller, but substantial, differences in Phe140 and Glu166. In both covalent inhibitors-bound structures, the inhibitor binds to the Sγ atom of Cys145. In the case of current structure with shikonin, the side chain of Cys145 adopts a different configuration, forming hydrogen bond with shikonin (Fig. 1d, e) . In addition, the imidazole group of His41 pointing towards the binding pocket in other structures flip outward, opening a way for the entry of shikonin. The distances between His41 Nε2 and Cys145 Sγ are 5.3 Å in Shi M pro structure, and the distance is significantly longer than that observed in any other main protease of reported structures (Fig. 1d) . [5] [6] [7] [8] [9] Phe140 no longer has π -π interaction with His163, and the phenyl ring of Phe140 undergoes dramatic conformational change and moves outward to the solvent (Fig. 1d ). The side chain of Glu166 is flexible and adopts an inactive conformation in both apo and Shi M pro structures, but is well ordered in the other known structures (Fig. 1d) . It has been shown that Glu166 is critical in keeping the substrate binding pocket in a suitable shape by forming hydrogen bond with peptidomimetic inhibitors and N terminal residue in the other protomer. 8 This may explain why Glu166 is strictly conserved among all main proteases. Two water molecules in the active site of the protease. The apo M pro structure reveals the presence of two water molecules in the substrate binding site (Fig. 5a) . Water 1 forms hydrogen bond network involving Phe140, His163 and Glu166 located in the S1 pocket, stabilizing the oxyanion hole in the apo state structure. 3 Another water molecular (water 2) hydrogen bonded with His41 and Cys145 in the apo state structure is occupied by the shikonin in the Shi M pro structure ( Fig. 5b) . However, these two water molecules are not observed in the Shi M pro structure. We propose that inhibitors that are able to replace these water molecules may have significant improvement of potency for the protease, as was observed when the two water molecules in the substrate binding pocket of M pro are replaced by the inhibitors (Fig. 5c ). 5-9 The current global pandemic has increased the urgency for novel small molecule inhibitors to slow or block SARS-CoV-2 viral propagation. Here we have shown that the napthoquinone, shikonin, binds in a unique mode to the M pro protease. Our structure reveals three novel interactions in the substrate/inhibitor binding pocket, 1) the π stacking interaction between the shikonin naphthoquinone core and side chain of His41 from the S2 subsite, 2) hydrogen bonds with Cys145, His164 and Met165 in the S1 pocket and 3) hydrogen bonds with Gln189 and Thr190 in the S3 pocket (Fig. 1e , f). To date it has been shown that the covalent and peptidomimetic inhibitors identified bind to S1/S2/S4 site, while camfour only binds to the S2 subsite and another natural product baicalein binds to the S1/S2 pocket (Fig. 1g) . [5] [6] [7] [8] [9] The Shi M pro structure highlights a new binding mode of non-covalent and natural product inhibitors, with distinct local conformational changes in the substrate binding pocket, and represents an exciting novel scaffold derived from a Chinese medicinal herb. The M pro structure identified here in complex with natural product shikonin provides an invaluable resource to design improved antiviral drugs for this important therapeutic target. The cDNA of full-length SARS-Cov-2 main protease 3CL (NC_045512) was The crystals were tailored with cryo-loop (Hampton research, America) and then flash-frozen in liquid nitrogen to collect better X-ray data. The data set was collected at 100 K on a macromolecular crystallography beamline17U1 (BL17U1) at Shanghai Synchrotron Radiation Facility (SSRF, Shanghai, China). All collected data were handled by the HKL 2000 software package. The structure was determined by molecular replacement with PHENIX software. The of 7C2Q was referred as a model. The program Coot was used to rebuild the initial model. The models were refined to resolution limit 2.45 Å by using the PHENIX software. The complete wanted data collection and statistics of refinement are shown in Table 1 . The structure has been deposited in PDB (PDB code 7CA8). Values in parentheses are for the highest-resolution shell. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Coronavirusesdrug discovery and therapeutic options Structure of SARS-CoV-2 main protease in the apo state reveals the inactive conformation phytocompounds from Lithospermum erythrorhizon, inhibit the transcriptional activation of human tumor necrosis factor alpha promoter in vivo Structure of M(pro) from COVID-19 virus and discovery of its inhibitors Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease Structural basis for the inhibition of SARS-CoV-2 main protease by antineoplastic drug carmofur Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α -ketoamide inhibitors Discovery of baicalin and baicalein as novel, natural product inhibitors of SARS-CoV-2 3CL protease in vitro We thank the SSRF BL17U1 beam line for data collection and processing. The authors declare that they have no conflict of interest.