key: cord-0733962-i5m0h330 authors: Semenov, Valentin A.; Krivdin, Leonid B. title: Combined Computational NMR and Molecular Docking Scrutiny of Potential Natural SARS-CoV-2 M(pro) Inhibitors date: 2022-03-10 journal: J Phys Chem B DOI: 10.1021/acs.jpcb.1c10489 sha: 155c5c6d99a3c037870ef2ec1d5da41161dc4cd0 doc_id: 733962 cord_uid: i5m0h330 [Image: see text] In continuation of the search for potential drugs that inhibit the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), in this work, a combined approach based on the modeling of NMR chemical shifts and molecular docking is suggested to identify the possible suppressors of the main protease of this virus among a number of natural products of diverse nature. Primarily, with the aid of an artificial neural network, the problem of the reliable determination of the stereochemical structure of a number of studied compounds was solved. Complementary to the main goal of this study, theoretical modeling of NMR spectral parameters made it feasible to perform a number of signal reassignments together with introducing some missing NMR data. Finally, molecular docking formalism was applied to the analysis of several natural products that could be chosen as prospective candidates for the role of potential inhibitors of the main protease. The results of this study are believed to assist in further research aimed at the development of specific drugs based on the natural products against COVID-19. A highly infectious novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first identified in the city of Wuhan, Hubei, China. The World Health Organization declared the outbreak of a "Public Health Emergency of International Concern" on January 30, 2020, and a pandemic on March 11, 2020 . Although a detailed understanding of the pathophysiology of SARS-CoV-2 is still not fully gained, the genetic structure of this virus has nevertheless been determined, and several potential targets for the potential drugs have been identified. The virus is known to contain four non-structural proteins: papain-like (PL pro ) and 3-chymotrypsin-like (3CL pro or M pro ) proteases, RNA polymerase (RDRp), and helicase, 1 as shown in Figure 1 . It was found that only proteases (PL pro and M pro ) are involved in the transcription of this virus. 2 The main protease of SARS-CoV-2 has been shown to have of about 96% sequence similarity to that of SARS-CoV. 3 Therefore, one of the important targets that antiviral molecules can potentially aim at is the main protease M pro . The crystal structure of M pro in the complex with the peptide inhibitor N3 was first deposited to the protein databank (PDB ID: 6LU7) in 2020. 4 Since this deposit to date, more than 230 M pro structures have been deposited to PDB in combination with various fragments. The M pro active site contains catalytic dyad Cys145−His41, located between the two domains of the protease. It is assumed that inhibitors targeting this site should exhibit a broad spectrum of anti-SARS-CoV-2 activity, because the M pro binding site is highly conserved among all types of coronaviruses. 4, 5 In the absence of proteases with similar cleavage site specificity, the M pro binding site is likely to remain constant for a long time, because it is less prone to drug resistance mutations. The Journal of Physical Chemistry B pubs.acs.org/JPCB Article In addition to the known drugs, to date several synthetic antiviral compounds are at the stage of their preclinical and clinical trials. At the same time, a possible negative modulation of the main protease of SARS-CoV-2, which could lead to new developments in therapeutic drugs, was demonstrated by several natural products, including a number of alkaloids, flavonoids, and terpenoids with high affinity to M pro . 6−11 Moreover, several theoretical approaches based on the calculations of free energy and molecular dynamics, as well as virtual screening of exposed ligand series, are useful in identifying the inhibitory activity of M pro . 12−24 It is well known that the theoretical calculation of NMR chemical shifts provides a powerful tool in structural elucidation of organic molecules, 25−29 such as natural products, carbohydrates, and biochemical species, including those with potential anti-SARS-CoV-2 activity. In continuation of our recent computational NMR survey of parent strychnine 30 with a related family of Strychnos alkaloids 31, 32 and their much larger derivatives 33, 34 together with a number of antimalarial plant and marine alkaloids, 35 in this paper we investigated a series of natural products of diverse chemical origins providing potential SARS-CoV-2 M pro inhibitory activities. Here, we report on the combination of the predictive power of computational NMR and molecular docking to the study of the compounds, which were shown 36 to provide inhibitory activities against the main protease: flavonoids (1−3), quinones (4−6), coumarin (7), alkaloids (8−10), steroids (11, 12) , tannins/ellagitannins (13− 15) , lignan (16) , and other terpenes and terpenoids (17−24) ; see Scheme 1. ■ COMPUTATIONAL DETAILS Geometry Optimization. Geometry optimization of 1−24 was performed with GAUSSIAN 09 37 code at the M06-2X/cc-pVTZ level (for nitrogen and oxygen atoms, the extended augcc-pVTZ basis set with diffuse functions was used to take into account the effect of the diffused lone pairs). Evaluated structures were identified as minima on their potential energy surfaces. Solvent effects were accounted for within the integral equation formalism polarizable continuum model (IEF-PCM). 38, 39 Cartesian coordinates of all optimized structures 1−24 are given in the Supporting Information. Calculation of Chemical Shifts. All calculations of 1 H and 13 C NMR isotropic magnetic shielding constants (the latter being converted into a chemical shift scale) were carried out at the gauge including atomic orbital (GIAO) DFT level for a liquid phase by implying the DALTON package. 40 The wellknown GIAO method is based on the approach that introduces the local gauge origins to define the vector potential of the external magnetic field. In our calculations, we used the mPW1PW91/6-31G(d,p) computational scheme. In this scheme, the one-parameter hybrid functional mPW1PW91 (which is based on the Perdew−Wang exchange), as modified by Adamo and Barone, combined with PW91 correlation 41 was employed throughout in combination with Pople's double-zeta basis set. 42 Calculated proton and carbon isotropic magnetic shielding constants were converted into 1 H and 13 C NMR chemical shifts as recommended by the International Union of Pure and Applied Chemistry (IUPAC). 43 To take into account the systematic error of calculated chemical shifts, we established correlations between their calculated and experimental values. These correlations were used further to find out the linear correlation equations of the y = ax + b type. The slope a and intercept b were then used for recalculating theoretical chemical shifts into experimental δ-scale by using the equation δ recalc = (δ calc − b)/a. The mean absolute errors (MAE) and the corrected mean absolute errors (CMAE) were evaluated in all particular cases. Preparation of Protein for Docking and Grid Generation. The crystal structure of the main protease (having resolution 2.16 Å, R-value free 0.24, R-value work 0.20) in a complex with a peptide-like inhibitor N3 was obtained from the Protein Data Bank 44 (PDB ID: 6LU7 4 ). The 6LU7 protein contains two chains, A and B, which form a homodimer. Chain A was used for macromolecule preparation with AutoDock-Tools. 45 The co-factor and water molecules were removed, and hydrogen atoms were added. The partial atomic charges were predicted via the Gasteiger−Marsili approach 46 after putting hydrogens in all the required atoms of the protein. The receptor grid maps were calculated with AutoGrid 4.2, 45 mapping the receptor interaction energies using the ligand atom types as probes. The grid of 60 × 60 × 60 Å with 0.375 Å spacing was centered on the coordinates of the ligand originally present in the 6LU7 crystal. Molecular Docking Simulations. The molecular docking using the AutoDock 4.2 software 45 was employed to rapidly determine the ligand-binding pose and affinity to SARS-CoV-2 M pro . The energetically minimized structures of all studied compounds, which were obtained as a result of optimization in the first stage of this study, were used to simulate docking as ligand molecules. A Lamarckian genetic algorithm 47 was used to generate docking simulations with an initial population size of 300 random individuals and 50 independent runs. Initial orientation and position of the ligands were randomly set. The best docking modes of all compounds were selected from their conformations based on the binding energy values as well as on significant non-valence interactions observed with receptors. The docking analyses were performed by using Schrodinger Maestro 11.5, 48 MOE 2014, 49 and AutoDock-Tools. 45 Validation of the Docking Protocol. The docking procedure was validated by re-docking the peptide-like inhibitor N3 extracted from a crystallographic structure (6LU7) into the receptor at the same binding pocket. These results indicated that the applied protocol was reliable because a good overlap was achieved between the ligand pose and the X-ray pose with a binding affinity of −9.0 kcal/mol and a root mean square deviation (RMSD) reference value of 1.90 Å. Identification of Possible Stereochemical Misassignments. All studied compounds were checked for possible inconsistencies with their originally established structures. For this purpose, the artificial neural network (ANN) pattern recognition methodology developed by Sarotti 50 was used throughout. Calculated shielding constants were compared with the experimental chemical shifts by using the two-layer feedforward ANN. Within the framework of the formalism of this neural network, three different sets of statistical descriptors were used as input vectors for the input layer, namely, correlation coefficient R 2 , slope a, and intercept b (see the section "Calculation of Chemical Shifts"). The MAE, standard deviation, maximum error, and their corresponding corrected descriptors were also used for the statistical treatment of the results. Thus, for each reference standard used in the calculation of NMR chemical shifts, a total of nine statistical parameters were evaluated. As a result, two categories were used as the ■ RESULTS AND DISCUSSION Molecular Objects. Herein, we will shortly comment on the isolation and general biological activity of natural products 1− 24 selected for this study. Kaempferol (1) together with other strawberry flavonoids isolated from Fragaria ananassa Duch was found to effectively suppress degranulation in rat basophilic leukemia cells. 51 It was established that the suppression of the Ag-stimulated degranulation by kaempferol was mainly due to the suppression of the Ca 2+ elevation together with the spleen tyrosine kinase activation. Baicalein-7-O-glucoside (2) has been isolated from the seeds of Oroxylum indicum and identified by the high-speed countercurrent chromatography by Chen, Games, and Jones 52 and later by Yuan and coauthors. 53 O. indicum is a traditional herbal medicine, possessing analgesic, antitussive, and anti-inflammatory activity. Active ingredients characterized from the seeds and leaves of this plant are several flavonoids involving baicalein-7-O-glucoside, which were shown to reduce the total cholesterol level and have detoxification and chemo-preventative effects. Procyanidin B 2 (3), extracted from hawthorn, belongs to the well-known proanthocyanidins, which are widely distributed in nature, providing a range of therapeutic effects and often considered as the active ingredients of herbal medicines. Khan, Haslam, and Williamson 54 established its structure as a proanthocyanidin dimer and performed a detailed conformational study by applying a heteronuclear NMR correlation. In a study to discover potential anticancer agents from rhizosphere fungi of Sonoran desert plants, a new metabolite, terrefuranone (4), was isolated from As. terries, occurring in the rhizosphere of Ambrosia ambrosoides. 55 The structure of 4 was elucidated mainly by means of 1 H and 13 C NMR techniques involving one-and two-dimensional COSY, ROESY, NOESY, and HMBC together with UV, IR, and MS including APCIMS and HRFABMS methods. Terrequinone A was found to exhibit moderate selective cytotoxicity against cancer cell lines as compared with the normal fibroblast cells. Zeylanone (5) was obtained from the crude extracts of Drimia maritima bark, which is traditionally used in Indonesia for the treatment of rheumatic diseases. This natural product displays a range of biological effects, such as ichthyotoxicity and antibacterial activity. The structure of zeylanone was most completely elucidated by Gu and coauthors 56 by means of diverse NMR experiments involving, in particular, COSY, ROESY, HSQC, and HMBC. The structure of carminic acid (6), a degradation product of the well-known coloring matter of cochineal and the color pigment of the Middle and South American shield louse Coccus cacti coccinelliferi, was established by chemical means as early as in 1964 by Overeem and van der Kerk. 57 It was reinvestigated much later by Schmitt et al., 58 who performed a detailed NMR investigation of this interesting natural product by applying a variety of one-and two-dimensional NMR techniques. Cinchonain-Ib (7) was isolated by Nonaka and Nishioka 59 as a result of the investigation of the relatively low molecular weight phenolic compounds in the bark of Cinchona succirubra. A direct coupling of caffeic acid with epicatechin leading to the formation of the chinchonine alkaloids has also been documented in that study. The structure and stereochemistry of cinhonain-Ib were established based on the physicochemical methods available at that period of time. 2,2-Di(3-indolyl)-3-indolone (8) was originally isolated from a Vibrio sp. separated from the marine sponge Hyrtios altum. 60 This molecule has also been very recently produced from the clone based on the functional screening of the metagenome library generated from the marine sponge Discodermia calyx. 61 Obtained in that study, 1 H and 13 C NMR spectra and COSY, HMBC, and HMQC correlations together with the highresolution electrospray ionization mass spectrometry (ESIMS) enabled unambiguous identification of its structure. The dichloromethane/methyl alcohol extract of the stem bark of Oriciopsis glaberrima ENGL. (Rutaceae) afforded four new acridone alkaloids of the oriciacridone family including oriciacridone F (9). 62 All structures were established based on the MS, 1D, and 2D NMR experiments. Acridone F showed potent activity against α-glucosidase together with moderate free radical scavenging activity against 1,1-diphenyl-2-picrylhydrazyl (known as DPPH). Asperlicin (10), a competitive cholecystokinin antagonist produced from Aspergillus alliaceus, was originally isolated by Chang et al., 63 and later, its structure was revised by Sun, Byard, and Cooper 64 by means of mainly DEPT, HMQC, and HMBC experiments to yield the reestablished structure. Physalin D (11), extracted from the fractions of Physalis angulata L., was shown to play a relevant role in the antimycobacterial activity. 65 Structural elucidation of physalins D performed in that study was based on detailed 1 H and 13 C NMR spectral analyses with the aid of 2D-correlation spectroscopy (COSY, HSQC, and HMBC). The assignment of its 13 C NMR chemical shifts was reported in that study for the first time. Diosgenin glucoside (12) , known also as diosgenin-3-O-β-Dglucopyranoside, was extracted from the natural steroidal saponin mixture, which are widely distributed in plants, such as Paridis polyphylla. 66 Diosgenin glucoside has been known to improve cardiovascular function, anti-platelet aggregation, and antitumor and anti-diabetic activities. The structure of that compound was deduced from mainly 1 H and 13 C NMR data. Strictinin (13) was extracted from the leaves of Camellia irrawadiensis and was found to show antiallergic effect and hair growth promotion. 67 It was identified as 1-O-galloyl-4,6-O-(S)hexahydroxydiphenoyl-β-D-glucose (13) by a direct comparison of its spectral data (NMR, MS, and circular dichroism) and specific rotation with those of an authentic sample, which was isolated from the green tea extract derived from C. sinensis. Pedunculagin (14), an active inhibitor against carbonic anhydrase, 68 was isolated much earlier from the pericarps of Punica granatum L. Its stereochemical structure was established by Tanaka, Nonaka, and Nishioka 69 by means of an extensive NMR study. Phyllaemblicin B (15) was isolated from the roots of Phyllanthus emblica along with 15 tannins and related compounds, 70 the latter known for their anti-inflammatory and antipyretic effects in many local traditional medicinal systems, such as the Chinese herbal medicine, Tibetan medicine, and Ayurvedic medicine. The structure of phyllaemblicin B was established by means of NMR and other spectral and chemical methods. Berchemol (16) , a representative of the epoxylignans family, was extracted from the seeds of Centaurea cyanus and identified by means of UV, ESIMS, high-resolution MS, 1 H and 13 C NMR, and a series of 2D NMR experiments. Unambiguous and complete assignment of its 1 H and 13 C NMR signals was 71 Betulinic acid (17) was isolated earlier from the leaves of Nerium Oleander 72 and most recently from the stems of Combretum laxum, 73 the former showing strongest activities against Candida albicans and Cryptococcus neoformans with a minimum inhibitory concentration of 100 μg/mL. Limonin (18) was isolated from the chloroform layer of the bark of Phellodendron amurense and was found to provide little cytotoxicity against the human cancer cell lines. 74 The structure of limonin was identified by a comparison of its spectral data (UV, IR, MS, 1 H, and 13 C NMR) with those reported in the literature. Copaene (19) , a potent attractant for male Mediterranean fruit flies, Ceratitis capitata, was found to be a minor component in the oils of various plant species, including its hosts such as orange, guava, and mango. 75 It was also shown in that study that copaene affected virgin females, provoking "pseudomale" courtship behavior in the short-range bioassay. Mating occurred exclusively on the artificial leaves treated with copaene, suggesting that the compound potentially served as a chemical cue to facilitate orientation of flies to the "rendezvous" site. Historically, copaene was first synthesized by Wenkert, Bookser, and Arrhenius. 76 Iguesterin (20) , which belongs to the quinone-methide triterpene family, was isolated from Triterygium regelii 77 and provided potent inhibitory activities against SARS. The worldwide outbreak of SARS was caused by the infection with novel SARS-CoV-2 3CL pro , 78 which mediated the proteolytic processing of replicase polypeptides into functional proteins. Iguesterin was found to be an effective drug against SARS coronavirus. N-Feruloyltyramine (21) was isolated and characterized by Park 79 as the P-selectin expression suppressor from garlic (Allium sativum), a medicinal and culinary plant reported to have several positive health effects on cardiovascular diseases, particularly via the suppression of platelet activation. It was shown that N-feruloyltyramine was able to effectively suppress P-selectin expression on platelets. Potential effects of Nferuloyltyramine on the cyclooxygenase enzymes were found to be of major importance. Mulberrofuran G (22) was isolated from the ethyl acetate extract of the root bark of cultivated mulberry tree Morus lhou Koidz. 80 The structure of mulberrofuran was established based on the spectral and chemical evidence. It was found that intravenous injection of mulberrofuran G caused a marked depression effect in rabbits. Itoaic acid (23) , which belongs to seco-friedelolactones, was isolated from the bark and twigs of Itoa orientalis. 81 The structure of this alkaloid was elucidated by means of MS together with different 1D and 2D NMR techniques. Its antiinflammatory activity was evaluated for several compounds extracted from the plant Xylosma controversum. Anabsinthin (24) and accompanying natural products were isolated from Artemisia absinthium L., 82 the latter commonly known as wormwood, a yellow-flowering perennial plant distributed throughout Europe and Siberia and used for the antiparasitic effects and for the treatment of anorexia and indigestion. The structure of anabsinthin was elucidated by means of various physicochemical methods, including optical rotation, Fourier transform infrared spectroscopy, NMR, and high-resolution mass spectrometry. Structural Validation. Within the framework of the approach based on ANN, 50 the correctness of the initially established stereochemical structure of compounds 1−24 was verified. For this purpose, calculated shielding constants together with experimental 1 H and 13 C NMR chemical shifts were subjected to correlation analysis. Further, the resulting set of 18 statistical descriptors for each of the 24 compounds was used as input vectors for the ANN input layer. Based on the pattern recognition results, it was revealed that stereochemical structures of all studied compounds were correctly established with a confidence of 59.4−99.9% with a mean value of 95.4%. Spectral Analyses and Reassignments. A comparison of calculated and experimental NMR chemical shifts allowed us to perform a number of spectral assignments and possible reassignments in the series of studied natural products, as presented in Schemes 2−4. Among those are the reassignments of C-20 and C-33 in asperlicin (10) (17), as well as a number of predicted 1 H chemical shifts of those two compounds, are presented in Scheme 3. Finally, shown in Scheme 4 are the predicted additional assignments of 1 H NMR chemical shifts of limonin (18) and α-copaene (19) , which were not reported before this study. The Journal of Physical Chemistry B pubs.acs.org/JPCB Article Shown in Figure 2 are the final correlation plots of calculated and experimental 1 H and 13 C NMR chemical shifts of the studied natural products of diverse chemical origin 1−24, providing anti-SARS-CoV-2 M pro activity. For protons, the RMSD was 0.37 ppm for the range of about 10 ppm, while MAE and CMAE were found to be 0.80 and 0.22 ppm (405 points), respectively. For carbons, RMSD was 3.0 ppm, while MAE and CMAE were 3.1 and 2.1 ppm, respectively, for the range of about 200 ppm and 625 data points. Presented correlations indicate the adequate level of theory [mPW1PW91/6-31G(d,p)] applied for the calculation of 1 H and 13 C NMR chemical shifts in this series. Molecular Docking Simulations. To analyze structural features, which are critical for binding of the SARS-CoV-2 main protease, we selected 24 compounds (Scheme 1), which had previously demonstrated high M pro inhibition potential. Compounds 1−24 were all docked with the selected three-dimensional (3D) structure of the M pro protease (PDB ID: 6LU7). The results of these simulations were analyzed based on the ability of the docked ligands to reproduce the N3−M pro interaction pattern. Table 1 shows the results of performed calculations in the most ranked binding position arranged in the order of decreasing predicted free binding energy, ΔG. It is seen that docking energies in this series of ligands are in the range from −11.33 to −4.66 kcal/mol. In addition, for comparison with the studied ligands, redocking into the SARS-CoV-2 M pro was also performed to validate the used docking protocol and to estimate the binding affinity of the native ligand N3 to the catalytic dyad of Cys145−His41. The performed analysis showed that compounds of the selected series are predicted to fit into the M pro active site, repeating the docked pose, which significantly matches the interaction pattern of the native ligand, as illustrated in Figures 3 and 4 . The Journal of Physical Chemistry B pubs.acs.org/JPCB Article It followed that four compounds, namely, berchemol (16), terrequinone A (4), betulinic acid (17) , and 2,2-di(3-indolyl)-3indolone (8), have a binding energy lower than that of the native ligand N3. As a result, they showed good coordination with the binding affinity of −11.33, 10.39, −9.73, and 9.59 kcal/mol, respectively. Analysis of their interactions with M pro residues showed that they are located deeply within the binding pocket of the M pro active site, being in the cleft between the two protease domains and constituting the catalytic dyad Cys145−His41. This finding indicates the fact that they have the potential to covalently bind to amino acid residues in this region of 6LU7. This ability to interact with a main protease provides additional benefits in suppressing viral activity. Figures 3 and 4 also demonstrate the extensive interactions between these four compounds and amino acid residues, which form the binding cavity in 2D diagrams. Binding diagrams for the rest of the studied compounds are given in the Supporting Information. These contacts include hydrogen bonds (HB) and van der Waals, π-alkyl, and π−π stacking interactions. In a more extended manner, the main interactions of compounds 1−24 with the M pro active site are presented in Table 1 . It should be noted that among the studied series of potential SARS-CoV-2 M pro inhibitors 1−24, berchemol (16) showed the best coordination with the main protease, with the lowest binding energy and, accordingly, the value of inhibition constant. The molecule of berchemol (16) has five HB, including that between the pyrrole hydrogen atom of His163 and oxygen of the hydroxy group of one of the methoxyphenyl moieties. The second HB is formed between the amino acid Glu166 and the hydrogen of the same hydroxy group. The third HB is between The Journal of Physical Chemistry B pubs.acs.org/JPCB Article the carbonyl oxygen Asn142 and one of the aromatic protons of the methoxyphenyl fragment, while the fourth HB is formed between the oxygen atom of the methoxytetrahydrofuranol ring and the proton of the amino group of Gly143. The fifth HB is located between the hydrogen atom of the hydroxy group of the methoxytetrahydrofuranol cycle and one of the carbonyl oxygen atoms of the Gln189 amino acid. In addition, this compound has two weak π−sulfur interactions between Met49 and Met165 and one of the methoxyphenyl moieties. Likewise, the amino acid His41 has a π-alkyl contact with the aromatic moiety of berchemol. Other weaker van der Waals interactions are formed by amino acids Cys145, Gly143, Gln189, Asp187, His172, Arg188, and others of the binding pocket. The ligand−receptor coordination of berchemol is illustrated on the left in the 2D diagram in Figure 3 and is shown in general in Figure 5 . The detailed H-binding interactions with the amino acid residues Asn142, Gly143, His163, Glu166, and Gln189 of the M pro active site are presented in Figure 6a . Terrequinone A (4) forms the interaction network of four HBs, with the strongest one located between the proton of the amino group of Gly143 and the nitrogen atom of one of the indole fragments, as well as between the hydroxyl hydrogen of the quinone core and the carbonyl oxygen atom of Gln189. The other two weaker HBs belong to the oxygen atom of the quinone and the amine hydrogen of Glu166, as well as the carbonyl oxygen atom of Arg188 and one of the pyrrole protons of terrequinone A. The hydrogen bonding is presented in more detail in Figure 6b . It should be noted that terrequinone A forms two π-alkyl interactions with the catalytic dyad Cys145−His41. There are also π-alkyl interactions with the amino acid residues Glu166 and Gln189. Terrequinone A forms other non-covalent contacts of the hydrophobic type with Met49 and Met165. The rest van der Waals-type interactions are observed with Thr26, Asn142, His163, and other amino acids of M pro binding site. As for betulinic acid (17) , as in the case of the previous compound, there are four HBs, two of which are formed by the amino group and a carbonyl oxygen atom of Thr26 with the hydroxo group of the dimethylcyclohexyl fragment; see Figure 7a . The third strongest HB is formed between the hydrogen atom of the amino group of Glu166 and the carboxyl oxygen atom of betulinic acid. The last HB is located between the proton of the same carboxyl group and the carbonyl oxygen atom of His164. It should be noted that π-alkyl interactions are formed in this molecule only by His41. Among the nonpolar interactions, the influence of leucine 141 can be noted. In the molecule of 2,2-di(3-indolyl)-3-indolone (8), only one HB is observed, namely, that between the carbonyl oxygen of His164 and the pyrrolidine hydrogen atom of the indolone core The Journal of Physical Chemistry B pubs.acs.org/JPCB Article of 8, as can be seen in Figure 7b . Here, it should be noted that the stabilization of the considered complex originates mainly in the formation of two π-alkyl contacts between indolyl residues and Cys145 and Glu166 simultaneously. On the other hand, it is due to the formation of the attractive interaction of the π-stacking between the aromatic systems of the imidazole moiety of the His41 and the benzyl ring of indolone (8) . It is interesting to note that in all cases considered herewith (as it has already been indicated by many authors), the base for the stabilization of potential inhibitors in the M pro binding pocket is a distributed network of HBs, predominantly those involving residues Gly143, His163, Glu166, and Gln189. On the other hand, it is a non-valence coordination of the aromatic systems of ligands and the catalytic dyad Cys145−His41 within the formation of stable π-contacts. Finally, in a more detailed presentation, the frequencies (by way of the corresponding fractions) of the main types of interactions of the active site residues with ligands of the studied series 1−24 (namely, HBs together with the van der Waals, hydrophobic, and π-alkyl interactions) are shown in the share diagram in Figure 8 . In the background of the ongoing COVID-19 pandemic, quantum-chemical methods used for the elucidation of chemical structures can be effectively implemented to the target potential inhibitors of this virus, accelerating the development of appropriate drugs. In the present study, we proposed and used a combined approach, based on the modeling of NMR chemical shifts and molecular docking to identify the potential SARS-CoV-2 M pro inhibitors among a series of natural products of diverse origins. The well-known challenge related to the incorrect initial assignment of the stereochemical structure was solved in this study within the framework of the formalism of the ANN, which allows, based on the correlations of calculated and experimental chemical shifts, to determine the true stereochemistry of the molecule with high reliability. The modeling of spectral NMR parameters, performed in the present study, made it possible to predict potential NMR reassignments in some of the studied natural products. The molecular docking of the established structures was used to study binding interactions in the active site of the main protease. For several compounds with high binding energies, the nonpolar and electrostatic interactions were found to stabilize these ligands in the binding pocket. Undoubtedly, one of the main roles was played by the distributed networks of the ligand− receptor HBs. As a result of this study, it was found that berchemol, which is a natural extract from the seeds of C. cyanus, showed the best affinity with the SARS-CoV-2 main protease. Several other tested natural products, such as terrequinone A, betulinic acid, and 2,2-di(3-indolyl)-3-indolone, also showed better binding The Journal of Physical Chemistry B pubs.acs.org/JPCB Article affinity with the protease than the native ligand. Selected potential inhibitor candidates identified in this work showed improved interaction energies in relation to SARS-CoV-2 M pro and have increased specificity due to additional HBs with the active site residues. Presented results are expected to stimulate further research aimed toward the development of specific drugs against COVID-19. The PDB files were downloaded from the RCSB protein data bank (https://rcsb.org). All structures of the investigated compounds have been provided in the main text of the article and in the Supporting Information. The following free and commercial software packages were used in this article: Gaussian 09 (http://gaussian.com), Dalton 2016 (https:// daltonprogram.org), Schrodinger Maestro Favorsky Irkutsk Institute of Chemistry, Siberian Branch of the Russian Academy of Sciences Favorsky Irkutsk Institute of Chemistry, Siberian Branch of the Russian Academy of Sciences ) and at the A. E. Favorsky Irkutsk Institute of Chemistry using the facilities of the Coronaviruses − drug discovery and therapeutic options SARS and MERS: recent insights into emerging coronaviruses Systematic Comparison of Two Animal-to-Human Transmitted Human Coronaviruses: SARS-CoV-2 and SARS-CoV Structure of M pro from SARS-CoV-2 and discovery of its inhibitors Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved a-ketoamide inhibitors COVID-19 Working Group in Peru. Comprehensive virtual screening of 4.8 k flavonoids reveals novel insights into allosteric inhibition of SARS-CoV-2 M PRO . Sci. Rep. 2021 A molecular docking study of SARS-CoV-2 main protease against phytochemicals of Boerhavia diffusa Linn. for novel COVID-19 drug discovery In-silico approaches to detect inhibitors of the human severe acute respiratory syndrome coronavirus envelope protein ion channel Anti-COVID-19 terpenoid from marine sources: A docking, admet and molecular dynamics study Moroccan Medicinal plants as inhibitors against SARS-CoV-2 main protease: Computational investigations Targeting SARS-CoV-2 M3CLpro by HCV NS3/4a Inhibitors: In Silico Modeling and In Vitro Screening Identification of Key Interactions between SARS-CoV-2 Main Protease and Inhibitor Drug Candidates Identification of Potential Inhibitors of 3CL Protease of SARS-CoV-2 From ZINC Database by Molecular Docking-Based Virtual Screening Cysteine Focused Covalent Inhibitors against the Main Protease of SARS-CoV Antiviral agents against COVID-19: structure-based design of specific peptidomimetic inhibitors of SARS-CoV-2 main protease Lead Discovery of SARS-CoV-2 Main Protease Inhibitors through Covalent Docking-Based Virtual Screening Interactive Molecular Dynamics in Virtual Reality Is an Effective Tool for Flexible Substrate and Inhibitor Docking to the SARS-CoV-2 Main Protease Impact of Early Pandemic Stage Mutations on Molecular Dynamics of SARS-CoV-2 Mpro Pharmacophore-based approaches in the rational repurposing technique for FDA approved drugs targeting SARS-CoV Computational Prediction of Potential Inhibitors of the Main Protease of SARS-CoV-2. Front. Chem. 2020, 8, 590263. (22) Profiling SARS-CoV-2 Main Protease (MPRO) Binding to Repurposed Drugs Using Molecular Dynamics Simulations in Classical and Neural Network-Trained Force Fields Rational approach toward COVID-19 main protease inhibitors via molecular docking, molecular dynamics simulation and free energy calculation Molecular Electromagnetism: A Computational Chemistry Approach Computational 1 H NMR: Part 3. Biochemical studies Computational protocols for 13 C NMR chemical shifts Computational NMR of carbohydrates: Theoretical background, applications, and perspectives DFT computational schemes for H and 13 C NMR chemical shifts of natural products, exemplified by strychnine 1 H and 13 C NMR spectra of strychnos alkaloids: Selected NMR updates The 1 H and 13 C NMR chemical shifts of Strychnos alkaloids revisited at the DFT level Computational 1 H and 13 C NMR of strychnobaillonine: On the way to larger molecules calculated at lower computational costs Computational 1 H and 13 C NMR of the trimeric monoterpenoid indole alkaloid strychnohexamine: Selected spectral updates Benchmark density functional theory calculations of 13 C NMR chemical shifts of the natural antimalarial compounds with a new basis set 3z-S Discovery of potential multi-target-directed ligands by targeting host-specific SARS-CoV-2 structurally conserved main protease The IEF version of the PCM solvation method: an overview of a new method addressed to study molecular solutes at the QM ab initio level Quantum mechanical continuum solvation models a Molecular Electronic Structure Program Exchange functionals with improved longrange behavior and adiabatic connection methods without adjustable parameters: The mPW and mPW1PW models Self-consistent molecular orbital methods 25. Supplementary functions for Gaussian basis sets Further conventions for NMR shielding and chemical shifts Autodock4 and AutoDockTools4: automated docking with selective receptor flexiblity Iterative partial equalization of orbital electronegativity − a rapid access to atomic charges Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function Molecular Operating Environment (MOE) 2014.09; Chemical Computing Group ULC Inhibitory effects of flavonoids isolated from Fragaria ananassa Duch on IgE-mediated degranulation in rat basophilic leukemia RBL-2H3 Isolation and identification of four flavonoid constituents from the seeds of Oroxylum indicum by high-speed counter-current chromatography Separation of flavonoids from the leaves of oroxylum indicum by HSCCC Structure and conformation of the procyanidin B-2 dimer Cytotoxic and other metabolites of aspergillus inhabiting the rhizosphere of sonoran desert plants Cytotoxic and antimicrobial constituents of the bark of Diospyros maritima collected in two geographical locations in Indonesia Mollisin, a naturally occurring chlorine-containing quinone: Part IV. Revised structures for cochenillic acid and for the insect pigments, carminic and kermesic acids A 1 H and 13 C NMR study of carminic acid Tannins and related compounds. VII. Phenylpropanoid-substituted epicatechnins, cinchonains from Cinchona succirubra. (1) Marine natural products. XXXIV. Trisindoline, a new antibiotic indole trimer, produced by a bacterium of vibrio sp. separated from the marine sponge Hyrtios altum Cleistocaltones A and B, antiviral phloroglucinol-terpenoid adducts from Cleistocalyx operculatus α-Glucosidase inhibitory and antioxidant acridone alkaloids from the stem bark of oriciopsis glaberrima ENGL. (Rutaceae) A potent nonpeptide cholecystokinin antagonist selective for peripheral tissues isolated from Aspergillus alliaceus Revised NMR assignments for the cholecystokinin antagonist asperlicin Antimycobacterial physalins from Physalis angulata L. (Solanaceae) The substrate specificity of a glucoamylase with steroidal saponin-rhamnosidase activity from Curvularia lunata Identification of a major polyphenol and polyphenolic composition in leaves of Camellia irrawadiensis Carbonic anhydrase inhibitors from the pericarps of Punica granatum L Tannins and related compounds. XL. Revision of the structures of punicalin and punicalagin, and isolation and characterization of 2-O-galloylpunicalin from the bark of Punica granatum L Novel norsesquiterpenoids from the roots of Phyllanthus emblica Epoxylignans from the seeds of Centaurea cyanus (Asteraceae) Oleanderol, a new pentacyclic triterpene from the leaves of Nerlum oleander Bioactive pentacyclic triterpenes from the stems of Combretum laxum Isolation of limonoids and alkaloids from Phellodendron amurense and their multidrug resistense (MDR) reversal activity α-Copaene, a potential rendezvous cue for the Mediterranean fruit fly, Ceratitis capitata? Total syntheses of (±)-α-and (±)-β-copaene and formal total syntheses of (±)-sativene, (±)-cis-sativenediol, and (±)-helmint hosporal SARS-CoV 3CL pro inhibitory effects of quinone-methide triterpenes from Tripterygium regelii 79) Park, J. B. Isolation and characterization of N-feruloyltyramine as the P-selectin expression suppressor from garlic (Allium sativum) Structures of two natural hypotensive Diels-Alder Type adducts, malberrofuranes F and G, from the cultivated mulberry tree (Morus lhou Koidz) A new seco-friedelolactone acid from the bark and twigs of Itoa orientalis Analysis of sesquiterpene lactones, lignans, and flavonoids in wormwood (Artemisia absinthium L.) using high-performance liquid chromatography (HPLC)-mass spectrometry, reversed phase HPLC, and HPLCsolid phase extraction-nuclear magnetic resonance