key: cord-0864417-woj05vq6 authors: Ren, Xia; Shao, Xin-Xin; Li, Xiu-Xue; Jia, Xin-Hua; Song, Tao; Zhou, Wu-Yi; Wang, Peng; Li, Yang; Wang, Xiao-Long; Cui, Qing-Hua; Qiu, Pei-Ju; Zhao, Yan-Gang; Li, Xue-Bo; Zhang, Feng-Cong; Li, Zhen-Yang; Zhong, Yue; Wang, Zhen-Guo; Fu, Xian-Jun title: Identifying potential treatments of COVID-19 from Traditional Chinese Medicine (TCM) by using a data-driven approach date: 2020-05-04 journal: J Ethnopharmacol DOI: 10.1016/j.jep.2020.112932 sha: b045e751b69d92a236fe3ed62074c0ff6c614c9e doc_id: 864417 cord_uid: woj05vq6 Abstract Ethnopharmacological relevance : Traditional Chinese Medicine (TCM) has been widely used as an approach worldwide. Chinese Medicines (CMs) had been used to treat and prevent viral infection pneumonia diseases for thousands of years and had accumulated a large number of clinical experiences and effective prescriptions. Aim of the study : This research aimed to systematically excavate the classical prescriptions of Chinese Medicine (CM), which have been used to prevent and treat Pestilence (Wenbing, Wenyi, Shiyi or Yibing) for long history in China, to obtain the potential prescriptions and ingredients to alternatively treat COVID-19. Materials and methods We developed the screening system based on data mining, molecular docking and network pharmacology. Data mining and association network were used to mine the high-frequency herbs and formulas from ancient prescriptions. Virtual screening for the effective components of high frequency CMs and compatibility Chinese Medicine was explored by a molecular docking approach. Furthermore, network pharmacology method was used to preliminarily uncover the molecule mechanism. Results 574 prescriptions were obtained from 96,606 classical prescriptions with the key words to treat “Warm diseases (Wenbing)”, “Pestilence (Wenyi or Yibing)” or “Epidemic diseases (Shiyi)”. Meanwhile, 40 kinds of CMs, 36 CMs-pairs, 6 triple-CMs-groups existed with high frequency among the 574 prescriptions. Additionally, the key targets of SARS-COV-2, namely 3CL hydrolase (Mpro) and angiotensin-converting enzyme 2(ACE2), were used to dock the main ingredients from the 40 kinds by the LigandFitDock method. A total of 66 compounds components with higher frequency were docked with the COVID-19 targets, which were distributed in 26 kinds of CMs, among which Gancao (Glycyrrhizae Radix Et Rhizoma), HuangQin (Scutellariae Radix), Dahuang (Rhei Radix Et Rhizome) and Chaihu (Bupleuri Radix) contain more potential compounds. Network pharmacology results showed that Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) CMs-pairs could also interact with the targets involving in immune and inflammation diseases. Conclusions These results we obtained probably provided potential candidate CMs formulas or active ingredients to overcome COVID-19. Prospectively, animal experiment and rigorous clinic studies are needed to confirm the potential preventive and treat effect of these CMs and compounds. Corona Virus Disease 2019 , which is caused by a newly identified coronavirus SARS-COV-2, has been spread to more than 200 countries and regions around the world and posing significant threats to public health . Unfortunately, it is still raging with no effective drugs clinically approved. Given the severity of SARS-COV-2, it is critical to discovery and clinical application of specific drugs against SARS-COV-2 to alleviate the current epidemic situation. It is particularly important to screen possible blockers for the potential target proteins of virus by computational chemical biology techniques such as molecular docking ("dry method"research) in special cases such as the outbreak of SARS-COV-2 . This approach conducive to large-scale screening in a short period of time. It has been recommended two main proteins, 3C-like protease (3CLpro) and angiotensin-converting enzyme 2 (ACE2) , could be used as available targets for screening drugs that inhibiting the replication and proliferation of SARS-COV-2, benefit from rapid sequencing of SARS-COV-2 coupled with molecular modelling based on the genomes of related viral proteins Chai et al., 2020) . Chinese Medicines (CMs), a long history system of medicine with distinct features of theories and practices, has been used for thousands of years (Qiu et al., 2015) . CMs prescriptions embody the principles of system theory, and act on multiple cellular targets in multiple pathways to exert therapeutic effects (Hou et al., 2016; Liao et al., 2018) . COVID-19 belongs to the category of pestilence or epidemic in CM (Li et al., 2019) . CMs had been used to treat and prevent viral infection pneumonia diseases for thousands of years and had accumulated a large number of clinical experience and effective prescription (Luo et al., 2020) . In the "Diagnosis and Treatment Program for Corona Virus Disease 2019 (COVID-19)" issued by the National Health Commission of China, it is recommended to treat with CMs and had achieved good clinical effects. Thus, it is very significant to explore and mine experiences of CMs in treating of pestilence or epidemic diseases based on the abundant historical classics of CMs combined with modern medical research method. In the present study, data mining and association network were used to mine the high-frequency CMs and formulas from ancient prescriptions. Furthermore, molecular docking approach were used to explore binding rates between the main ingredients in high frequency CMs and the key targets of SARS-COV-2. Then, we preliminarily uncover molecular mechanism by a network pharmacology process. These results are expected to provide referenced candidate CMs formulas or active compounds to overcome COVID-19. In our study, the Dictionary of Traditional Chinese Medicine Prescriptions (Peng et al, 1996) and Pharmacopoeia of the People's Republic of China (Pharmacopoeia Commission China, 2015) were used to screened the prescriptions containing "Warm diseases (Wenbing)", "Pestilence (Wenyi or Yibing)" or "Epidemic diseases (Shiyi)". The data processing process included the following three steps: Firstly, the relevant prescriptions retrieved were inputted into the word document to obtain the original literature file; Secondly, key words of prescription information contained number, name, source, formula, efficacy, therapy, dosage form and number of natural medicines and ancient literature classified information were inputted into the Excel file; Finally, prescription data were standardized on the basis of Pharmacopoeia of the People's Republic of China (Peng, 1996) and the Chinese Materia Medica (Zhu, 1998) . Then standardized data was imported to database for following data mining and association network analysis. In our present study, frequency analysis method, association rule mining method and association knowledge network construction method were used to analyze the collected prescriptions. The high frequency CMs were mined by frequency analysis method, and the compatibility rule of prescription was analyzed by association rules. The rules package was called into the formula basket data by R software platform, and the Apriori algorithm was used to mine the data for association rules with Confidence, Support and Promotion as the criteria (CSBTS, 1997) . Support value was the percentage of preconditions that were true and used to measure universality; Confidence value was the percentage of preconditions for which records and the conclusions were both true, mainly used for measuring accuracy; Promotion value was used to evaluate the degree to which the appearance of one item set increased the appearance of another (Zhan and Fu, 2016) . The mined related knowledge was screened and visualized by the arulesViz package to construct the associated knowledge network (Hahsler et al., 2011) . The chemical composition of high-frequency CMs were obtained from the TcmSP™ (Traditional Chinese Medicine System Pharmacology Database, http://tcmspw.com/tcmsp.php). Meanwhile, important pharmacology-related parameters of compounds were also obtained from TcmSP™ , including drug-likeness (DL) and oral bioavailability (OB). The compounds with OB >30% and DL >0.18 were selected as candidate compounds for further analysis . Besides, some compounds with low OB or DL values were also selected for candidate compounds because of their excellent pharmacological activities or high contents . The sdf format of the main active ingredients' structures were downloaded from PubChem (https://pubchem.ncbi.nlm.nih.gov/) as candidates for molecular docking. The high-resolution crystal structure of COVID-19 3CL hydrolase (Mpro) was obtained from PDB (PDB_ID: 6LU7) (Jin et al., 2015) (Fig 2) . The active site of the protein is centered on the active amino acid site of the original ligand in the crystal structure. The corresponding "active pocket" was constructed. The system searched for the "active pocket" near the active site, and finally -8.669631, 12.384467, and 67.029640 with a point count of 8538 were defined as active pocket. The high-resolution crystal structure of angiotensin-converting enzyme 2 (ACE2) was obtained from PDB (PDB_ID:2AJF) (Fig 2) . The two active sites of the protein are centered on the active amino acid site of the original ligand in the crystal structure. (Fig 4) were constructed to dock. Receptor-Ligand interaction module in Discovery Studio 2017 R2 were used to explore binding rates between the main ingredients in high frequency CMs and the key targets of SARS-COV-2. The LigandFit molecular docking parameter settings remained the default parameters. In order to ensure the accuracy of the results, scoring was performed with seven scoring functions: DockScore, LigScore1, LigScore2, PLP1, PLP2, Jain and PMF. Then, consensus score was used to analysis of seven scoring functions selected for Ligandfit docking, which produces a single consensus score value for each ligand rather than for each posed to measure the result. The threshold was set at greater than four. The Huangqin and Gancao CMs-pair were selected to explore their possible molecular mechanism. The chemical compositions were obtained according to "2.2.1 Screening of active components in CMs". We then predicted the potential targets using target prediction approach developed by Fu et al. 2017 . Noteworthy, only the targets with reliability score greater than 0.9 of Homo sapiens were retained for further analysis. After, we established the Protein-Protein Interaction (PPI) network for the targets by using String Datasets (https://string-db.org/), and the PPI networks were further visualized by using Cytoscape (Version 3.5.0, available at http://www.cytoscape.org/) (Shannon et al., 2004) . DAVID, a database for annotation, visualization and integrated discovery to identify the functions, was used to perform GO enrichment analysis and KEGG pathway enrichment analyses for the potential targets (Huang et al., 2009 ). The P-value was calculated and further corrected by using the Benjamini-Hochberg method, and P-value <0.05 was selected as the cutoff criterion. Subsequently, the compound-target network, and target-pathway network were constructed and visualized by using Cytoscape (Version 3.5.0). Chinese medicine prescriptions (fang ji in Chinese) is the main form of CMs application in clinical practice (Ren et al., 2019) . Medicine Prescriptions is the summary of research achievement in CMs prescriptions, which contains more than 1800 kinds of CMs and 90, 000 prescriptions in related literatures (Peng, 1996) . In our study, the prescriptions for treatment of pestilence or epidemic diseases were mined from Dictionary of Traditional Chinese Medicine Prescriptions. 574 prescriptions were selected for the treatment of "Wenbing", "Wenyi", "Yibing" or "Shiyi". The age distribution of prescriptions showed that the use of CMs to prevent epidemics could be traced back to Jin dynasty ( for the first time, which mean the evil epidemic pathogenic factors in Wen Yi Lun (Wu, 1991) . Wu emphasized that Wen diseases (pestilence) was totally different from febrile disease and clearly pointed out that "The wenyi was a disease, not feng(wind), han(cold), shu(heat) or shi(damp), but a strange feeling between heaven and earth". In addition, Wu established the thinking mode of syndrome differentiation and created the effective prescription named "Dayuanyin" for treating pestilence diseases. (Wu, 1972) . Abundant publishes and excellent prescriptions of Wen diseases were also created in that times. In order to screen high frequency CMs, we counted the frequency of the herbs used in screened CMs formulae by frequency analysis method. The results showed that 40 kinds of CMs with the highest frequency from above prescriptions (Table 1 , CMs-pair and triple-CMs groups are the basic forms of compatibility of CMs. There were 36 CMs-pair with a frequency of more than 5% from above prescriptions ( Association rule mining is often used to find possible associations or connections between substance, and it has been applied to study the compatibility of CMs prescriptions . We mined the association relationship (Supplementary Table S1 ) and matrix analysis (Fig 6) for the compatibility relationship of CMs and built the association knowledge network of CMs (Fig 6) . From these results, Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) were key medicines of these prescriptions for treatment of pestilence or epidemic diseases. Most of the CM prescriptions were designed basing on these two CMs. The chemical composition of high frequency CMs were obtained from the CMs includes 431 chemicals, which were molecularly docked with the SARS-COV-2 targets 3CL hydrolase and angiotensin converting enzyme 2 (ACE2) using LigandFit. Consensus scoring, the combination of multiple scoring functions, is easier to find false positive than a single scoring function. The higher the Consensus scoring, the higher the binding rate of the molecule to the target. The score is greater than 4, indicating a better docking result. In our study, compounds with scoring values greater than 4 were screened for analysis. Therefore, 66 compounds were screened, of which 27 were docked with the 3CL hydrolase target and 48 were docked with the ACE2 target. The screened compounds were distributed in 27 kinds of CMs, among which Gancao (Glycyrrhizae Radix Et Rhizoma), HuangQin (Scutellariae Radix), Dahuang (Rhei Radix Et Rhizome) and Chaihu (Bupleuri Radix) contain more potential compounds (Fig 7, Supplementary Table S2 ). The results of molecular docking were consistent with the frequency results of high frequency CMs. 3C-like protease (3CL Pro ) play an important role in the replication of the virus, which is considered to be an attractive target for drug development . Acetoside (Consensus scoring =7) has the strongest binding activity to 3CL hydrolase, which comes from Shengdi (Rehmanniae Radix). In Acetoside, hydrogen bonds were formed between phenolic hydroxyl groups and residues THR and PHE, and hydrophobic interaction was formed between benzene ring and target protein GLU ( Fig 8A) . In addition, various components in high frequency CMs, such as Gancao (Glycyrrhizae Radix Et Rhizoma), Dahuang (Rhei Radix Et Rhizome) and Chaihu (Bupleuri Radix), also have potential anti-activity on 3CL protein (Supplementary Table S2 ). According to the two binding regions grid3 and grid4 between ACE2 and viral protein conformation (Niu et al, 2020) , components which may block the binding of the two proteins were screened. Glyasperin F in Gancao (Glycyrrhizae Radix Et Rhizoma) had the strongest binding to site ACE 1(Consensus scoring=6), and hydrogen bond and σ-p hyperconjugated system was formed between its phenolic hydroxyl group forms and target protein residue LEU ( Fig 8B) . Isorhamnetin in Gancao (Glycyrrhizae Radix Et Rhizoma) and Chaihu (Bupleuri Radix) has the strongest binding ability to ACE site 2 (Consensus scoring=6), which mainly formed hydrophobic interaction between benzene ring and residue target protein residue PRO ( Fig 8C) . Besides, various ingredient in CMs, such as HuangQin (Scutellariae Radix), Chaihu (Bupleuri Radix) and Zhimu(Anemarrhenae Rhizoma), could combined with ACE2 protein. According to the frequency analysis, association rule analysis and molecular docking results of high-frequency CMs, Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) were the key medicines pairs of these prescriptions for treatment of . A systemic pharmacology model based on chemical, pharmacokinetic and pharmacological data was constructed to explore the molecular mechanisms. In the present work, the number of 85 and 34 kinds of active compounds were selected from Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix), respectively. The detailed information about those molecules were provided in Supplementary Table S3 . An integrated in silico approach was introduced to identify the target proteins for the active compounds of CMs (Fu et al., 2017) . We totally obtained 286 potential therapeutic targets for 119 kinds of candidate compounds from Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin Table S4 ). In order to directly represent the interpretation of the complex relationships between active compounds and their targets, C-T network was constructed ( Fig 9B) . Amongst them, those ones with high interconnection degrees were responsible for the high interconnectedness of the C-T network, especially Quercetin (degree = 10), 5,7,2',6'-Tetrahydroxyflavone (degree = 12), Kaempferol (degree = 12), 4'-Hydroxywogonin (degree = 11), Ganhuangenin (degree = 11), Baicalein (degree = 9), Gancaonin O (degree = 8), and Norwogonin (degree = 8). As shown in the C-T network (Fig 9B) , a compound regulated multiple targets, while multiple compounds possibly regulate the same target. The significant targets interacting with the active ingredients were mapped onto the KEGG pathways and the T-P network was generated as shown in Fig 9C. Among the results of KEGG pathway enrichment, we selected the pathways in the basic biological processes of metabolism, genetic information processing, environmental information processing, cellular processes and organismal system. There are 20 target-enriched pathways (Table 5) , which act on the immune system, inflammation, cellular processes, and endocrine system, respectively. Thus, we postulated that Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) CM pair exerts therapeutic effects on multiple targets and pathways of the human body through hits complex active component. network-based computational methods, an integrated system pharmacology approach was used to predict targets, construct networks, and explore the molecular action of high-frequency Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) (GH) CMs-pair. In present study, 85 and 34 kinds of active ingredients with favorable bioactivities and contents were selected from Gancao (Glycyrrhizae Radix Et Rhizoma) and HuangQin (Scutellariae Radix) by ADME filtering, providing some foundational clues for thorough investigation on this CMs-pair. By analyzing the network topology of targets, 30 kinds of important targets were identified. By using network systematic analysis, GH CMs-pair could regulate the proteins related to immune system, inflammation, cellular processes, and endocrine system. COVID-19 leads to a strong immune response and inflammatory storm, in which a large number of cytokines are activated. GH CMs-pair may regulate the immune-related pathway Toll-like receptor signaling pathway, T-cell and B-cell receptor signaling pathway, as well as cytokine action related pathways such as TNF signaling pathway, NF-κB signaling pathway and PI3K-Akt signaling pathway signaling pathway to inhibit the activated cytokines, relieve the excessive immune response and eliminate inflammation. From perspective of molecular network, GH CMs-pair exerted overall regulation through multi-ingredient and multi-target synergistic effect. In conclusion, based on experience of ancient prescription and modern pharmacy research methods, 40 kinds of high frequency CMs, 36 high-frequency CMs-pair and 6 kinds of high-frequency triple-CMs-group were excavated. In addition, the molecular mechanism of the selected key CMs drug pair was preliminarily discussed. CMs-pair with highest frequency show a potential anti-SARS-COV-2 activity by binding with the ACE2 and 3CL hydrolase and regulate the target related to immune system, inflammation, cellular processes, and endocrine system. Our results provide referenced candidate compatibility of CM and active ingredients against SARS-COV-2. The results fully reflected the synergistic mechanism of multi-components and multi-targets of CMs. In view of the limitations of virtual screening results, further experiments in vivo and in vitro are needed to verify the results of this study in the later stage, so as to provide experimental basis for the research and development of antiviral natural drugs. The authors declare no competing financial interest. Xian Specific ACE2 Expression in Cholangiocytes May Cause Liver Damage After 2019-nCoV Infection Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Coronaviruses: Genome Structure, Replication, and Pathogenesis Terminology of TCM clinical diagnosis and treatment: disease Toward Understanding the Cold, Hot, and Neutral Nature of Chinese Medicines Using in Silico Mode-of-Action Analysis The arules R-package ecosystem: analyzing interesting patterns from large transaction data sets Qingfei Xiaoyan Wan, a traditional Chinese medicine formula, ameliorates Pseudomonas aeruginosa-induced acute lung inflammation by regulation of PI3K/AKT and Ras/MAPK pathways Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources Si Sheng Xin Yuan. Chinese Press of Traditional Chinese Medicine PDB-Explorer: a web-based interactive map of the protein data bank in shape space Treatment of Corona Virus Disease 19 by Stages Based on Syndrome Differentiation of Traditional Chinese Medicine Therapeutic Drugs Targeting 2019-nCoV Main Protease by High-Throughput Screening Network pharmacology study reveals energy metabolism and apoptosis pathways-mediated cardioprotective effects of Shenqi Fuzheng Can Chinese Medicine Be Used for Prevention of Corona Virus Disease Research Evidence and Current Prevention Programs Application of Chinese medicine in acute and critical medical conditions Rapid establishment of traditional Chinese medicine prevention and treatment for the novel coronavirus pneumonia based on clinical experience and molecular docking Dictionary of traditional Chinese medicine prescriptions. People's Medical Publishing House Pharmacopoeia of the People's Republic of China When the East Meets the West: The Future of Traditional Chinese Medicine in the 21st Century Hepatoprotective effects of a traditional Chinese medicine formula against carbon tetrachloride-induced hepatotoxicity in vivo and in vitro Cytoscape: a software environment for integrated models of biomolecular interaction networks Tianfoshen oral liquid: a CFDA approved clinical traditional Chinese medicine, normalizes major cellular pathways disordered during colorectal carcinogenesis Difficult to Know. People's Medical Publishing House Wen Bing Tiao Bian. People's Military Medical Publisher System Pharmacology-Based Dissection of the Synergistic Mechanism of Huangqi and Huanglian for Diabetes Mellitus Herb pair Danggui-Honghua: mechanisms underlying blood stasis syndrome by system pharmacology approach Clinical Application Experience Mode of Marine Chinese Medicine Sepiae Endoconcha by Ancient Physicians Treatise on Febrile and Miscellaneous Diseases Chinese Materia medica: chemistry, pharmacology and applications The authors thank the Natural Science