key: cord-0022718-1ot9slgm authors: Zhang, Le; Zhang, Lei; Guo, Yue; Xiao, Ming; Feng, Lu; Yang, Chengcan; Wang, Guan; Ouyang, Liang title: MCDB: A comprehensive curated mitotic catastrophe database for retrieval, protein sequence alignment, and target prediction date: 2021-06-07 journal: Acta Pharm Sin B DOI: 10.1016/j.apsb.2021.05.032 sha: 52dcf413e933724b33223cc41c3a365790b787a9 doc_id: 22718 cord_uid: 1ot9slgm Mitotic catastrophe (MC) is a form of programmed cell death induced by mitotic process disorders, which is very important in tumor prevention, development, and drug resistance. Because rapidly increased data for MC is vigorously promoting the tumor-related biomedical and clinical study, it is urgent for us to develop a professional and comprehensive database to curate MC-related data. Mitotic Catastrophe Database (MCDB) consists of 1214 genes/proteins and 5014 compounds collected and organized from more than 8000 research articles. Also, MCDB defines the confidence level, classification criteria, and uniform naming rules for MC-related data, which greatly improves data reliability and retrieval convenience. Moreover, MCDB develops protein sequence alignment and target prediction functions. The former can be used to predict new potential MC-related genes and proteins, and the latter can facilitate the identification of potential target proteins of unknown MC-related compounds. In short, MCDB is such a proprietary, standard, and comprehensive database for MC-relate data that will facilitate the exploration of MC from chemists to biologists in the fields of medicinal chemistry, molecular biology, bioinformatics, oncology and so on. The MCDB is distributed on http://www.combio-lezhang.online/MCDB/index_html/. Mitotic catastrophe (MC), originally proposed by Molz Lisa et al. 1 in 1989, was named for the first time by the International Nomenclature Committee on Cell Death in 2012 2 . MC is a form of programmed cell death caused by the deregulation of mitotic process as an intrinsic onco-suppressive mechanism 2,3 . Studies have identified that DNA lesions, mitotic defects, failure of cytokinesis could cause MC, and even tumor cells are more susceptible to this mitotic abnormality than normal cells 4, 5 . At present, in addition to photo and proton radiotherapy, there are a variety of chemotherapeutic drugs that could exert anti-tumor effects by inducing MC, covering microtubule regulators, CHK1 inhibitors, PARPs inhibitors, WEE1 inhibitors, PLKs inhibitors, and so on 6e10 . Currently, with the in-depth exploration of relevant studies on MC, the significance of MC in tumor prevention, treatment, drug resistance and radiosensitivity gradually developed 11e14 , which has attracted widespread attention from chemists to biologists in the fields of medicinal chemistry, molecular biology, and bioinformatics 5,15e19 . Recently, the rapid increased MC-related data (genes, proteins, and compounds) is greatly promoting MC-related drug design, discovery, synthesis, and repositioning. Thus, a lot of commonly used public databases, such as Public Medicine (PubMed) database 20 , Public Chemistry (PubChem) database 21 , Universal Protein Resource (UniProt) 22 , and Protein Data Bank (PDB) 23, 24 , consist of a large amount data of MC-related genes, proteins, and compounds. However, previous databases did not specifically establish a confidence level and classification criteria for MC-related data, resulting in data reliable ambiguity and retrieval inconvenience. Also, previous databases did not develop such uniform naming rules for these MC-related data that severely restricted their accessibility and usability. Lastly, previous databases neither provide MC-related tools/functions to discover new MC-related genes, proteins, and compounds, nor assist us to understand the MC-related biological functions, signal transduction mechanisms, and biological processes 25e27 . In order to overcome these previous shortcomings, we develop a Mitotic Catastrophe Database (MCDB) with three major innovations: (1) MCDB is the first comprehensive database for MCrelated data curation, which not only consists of 1214 genes/ proteins and 5014 compounds collected and organized from more than 8000 research articles, but also data upload function of which can update the curated MC-related data. (2) MCDB defines the confidence level, classification criteria, and uniform naming rules for MC-related data, which significantly improves data reliability and retrieval convenience. (3) MCDB offers protein sequence alignment and target prediction functions. The former can be used for predicting new potential MC-related genes and proteins, and the latter can facilitate the identification of potential target proteins of unknown compounds for MC. In general, MCDB provides such a proprietary, standardized, and comprehensive data retrieval and analysis platform that not only can facilitate the design and discovery of MC-based antitumor drugs, but also promote tumor biomedical and clinical treatment research in the distant future. MCDB is developed on the cloud Linux server (CentOS 7.5.1804) 28 , which employs Nginx (version 1.18) 29 and SQLite (version 3.32.3) 30 as the web and database server, respectively. The back and front end of the website are respectively based on the Django framework (version 3.0.8) 31 and Bootstrap framework (version 4.4.1) 32 . Here, we distributed the website on http://www. combio-lezhang.online/MCDB/index_html/to provide open access (Fig. 1 ). As shown in Fig. 2A , we manually identified 188 MC-related genes/proteins from 1012 original publications by mining PubMed with the keyword mitotic catastrophe. Meanwhile, we identified 1126 MC-related genes/proteins in the GO knowledgebase by analyzing the items involved in the mitosis biological process 22, 33 . By intersecting the results from GO and PubMed databases, we have 1214 genes/proteins. Next, we unified gene names, protein names, synonyms, UniProt accessions by the standards of Uni-Prot 34 . For example, when the same gene/protein has multiple names and abbreviations, we annotated its gene name and protein name by the standard of UniProt, whereas putting its other names and abbreviations in the synonyms field. When different genes/ proteins share the same name, we confirmed these genes/proteins by corresponding UniProt accessions. As shown in Fig. 2B , we obtained 5014 compounds from more than 7000 published articles by text mining. Next, we used ChemBioDraw Ultra (version 14.0.0.117) software to draw compound structures, and then obtained the Simplified Molecular Input Line Entry Specification (SMILES) expression of compounds by SMILES conversion 35 . Subsequently, we calculated the International Chemical Identifier (InChI) and the International Chemical Identifier hash (InChI Key) by using the International Union of Pure and Applied Chemistry (IUPAC) standard for SMILES. To improve the data reliability for MC-related genes and proteins, we define the confidence level criteria for MC-related genes and proteins in Table 1 and Fig. 2A . The first level (94 proteins, 7.74%), with literature support and GO items association; the second level (89 proteins, 7.33%), with literature support but GO items are not mentioned; the third level (106 proteins, 8.73%), with GO items association and homologous enzymes have been reported in the literature; the fourth level (925 proteins, 76.20%), GO items are indicated to be associated with mitosis which may be involved in MC. To solve the unnecessary duplication and confusion caused by inconsistent descriptions of the compound's effect on protein in literature, we defined classification criteria for compounds in Table 2 and Fig. 2B . Inhibitor, the molecule that can inhibit the function of proteins, but does not induce conformational change in the protein, includes inhibitor, antagonist, inactivator, destabilizing agent, and so on; activator, the molecule that can activate the function of proteins, but does not induce conformational change in the protein, includes activator, agonist, enhancer, stabilizing agent, and so on; allosteric regulator, the molecule that can induce conformational change in the protein is recognized as allosteric regulators. BLASTþ is performed to compare the similarity of multiple protein sequences 36e38 . After the user submits the sequence in FASTA format, multiple sequence comparisons 39e41 will be automatically carried out. Each entry in the library will be paired with the sequence submitted to calculate attributes including Length, Score, Expect, Identities, Positives, Gaps 42 , which are detailed in Table 3 . Generally, the Threshold of Expect should be less than or equal to 1 Â 10 À5 , in which 1 Â 10 À5 matches would be expected to occur by chance 43 . Because similar molecular structures usually have the same/ similar target proteins and similar biological properties 44e47 , we employ the Tanimoto coefficient to compute the two-dimension similarity score between different compound structures 48,49 , described by Eq. (1): Here, X and Y respectively represent the binary data of the compound molecular fingerprint. The value range of Tanimoto coefficient is [0,1], and the threshold is generally set to 0.8 50e52 . We also employ SHAFTS to compute the three-dimension Hybrid Score between different structures 53,54 , described by Eqs. (2)e(6): Here, i and j respectively represent the atom of A and B, d ij is the interatomic distance between atom i and j, g is the width of a Gaussian relevant with van der Waals radii. The final ShapeScore is normalized to the range between 0 and 1. Figure 1 The homepage of the MCDB. Here i and j respectively represent the feature point of A and B with the same type f, d ij is the distance between point i and j, and R f is the overlap tolerance with a default value of 0.8 Å . The final FeatureScore is normalized to the range between 0 and 1. HybridScore is the sum of ShapeScore and FeatureScore. HybridScore is scaled to the range between 0 and 2, and compounds have a certain similarity when HybridScore is equal or greater than 1. We usually will focus on such proteins that can form compounds or have structures, since they are very useful for pharmacological and pharmacochemical research 55e57 . Here, we define the ratios for compounds (Ratio_c) and structures (Ratio_s) to help us to show the distribution of these types of proteins in MCDB. The gene or protein is only associated with GO item. We use Ratio_c to describe the proportion of proteins that can form compounds in MCDB curated MC-related data by Eq. (7). Here, N m represents how many compounds that MC-related protein m can have in MCDB. Max(N m ) represents such MCrelated protein m that has the greatest number of compounds in MCDB. We use Ratio_s to describe the proportion of proteins that have structures in MCDB curated MC-related data by Eq. (8). Here, N i represents how many structures that MC-related protein i can have in MCDB. Max(N i ) represents such MCrelated protein i that has the greatest number of structures in MCDB. The MCDB is composed of seven functional modules, which allows users to browse, search and analyze MC-related data. These functional modules are "Data Visualization", "MC-related Gene and Protein Search", "MC-related Compound Search", "PDB Search", "Protein Sequence Alignment", "Target Prediction, and Upload". MCDB is distributed on http://www.combio-lezhang. online/MCDB/index_html/ (Fig. 1 ). As shown in Fig. 2 , MCDB collected and organized MC-related 1214 genes/proteins and 5014 compounds from more than 8000 original articles and Gene Ontology (GO) knowledge. And due to the inconvenience of retrieval and use caused by irregular abbreviations and duplicate names of genes and proteins, we have unified all gene names, protein names, synonyms, UniProt accessions based on standards of UniProt 34 . When the same gene/ protein has multiple names and abbreviations, we annotated its gene name and protein name by the standard of UniProt, whereas putting its other names and abbreviations in the synonyms field. When different genes/proteins share the same name, we confirmed these genes/proteins by corresponding UniProt accessions. To improve the MC-related genes and proteins data reliability, storage and retrieval convenience, these genes and proteins are divided into 4 confidence levels according to the evidence reliability of the correlation with MC ( Fig. 2 and Table 1 ). The first confidence level account for 94 entries (7.74%), the second level account for 89 entries (7.33%), the third confidence level account for 106 entries (8.73), and the fourth confidence level is the most with 925 entries (76.20%) ( Fig. 3 and Table 1 ). Also, MCDB collected 5014 targeted compounds for the first three confidence levels of MC-related genes and compounds. And in order to eliminate the unnecessary duplication and confusion caused by inconsistent descriptions of the compound's effect on protein from different articles, we divided these compounds into 3 classifications as Table 2 . Fig. 3 shows that there are 4630 inhibitors, 319 activators, and 65 allosteric regulators, which account for 92.34%, 6.36%, and 1.3% of the total compound entries, respectively. The data visualization module is designed to visualize the data, which can visually show users the statistics and distribution of MC-related genes, proteins, and compounds in the database. After clicking the "Data Visualization" link of the homepage, users can not only use this module to visualize the data distribution for MCDB curated MC-related genes/proteins, and compounds under different confidence levels or classifications, but also obtain the data distribution for such proteins that can form compounds or have structures. Fig. 4A shows the data distribution for MC-related genes and proteins under different confidence levels, which is described in Table 1 and Fig. 2A . Here, the first, second, third, and fourth confidence levels have 94 entries (7.74%), 89 entries (7.33%), 106 entries (8.73%), and 925 entries (76.20%), respectively. Fig. 4B shows the data distribution for MC-related compounds under different classifications, which is described in Table 2 and Fig. 2B . Here, the number (percentage) of the inhibitor, activator, and allosteric regulator is 4630 entries (92.34%), 319 entries (6.36%), and 65 entries (1.30%), respectively. As previous studies, researchers always focus on such proteins that can form compounds or have structures 55e57 . Fig. 4C and D shows distributions of genes and proteins with compounds and structures respectively. For example, after we input "30" and click the "Submit" button at the top of Fig. 4C , the bottoms of Fig. 4C show Ratio_c discussed by Eq. (7) for the top 30 proteins that can form compounds. The purple box of Fig. 4C shows the Table 3 Attributes for protein sequence alignment. Length Represent the sequence length of the protein. Derived from the raw score, calculated as the sum of substitution and gap scores, taking the statistical properties of the scoring system into account. Represent the number of different alignments with scores equivalent to or better than the raw score that is expected to occur in a database search by chance. Identities The extent to which two sequences have the same residues at the same positions in an alignment. Positives The extent to which two sequences are related in an alignment. Calculated as the sum of the gap opening penalty and the gap extension penalty. corresponding gene name "GSK3B" and its 263 compounds, when the mouse is on this circle. For another example, after we input "30" and click the "Submit" button at the top of Fig. 4D , the bottom of Fig. 4D shows the Ratio_s discussed by Eq. (8) for the top 30 proteins that have structures. The purple box of Fig. 4D shows the corresponding gene name "INS" and its 284 structures, when the mouse is on this circle. The MC-related Gene and Protein Search module is designed to facilitate users to query MC-related genes and proteins, and provides users with standardized information, including UniProt accession, gene name, GO identifier, GO term, protein name, synonyms, confidence level, and PubMed identifier (PMID) of MC relevant literature. And we also provide a hyperlink for data download so that users can obtain all the data of MC-related genes and proteins. This module can help users determine which genes and proteins are related to MC and obtain standardized detailed information of the MC-related genes and proteins, so as to further study the MC-related biological processes and molecular mechanisms they participated in. After clicking the "MC-related Gene and Protein Search" link of the homepage (Fig. 1) , we can query the information for MC-related genes and proteins by inputting the UniProt Accession or Gene name (Fig. 5A) . For example, after we input Gene Name "BRD4" in the search box and click the "Submit" button in Fig. 5A and B shows its UniProt Accession is "O60885", Gene Name is "BRD4", GO Identifier is "GO:0043123", GO Term is "positive regulation of I-kappaB kinase/NF-kappaB signaling", Protein Name is "Bromodomain-containing protein 4", Synonyms is "HUNK1", Confidence Level is "1" (described in Table 1 ), and MC-PMID is "31199520,19596781". When the mouse hovers over the help marks next to the "GO Identifier", "Synonyms", "Confidence Level", or "MC-PMID", MCDB will automatically show the related explanations or instructions. Also, Fig. 5B shows that the GO identifier "GO:0043123" and PMID "31199520,19596781" have hyperlinks. When clicking the "GO:0043123" hyperlink, users can obtain the full GO related information; When clicking the "31199520" hyperlink, users can go to PubMed website for further study. The MC-related Compound Search module is to facilitate users to query MC-related compounds with detailed information, including UniProt accession, gene name, confidence level, SMILES, InChI, InChI Key, molecular formula, molecular weight, classification. And a hyperlink for data download was also provided in this module, so that users can obtain all the data of MC-related compounds. This module can help users determine if compounds are related to MC and obtain detailed information about the MCrelated compounds, so as to further understand what roles of these compounds play to regulate MC and MC-related genes/proteins. After clicking the "MC-related Compound Search" link of the homepage (Fig. 1) , we can query the information for MC-related compounds by inputting SMILES, InChI, InChI Key, UniProt Accession, or Gene Name (Fig. 6A) . Additionally, users can fill in the desired number before query submission; otherwise, ten records will be displayed per page by default. For example, after inputting SMILES "O]C1C(CO) (CO) N2CCC1CC2" and clicking the "Submit" button, Fig. 6B shows its UniProt Accession is "P04637", Gene Name is "TP53", Confidence Level is "1" (described in Table 1 ), SMILES is "O] C1C(CO) (CO)N2CCC1CC2", InChI is "1S/C9H15NO3/c11-5-9(6-12)8(13)7-1-3-10(9)4-2-7/h7,11-12H, 1-6H2", InChI Key is "RFBVBRVVOPAAFS-UHFFFAOYSA-N 00 , Molecular Formula is "C9H15NO3", Molecular Weight is "185.22", Classification is "activator" (described in Table 2 ). When the mouse hovers over the help marks next to the "Confidence Level" or "Classification", MCDB will automatically detail the related explanations. Also, users can enter P04637 (UniProt accession) or TP53 (gene name) to query all inhibitors, activators, and allosteric regulators. Notably, the structures can be displayed as Fig. 6C , and the SMILES, InChI, and InChI Key of each compound in this module can also increase the convenience for users to use external software or links for extended queries and research. Additionally, users can click the sorting markers next to "Molecular Weight" or "Classification" to sort the retrieval results. The PDB Search module is to facilitate users to query structure information of MC-related proteins with UniProt accession, gene name, confidence level, PDB code, released date, method. Users can obtain the structural information of MC-related proteins by PDB Search module to facilitate further biophysics research and structures-based drug design and discovery for protein. After clicking the "PDB Search" link of the homepage (Fig. 1) , we can query the structures for MC-related proteins by inputting UniProt Accession (Fig. 7A) . For example, after inputting UniProt Accession "Q969H0", the first row of Fig. 7B shows that UniProt Accession is "Q969H0", Gene Name is "FBXW7", Confidence Level is "2" (described in Table 1 ), PDB Code is "2OVP", Released Date is "2007-04-24", Method is "X-RAY DIFFRACTION 2.9 Å ". It is noted that "Released Date" refers to the released time of the protein structure and the "Method" refers to the way used to analyze the protein structure. Additionally, Fig. 7B shows the PDB Codes such as "2OVP", "2OVR", "2OVQ", "5V4B", "5IBK" have hyperlinks. Once users click these hyperlinks, they will go to the PDB 24 website for further research. In this module, users can sort the results according to the priority of "PDB Code" or "Released Date". The Protein Sequence Similarity Alignment module is to compare the protein sequence submitted by users with protein sequences in the database. Users can comprehensively consider the results by both "Confidence Levels" defined in Table 1 and "Scores" shown in Fig. 8 . Since the proteins with similar sequences may have similar functions and structures, users can use protein sequence alignment results to carry out protein structure simulation, function prediction, evolutionary research, especially the prediction of novel potential MC-related genes and proteins. After clicking the "Protein Sequence Alignment" link of the homepage (Fig. 1) , we can compute the similarity between the submitted protein sequence and the MC-related protein sequences of MCDB by inputting the protein sequence into Protein Sequence Alignment module (Fig. 8A) . For example, after submitting the sequence of CDK12 in Fig. 8A and B shows the protein sequence alignments with key attributes (Table 3 ) with decreasing order of attribute "Score". The first row of Fig. 8B shows that UniProt Accession is "Q14004", Gene Name is "CDK13", Confidence Level is "3", Length is "1512", Score is "803 bits (2074)", Expect is "0.0", Identities is "408/596 (68%)", Positives is "466/596 (78%)", Gaps is "31/596 (5%)". When the mouse hovers over the help marks next to the "Confidence Level", "Length, Score", "Expect", "Identities", "Positives", or "Gaps", MCDB will automatically detail the related explanations. And users can sort results according to the priority of "Score", "Expect", "Identities", "Positives", or "Gaps", which could help us better analyze the alignment results. Also, Fig. 8B shows that each UniProt Accession has a hyperlink. Once we click the hyperlink of Uni-Prot Accession, MCDB can automatically run the "MC-related Gene and Protein Search" module to retrieve the detailed gene and protein information. Additionally, users can click the "download" button to have full data for further research (Fig. 8B ). The Target Prediction module is to compare the compound submitted by users with compounds in the database to predict the potential targets. This module provides two target prediction options for users. One is Tanimoto Score 48 , which is used to predict targets based on molecular fingerprints, the other is SHAFTS, which is used to predict targets based on 3D structure 53 . The Target Prediction module not only can help users identify potential target proteins of unknown compounds and predict their impacts on MC, but also facilitate the ligand-based drug design and discovery based on MC. After clicking the "Target Prediction" link of the homepage (Fig. 1) , we can compute the similarity between the submitted compound and the curated MC-related compounds of MCDB by inputting SMILES into Target Prediction module (Fig. 9A) . For example, after we submit an example of SMILES and choose the SHAFTS(3D), the first row of Fig. 9B shows that Uniprot Accession is "P42345", Gene Name is "MTOR", Classification is "inhibitor", Confidence Level is "2" (described in Table 1 ), Hybrid Score is "1.045", Shape Score is "0.7931", Feature Score is "0. . When the mouse hovers over the help marks next to the "Classification", "Confidence Level", "Hybrid Score", "Shape Score", or "Feature Score", MCDB will automatically detail the related explanations. And users can sort the results according to the priority of "Uniprot Accession", "Gene Name", "Classification", "Confidence Level", "Hybrid Score", "Shape Score", or "Feature Score", which can help us better analyze the prediction results. For another example, after we submit an example of SMILES and choose the Tanimoto Coefficient(2D), the first row of Fig. 9C shows that UniProt Accession is "P61073", Gene Name is "CXCR4", Classification is "inhibitor", Confidence Level is "2", Similarity Score is "0.5469", SMILES is "C(CCNCCCN) CNCCCN". Here, users can sort results according to the priority of "Uniprot Accession", "Gene Name", "Classification", "Confidence Level", or "Similarity Score", which can help us better analyze the prediction results. The Upload module is designed for the constantly increasing experimental data, users can upload MC-related genes, proteins and compounds which are not included in the database. And then, we can maintain the database according to the influx of large amounts of MC-related data. After clicking the "Upload" link of the homepage (Fig. 1) , the Upload module allows users to upload the candidate MC-related genes, proteins, and compounds onto MCDB. We employ Fig. 10A to upload the candidate MC-related gene and protein data onto MCDB by filling the UniPort Accession, Gene Name, Evidence, and Description. We employ Fig. 10B to upload the candidate MC-related compound onto MCDB by filling SMILES, Uniprot Accession, Gene Name, Evidence, and Description. The MCDB curator will validate uploaded data periodically. And then, the confirmed data will be saved in MCDB by assigning a corresponding confidence and classification level, which are described in Tables 1 and 2 . Currently, MC-related data is increasing very rapidly because of its importance in tumor prevention, treatment, and drug resistance. However, to the best of our knowledge, there is no such a specialized database that can facilitate the retrieval and analysis for these MC-related data. Therefore, we build up a Mitotic Catastrophe Database (MCDB) for extensive MC research (Fig. 1) . As indicated by Fig. 2, we already collected such a large amount of MC-related data from more than 8000 articles that could offer users a comprehensive MC-related database for further study. Moreover, the upload module of MDCB (Fig. 10 ) could help us periodically update the rapidly increased MC-related data. Compared to previous public databases that curate MC-related data, MCDB defines confidence level and classification (Tables 1 and 2) for MC-related data, which not only can effectively reduce the duplication and confusion resulting from inconsistent descriptions, but also can help us comprehensively understand the reliability and availability of similarity score for Protein Sequence Alignment and Target Prediction modules (Figs. 8 and 9 ). Described by Figs. 4e10, MCDB offers seven functional modules to search, update, and analyze the MC-related data. And the Protein Sequence Alignment module (Fig. 8) can contribute to protein structure simulation, function prediction, evolutionary research, and especially for new potential MC-related protein prediction by computing the similarity between protein sequences. The Target Prediction module (Fig. 9 ) can help us to explore the potential target proteins of unknown compounds, their possible effects on MC prediction, and ligand-based drug design by computing the similarity between compounds. Although MCDB already supplements MC's knowledge gaps in the database field and offer specific MC-related online service, it can neither automatically update MC-related data, nor real-time compute the similarity between MC-related compounds. To overcome these shortcomings, we will employ natural language processing technology 58,59 to improve manual update efficiency and use high-performance computing 60e62 to reduce the processing time for similarity computing. Finally, we will make MDCB as a highly integrated web-based MC-related data platform by integrating more advanced bioinformatics applications and algorithms 27,63e69 in the future. CDC2 and the regulation of mitosis: six interacting mcs genes Targeting the mitotic catastrophe signaling pathway in cancer Targeting programmed cell death using small-molecule compounds to improve potential cancer therapy Molecular mechanisms of radiationinduced cancer cell death: a primer Biology of glioblastoma multiformedexploration of mitotic catastrophe as a potential treatment modality Current advances of tubulin inhibitors as dual acting small molecules for cancer therapy Comparison of the different mechanisms of cytotoxicity induced by checkpoint kinase I inhibitors when used as single agents or in combination with DNA damage Opening a door to PARP inhibitor-induced lethality in HR-proficient human tumor cells Synthetic lethal targeting of mitotic checkpoints in HPVnegative head and neck cancer Targeted inhibition of Polo-like kinase 1 by a novel small-molecule inhibitor induces mitotic catastrophe and apoptosis in human bladder cancer cells Impaired Notch signaling leads to a decrease in p53 activity and mitotic catastrophe in aged muscle stem cells The CHK1 inhibitor MU380 significantly increases the sensitivity of human docetaxel-resistant prostate cancer cells to gemcitabine through the induction of mitotic catastrophe Wee1 kinase inhibitor AZD1775 effectively sensitizes esophageal cancer to radiotherapy Mec1 is activated at the onset of normal S phase by low-dNTP pools impeding DNA replication Discovery of a ruthenium complex for the theranosis of glioma through targeting the mitochondrial DNA with bioinformatic methods Targeting tubulin-colchicine site for cancer therapy: inhibitors, antibodyedrug conjugates and degradation agents Insight into the selective binding mechanism of DNMT1 and DNMT3A inhibitors: a molecular simulation study Structure-based drug design and identification of H 2 O-soluble and low toxic hexacyclic camptothecin derivatives with improved efficacy in cancer and lethal inflammation models in vivo The ups and downs of poly(ADP-ribose) polymerase-1 inhibitors in cancer therapydcurrent progress and future direction Database resources of the national center for biotechnology information PubChem 2019 update: improved access to chemical data The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still going strong RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy The protein Data Bank Computed tomography angiography-based analysis of high-risk intracerebral haemorrhage patients by employing a mathematical model EZH2-, CHD4-, and IDH-linked epigenetic perturbation and its association with survival in glioma patients Using game theory to investigate the epigenetic control mechanisms of embryo development: comment on F5 incorporation. Nginx. 2020 SQLite release 3.32 Django Software Foundation. Django 3.0.8 release notes Bootstrap Team Gene Ontology: tool for the unification of biology UniProt: a worldwide hub of protein knowledge Autophagic compound database: a resource connecting autophagy-modulating compounds, their potential targets and relevant diseases Gapped BLAST and PSI-BLAST: a new generation of protein database search programs BLASTþ: architecture and applications Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements CGIDLA: developing the Web Server for CpG island related density and LAUPs (lineage-associated underrepresented permutations) study Comprehensively benchmarking applications for detecting copy number variation CpG-island-based annotation and analysis of human housekeeping genes BLASTþ: architecture and applications Relating protein pharmacology by ligand chemistry SwissTargetPrediction: updated data and new features for efficient prediction of protein targets of small molecules Similarity-based machine learning methods for predicting drugetarget interactions: a brief review Application of negative design to design a more desirable virtual screening library Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? Molecular fingerprint similarity search in virtual screening Unsupervised data base clustering based on daylight's fingerprint and Tanimoto similarity: a fast and automated way to cluster small and large data sets SuperPred: drug classification and target prediction ETCM: an encyclopaedia of traditional Chinese medicine SHAFTS: a hybrid approach for 3D molecular similarity calculation. 1. Method and assessment of virtual screening ChemMapper: a versatile web server for exploring pharmacology and chemical structure association based on molecular 3D similarity method Understanding the fabric of protein crystals: computational classification of biological interfaces and crystal contacts GeoMine: interactive pattern mining of proteineligand interfaces in the Protein Data Bank EasyVS: a user-friendly web-based tool for molecule library selection and structure-based virtual screening GenCLiP 3: mining human genes' functions and regulatory networks from PubMed based on co-occurrences and natural language processing Integrating semantic query function into D-NetWeaver Employing graphics processing unit technology, alternating direction implicit method and domain decomposition to speed up the numerical diffusion solver for the biomedical engineering research Novel 3D GPU based numerical parallel diffusion algorithms in cylindrical coordinates for health care simulation Developing a multiscale, multi-resolution agent-based brain tumor model by graphics processing units Exploring the dynamics and interplay of human papillomavirus and cervical tumorigenesis by integrating biological data into a mathematical model Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA) Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics Bioinformatic analysis of chromatin organization and biased expression of duplicated genes between two poplars with a common whole-genome duplication Exploring the underlying mechanism of action of a traditional Chinese medicine formula, Youdujing ointment, for cervical cancer treatment 2019nCoVAS: developing the web service for epidemic transmission prediction, genome analysis, and psychological stress assessment for 2019-nCoV Robust needle localization and enhancement algorithm for ultrasound by deep learning and beam steering methods This work was supported by grants from National Natural Science The authors have no conflicts of interest to declare.