key: cord-0076974-5hw7euta
authors: Reza, Md. Selim; Harun-Or-Roshid, Md.; Islam, Md. Ariful; Hossen, Md. Alim; Hossain, Md. Tofazzal; Feng, Shengzhong; Xi, Wenhui; Mollah, Md. Nurul Haque; Wei, Yanjie
title: Bioinformatics Screening of Potential Biomarkers from mRNA Expression Profiles to Discover Drug Targets and Agents for Cervical Cancer
date: 2022-04-02
journal: Int J Mol Sci
DOI: 10.3390/ijms23073968
sha: c1774c310ba8073104cf83e980ed13797da6100a
doc_id: 76974
cord_uid: 5hw7euta

Bioinformatics analysis has been playing a vital role in identifying potential genomic biomarkers more accurately from an enormous number of candidates by reducing time and cost compared to the wet-lab-based experimental procedures for disease diagnosis, prognosis, and therapies. Cervical cancer (CC) is one of the most malignant diseases seen in women worldwide. This study aimed at identifying potential key genes (KGs), highlighting their functions, signaling pathways, and candidate drugs for CC diagnosis and targeting therapies. Four publicly available microarray datasets of CC were analyzed for identifying differentially expressed genes (DEGs) by the LIMMA approach through GEO2R online tool. We identified 116 common DEGs (cDEGs) that were utilized to identify seven KGs (AURKA, BRCA1, CCNB1, CDK1, MCM2, NCAPG2, and TOP2A) by the protein–protein interaction (PPI) network analysis. The GO functional and KEGG pathway enrichment analyses of KGs revealed some important functions and signaling pathways that were significantly associated with CC infections. The interaction network analysis identified four TFs proteins and two miRNAs as the key transcriptional and post-transcriptional regulators of KGs. Considering seven KGs-based proteins, four key TFs proteins, and already published top-ranked seven KGs-based proteins (where five KGs were common with our proposed seven KGs) as drug target receptors, we performed their docking analysis with the 80 meta-drug agents that were already published by different reputed journals as CC drugs. We found Paclitaxel, Vinorelbine, Vincristine, Docetaxel, Everolimus, Temsirolimus, and Cabazitaxel as the top-ranked seven candidate drugs. Finally, we investigated the binding stability of the top-ranked three drugs (Paclitaxel, Vincristine, Vinorelbine) by using 100 ns MD-based MM-PBSA simulations with the three top-ranked proposed receptors (AURKA, CDK1, TOP2A) and observed their stable performance. Therefore, the proposed drugs might play a vital role in the treatment against CC.

Cervical cancer (CC) is a type of malignancy that arises from the cervix (lower part of the uterus) and is characterized by expulsion and irregular bleeding in the vagina, pelvic pain, and pain during sexual intercourse [1] . It has been reported that the human papillomavirus (HPV) infection causes CC in almost all cases [2] . Currently, CC is placed as the fourth most common type of malignancy among females with a high mortality rate worldwide and the second most prevalent cancer among females in middle-and lowincome countries (MLICs) [3, 4] . According to the 2018 Globocan report, about 569,847 new CC cases with 311,365 mortalities are identified annually [3] . About 14,065 females are diagnosed with CC every year in the United States, causing 5266 deaths [5] . In China, CC is considered the eighth most common cancer in women but, surprisingly, the second most common cancer in women between 15 and 44 years old. About 43% of CC patients are <45 years of age, and 20-28% are <40 years of age [6] . However, cervical cancer has been relatively well controlled for several decades in many high-income countries due to the cervical screening initiatives and effective cancer treatment services, but it remains the most prevalent cause of cancer-related mortality among women in MLICs [4, 7, 8] . There is still debate about the precise molecular pathways between chronic high-risk HPV infection and the CC pathological phase [9] . Growing research has shown that the irregular expression of multiple genes is convoluted in the pathogenesis of CC [10] . Since tumorigenesis involves various complex genetic and epigenetic events, including the overexpression of oncogenes or the inactivation of suppressor genes [11] , the revolution of dysregulated genes in oncogenic pathways might highlight the molecular events underlying tumor formation, which helps insight into CC treatment strategies. Therefore, it is crucial to elucidate the potential molecular mechanisms underlying CC to offer novel therapeutic targets and prognostic biomarkers of CC [12, 13] .

However, de novo (new) drug discovery is a tremendously challenging, time-consuming, and expensive task due to several steps involved in this process, from the target-based drug selection to clinical validation. Drug repurposing (DR) is a promising approach to overcome many of those obstacles in discovering and developing new drugs by exploring the new therapeutic applications of approved drugs that are established for different diseases [14] . It is considered as a supporting process to the conventional drug discovery. Exploring more suitable repurposable drugs for a new disease requires identifying appropriate target proteins associated with the disease. Hub genes/study genes mediated proteins have been considered as the key drug target receptors. Transcriptomics analysis is a widely used popular approach to explore genomic biomarkers [15] [16] [17] [18] [19] [20] [21] . By this time, several authors have suggested several sets of hub genes/study genes to explore molecular mechanisms and pathogenetic processes of CC [22] [23] [24] [25] . CDK1 and TOP2A might play a critical role in controlling the genetic network related to cervical cancer incidence, progression, and metastasis [17, 26, 27] . Among them, some authors suggested candidate drugs for the treatment against CC [16, 28] . Nevertheless, none of them have investigated the resistance of their suggested drugs against the independent receptors proposed by others. A question may be raised, namely, how can a drug be effective globally for all people around the world. Therefore, our main objectives are (i) computational identification of genomic biomarkers (drug targets) for CC, highlighting their functions, pathways, and regulatory factors, (ii) exploring proposed genomic biomarkers-guided candidate drugs for the treatment against CC, and (iii) in silico validation of the proposed drugs against the state-of-the-art alternative top-ranked independent receptors proposed by others. The workflow of the present research is displayed in Figure 1 . 

The datasets GSE6791, GSE27678, GSE63514, and GSE9750 were analyzed to identify DEGs between CC infections and control samples, and the DEGs in each dataset were presented using the volcano plots (Figure 2a-d) , where blue and red dots represented the up-regulated and down-regulated genes, respectively. In GSE6791, a total of 4743 DEGs with 4232 up-regulated and 511 down-regulated genes; in GSE27678, a total of 596 DEGs with 154 up-regulated and 442 down-regulated genes; in GSE63514, a total of 4091 DEGs with 2631 up-regulated and 1460 down-regulated genes; in GSE9750, a total of 2640 DEGs with 711 up-regulated and 1929 down-regulated genes were identified by GEO2R online tool with |logFC| > 1.0 and adjusted p-value < 0.05. Then, we found 78 up-regulated cDEGs and 38 down-regulated cDEGs for CC patients (see Figure 2e ; Table S3 ). 

The datasets GSE6791, GSE27678, GSE63514, and GSE9750 were analyzed to identify DEGs between CC infections and control samples, and the DEGs in each dataset were presented using the volcano plots (Figure 2a-d) , where blue and red dots represented the up-regulated and down-regulated genes, respectively. In GSE6791, a total of 4743 DEGs with 4232 up-regulated and 511 down-regulated genes; in GSE27678, a total of 596 DEGs with 154 up-regulated and 442 down-regulated genes; in GSE63514, a total of 4091 DEGs with 2631 up-regulated and 1460 down-regulated genes; in GSE9750, a total of 2640 DEGs with 711 up-regulated and 1929 down-regulated genes were identified by GEO2R online tool with |logFC| > 1.0 and adjusted p-value < 0.05. Then, we found 78 up-regulated cDEGs and 38 down-regulated cDEGs for CC patients (see Figure 2e ; Table S3 ). 

The datasets GSE6791, GSE27678, GSE63514, and GSE9750 were analyzed to identi DEGs between CC infections and control samples, and the DEGs in each dataset we presented using the volcano plots (Figure 2a-d) , where blue and red dots represented th up-regulated and down-regulated genes, respectively. In GSE6791, a total of 4743 DEG with 4232 up-regulated and 511 down-regulated genes; in GSE27678, a total of 596 DEG with 154 up-regulated and 442 down-regulated genes; in GSE63514, a total of 4091 DEG with 2631 up-regulated and 1460 down-regulated genes; in GSE9750, a total of 2640 DEG with 711 up-regulated and 1929 down-regulated genes were identified by GEO2R onlin tool with |logFC| > 1.0 and adjusted p-value < 0.05. Then, we found 78 up-regulate cDEGs and 38 down-regulated cDEGs for CC patients (see Figure 2e ; Table S3 ). ; red dots and blue dots represented the significantly down-regulated and up-regulated DEGs, respectively. (e) Common up-regulated and down-regulated differentially expressed genes from CC visualized through a Venn diagram. Seventy-eight genes were founded common-up-regulated and thirty-eight genes were founded common down-regulated in CC patients.

The PPI network of cDEGs was constructed using STRING database (Figure 3a) , which contained 92 nodes and 1790 edges. We selected top-ranked eleven (11) cHubGs {AR, AURKA, BRCA1, CCNB1, CDK1, ECT2, ESR1, EZH2, MCM2, NCAPG2, and TOP2A}, applying four topological measures in the PPI network. Then, using MCODE, clusters were selected from the PPI network. It was shown that the most significant cluster had 42 nodes and 850 edges. MCODE analysis demonstrated that the most significant cluster contained the seven hub genes {AURKA, BRCA1, CCNB1, CDK1, MCM2, NCAPG2, and TOP2A} (see Figure 3b ). So, we considered these seven key genes (KGs) for further analysis. visualized through a Venn diagram. Seventy-eight genes were founded common-up-regulated and thirty-eight genes were founded common down-regulated in CC patients.

The PPI network of cDEGs was constructed using STRING database (Figure 3a ), which contained 92 nodes and 1790 edges. We selected top-ranked eleven (11) cHubGs {AR, AURKA, BRCA1, CCNB1, CDK1, ECT2, ESR1, EZH2, MCM2, NCAPG2, and TOP2A}, applying four topological measures in the PPI network. Then, using MCODE, clusters were selected from the PPI network. It was shown that the most significant cluster had 42 nodes and 850 edges. MCODE analysis demonstrated that the most significant cluster contained the seven hub genes {AURKA, BRCA1, CCNB1, CDK1, MCM2, NCAPG2, and TOP2A} (see Figure 3b ). So, we considered these seven key genes (KGs) for further analysis. , and TOP2A) were found to be common across 11 hub genes, so we considered these 7 genes as the key genes. The network represents highly interconnected regions of the PPI network. The network holds 42 nodes and 850 edges.

The network analysis of KGs with TFs detected top-ranked four significant TFs (SP1, TP53, NFYA, and E2F1) as the key transcriptional regulatory factors for KGs (see Figure  4 ). We found TP53 as key TFs for six KGs (AURKA, BRCA1, CCNB1, CDK1, MCM2, and TOP2A), SP1 for four KGs (AURKA, BRCA1, CCNB1, and CDK1), NFYA for four KGs (AURKA, BRCA1, CCNB1, and NCAPG2), and E2F1 for four KGs (AURKA, CDK1, MCM2, and TOP2A). Similarly, the network analysis of KGs with miRNAs identified top-ranked two significant miRNAs, denoted as hsa-mir-24 and hsa-let-7b, that were considered as the key post-transcriptional regulatory factors for KGs sets {AURKA, BRCA1, and CDK1}, {CCNB1, CDK1, and NCAPG2}, respectively (see Figure 4 ). , and TOP2A) were found to be common across 11 hub genes, so we considered these 7 genes as the key genes. The network represents highly interconnected regions of the PPI network. The network holds 42 nodes and 850 edges.

The network analysis of KGs with TFs detected top-ranked four significant TFs (SP1, TP53, NFYA, and E2F1) as the key transcriptional regulatory factors for KGs (see Figure 4 ). We found TP53 as key TFs for six KGs (AURKA, BRCA1, CCNB1, CDK1, MCM2, and TOP2A), SP1 for four KGs (AURKA, BRCA1, CCNB1, and CDK1), NFYA for four KGs (AURKA, BRCA1, CCNB1, and NCAPG2), and E2F1 for four KGs (AURKA, CDK1, MCM2, and TOP2A). Similarly, the network analysis of KGs with miRNAs identified top-ranked two significant miRNAs, denoted as hsa-mir-24 and hsa-let-7b, that were considered as the key post-transcriptional regulatory factors for KGs sets {AURKA, BRCA1, and CDK1}, {CCNB1, CDK1, and NCAPG2}, respectively (see Figure 4 ). Int. J. Mol. Sci. 2022, 23, x FOR PEER REVIEW 5 of 20 Table 1 displayed the top five significantly enriched GO terms and KEGG pathways by involving KGs of CC diseases that are also supported by the literature review [17, 19, 23, [29] [30] [31] [32] [33] [34] [35] [36] . Table S4 . Table 1 displayed the top five significantly enriched GO terms and KEGG pathways by involving KGs of CC diseases that are also supported by the literature review [17, 19, 23, [29] [30] [31] [32] [33] [34] [35] [36] . Table S4 . 

The log-rank test was used to test the significant difference between two survival curves corresponding to low-and high-risk groups based on the seven KGs (AURKA, BRCA1, CCNB1, CDK1, MCM2, NCAPG2, and TOP2A). We observed a significant difference between the two survival probability curves ( Figure 5 ), which indicates that the proposed KGs have strong prognostic power in detecting CC. 

To explore candidate drugs by molecular docking simulation, we considered m= 11 drug target proteins (receptors) corresponding to our proposed seven KGs and their 

To explore candidate drugs by molecular docking simulation, we considered m = 11 drug target proteins (receptors) corresponding to our proposed seven KGs and their regulatory four key TFs and 80 meta-drug agents (ligands) as mentioned in Section 4.3. We downloaded the 3D structure of our proposed 11 receptors (AURKA, BRCA1, CCNB1, CDK1, MCM2, TOP2A, SP1, TP53, NFYA, E2F1) from the Protein Data Bank (PDB) [37] with source codes 1mq4, 1n5o, 2b9r, 4y72, 4uuz, 1zxm, 1sp1, 1aie, 6qmq, 2aze, respectively. The 3D structure of the NCAPG2 target protein was downloaded from AlphaFold source using UniProt [38] ID of Q86XI2. The 3D structures of 80 drugs (see Table S1 ) were downloaded from the PubChem database [39] as mentioned previously.

On the other hand, we reviewed 52 published articles associated with CC infections that provided transcriptome-guided hub proteins (genomic biomarkers) for cross-validation of the proposed key genes and the candidate drug agents. There were 255 hub genes reported in those 52 articles, with 7 hub proteins (AURKA, PCNA, CCNB1, CDC45, MCM2, TOP2A, CDK1) appearing in at least 5 articles (Table S2 ) [15] [16] [17] [18] [19] [20] [21] [23] [24] [25] [26] [27] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] . Five (AURKA, CCNB1, MCM2, TOP2A, and CDK1) of the seven reported hub proteins were found to be similar to our suggested seven KGs. We downloaded the 3D structure of the published remaining receptors (CDC45, PCNA) from PDB with source codes 5dgo, 1u76, respectively. Then, molecular docking was carried out between total m = 13 receptors (proposed and published) and n = 80 meta-drug agents to calculate the binding affinity scores (kcal/mol) for each pair of receptors and agents. Then, we ordered the target receptors in descending order of row sums of the binding affinity matrix A = (A ij ) and drug agents, according to the column sums, to select a few drug agents as the candidate drugs (see Figure 6 ). Thus, we selected top-ranked three drug agents (Paclitaxel, Vinorelbine, Vincristine) as candidate drugs with binding affinity scores −7.5 kcal/mol ≤ against the 13 receptors. The docked complexes of the top three virtual hits from AutoDock-Vina docking were further considered for protein-ligand interaction profiling. As shown in Figure 7a , AURKA_paclitaxel complex showed three hydrogen bonds with lys141, lys162, asp274 residues. Although the ligand formed major hydrophobic interactions with leu139, val147, leu210, thr217, tyr219, glu260, leu263 residues, and lys162 residue showed additional salt bridges interactions with the ligand. On the other hand, CDK1_vinorelbine (Figure 7b ) formed hydrophobic interactions with tyr15, lys88, leu135 residues. In the case of the TOP2A_vincristine complex, vincristine showed hydrogen bonding to the active site, where the interaction was maintained by glu66 residue. Vincristine also formed hydrophobic interactions with glu66, pro111, lys233, leu257 residues. Furthermore, vin- The docked complexes of the top three virtual hits from AutoDock-Vina docking were further considered for protein-ligand interaction profiling. As shown in Figure 7a , AURKA_paclitaxel complex showed three hydrogen bonds with lys141, lys162, asp274 residues. Although the ligand formed major hydrophobic interactions with leu139, val147, leu210, thr217, tyr219, glu260, leu263 residues, and lys162 residue showed additional salt bridges interactions with the ligand. On the other hand, CDK1_vinorelbine (Figure 7b ) formed hydrophobic interactions with tyr15, lys88, leu135 residues. In the case of the TOP2A_vincristine complex, vincristine showed hydrogen bonding to the active site, where the interaction was maintained by glu66 residue. Vincristine also formed hydrophobic interactions with glu66, pro111, lys233, leu257 residues. Furthermore, vincristine also formed π-cation interactions with lys233 residue (see Figure 7c ). 

Among the proposed candidate drugs, Paclitaxel, Vinorelbine, and Vincristine were the top-ranked three candidate drugs ( Figure 6 ). Therefore, these top three drug agent were selected for their stability analysis through 100 ns MD-based MM-PBSA simula tions.

From Figure 8 , we observed that all the three systems were significantly stable be tween the variations of moving and initial drug-target complexes. We calculated thei RMSD (root mean square deviation). Figure 8a represented the RMSD corresponding to the proposed receptors (AURKA, CDK1, TOP2A). All the systems projected the RMSD around 1 Å to 2.5 Å, except TOP2A complex, which shows the RMSD around 2 Å to 3.7 Å. The average RMSD for AURKA, CDK1, TOP2A complexes were 1.59 Å, 2.11 Å, and 2.80 Å, respectively. CDK1 complex showed slight fluctuation in around 20,000 ps to 28,000 ps and was stabilized in the remaining simulation. As can be seen from the plot AURKA showed a more rigid conformation than the other proteins, also achieved equi 

Among the proposed candidate drugs, Paclitaxel, Vinorelbine, and Vincristine were the top-ranked three candidate drugs ( Figure 6 ). Therefore, these top three drug agents were selected for their stability analysis through 100 ns MD-based MM-PBSA simulations.

From Figure 8 , we observed that all the three systems were significantly stable between the variations of moving and initial drug-target complexes. We calculated their RMSD (root mean square deviation). Figure 8a represented the RMSD corresponding to the proposed receptors (AURKA, CDK1, TOP2A). All the systems projected the RMSD around 1 Å to 2.5 Å, except TOP2A complex, which shows the RMSD around 2 Å to 3.7 Å. The average RMSD for AURKA, CDK1, TOP2A complexes were 1.59 Å, 2.11 Å, and 2.80 Å, respectively. CDK1 complex showed slight fluctuation in around 20,000 ps to 28,000 ps and was stabilized in the remaining simulation. As can be seen from the plot, AURKA showed a more rigid conformation than the other proteins, also achieved equilibrium at 3 ns, and remained stable afterward. In contrast, TOP2A showed a dramatic increase in flexibility, with RMSD values rising gradually from 2 Å to 3.5 Å over time. Here, we calculated the MM-PBSA binding energy for three drug agents as mentioned previously, Figure 8b 

CC is the second most common type of malignant tumor in females with a 5-year survival rate of only about 52%. Thus, more research is required to discover potential biomarkers and candidate drugs for improving the 5-year survival rate and reducing the mortality rate of CC patients [16] .

To investigate the genetic influence of CC infections, we identified 116 shared cDEGs. Among them, we detected seven cDEGs as KGs highlighting their functions, pathways, regulatory factors, and candidate drugs. The literature review suggested that AURKA was an enhanced potential target for cervical cancer treatment [15, 25, 27, 43, 45] . BRCA1 enhanced the sensitivity of cervical squamous cell carcinoma (CSCC) patients to cisplatin-based CCRT by up-regulating STAT1 to activate the JAK/STAT pathway [50] . CCNB1 played vital roles in the progression of CC through different signaling pathways [17, 18, 26, 46, 51] . CDK1 contributed to the occurrence and development of CSCC [18, 23, [25] [26] [27] 40, 41, 44, 46] . MCM2 involved in the carcinogenesis of cervical cancer [16, [19] [20] [21] 23, 27, 42] . TOP2A is regarded as a biomarker for the improved diagnosis of CC [16, 17, 19, 23, 24, 26, 27, 41, 43, 45] . We also investigated the multivariate survival analysis of CC patients with KGs and found a substantial difference between two survival probability curves, indicating that the suggested KGs have a better prognostic capacity for CC detection. Four transcriptional (TFs) and two post-transcriptional (miRNAs) regulatory factors of KGs were introduced in Section 2.3. The four TFs proteins (SP1, TP53, NFYA, E2F1) were further utilized as the drug target receptors. Previous studies also supported that SP1 is an important biomarker for CC [52] . Khan M.A et al. revealed that TP53 has an association with cervical cancer pathogenesis [53] . NFYA promoted cell proliferation and 

CC is the second most common type of malignant tumor in females with a 5-year survival rate of only about 52%. Thus, more research is required to discover potential biomarkers and candidate drugs for improving the 5-year survival rate and reducing the mortality rate of CC patients [16] .

To investigate the genetic influence of CC infections, we identified 116 shared cDEGs. Among them, we detected seven cDEGs as KGs highlighting their functions, pathways, regulatory factors, and candidate drugs. The literature review suggested that AURKA was an enhanced potential target for cervical cancer treatment [15, 25, 27, 43, 45] . BRCA1 enhanced the sensitivity of cervical squamous cell carcinoma (CSCC) patients to cisplatin-based CCRT by up-regulating STAT1 to activate the JAK/STAT pathway [50] . CCNB1 played vital roles in the progression of CC through different signaling pathways [17, 18, 26, 46, 51] . CDK1 contributed to the occurrence and development of CSCC [18, 23, [25] [26] [27] 40, 41, 44, 46] . MCM2 involved in the carcinogenesis of cervical cancer [16, [19] [20] [21] 23, 27, 42] . TOP2A is regarded as a biomarker for the improved diagnosis of CC [16, 17, 19, 23, 24, 26, 27, 41, 43, 45] . We also investigated the multivariate survival analysis of CC patients with KGs and found a substantial difference between two survival probability curves, indicating that the suggested KGs have a better prognostic capacity for CC detection. Four transcriptional (TFs) and two post-transcriptional (miRNAs) regulatory factors of KGs were introduced in Section 2.3. The four TFs proteins (SP1, TP53, NFYA, E2F1) were further utilized as the drug target receptors. Previous studies also supported that SP1 is an important biomarker for CC [52] . Khan M.A et al. revealed that TP53 has an association with cervical cancer pathogenesis [53] . NFYA promoted cell proliferation and tumorigenic properties by transcriptional activation of SOX2 in cervical cancer [54] . The E2F1 protein is considered as a potential biomarker for CC [55] .

To investigate the common pathogenetic processes of KGs, we selected the top five common GO terms for each of BPs, MFs and cellular components, and KEGG pathways that were significantly associated with cervical cancer disease through the cDEGs, including KGs. Among them, the association of the top five common BPs (DNA replication, cell division, G1/S transition of mitotic cell cycle, DNA replication initiation, mitotic nuclear division) with CC was supported by some other individual studies [17, 19, 30] . The top five common MFs (ATP binding, protein binding, DNA helicase activity, chromatin binding, and DNA binding) that were significantly associated with CC disease also received support from some individual studies [17, 19, [31] [32] [33] . Similarly, the association of the top four cellular components (nucleoplasm, midbody, MCM complex, nucleus, and spindle) with CC disease was supported by the literature review [19, [34] [35] [36] . We selected the top five significantly enriched common KEGG pathways (DNA replication, cell cycle, p53 signaling pathway, Oocyte meiosis, and Fanconianemia pathway) that were also reported by some other studies [17, 23, 29] .

To explore effective candidate drugs for the treatment against CC disease, we considered top-ranked proposed receptors and their regulatory four key TFs proteins as the drug target receptors and performed their docking analysis with 80 meta-drug agents (see Table S1 ). For cross-validation, we considered seven published hub proteins, where five proteins were common with our proposed KGs (see Figure 9 ). On the other hand, five articles at a time explored four hub genes individually, two articles explored three hub genes individually, etc. Moreover, TOP2A gene supported by 11 articles, CDK1 gene was supported by 10 articles, MCM2 gene was supported by 7 articles, CDC45 gene was supported by 6 articles, and (AURKA, CCNB1, PCNA) genes were supported by 5 articles (see Figure 9 ). Then, we selected top-ranked three drugs as the candidate drugs, where the first three drugs showed strong binding affinities with all target proteins (Figure 6) , where red * denoted published and proposed receptors, and blue ** denoted published receptors. Some other independent studies also recommended our suggested drugs, including Vincristine [56, 57] , Vinorelbine [57, 58] , and Paclitaxel [56] [57] [58] [59] [60] , for the treatment against CC. Finally, we examined the stability of top-ranked three drugs (Vincristine, Vinorelbine, Paclitaxel) by using 100 ns MD-based MM-PBSA simulations for three top-ranked proposed receptors (AURKA, CDK1, TOP2A) and observed their stable performance according to the laws of physics [61, 62] . Therefore, the proposed candidate drugs might play a vital role in the treatment against CC with comorbidities, since our proposed target proteins are also associated with several comorbidities. The present study emphasizes the further wet-lab experimental validation for both the proposed target proteins and candidate drugs. 

We used both original data and meta-data to reach the goal of this study as described below.

We collected microarray profiles for cervical cancer (CC) disease from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) website (http://www.ncbi.nlm.nih.gov/geo/, (accessed on 10 March 2022)). CC patients microarray datasets of GSE6791 [63] , GSE27678 [64] , and GSE63514 [65] were all based upon the GPL570 Platforms ((HG-U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array), which included 20 CC tissues and 8 normal cervical tissues, 28 CC tissues and 3 normal cervical tissues, and 28 CC tissues and 24 normal cervical tissues, respectively. The GSE9750 [66] dataset used the GPL96 Affymetrix Human Genome U133A Array platform and included 33 CC tissue samples that were primarily marked by HPV16 or HPV18 and 21 normal cervical samples.

We collected host-transcriptome-guided 80 meta-drug agents by the literature review of CC disease (see Table S1 ) for exploring candidate drugs. Thus, we considered 80 drug agents to explore candidate drugs by molecular docking with our proposed receptors. 

We used both original data and meta-data to reach the goal of this study as described below.

We collected microarray profiles for cervical cancer (CC) disease from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) website (http://www.ncbi.nlm.nih.gov/geo/, (accessed on 10 March 2022)). CC patients microarray datasets of GSE6791 [63] , GSE27678 [64] , and GSE63514 [65] were all based upon the GPL570 Platforms ((HG-U133_Plus_2) Affymetrix Human Genome U133 Plus 2.0 Array), which included 20 CC tissues and 8 normal cervical tissues, 28 CC tissues and 3 normal cervical tissues, and 28 CC tissues and 24 normal cervical tissues, respectively. The GSE9750 [66] dataset used the GPL96 Affymetrix Human Genome U133A Array platform and included 33 CC tissue samples that were primarily marked by HPV16 or HPV18 and 21 normal cervical samples.

We collected host-transcriptome-guided 80 meta-drug agents by the literature review of CC disease (see Table S1 ) for exploring candidate drugs. Thus, we considered 80 drug agents to explore candidate drugs by molecular docking with our proposed receptors.

To select the top-ranked hub genes (independent meta-receptors) associated with CC disease, we reviewed 52 published articles and selected the top-ranked 7 target proteins as the independent meta-receptors (see Table S2 ).

The identification of cDEGs was a key step of this study. To identify cDEGs, we identified DEGs for each of GSE6791, GSE27678, GSE63514, and GSE9750 datasets separately, by using linear models for microarray (LIMMA) approach through GEO2R online tool [67] with |logFC| > 1.0 and adjusted p-value < 0.05. The LIMMA approach calculates the p-value by using the modified t-statistics to test the significance of differential gene expressions between two conditions, and p-value is then adjusted by using the Benjamini-Hochberg procedure [68] . Finally, we selected the common DEGs by using four DEGs sets derived from four publicly available microarray datasets.

The PPI network of cDEGs was constructed through the STRING online database (https://string-db.org/, (accessed on 10 March 2022)) [69] . To improve the quality of PPI network, we used the Cytoscape software [70] . The Cytoscape plugin cytoHubba was used to select the common Hub Genes (cHubGs) or common Hub Proteins (cHubPs) from PPI network [70, 71] . The PPI network provides several nodes and edges, which indicate proteins and their interactions, respectively. A node with the largest number of significant interactions/connections/edges with other nodes is considered as the top-ranked cHubGs. The cHubGs were selected by using four topological analyses (Degree [72] , BottleNeck [73] , Betweenness [74] , and Stress [75] ) of the PPI network. Molecular Complex Detection (MCODE) (http://apps.cytoscape.org/apps/mcode, (accessed on 10 March 2022)) plugin of Cytoscape software was used to detect the most profound modules from the PPIs network. Highly interconnected portions were identified through MCODE clustering that assists the research in effective drug designing. For representing molecular complexes in the PPI network, MCODE was used by detecting the densely connected areas [76] . Then, we selected the important key genes (KGs) that were shared by both cHubGs and MCODE clustering genes.

To explore key transcriptional regulatory transcription factors (TFs) and post-transcriptional regulatory micro-RNAs (miRNAs) of KGs, we performed TFs-TGs and miRNAs-KGs interaction network analysis, respectively, by using the NetworkAnalyst web platform [77] . The TFs-KGs and miRNAs-KGs interaction networks were constructed by using the ENCODE (https://www.encodeproject.org/, (accessed on 10 March 2022)) [78] and Reg-Network repository [79] databases, respectively. The Cytoscape software [70] was used to improve the quality of networks. The key regulators were selected by using two topological analyses (Degree [72] and Betweenness [74] ) of networks.

Gene ontology (GO) functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment/annotation/over-representation analysis [80, 81] is a widely used approach to determine the significantly annotated/enriched/over-represented functions/classes/terms and pathways by the identified cDEGs/cHubGs. It is an important part for revealing the molecular mechanisms of actions and cellular roles of genes. The GO terms are categorized into Biological Process, Cellular Component, and Molecular Function [82] . We performed GO and KEGG enrichment analysis using DAVID web tool (https://david.ncifcrf.gov/tools.jsp, (accessed on 10 March 2022)) [83] . The significance level was set to p-value < 0.05.

To examine the prognostic performance of KGs in detecting CC, we performed a multivariate survival analysis of CC patients based on expressions of KGs by using SurvExpress online tool [84] . The significance level was set to p-value < 0.05.

To propose in silico validated efficient candidate drugs for the treatment against CC, we employed a molecular docking study of our proposed receptor proteins with the drug agents. We considered our proposed KGs-based key proteins (KPs) and their regulatory key TFs proteins as the drug target proteins and 80 drug agents as mentioned earlier in Section 4.3 (see Table S1 ). The molecular docking study requires three-dimensional (3D) structures of both receptor proteins and drug agents/ligands. We downloaded the 3D structure of all targeted proteins from the Protein Data Bank (PDB) [37] and SWISS-MODEL [85] . The 3D structures of all drug agents were downloaded from PubChem database [39] . The 3D structures of the target proteins were visualized using Discovery Studio Visualizer 2019 [86] , and protein chains that were not part of the gene were removed. PDB2PQR and H++ servers were utilized to assign the protonation state of target proteins [87, 88] . All the missing hydrogen atoms were also appropriately added. The pKa for target proteins residues were investigated under the physical conditions of salinity = 0.15, internal dielectric = 10, pH = 7, and external dielectric = 80. Further, the protein was prepared for molecular docking by removing ligand heteroatoms and water molecules and by addition of polar hydrogens on AutoDock tools 1.5.7 (developed by ADT, The Scripps Research Institute, La Jolla, CA, USA) [89] . The drug agents/ligands were prepared for molecular docking simulation by setting the torsion tree and rotatable/non-rotatable bonds present in the ligand through AutoDock tools 1.5.7. Then, binding affinities between the target proteins and drug agents were calculated using AutoDock Vina [90] . The exhaustiveness parameter was set to 10. Discovery Studio Visualizer 2019 [86] , PLIP [91] , and PyMol [92] were used to analyze the docked complexes for surface complexes, types, and distances of non-covalent bonds. Let A ij denotes the binding affinity between ith target protein (I = 1, 2, . . . , m) and jth drug agent (j = 1, 2, . . . , n). Then, target proteins were ordered according to the descending order of row sums ∑ n j=1 A ij , j = 1, 2, . . . , m, and drug agents were ordered according to the descending order of column sums ∑ m i=1 A ij , j = 1, 2, . . . , n, to select the few top-ranking drug agents as the candidate drugs. Then, we validated the proposed repurposed drugs by molecular docking study with the top-listed receptor proteins associated with CC infections that were obtained by the literature review. To select the top-listed receptor proteins associated with CC infections, we reviewed 52 recently published articles and selected the top-ranked seven receptor proteins (see Table S2 ).

MD simulations were carried out by using YASARA Dynamics software [93] and the AMBER14 force field [94] to study the dynamic behavior of the top-ranked protein-ligand complexes. A total of six different systems were used to run MD simulations. The systems included the top three hits, AURKA_paclitaxel, CDK1_vinorelbine, and TOP2A_vincristine complexes corresponding to our proposed receptors.

For the complexes, the parameters of ligands were assigned through AutoSMILES [95] algorithms, where unknown organic molecules are parameterized fully automatically by the calculation of semi-empirical AM1 Mulliken point charges with the COSMO solvation model, assigning of AM1BCC [96] atom and bond types, and also assigning general AMBER force field (GAFF) [97] atom types and the remaining force field parameters. Before the simulation, the protein-ligand complexes hydrogen bonding network was optimized and solvated by a TIP3P [98] water model in a simulation cell. Periodic boundary conditions were maintained with a solvent density of 0.997 g L −1 . Titratable amino acids in the protein complex were subjected to pKa calculation during solvation. The initial energy minimization process of each simulation system, consisting of 55,410 ± 10, 72,287 ± 10, and 96,252 ± 10 atoms for AURKA_paclitaxel, CDK1_vinorelbine, and TOP2A_vincristine complexes, was performed by a simulated annealing method, respectively, using the steepest gradient approach (5000 cycles). Each simulation was run with a multiple-time-step algorithm [99] , using a time-step interval of 2.50 fs under physiological conditions (298 K, pH 7.4, 0.9% NaCl) [100] . All bond lengths were constrained using the linear constraint solver (LINCS) [101] algorithm, and SETTLE [102] was used for water molecules. Longrange electrostatic interactions were described by the PME [103] methods, and finally, 100 ns MD simulation was accomplished at Berendsen thermostat [104] and constant pressure. The trajectories were recorded every 250 ps for further analysis, and subsequent analysis was implemented by the default script of YASARA [105] macro and SciDAVis software available at http://scidavis.sourceforge.net/, (accessed on 10 Mar 2021). All snapshots were then subjected to YASARA software's MM-Poisson-Boltzmann surface area (MM-PBSA) binding free energy calculation using the formula below [106] :

Here, MM-PBSA binding energy was calculated using YASARA built-in macros using AMBER 14 as a force field, with larger positive energies indicating better binding [107] .

The present study utilized various well-established bioinformatics tools to reveal KGs, highlighting their regulatory factors and dysregulated molecular functions and pathways that were responsible for CC progression. The five KGs were common between our proposed seven KGs and the top-ranked seven KGs published by others, which indicated that our proposed KGs received more support by the literature review compared to any other individual studies. Finally, we suggested potential drugs, such as vincristine, vinorelbine, paclitaxel, and investigated their stability performance by using 100 ns MD-based MM-PBSA simulations for three top-ranked proposed receptors (AURKA, CDK1, TOP2A), also observing their stable performance. Thus, the proposed molecular biomarkers and repurposing candidate drugs presented in this study have merit for diagnosis and therapies of CC disease.

Knowledge of cervical cancer and screening practices of nurses at a regional hospital in Tanzania

Cervical cancer: A global health crisis

Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries

Estimates of incidence and mortality of cervical cancer in 2018: A worldwide analysis

ICO/IARC Information Centre on HPV and Cancer (HPV Information Centre). Human Papillomavirus and Related Diseases in India

Mortality impact of achieving WHO cervical cancer elimination targets: A comparative modelling analysis in 78 low-income and lower-middle-income countries

Cervical cancer in Africa, Latin America and the Caribbean and Asia: Regional inequalities and changing trends

TCGA Integrated genomic and molecular characterization of cervical cancer The Cancer Genome Atlas Research Network

Human Papillomavirus E6/E7 and Long Noncoding RNA TMPOP2 Mutually Upregulated Gene Expression in Cervical Cancer Cells

Cancer genes and the pathways they control

Aberrant methylation of MEG3 functions as a potential plasma-based biomarker for cervical cancer

MiR-34a and miR-206 act as novel prognostic and therapy biomarkers in cervical cancer

Drug Repurposing (DR): An Emerging Approach in Drug Discovery

Cervical Cancer and Gene Expression Analysis with Key Genes Identification by Computational Method

Screening and Discovery of New Potential Biomarkers and Small Molecule Drugs for Cervical Cancer: A

Identification of key genes and pathways in cervical cancer by bioinformatics analysis

Comprehensive gene and pathway analysis of cervical cancer progression

xi Identification of key genes and pathways of diagnosis and prognosis in cervical cancer by bioinformatics analysis

A Novel Four-Gene Prognostic Signature as a Risk Biomarker in Cervical Cancer

Identification of hub genes as potential prognostic biomarkers in cervical cancer using comprehensive bioinformatics analysis and validation studies

Country Quarantine during COVID-19: Critical or Not?

Bioinformatics analysis of differentially expressed genes and pathways in the development of cervical cancer

Bioinformatics analysis and identification of potential genes related to pathogenesis of cervical intraepithelial neoplasia

Prediction of a miRNA-mRNA functional synergistic network for cervical squamous cell carcinoma

Bioinformatics analysis shows that top2a functions as a key candidate gene in the progression of cervical cancer

Investigation of differentially-expressed microRNAs and genes in cervical cancer using an integrated bioinformatics analysis

Identification of miRNA-mRNA Regulatory Network and Construction of Prognostic Signature in Cervical Cancer

Identification of Core Prognosis-Related Candidate Genes in Cervical Cancer via Integrated Bioinformatical Analysis

Cdc6 contributes to abrogating the G1 checkpoint under hypoxic conditions in HPV E7 expressing cells

Rucaparib antagonize multidrug resistance in cervical cancer cells through blocking the function of ABC transporters

FOXA1 Mutations Reveal Distinct Chromatin Profiles and Influence Therapeutic Response in Breast Cancer

Integrative miRNA-mRNA functional analysis identifies miR-182 as a potential prognostic biomarker in breast cancer

The midbody interactome reveals unexpected roles for PP1 phosphatases in cytokinesis

Nuclear erk: Mechanism of translocation, substrates, and role in cancer

Its Role in DNA Replication and Implications for Cancer Therapy

The Protein Data Bank

UniProt: A worldwide hub of protein knowledge The UniProt Consortium

PubChem 2019 update: Improved access to chemical data

Bioinformatics prediction and analysis of hub genes and pathways of three types of gynecological cancer

Comprehensive analysis of differential gene expression to identify common gene signatures in multiple cancers

A gene interaction network-based method to measure the common and heterogeneous mechanisms of gynecological cancer

Integrated analysis of key microRNAs /TFs /mRNAs/ in HPV-positive cervical cancer based on microRNA sequencing and bioinformatics analysis

The dysregulation of microarray gene expression in cervical cancer is associated with overexpression of a unique messenger rna signature

Network analysis revealed aurora kinase dysregulation in five gynecological types of cancer

Comprehensive Analysis of circRNA Expression Profiles During Cervical Carcinogenesis

Identification of key pathways and genes in the progression of cervical cancer using bioinformatics analysis

Expression and prognosis of CDC45 in cervical cancer based on the GEO database

Bioinformatics analysis of the transcriptional expression of minichromosome maintenance proteins as potential indicators of survival in patients with cervical cancer

Effect of BRCA1 on the Concurrent Chemoradiotherapy Resistance of Cervical Squamous Cell Carcinoma Based on Transcriptome Sequencing Analysis

Ccnb1 expedites the progression of cervical squamous cell carcinoma via the regulation by foxm1

Sp1 contributes to radioresistance of cervical cancer through targeting g2/m cell cycle checkpoint CDK1

Exploring the p53 connection of cervical cancer pathogenesis involving north-east Indian patients

NF-YA promotes the cell proliferation and tumorigenic properties by transcriptional activation of SOX2 in cervical cancer

Feedback between E2F1 and CIP2A regulated by human papillomavirus E7 in cervical cancer: Implications for prognosis

Analysis of Nuclear Encoded Mitochondrial Gene Networks in Cervical Cancer

Chemotherapy for advanced or recurrent cervical cancer

Chemotherapy for recurrent and metastatic cervical cancer

Investigational drugs for the treatment of cervical cancer

Current Treatment for Cervical Cancer: An Update

Theoretical Nuclear Physics

Mechanistic and structural analysis of a family 31 α-glycosidase and its glycosyl-enzyme intermediate

Fundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancers

Tissue transglutaminase mediates the pro-malignant effects of oncostatin M receptor over-expression in cervical squamous cell carcinoma

Molecular transitions from papillomavirus infection to cervical precancer and cancer: Role of stromal estrogen receptor signaling

Identification of copy number gain and overexpressed genes on chromosome arm 20q by an integrative genomic approach in cervical cancer: Potential role in progression

Linear models and empirical bayes methods for assessing differential expression in microarray experiments

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

The STRING database in 2011: Functional interaction networks of proteins, globally integrated and scored

Cytoscape: A software Environment for integrated models of biomolecular interaction networks

Identifying hub objects and sub-networks from complex interactome

Lethality and centrality in protein networks

Functional topology in a network of protein interactions

A Set of Measures of Centrality Based on Betweenness

Structural parameters of communication networks

An automated method for finding molecular complexes in large protein interaction networks

NetworkAnalyst 3.0: A visual analytics platform for comprehensive gene expression profiling and meta-analysis

The ENCODE (ENCyclopedia of DNA Elements) Project. Science

RegNetwork: An integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse

GO:: TermFinder-open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes

Kyoto Encyclopedia of Genes and Genomes

GoPubMed: Exploring PubMed with the gene ontology

Database for Annotation, Visualization, and Integrated Discovery

SurvExpress: An Online Biomarker Validation Tool and Database for Cancer Gene Expression Data Using Survival Analysis

SWISS-MODEL: Homology modelling of protein structures and complexes

Visualizer. Accelrys Software Inc. D Sv4

Expanding and upgrading automated preparation of biomolecular structures for molecular simulations

H++: A server for estimating pKas and adding missing hydrogens to macromolecules

AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility

AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading

Expanding the scope of the protein-ligand interaction profiler to DNA and RNA

PyMOL User's Guide

The amber lipid force field

MOPAC: A semiempirical molecular orbital program

Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation

Development and testing of a general Amber force field

Comparison of simple potential functions for simulating liquid water

New ways to boost molecular dynamics simulations

Fast empirical pKa prediction by Ewald summation

LINCS: A Linear Constraint Solver for molecular simulations

An analytical version of the SHAKE and RATTLE algorithm for rigid water models

A smooth particle mesh Ewald method

Molecular dynamics with coupling to an external bath

Increasing the precision of comparative models with YASARA NOVA-A selfparameterizing force field

Structural dynamics and quantum mechanical aspects of shikonin derivatives as CREBBP bromodomain inhibitors

Computational investigation of curcumin, a natural polyphenol that inhibits the destabilization and the aggregation of human SOD1 mutant (Ala4Val)

Agents in Cervical Cancer: Beyond Bevacizumab

Irinotecan for the treatment of cervical cancer

Osthole promotes the suppressive effects of cisplatin on NRF2 expression to prevent drug-resistant cervical cancer progression

Topotecan in cervical cancer

BRM270 suppresses cervical cancer stem cell characteristics and progression by inhibiting SOX2

Advances in cervical cancer pharmacotherapies

Are bioactive molecules from seaweeds a novel and challenging option for the prevention of HPV infection and cervical cancer therapy?-a review

Chemosensitivity of uterine cervical cancer demonstrated by the histoculture drug response assay

Potential for use of amifostine in cervical cancer

Bioinformatics analysis reveals potential candidate drugs for cervical cancer

Bácskay, I. The enhanced inhibitory effect of different antitumor agents in self-microemulsifying drug delivery systems on human cervical cancer HeLa cells

Pharmacodynamics of current and emerging treatments for cervical cancer

Identification of EPHX2 and RMI2 as two novel key genes in cervical squamous cell carcinoma by an integrated bioinformatic analysis

Emerging drugs for the treatment of cervical cancer

Chemoradiotherapy for cervical cancer in 2010

Treatment of metastatic cervical cancer: Future directions involving targeted agents

Concomitant and Neoadjuvant Chemotherapy for Cervical Cancer

Construction of an immune-related gene signature for prediction of prognosis in patients with cervical cancer

Predicting Hub Genes Associated with Cervical Cancer through Gene CoExpression Networks

A microRNAMessenger RNA Regulatory Network and Its Prognostic Value in Cervical Cancer

Comprehensive analysis of prognostic alternative splicing signature in cervical cancer

Bioinformatics analysis of key genes and pathways of cervical cancer

Identification of crucial aberrantly methylated and differentially expressed genes related to cervical cancer using an integrated bioinformatics analysis

A linear regression and deep learning approach for detecting reliable genetic alterations in cancer using dna methylation and gene expression data

Co-expression network analysis identified atypical chemokine receptor 1 (ACKR1) association with lymph node metastasis and prognosis in cervical cancer

Screening of cervical cancer-related hub genes based on comprehensive bioinformatics analysis

A prognostic signature based on immunerelated genes for cervical squamous cell carcinoma and endocervical adenocarcinoma

Construction of Gene Modules and Analysis of Prognostic Biomarkers for Cervical Cancer by Weighted Gene Co-Expression Network Analysis

Identification of hub genes and key pathways associated with the progression of gynecological cancer

Identification of differentially expressed miRNAs in early-stage cervical cancer with lymph node metastasis across the cancer genome atlas datasets

Determination of a six-gene prognostic model for cervical cancer based on WGCNA combined with LASSO and Cox-PH analysis

Probing pathway-related modules in invasive squamous cervical cancer based on topological centrality of network strategy

Identification of Key Genes in Association with Progression and Prognosis in Cervical Squamous Cell Carcinoma

Screening of Significant Biomarkers Related to Prognosis of Cervical Cancer and Functional Study Based on lncRNA-associated ceRNA Regulatory Network

Integrative systems biology approaches to identify potential biomarkers and pathways of cervical cancer

Dysregulation of KIF14 regulates the cell cycle and predicts poor prognosis in cervical cancer: A study based on integrated approaches

Comprehensive circular RNA expression profile in radiation-treated HeLa cells and analysis of radioresistance-related circRNAs

High olive oil diets enhance cervical tumour growth in mice: Transcriptome analysis for potential candidate genes and pathways

Multi-Omics Data Analyses Construct a Six Immune-Related Genes Prognostic Model for Cervical Cancer in Tumor Microenvironment

SNX10 and PTGDS are associated with the progression and prognosis of cervical squamous cell carcinoma

Identification of potential miRNAs and candidate genes of cervical intraepithelial neoplasia by bioinformatic analysis

Radiogenomic Analysis of Papillary Thyroid Carcinoma for Prediction of Cervical Lymph Node Metastasis: A Preliminary Study

Identification of biomarkers associated with cervical lymph node metastasis in papillary thyroid carcinoma: Evidence from an integrated bioinformatic analysis