key: cord-0863292-5a544bqh authors: Liu, Hengrui; Weng, Jieling title: A Pan-Cancer Bioinformatic Analysis of RAD51 Regarding the Values for Diagnosis, Prognosis, and Therapeutic Prediction date: 2022-03-10 journal: Front Oncol DOI: 10.3389/fonc.2022.858756 sha: c27b4ac6ef5931021bd8edb61b81f228cae73cb6 doc_id: 863292 cord_uid: 5a544bqh BACKGROUND: RAD51, a critical protein for DNA repairment, has been found to associate with multiple cancer types, but, so far, a systematic pan-cancer analysis of RAD51 has not been done yet. METHODS: Data were obtained from multiple open databases and genetic alteration, gene expression, survival association, functional enrichment, stemness, mutation association, immunity association, and drug therapy association of RAD51were analyzed. A prognostic model of RAD51 for overall glioma was constructed as an example application of RAD51 as a biomarker. RESULTS: RAD51 was overexpressed in 28 types of cancers and was associated with worse overall survival in 11 cancer types. RAD51 correlated genes were enriched in cell cycle terms. RAD51 was associated with cancer stemness, tumor mutational burden, and multiple immunomodulators in different cancer types. RAD51 expression was different across immune subtypes in 11 cancer types. RAD51 was closely associated with cancer immune microenvironments in some cancer types. Proliferating T cells was the cell type that expressed highest RAD51 across most of the cancer samples analyzed. RAD51 expression had an AUC of over 0.5 in 12 of the 23 ICB subcohorts. The Tumor Immune Dysfunction and Exclusion of 9 cancer types were different between RAD51 high and low groups. RAD51 expression showed negative correlations with the sensitivity of most drugs. A prognostic nomogram was constructed with a high confidence. CONCLUSION: RAD51 is a clinical valuable biomarker for multiple cancer types, regarding its potential power for diagnosis, prognosis, and therapeutic prediction. Cancer is a major public health issue in the world. Affected by the COVID-19 pandemic, the diagnosis, and treatment of cancer were hampered and delayed, resulting in a short-term decrease in cancer incidence this year but might also lead to a potential increase in advanced-stage cancers and higher mortality in the next few years (1) . Given the complexity of cancer development, there may be common mechanisms shared across different cancer types, hence, pan-cancer analysis of genes of interest, especially those genes that might play common roles in multiple cancer types, can contribute to clinical cancer diagnosis, prognosis, and therapies. The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and the Chinese Glioma Genome Atlas (CGGA) (2) (3) (4) , as well as other available open databases, provide gene expression and clinical data of different cancer types, enabling pan-cancer analysis for understanding these genes across multiple cancer types. Cancer arises from mutation. Genome instability and mutation have been thought to be a hallmark of cancer (5) . In non-cancer cells, homologous recombination (HR) is essential for the maintenance of genome stability. HR repairs most DNA lesions through the complementarity of the DNA sequence. A critical step of the HR repairment is the binding of singlestranded DNA (ssDNA) to RAD51 protein near the repair sites (6, 7) . This process has been found to be critical in the tumorigenesis of some cancer types. For instance, first found in breast cancer, the product of the breast cancer-associated gene 2 (BRCA2) mediates the chaperoning of RAD51 onto replication protein A (RPA)-coated ssDNA (8) , thereby promoting cancer development. Therefore, HR-deficient in normal tissues has been suggested to be a potential mechanism for tumorigenesis (9) . An early clinical study showed that, for lung cancer patients, overexpression of RAD51 resulted in significantly worse survival (10) , which inferred the potential prognostic value of RAD51 for cancer patients. Data suggested that the overexpression of RAD51 might promote cancer resistance to chemotherapy and radiotherapy (11) (12) (13) . RAD51 was found to mediate the resistance of triple-negative breast cancer stem cells to the PARP Inhibitor (14) . However, it remains unknown if the alteration in resistance results in the survival association of RAD51, but the bioinformatics study in the general prognostic power of RAD51 in some cancer types have been reported. For example, RAD51 was reported as prognostic biomarkers for colon cancer (15) and pancreatic cancer (16) . In addition, data has suggested that RAD51 might associate with cancer immunity (17) . In breast cancer and liver cancer, the role of RAD51 as a biomarker for immune cell infiltration has been reported (18, 19) . However, so far, a systematic pan-cancer analysis of RAD51 has not been done yet. Therefore, this study aimed to investigate the clinical value of RAD51 for 33 cancer types, regarding the potential of RAD51 as diagnostic, prognostic, and immune therapy predictive biomarkers. The graphical abstracts were shown in Supplementary Materials. Clinical and genomic data of glioma cohorts were downloaded from The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and the Chinese Glioma Genome Atlas (CGGA) in May 2021, in which the methods of acquisition and application complied with the guidelines and policies. Mutation analyses were conducted using the cBioPortal (20) and the Open Targets Platform (21) . The mutation or variant data were obtained from the TCGA PanCancer Atlas Studies and the UniProt. The 3D structure of the RAD51 protein was obtained from the RCSB PDB/PDB-101 [PDB 5nwl (22) ]. All the analyses and plotting, including ROC plot, survival KM plot, nomogram construction, etc., were implemented by R foundation for statistical computing (2020) version 4.0.3 and ggplot2 (v3.3.2). Multiple data sets of the gene overexpression and DNA copy number gain of RAD51 were accessed and analyzed using the Oncomine (23). Top 100 RAD51 correlated genes were identified using the GEPIA (24) . The protein-protein interaction network of the top 100 RAD51 correlated genes was constructed using the STRING (25) . The minimum required interaction score was set at the "high confidence" (>0.9). The active interaction source was set at "Experiments and Databases". All the enrichment analyses were conducted using the Metascape (26) . Immunofluorescence staining of the subcellular distribution of RAD51 within the nucleus, endoplasmic reticulum (ER), and microtubules of A431 squamous carcinoma cells, U-2 OS osteosarcoma cells, and GBM cells was obtained from the Human Protein Atlas (HPA) (27) . Immunohistochemistry staining images of RAD51 in cancer and non-cancer tissues were accessed from the HPA. Antibody HPA039310 was used to stain RAD51 except for stomach and stomach cancer (antibody CAB010381). The sample details and the general pathological annotations and results were provided by HPA. Plots of single-cell RNA-sequencing data from the FUCCI U-2 osteosarcoma cell line were accessed and analyzed using the HPA. The temporal RAD51 mRNA expression patterns were characterized in individual cells using the Fluorescent Ubiquitination-based Cell Cycle Indicator (FUCCI) U-2 OS cell line. The OCLR algorithm was used to calculate the mRNAsi for the evaluation of cancer stemness. The tumor mutational burden (TMB) and microsatellite instability (MSI) were used to evaluate the mutation levels of samples. Immunomodulators association of RAD51 across cancer types were analyzed using TCGA data and the TISIDB (28) . Associations between RAD51 expression and immune subtypes across human cancers were analyzed using TCGA data and the TISIDB. The immune cell infiltration level was calculated using the TCGA cohort. The XCELL algorithms were used to estimate the immune cell infiltration levels (29) . The single-cell data were accessed and analyzed using the TISCH (30). Immune checkpoint blockade (ICB) of RAD51 low (0-25%) and high (75-100%) groups were compared across multiple cancer types. Potential ICB response was predicted using the Tumor Immune Dysfunction and Exclusion (TIDE) algorithm (31) . TCGA data were analyzed. The GSCALite (32) was used to evaluate the area under the doseresponse curve (AUC) values for drugs and gene expression profiles of RAD51 in different cancer cell lines. Drug sensitivity and gene expression profiling data of cancer cell lines in GDSC and CTRP were integrated for investigation. The Spearman correlation analysis was performed to analyze the association of expression of RAD51 and the small molecule drug sensitivity (IC 50 ). The ROC plotter (33) was used to analyze associations of RAD51 transcriptome levels with therapeutic responses in breast cancer, ovarian cancer, glioma (female), and colon cancer (nonchemotherapy) cohorts. Wilcox test or Kruskal-Wallis test was applied to compare gene expression differences. Kaplan-Meier analysis, log-rank test, and Cox regression test were used to conduct survival analysis. Pearson's correlation test was conducted to evaluate the correlation of two variables except for the drug-sensitive analysis. A P<0.05 was considered to be statistically significant. The first section of this study was to explore whether RAD51 genetic alterations were associated with cancers. The alteration frequency bar plot showed that the total alteration frequencies of most of the cancer types were lower than 2.5%. Only cervical adenocarcinoma (6.52% of 46 cases), pleural mesothelioma (5.15% of 87 cases), mature B-Cell neoplasms (4.17% of 48 cases), and endometrial carcinoma (3.24% of 586 cases) had alteration frequencies of over 2.5%, but these cancer types had a relatively low case number except for the endometrial carcinoma ( Figure 1A) . Survival data suggested that altered RAD51 resulted in a worse overall survival in cancer patients, but the case number for altered groups was relatively small and the p-value was relatively large ( Figure 1B) . To further investigate the mutation of RAD51 in cancers, the TCGA mutation data was plotted and only 48 mutations were found, with 46 missense and 2 truncated mutations ( Figure 1C ), as shown in the 3D protein structure ( Figure 1D ). We also collected variant data from the Uniprot database and 10 disease-associated variants were found, but with only one variant associated with cancers ( Figure 1E ). Based on these results, this study suggested that, as an essential protein for the maintenance of genome stability, RAD51 had a low frequency of gene alterations. Thus, gene alterations of RAD51 might not be the major reason that drives cancers. This study hypothesized that although gene alterations of RAD51 might not be the major reason that drives cancers, the expression of RAD51 might associate with cancers. This study compared the expression of RAD51 across all tumor types and normal tissues using TCGA and GTEx data to determine the overexpression of RAD51 in cancers. A list of the cancer type abbreviations can be found in S- Table 1 . Results showed that RAD51 was significantly overexpressed in 28 types of cancer. Mesothelioma (MESO) and uveal melanoma (UVM) have no comparable normal tissue, while acute myeloid leukemia (LAML) was the only cancer type that expressed lower RAD51 in cancer than in normal tissues ( Figure 2A ). To further compare cancer-noncancer at a better control, paired cancer noncancer samples from the same patients of available cancer types were also compared. Results showed that 15 cancer types were found to significantly overexpress RAD51 ( Figure 2B ). The anatomy plot of the gene expression profile of RAD51 across all tumor samples and paired normal tissues in females and males showed that RAD51 was overexpressed in cancer in most of the organs ( Figure 2C ). To evaluate the diagnostic value of RAD51 in these 28 types of cancers significantly overexpressing RAD51, diagnostic ROCs were plotted for these cancer types except for cholangiocarcinoma (CHOL), pheochromocytoma and paraganglioma (PCPG), and uterine carcinosarcoma (UCS) because of the low normal tissue numbers. Results showed that the AUCs of 15 cancer types were over 0.9. The AUCs of 5 cancer types were between 0.8-0.9. The AUCs of 3 cancer types were between 0.7-0.8. The AUC of testicular germ cell tumors (TGCT) was 0.685 ( Figure 2D ). These results suggested that RAD51 had an excellent diagnostic value for 18 cancer types (AUC>0.8), and had an acceptable diagnostic value for 3 cancer types (0.70.8), and had an acceptable prognostic value for the 4 cancer types (0.7