key: cord-0075776-p15c4r7q authors: Khaleel, Anas; Alkhawaja, Bayan; Al-Qaisi, Talal Salem; Alshalabi, Lubna; Tarkhan, Amneh H. title: Pathway analysis of smoking-induced changes in buccal mucosal gene expression date: 2022-03-17 journal: Egypt J Med Hum Genet DOI: 10.1186/s43042-022-00268-y sha: 8ddb7af28a3c3dd6ca8014a65d6c40043ba10d5d doc_id: 75776 cord_uid: p15c4r7q BACKGROUND: Cigarette smoking is the leading preventable cause of death worldwide, and it is the most common cause of oral cancers. This study aims to provide a deeper understanding of the molecular pathways in the oral cavity that are altered by exposure to cigarette smoke. METHODS: The gene expression dataset (accession number GSE8987, GPL96) of buccal mucosa samples from smokers (n = 5) and never smokers (n = 5) was downloaded from The National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository. Differential expression was ascertained via NCBI’s GEO2R software, and Ingenuity Pathway Analysis (IPA) software was used to perform a pathway analysis. RESULTS: A total of 459 genes were found to be significantly differentially expressed in smoker buccal mucosa (p < 0.05). A total of 261 genes were over-expressed while 198 genes were under-expressed. The top canonical pathways predicted by IPA were nitric oxide and reactive oxygen production at macrophages, macrophages/fibroblasts and endothelial cells in rheumatoid arthritis, and thyroid cancer pathways. The IPA upstream analysis predicted that the TP53, APP, SMAD3, and TNF proteins as well as dexamethasone drug would be top transcriptional regulators. CONCLUSIONS: IPA highlighted critical pathways of carcinogenesis, mainly nitric oxide and reactive oxygen production at macrophages, and confirmed widespread injury in the buccal mucosa due to exposure to cigarette smoke. Our findings suggest that cigarette smoking significantly impacts gene pathways in the buccal mucosa and may highlight potential targets for treating the effects of cigarette smoking. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s43042-022-00268-y. Tobacco smoking is responsible for one in six of all deaths from non-communicable diseases, leading experts to identify tobacco control as the highest priority public health intervention [1, 2] . The prevalence of smoking has fallen around the world over the past three decades, but the absolute number of people who smoke has increased [3] . Despite a coordinated worldwide effort against smoking, there are around 1.1 billion current smokers, and it is expected that this number would reach 1.9 billion by 2025 if current smoking patterns are maintained [4] . Cigarette smoke contains over 5000 chemicals, of which 98 have been identified as carcinogenic or probably carcinogenic to humans [5] . The plethora of carcinogens in cigarette smoke perturbs biological pathways related to cellular proliferation, inflammation, and tissue injury, with strong links to various types of cancer [6, 7] . In cancer patients, cigarette smoking has been associated with an increased symptom burden as well as a reduced efficacy of chemotherapy [6, 8] . Smoking-induced differential gene expression has been well-documented in previous studies. In fact, smoking has a characteristic impact on the transcriptome, as it activates inflammatory and oxidative responses, changes airway structures, and alters gene expression across tissue types [9] . Previous studies have shown that cigarette smoking significantly alters the gene expression profiles of adipose tissue, buccal cells, nasal epithelial cells, lung tissue, and whole blood [10] [11] [12] [13] [14] . The aim of the current study is to broaden the understanding of the molecular pathways that are altered in buccal mucosa after exposure to cigarette smoke. Gene expression data from smokers and never smokers were analyzed via Ingenuity Pathway Analysis (IPA), which is a web-based software application that identifies new targets within the context of biological systems. The microarray dataset investigated in the present study was obtained from The National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository (accession number GSE8987). This dataset included gene expression data of buccal mucosa samples from smokers (n = 5) and never smokers (n = 5) [15] . Smokers were classified as those who had smoked at least 10 cigarettes per day and who had a cumulative smoking history of at least 10 pack years [15] . Table 1 shows the gene expression data samples included in the current study. As per the original study by Sridhar et al., buccal mucosa samples were collected from the study participants by scraping the inside of their mouths with a concave plastic tool with serrated edges. Total RNA was extracted from buccal mucosa samples using TRIzol reagent (Invitrogen, Carlsbad, CA), and RNA integrity was assessed using a denaturing agarose gel. The Affymetrix Human Genome U133A (HG-U133A) Array (Affymetrix, Santa Clara, CA) was then used to profile the gene expression of the extracted total RNA samples [15] . The demographics of the 10 subjects varied with regard to sex, age, and race. Among the 5 smokers, the mean age was 36 years old (± 8 years), with 1 male and 4 females. Similarly, the mean age of the 5 never smokers was 31 years old (± 9 years), with 2 males and 3 females. In terms of race, the smoker group comprised 3 Caucasians and 2 African Americans, while the never-smoker group consisted of 2 Caucasians and 3 African Americans. Demographic data for individual subjects were not provided in the dataset, but statistical comparisons of the smoker and never-smoker groups revealed not significant p values for sex (p = 0.42), age (p = 0.36), and race (p = 0.40) [15] . The GEO2R software, which is available on the NCBI website, was used to create a list of 15,000 differentially expressed genes between smoker and never-smoker buccal samples. The 15,000 genes were inputted into a Microsoft Excel spreadsheet and sorted by significance (Additional file 1: Table S1 ). After applying strict cut-off criteria (p < 0.05 and absolute fold change between − 0.5 and 1.5), the list of DE genes was narrowed down to 459 genes. The Bioconductor package Enhanced Volcano was used to visualize the 459 DE genes in the form of a labelled volcano plot [16] . The list of DE genes was inputted into IPA software (QIAGEN, Hilden, Germany), where the 'core analysis' function of the software was used to interpret the data in terms of canonical pathways and upstream regulators. The Bioconductor package clusterProfiler was used to carry out an over-representation analysis of the DE genes [17, 18] . Similarly, the SIGnaling Network Open Resource 2.0 (SIGNOR 2.0) was used to explore the signaling networks that exist between the DE genes [19] . Figure 1 displays a volcano plot of the full list of DE genes. However, only 459 genes exhibited significant differential expression, with 261 genes found to be overexpressed and 198 found to be under-expressed. Figure 2 illustrates the chromosomal location, molecular class, and cellular location of the 459 DE genes. Chromosome 1 had the highest number of significantly DE genes (n = 63), followed by chromosome 6 (n = 30), chromosome 2 (n = 29), and chromosome 19 (n = 27). Similarly, the most represented molecular classes among the significantly DE genes were enzymes (19.6%) and transcription regulators (12%). Lastly, the majority of the significantly DE genes were located either in the cytoplasm (40.5%) or the nucleus (25.7%). Table 2 lists the most significantly DE genes between smoker and never smoker buccal mucosa samples, showing that protein-coding genes occupy the top ranks in terms of significance. Figure 3A demonstrates the interplay between the DE oncological pathways, cytokines, and genes in smoker buccal mucosa, namely the IL2, EGFR, and ESR2 genes. Other than TIMP3, all the proteins in the pathway were predicted to be inhibited in smoker buccal mucosa. Figure 3B illustrates the results of an interaction network analysis of the DE genes in smoker buccal mucosa. Interestingly, the RPA1 gene was shown to have the highest number of interactions with the other DE genes in smoker buccal mucosa, but it did not have a significant level of differential expression (p > 0.05). The top 20 regulators predicted by IPA included the TP53, APP, SMAD3, and TNF proteins as well as the drug dexamethasone, among other molecules (Table 3 ). Figure 4 illustrates the data in Table 3 and emphasizes the predicted activation status of the top upstream regulators as revealed by IPA. As can be seen from Fig. 4 , the most inhibited upstream regulator in smoker buccal mucosa is predicted to be the TP63 protein. Dexamethasone was predicted to be a top upstream regulator and affected a total of 78 genes via indirect interactions (Fig. 5A) . Likewise, microRNA-8 (miR-8) was found by IPA to be among the top upstream regulators to be activated, as miR-8 targeted 7 of the DE genes between smokers and never smokers (Fig. 5B) . Of those genes, 5 (CCND2, ITGAV, QKI, RPS6KB1, and SMAD2) were under-expressed and 2 (BMP2 and CLDN3) were over-expressed. Further analysis of the top upstream regulator proteins resulted in the construction of gene-gene (Fig. 6 ) and protein-protein (Fig. 7) interaction networks. Figure 6 shows that the 36.04% of the top upstream regulator proteins were predicted to have interactions with one another, 26.19% have shared protein domains, and 22.85% were co-expressed. Similarly, Fig. 7 shows that the TP53 and TNF proteins had the highest number of interactions with the other top upstream regulator proteins. The most significant canonical pathway was identified as the nitric oxide and reactive oxygen production at macrophages ( Table 4) . The DE genes in smoker buccal mucosa are significantly associated with cancer and organismal injury, among other diseases (Table 5) . Pathway and functional enrichment analysis Figure 8 illustrates the most over-represented biological processes in smoker buccal mucosa. Interestingly, craniosynostosis and fibroid tumors were revealed to be the topmost significantly over-represented biological processes. Figure 9 shows the results of signaling network analysis of the 459 significantly DE genes, with the SMAD2 gene having the most interactions. SMAD2 is directly downregulated by the CTDSPL and SKIL genes and indirectly upregulated by the BMP2 gene. The most significantly differentially expressed (DE) protein-coding genes in smoker buccal mucosa were the CHD5, QKI, BATF3, and IL6R genes, which have previously reported associations with smoking and related diseases. The CHD5 gene, which is a tumor suppressor gene that is preferentially expressed in the nervous system and testis, was significantly upregulated in smoker buccal mucosa [20, 21] . CHD5 is believed to serve as a master regulator in tumor-suppressive networks, and CHD5 expression levels are strongly associated with the prognosis of several cancers, including hepatocellular carcinoma and non-small cell lung cancer [20, [22] [23] [24] . One study found that a rare CHD5 variant, rs12564469-rs9434711, contributed to the risk of hepatocellular carcinoma, a risk effect which was statistically significant in alcohol drinkers but not smokers [25] . The QKI gene contributes to a number of human diseases, including cancers, myelin disorders, and schizophrenia, and it is a critical regulator of alternative splicing in cardiac myofibrillogenesis and contractile function [26] . QKI has also been identified as a master regulator of alternative splicing in human lung cancer cell lines, but no significant statistical association was found between QKI expression and smoking status in lung tumors [27, 28] . Moreover, QKI was identified as a significantly altered gene in the ciliated epithelial cells of lungs affected by chronic obstructive pulmonary disease (COPD), a disease that is primarily caused by tobacco smoking [29] . The BATF3 gene belongs to the AP-1 transcription factor family, whose members respond to a range of pathological and physiological stimuli by mediating gene expression [30] . BATF3 controls the differentiation of dendritic cells, inhibits the differentiation of regulatory T cells, and critically regulates the development of memory T cells [31, 32] . BATF3 expression in the lungs was necessary in order to induce protection against allergic airway inflammation through tolerization with Helicobacter pylori extract [33] . Moreover, the acute inhalation of electronic cigarette smoke by healthy never smokers led to the significant upregulation of BATF3, among other genes that play a role in promoting tumorigenesis [34] . The IL6R gene is a pleiotropic regulator of both acquired and innate immune responses, and it is believed to be expressed in the lungs [35] . There have been conflicting findings regarding the benefits of anti-IL-6R therapy for COVID-19-induced acute respiratory distress syndrome [36, 37] . In the context of smoking, exposure to cigarette smoke led to increased IL6R mRNA levels in primary bronchial epithelial cell lines [38] . Moreover, a certain IL6R haplotype (rs6684439-rs7549250-rs4129267-rs10752641-rs407239) has been associated with a lower COPD risk in a Mexican Mestizo population, while the IL6R variant Asp358Ala did not show any association with COPD [39, 40] . Pseudogene expression was also altered in smoker buccal mucosa, most notably in the upregulation of FMO6P, ZNF259P1, and ZNF702P and the downregulation of ALDOAP2 and PNLIPRP2. FMO6P has significant sequence homology with the FMO3 gene, the latter of which functions to metabolize a small amount of nicotine [41] . A single nucleotide variation in the FMO6P pseudogene, rs6608453, was associated with nicotine dependence in African Americans [42] . Likewise, ALDOAP2 was over-expressed in both healthy and non-healthy smokers compared to non-smokers, while exposure to cigarette smoke resulted in the upregulation of the PNLIPRP2 polymorphic pseudogene in a murine model [43, 44] . In contrast, ZNF259P1 and ZNF702P did not have previously reported associations with smoking. ZNF259P1 was significantly correlated with the tumor size of primary lung adenocarcinomas, while ZNF702P was found to be upregulated after BCL2L10 knockdown in two ovarian cell lines [45, 46] . Analysis of upstream regulators revealed that the tumor protein 53 (TP53) gene was the most significantly DE regulator in smoker buccal mucosa. TP53 contains cellular proliferation by guarding against genomic mutation, and TP53 mutations are among the most common genetic alterations in human cancers [47] . Tobacco smoking is known to influence TP53 mutation patterns and frequencies in lung cancer and urothelial cell carcinoma patients [48, 49] . In fact, a large proportion of TP53 mutations in the lung cancers of smokers were G → T transversions, a primary mutagenic signature that is caused by DNA damage from tobacco smoke [50] . The most significant canonical pathway identified by IPA was the "nitric oxide and reactive oxygen production at macrophages". Nitric oxide and reactive oxygen species are essential for maintaining redox balance, but they also act in pathological processes [51] . Tobacco smoke contains large numbers of free radicals, including nitric oxide and reactive oxygen species (ROS), that cause oxidative stress on the cellular and sub-cellular levels [52, 53] . In turn, smoking-induced oxidative stress activates inflammatory response pathways that produce endogenous ROS at the site of oxidative stress, potentially causing further oxidative damage to that site [53] . Smoking also reduces the production of nitric oxide while also elevating the production of ROS in endothelial cells [54, 55] . Smoking-induced ROS production is especially concerning as it may contribute to the progression of endometrial adenocarcinoma [56] . Among the DE genes, those associated with craniosynostosis and fibroid tumors were over-represented in smoker buccal mucosa. Craniosynostosis, which is caused by the premature fusion of cranial sutures, is the second-most common cranio-facial anomaly [57] . Smoking during pregnancy was associated with an increased risk of craniosynostosis, while exposure to secondhand smoke modestly increased the risk of this birth defect [58] . Maternal smoking impacts cranio-facial development by acting upon variant alleles of the transforming growth factor alpha (TGF-α) gene, and genetic variation of the TGF-α gene is associated with increased risk of cranio-facial defects [59, 60] . Fibroid tumors are non-cancerous growths that develop inside or on the uterus and are the most common type of pelvic tumor detected in women [61] . Previous studies that investigated the impact of smoking on fibroid tumors yielded conflicting results. Earlier studies suggested that smoking had a protective effect against fibroid tumors, but subsequent studies have shown either a negative effect or no relationship at all [61, 62] . It is worthwhile to note that smoking has been shown to have an anti-estrogenic effect in women, resulting in an earlier natural menopause as well as protective associations with the risk of estrogen-related cancers [63, 64] . Pathway network analysis revealed that the SMAD2 gene had the highest number of interactions with other DE genes, and it was also a target of miR-8. SMAD3 was predicted by IPA to be an inhibited upstream regulator. The SMAD Family Member 2 (SMAD2) gene encodes for a protein that is vital for early development, and SMAD2 mutations were associated with complex cranio-facial defects in a murine model [65] . SMAD2, SMAD3, and SMAD4 mediate the signal transduction of transforming growth factor-β (TGF-β) superfamily members, the latter of which induce a range of effects that involve cellular differentiation, proliferation, migration, and apoptosis [66] . The present study is affected by a few limitations. The sample size was relatively small, and the patient samples differed in terms of sex and race, which could confound the interpretation of the genetic variation. Additionally, several differentially expressed genes in smoker buccal mucosa were uncharacterized or unmapped to pathways, meaning that their effects are not considered in the current analysis. The current findings signify the importance of inflammatory response and oxidative stress as a major component of smoking-induced tissue injury. Most significantly, nitric oxide-related inflammation stands as one of the canonical pathways underlying genetic and molecular pathways changes coupled with exposure to cigarette smoke. Future lines of research should focus on validating the results of the current study in a larger population to ascertain potential therapeutic targets in the context of smoking-induced damage. Effective tobacco control is key to rapid progress in reduction of non-communicable diseases Priority actions for the non-communicable disease crisis Spatial, temporal, and demographic patterns in prevalence of smoking tobacco use and attributable disease burden in 204 countries and territories, 1990-2019: a systematic analysis from the Global Burden of Disease Study Lung cancer 2020: epidemiology, etiology, and prevention Hazardous compounds in tobacco smoke Effects of tobacco smoke on gene expression and cellular pathways in a cellular model of oral leukoplakia Cigarette smoking: cancer risks, carcinogens, and mechanisms The effect of cigarette smoking on cancer treatment-related side effects Effect of smoking on gene expression profile -overall mechanism, impact on respiratory system function, and reference to electronic cigarettes Tobacco smoking-response genes in blood and buccal cells Smoking induces coordinated DNA methylation and gene expression changes in adipose tissue with consequences for metabolic health Tobacco smoking increases the lung gene expression of ACE2, the receptor of SARS-CoV-2 Tobacco-related alterations in airway gene expression are rapidly reversed within weeks following smoking-cessation A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking Smoking-induced gene expression changes in the bronchial airway are reflected in nasal and buccal epithelium EnhancedVolcano: publication-ready volcano plots with enhanced colouring and labeling clusterProfiler: an R package for comparing biological themes among gene clusters 0: a universal enrichment tool for interpreting omics data SIGNOR 2.0, the SIGnaling network open resource 2.0 2019 update Role of CHD5 in human cancers: 10 years later CHD5, a new member of the chromodomain gene family, is preferentially expressed in the nervous system CHD5 a tumour suppressor is epigenetically silenced in hepatocellular carcinoma The single-nucleotide polymorphisms in CHD5 affect the prognosis of patients with hepatocellular carcinoma CHD5 is a potential tumor suppressor in non small cell lung cancer (NSCLC) A rare CHD5 haplotype and its interactions with environmental factors predicting hepatocellular carcinoma risk QKI is a critical pre-mRNA alternative splicing regulator of cardiac myofibrillogenesis and contractile function The RNAbinding protein QKI suppresses cancer-associated aberrant splicing A large-scale analysis of alternative splicing reveals a key role of QKI in lung cancer Single cell RNA sequencing identifies IGFBP5 and QKI as ciliated epithelial cell genes associated with severe COPD AP-1 subunits: quarrel and harmony among siblings The transcription factor Batf3 inhibits the differentiation of regulatory T cells in the periphery Cutting edge: Batf3 expression by CD8 T cells critically regulates the development of memory populations Effective treatment of allergic airway inflammation with Helicobacter pylori immunomodulators requires BATF3-dependent dendritic cells and IL-10 Altered lung biology of healthy never smokers following acute inhalation of E-cigarettes Framingham Heart Study genome-wide association: results for pulmonary function measures Genetic IL-6R variants and therapeutic inhibition of IL-6 receptor signalling in COVID-19 Anti-IL6R role in treatment of COVID-19-related ARDS ADAM17 and EGFR regulate IL-6 receptor and amphiregulin mRNA expression and release in cigarette smoke-exposed primary bronchial epithelial cells from patients with chronic obstructive pulmonary disease (COPD) Genetic variants in IL6R and ADAM19 are associated with COPD severity in a Mexican mestizo population Neutrophil-mediated IL-6 receptor trans-signaling and the risk of chronic obstructive pulmonary disease and asthma Genetic susceptibility to nicotine and/or alcohol addiction: a systematic review Targeted sequencing identifies genetic polymorphisms of flavin-containing monooxygenase genes contributing to susceptibility of nicotine dependence in European American and African American Modulation of atherogenic lipidome by cigarette smoke in apolipoprotein E-deficient mice Genomic mechanisms of transformation from chronic obstructive pulmonary disease to lung cancer Bcl2l10 induces metabolic alterations in ovarian cancer cells by regulating the TCA cycle enzymes SDHD and IDH1 Hidden treasures in "ancient" microarrays: gene-expression portrays biology and potential resistance pathways of major lung cancer subtypes and normal tissue TP53 mutations in human cancers: origins, consequences, and clinical use TP53 mutation spectrum in smokers and never smoking lung cancer patients Mutations in TP53, but not FGFR3, in urothelial cell carcinoma of the bladder are influenced by smoking: contribution of exogenous versus endogenous carcinogens TP53 mutation spectrum in lung cancers and mutagenic signature of components of tobacco smoke: lessons from the IARC TP53 mutation database Oxidative damage in alveolar macrophages exposed to cigarette smoke extract and participation of nitric oxide in redox balance Active smoking causes oxidative stress and decreases blood melatonin levels Relationships among smoking, oxidative stress, inflammation, macromolecular damage, and cancer Cigarette smoke adversely influences nitric oxide bioavailability by effects on L-arginine transport and oxidative stress in endothelial cells Reactive oxygen species are involved in smoking-induced dysfunction of nitric oxide biosynthesis and upregulation of endothelial nitric oxide synthase: an in vitro demonstration in human coronary artery endothelial cells The cigarette smoke components induced the cell proliferation and epithelial to mesenchymal transition via production of reactive oxygen species in endometrial adenocarcinoma cells Craniosynostosis as a clinical and diagnostic problem: molecular pathology and genetic counseling Craniosynostosis and maternal smoking Association of genetic variation of the transforming growth factoralpha gene with cleft lip and palate Gene/environment interactions in craniosynostosis: a brief review Epidemiology and risk factors of uterine fibroids Environmental tobacco smoke and risk of late-diagnosis incident fibroids in the Study of Women's Health across the Nation (SWAN) The antiestrogenic effect of cigarette smoking in women Cigarette smoking and estrogen-related cancer Smad2 role in mesoderm formation, left-right patterning and craniofacial development TGF-beta receptor-mediated signalling through Smad2, Smad3 and Smad4 Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations The authors would like to acknowledge the efforts of the University of Petra's Faculty of Pharmacy and Medical Sciences, its dean, and its Department Head of Pharmacology and Biomedical Sciences. The online version contains supplementary material available at https:// doi. org/ 10. 1186/ s43042-022-00268-y.Additional file 1. List of DE genes downloaded from GEO2R. A complete list of gene names and their abbreviations are delivered in the supplementary data files. AK was involved in conceptualization, writing (review & editing), formal analysis, validation, and visualization. BK and LS were involved in formal analysis, methodology, and writing (original draft). AT and TQ were involved in visualization and writing (review & editing). All authors have read and approved the manuscript. This study was not funded. The current report utilized a previously published dataset for the analysis. The dataset used in this work was acquired from The National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) depository (accession number GSE8987).