key: cord-0917779-culfxw48 authors: Gu, H.; Yuan, G. title: Identification of potential biomarkers and inhibitors for SARS-CoV-2 infection date: 2020-09-18 journal: nan DOI: 10.1101/2020.09.15.20195487 sha: 88f71ccd1915e2a676e2e28a3561f9adc28adbb7 doc_id: 917779 cord_uid: culfxw48 The COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has overwhelmed many health systems globally. Here, we aim to identify biological markers and associated biological processes of COVID-19 using a bioinformatics approach to elucidate their potential pathogenesis. The gene expression profile of the GSE152418 dataset was originally produced by using the high-throughput Illumina NovaSeq 6000. Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) and Gene Ontology (GO) enrichment analyses were applied to identify functional categories and biochemical pathways. KEGG and GO results suggested that biological pathways such as Cancer pathways and Insulin pathways were mostly affected in the development of COVID-19. Moreover, we identified several genes including EP300, CREBBP, and POLR2A were involved in the virus activities in COVID-19 patients. We further predicted that some inhibitors may have the potential to block the SARS-CoV-2 infection based on the L1000FWD analysis. Therefore, our study provides further insights into the underlying pathogenesis of COVID-19. COVID-19 caused by SARS-CoV-2 infection, has affected a large number of countries with increasing morbidity and mortality 1 . Most COVID-19 patients exhibited mild-tomoderate symptoms and small groups of patients typically within a week turned into a severe stage. Early reports showed 21% of COVID-19 patients died in New York City during March 2020 2 . The aged patients or those with medical comorbidities such as diabetes, hypertension, lung diseases and cardiovascular diseases have a higher mortality rate 3 . Currently, there are no curative therapies for COVID-19. Therefore, understandings of the SARS-CoV-2 pathogenesis are critical to the development of therapeutics. Recent studies have suggested that uncontrolled inflammation leads to disease severity in COVID-19 4 . Most COVID-19 patients are characterized by increased numbers of neutrophils and exhibit increased levels of pro-inflammatory cytokines including IL6, IL1 and MCP-1 in the plasma 5 . The uncontrolled pro-inflammatory cytokines may lead to shock, respiratory failure and multiple organ failure in COVID-19 patients 6 . However, little is known about the mechanisms underlying COVID-19, and whether individuals in different parts of the world respond differently to SARS-CoV-2 remains unknown. The SARS-CoV-2 is an RNA virus with spike-like glycoproteins 7 . The development of vaccines for COVID-19 patients are largely dependent on the specific RNA and protein structure 8, 9 . Modern antiviral drug discovery is expected to be impacted dramatically by analyzing genomics 10 . High-throughput microarray methodologies and advanced drug development such as remdesivir have drawn extensive attentions 11 . Thus, there is an urgent need to identify potential targets or biomarkers for COVID-19 patients by genomics. In this study, we investigated the effect of SARS-CoV-2 on human peripheral blood mononuclear cells (PBMCs). We analyzed and identified several DEGs, inhibitors and the relevant biological processes of COVID-19 by utilizing comprehensive bioinformatics analysis. We performed the functional enrichment, pathway analysis, and proteinprotein interaction for finding COVID-19 gene nodes in PBMCs. These key genes and pathogenetic factors could be critical to guide future clinical and therapeutic interventions. Gene expression profile dataset GSE152418 was obtained from the GEO database (http://www.ncbi.nlm.nih.gov/geo/). The data was produced by using an Illumina NovaSeq 6000 (Homo sapiens) (Developmental and Cognitive Neuroscience, Yerkes National Primate Research Center, Atlanta, GA30329-4208, US). The GSE152418 dataset contained data including 17 COVID-19 subjects and 17 healthy controls. Data acquisition and preprocessing . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The raw microarray data between SARS-CoV-2 positive samples and negative controls were subsequently conducted by R script. We used a classical t test to identify DEGs with P<.01 and fold change ≥1.5 as being statistically significant. Gene Ontology (GO) and pathway enrichment analysis Gene Ontology (GO) analysis is a widely used approach to annotate genomic data and identify characteristic biological information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) database is commonly used for systematic analysis of gene functions and annotation of biological pathways. GO analysis and KEGG pathway enrichment analysis of DEGs in this study were analyzed by the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (http://david.ncifcrf.gov/). P<.05 and gene counts >10 were considered statistically significant. The Molecular Complex Detection (MCODE) was used to analyze the densely connected regions in PPI networks. The significant modules were from constructed PPI networks using MCODE. The function and pathway enrichment analyses were performed by using DAVID, and P<.05 was used as the cutoff criterion. The L1000FWD, a NIH Common Fund program, is used to identify potential novel inhibitors. L1000FWD calculated the similarity between an input gene expression signature and the LINCS-L1000 data to rank inhibitors which can regulate the transcriptional signature 12 . The adjusted p-value of 0.05 has been considered as threshold for statistical significance. The RAW 264.7 cell lines were purchased from American Type Culture Collection (ATCC® TIB-71™). Cells were cultured in DMEM medium supplemented with 10% FBS and 1% penicillin/streptomycin and incubated at 37 °C under 5% CO2. The cells were induced with 1µg/mL LPS for 24 hours and treated with different potential inhibitors for 24 hours: anisomycin (20µM, Sigma), BRD-K60870698 (20µM, Santa Cruz), QL-X-138 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The total RNA from cells was extracted using TRIzol reagent (Invitrogen, USA) as described previously 13 . The cDNA was obtained using a reverse transcription kit according to the manufacturer's instructions (TAKARA, US). PCR amplification was carried out for a total of 40 cycles and normalized to GAPDH expression. All reactions were performed in triplicate, and the relative expression was determined using the 2- For statistical analysis, Prism 7 software (GraphPad Software, USA) was used. The data were expressed as the mean ± S.E.M. A two-tailed Student's t-test was performed to determine the significance of the difference between the two groups. One-way analysis of variance (ANOVA) with Dunnett's post hoc test was used to compare more than two groups. P values < 0.05 were considered significant. To gain the insights on host responses to SARS-CoV-2 infection, the modular transcriptional signature of COVID-19 patients was compared to that of the healthy controls. A total of 1254 genes were identified to be differentially expressed in COVID-19 samples with the threshold of P<0.001. The top 10 up-and down-regulated genes for COVID-19 and negative samples are list in Table 1 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09. 15.20195487 doi: medRxiv preprint To further analyze the biological roles and potential mechanisms of DEGs from the COVID-19 samples versus healthy controls, we performed KEGG pathway and GO categories enrichment analysis. KEGG pathways (http://www.genome.jp/kegg/) is an encyclopedia of genes and genomes for understanding high-level functions. Our study showed top ten enriched KEGG pathways including "Pathways in cancer", "Insulin signaling pathway", "Neurotrophin signaling pathway", "T cell receptor signaling pathway", "Fc gamma R-mediated phagocytosis", "Pancreatic cancer", "Phosphatidylinositol signaling system", "Inositol phosphate metabolism", "Acute myeloid leukemia", and "Chronic myeloid leukemia" (Figure 1 ). Gene Ontology (GO) analysis is a major bioinformatic initiative to unify the representation of gene and gene product, which includes cellular components, molecular functions, and biological processes. We identified top ten cellular components including "Non-membrane-bounded organelle", "Intracellular non-membrane-bounded organelle", "Membrane-enclosed lumen", "Organelle lumen", "Intracellular organelle lumen", "Nucleolus", "Nuclear lumen", "microtubule cytoskeleton", "Nucleoplasm", and "Nucleoplasm part" (Figure 1 ). We then identified top ten biological processes: "Cellular macromolecule catabolic process", "Regulation of transcription", "Regulation of RNA metabolic process", "Regulation of transcription, DNA-dependent", "Protein catabolic", "Cellular protein catabolic process", "Proteolysis involved in cellular protein catabolic process", "Modification-dependent protein catabolic process", "Modification-dependent macromolecule catabolic process", and "Transcription" (Figure 1 ). We also identified top ten molecular functions: "Metal ion binding", "Transcription regulator activity", "Transcription factor binding", "Transition metal ion binding", "DNA binding", "Protein serine/threonine kinase activity", "Zinc ion binding", "Transcription cofactor activity", "Transcription activator activity", and "Transcription coactivator activity" (Figure 1 ). The PPI network was constructed to further explore the relationships of DGEs at the protein level. We set the criterion of combined score >0.7 and created the PPI network by using the 1198 nodes and 3137 interactions between negative controls and COVID-. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . 19 positive samples. Among these nodes, the top ten genes with highest scores are shown in Table 2 . The top two significant modules of COVID-19 versus control samples were selected to depict the functional annotation of genes ( Figure 2 ). We identified top ten signaling Table S1 ). We highlighted top ten inhibitors (Table 3 ) and further selected six potential anti-COVID-19 inhibitors with the highest scores identified by using the L1000FWD analysis ( Figure 3 ). Among them: anisomycin is used as a DNA synthesis inhibitor; BRD-K60870698 is a protein synthesis inhibitor; QL-X-138 is used for PARP inhibition; BMS-345541 is used as an IKK inhibitor, homoharrringtonine is used as a protein synthesis inhibitor; kinetinriboside is used for EGFR inhibition. Macrophages are key players during SARS-CoV-2 infection in innate immunity: they produce cytokines and lead to the activation and regulation of immune response 14, 15 . We then determined the anti-inflammatory effects of the six predicted inhibitors by using the LPS induced macrophages (Figure 4) . We found the anisomycin, QL-X-138 and BMS-345541 can inhibit the IL1, IL6 and TNFα expressions during the LPS induction ( Figure 4) . It is suggested that the potential inhibitors, anisomycin, QL-X-138, and BMS-345541, may block the SARS-CoV-2 infection and inflammation in COVID-19 patients. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint The COVID-19 disease caused by the SARS-CoV-2 has turned into a worldwide catastrophe. Understanding the pathogenesis of COVID-19 is highly urgent and critical for diagnosis and treatment. In the study by Lee LY et al 16 , the phenotype of COVID-19 disease in over half of the cancer patients is mild, but the mortality is higher than that in the general non-cancer population. And based on our KEGG studies, we found the COVID-19 patients were highly relevant to the cancer related pathways including acute myeloid leukemia and chronic myeloid leukemia. Besides cancer, diabetes is also associated with decreased host defense immunity and disordered glucose metabolism, which increases the susceptibility to COVID-19 infection 17 . Additionally, our KEGG studies indicated that COVID-19 took part in the regulation of insulin pathways. It is probable that patients with COVID-19 may aggravate the disorders of insulin and glucose metabolism. Thus, protecting patients with cancer related diseases or diabetes from exposure to SARS-CoV-2 is crucial. Wearing mask, self-isolation, keeping safe distance and avoiding crowded work environments are the best ways to minimize the risk of COVID-19. In addition, the infection of COVID-19 is also associated with the cellular macromolecule catabolic process including the regulation of RNA metabolic process and the proteolysis involved in cellular protein catabolic process based on our GO analysis. Entry of the SARS-CoV-2 is mainly dependent on proteolytic activation of the spike protein 18 . During the process of viral infection, the spike protein is cleaved into the S1 and S2 subunits and the S2 subunit is released 19 . The other evidence indicated that SARS-CoV-2 S protein can activate protease-independent and receptor-dependent cellular fusion to promote viral spreading 20 . In our study, the subsequent construction of the PPI network identified several DEGs as potential critical genes during COVID-19 which could be considered as active targets. EP300 and CREBBP target a significant number of proteins for acetylation, including cytosolic proteins involved in essential metabolic processes 21 . POLR2A is a key virus polymerase-interacting protein and is required for viral replication and transcriptional . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint activity 22 . Cell division cycle 20 (CDC20) encodes a regulatory protein and plays important roles in tumorigenesis 23 . Autophagy and the ubiquitin-proteasome system (UPS) are two major intracellular quality control pathways that are responsible for cellular homeostasis in eukaryotes 24 . Here, we identified the UPS related gene UBE2C, BTRC, CUL3, FBXW11, DCAF and autophagy related gene ATG7 were related to SARS-CoV-2 infection. In our study, we identified a number of anti-COVID-19 inhibitors by using L1000FWD. Interestingly, among these inhibitors, we found that anisomycin, QL-X-138, and BMS-345541 can also block the inflammation in macrophages, which may further inhibit the cytokine storm in COVID-19 disease. Anisomycin inhibits protein synthesis and substantially depresses the levels of the conventional early mRNAs 25 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. The inhibitors with a high significance p-value and combined score were selected. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 18, 2020. . https://doi.org/10.1101/2020.09.15.20195487 doi: medRxiv preprint Comparing SARS-CoV-2 with SARS-CoV and influenza pandemics Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention An inflammatory cytokine signature predicts COVID-19 severity and survival SARS-CoV-2 infection: The role of cytokines in COVID-19 disease COVID-19: immunopathology and its implications for therapy Mechanisms of coronavirus cell entry mediated by the viral spike protein An Evidence Based Perspective on mRNA-SARS-CoV-2 Vaccine Development Immunological considerations for COVID-19 vaccine strategies Functional genomics in virology and antiviral drug discovery L1000FWD: fireworks visualization of drug-induced transcriptomic signatures The Circadian Gene Clock Regulates Bone Formation Via PDIA3 Macrophage cytokines: involvement in immunity and infectious diseases Pathological inflammation in patients with COVID-19: a key role for monocytes and macrophages COVID-19 mortality in patients with cancer on chemotherapy or other anticancer treatments: a prospective cohort study Prevention and management of COVID-19 among patients with diabetes: an appraisal of the literature Gene of the month: the 2019-nCoV/SARS-CoV-2 novel coronavirus spike protein The spike protein of severe acute respiratory syndrome (SARS) is cleaved in virus infected Vero-E6 cells Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV Exploitation of EP300 and CREBBP Lysine Acetyltransferases by Cancer. Cold Spring Harb Perspect Med Eleutheroside B1 mediates its antiinfluenza activity through POLR2A and N-glycosylation Increased CDC20 expression is associated with development and progression of hepatocellular carcinoma Relationship between the proteasomal system and autophagy Control of adenovirus early gene expression: a class of immediate early products Antiviral activity of the natural alkaloid anisomycin against dengue and Zika viruses Discovery of a BTK/MNK dual inhibitor for lymphoma and leukemia The nuclear factor NF-kappaB pathway in inflammation Clock mutant promotes osteoarthritis by inhibiting the acetylation of NFkappaB RGS12 Is a Novel Critical NF-kappaB Activator in Inflammatory Arthritis Protein tyrosine phosphatase N2 regulates TNFalpha-induced signalling and cytokine secretion in human intestinal epithelial cells Two specific drugs, BMS-345541 and purvalanol A induce apoptosis of HTLV-1 infected cells through inhibition of the NF-kappaB and cell cycle pathways