key: cord-0836608-z15vnfll
authors: Wong, Henry Sung-Ching; Guo, Chin-Lin; Lin, Gan-Hong; Lee, Kang-Yun; Okada, Yukinori; Chang, Wei-Chiao
title: Transcriptome network analyses in human coronavirus infections suggest a rational use of immunomodulatory drugs for COVID19 therapy
date: 2021-01-20
journal: Genomics
DOI: 10.1016/j.ygeno.2020.12.041
sha: f48d0cc1632493679370c0c2029267f9e2fd576d
doc_id: 836608
cord_uid: z15vnfll

The recent outbreak of coronavirus disease 2019 (COVID-19) by SARS-CoV-2 has led to uptodate 24.3 M cases and 0.8 M deaths. It is thus in urgent need to rationalize potential therapeutic targets against the progression of diseases. An effective, feasible way is to use the pre-existing ΔORF6 mutant of SARS-CoV as a surrogate for SARS-CoV-2, since both lack the moiety responsible for interferon antagonistic effects. By analyzing temporal profiles of upregulated genes in ΔORF6-infected Calu-3 cells, we prioritized 55 genes and 238 ligands to reposition currently available medications for COVID-19 therapy. Eight of them are already in clinical trials. Of particular importance is the emergence of immune checkpoint inhibitors targeting PD-L1 that have not been used in COVID-19 treatment. We also pinpointed 16 drug groups from the Anatomical Therapeutic Chemical classification system, with the potential to mitigate symptoms of SARS-CoV-2 infection and thus to be repositioned for COVID-19 therapy.

The COVID-19 pandemic, which has led to huge economic losses by the Great Lockdown, is an ongoing global threat caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1] . As of August 2020, ~24.3M cases with ~828k deaths have been reported across 188 countries and territories, leading to an ~3.4% mortality rate, 34 times higher than that of H1N1. Unfortunately, no drug or vaccine has been approved to treat human coronaviruses, and new interventions are likely to take months to years to develop. In this regard, rationalizing semiempirical or empirical strategies to protect currently hospitalized victims (~8.4M uptodate) e.g.) [2] . As such, one plausible way to rationalize empirical strategies in a timely manner is to use available SARS-CoV/MERS-CoV data and perform fast, effective in silico analyses, thereby identifying candidate remedies from currently available medications.

In addition to genomics, SARS-CoV-2 and SARS-CoV share a certain degree of similiarity. It has been shown that SARS-CoV-2 shares similar transmission mechanisms and pathogenic sequelae with SARS-CoV.

First, both viruses are group B coronaviruses utilizing the same receptor, angiotensin-converting enzyme 2 (ACE2), to bind to the host cell surface [3] . Second, the pathology of SARS-CoV-infected lung shows features of acute injury and fibrinous organizing pneumonia patterns [4] , which were also found in SARS-CoV-2 infected lung, including diffusive acute alveolar injury with edema, hyaline membrane formation, proteinaceous exudate, focal reactive hyperplasia, immune cell infiltration, and intra-alveolar fibroblastic plug formation [5] .

The consequence is the respiratory failure, the leading cause for the mortality, as no gas exchange can be performed in such foamy, fibroblastic alveolar space. Intervening in the disease progression requires an explicit molecular cellular understanding of the pathogenesis, which is currently limited. What has been noted is the involvement of highly inflammatory responses that produce enormous chemokines implicated in cytokine storm [6, 7] . Futhermore, a key difference between SARS-CoV and SARS-CoV-2 is a truncation of the open reading frame (ORF) 3b, ORF6, in SARS-CoV-2, which affects the viral antagonistic effect toward interferon response [8] . These lines of evidence suggest that it is plausible to use available SARS-CoV data as a surrogate for devising the empirtical strategy and that immune modulation might be involved in such strategy. Fortunately, a mutant of SARS-CoV with the truncation in ORF6 has been found and used to infect a human lung cell line, Calu-3, for empirical study. Here, we took advantage of these preexisting data and applied bioinformatic analysis on SARS-CoV-infected cells (as the reference) and SARS-CoV ΔORF6-infected cells (as the surrogate to mimic SARS-CoV-2 infection) to identify medications that can be repurposed for COVID-19 therapy.

J o u r n a l P r e -p r o o f Journal Pre-proof

Transcriptomes of Calu-3 cells infected by infectious clone-derived (ic)SARS CoV and icSARS ΔORF6 mutant (GSE33267) were queried using GEOquery. The downloaded data had already been normalized. Probe-level expression values (n=30772) were log 2 -transformed and were aggregated to become gene-level expression values (n=18113) by maximum value. Gene symbols were further checked and corrected using the HGNChelper package.

Differentially expressed tests were conducted on the time-series microarray data (ΔORF6-infected cells vs.

control cells or SARS-infected cells vs. control cells) based on the following post-treatment time-points: 0, 3, 7, 12, 24, 30, 36, 48, 54 , 60 and 72 hours. To ensure the reliability of identified DEGs, we conducted three different methods, and defined differential expressed genes (DEGs) as genes that showed consensus (both on significance and effect direction) across methods. The three methods used different ways of normalization (normexp method and quantile normalization for background correction and RMA for method 1 and 2; normexp method and quantile normalization for background correction, RMA and between-sample normalization for method 2), different ways of gene summarization (maximum value aggregation for methods 1 and 3; Tukey Biweight estimator for method 2), and different ways of identifying significant genes across groups, as following: Method (1): Limma method. By assuming smooth expression changes over time, we fitted a regression spline with effective degrees of freedom of five to capture the temporal trends, and then compared the differences in curves between groups (∆ORF6 vs. mock and wild-type vs. mock). Then for each gene, a linear model that implemented in limma [9] was fitted and an empirical Bayes moderated F-statistic was calculated for testing the change in expression level over the time points. Five coefficients corresponded to differences in two curves were retrieved from the fitted model and then averaged to produce a mean coefficient value. To control the false positive rate, raw p-values were adjusted using the Benjamini-Hochberg (B-H) J o u r n a l P r e -p r o o f Journal Pre-proof method. Significant genes (out of 18113 genes) of the method (1) were defined as genes with Benjamini-Hochberg (BH)-adjusted p-value <0.05. Up or downregulated DEGs were defined by the average coefficient of >0 or <0, respectively. Method (2): Rnits method. In this method, probe expression values were further undergone between sample normalization and collapsed into gene-level using a robust Tukey Biweight estimator, which performs penalization on outlier values. Next, significant genes were identified by testing whether each gene expression trajectory from the treatment group (∆ORF6 or wild-type) and mock (control) group are the realizations of the same underlying basis B-spline curve [10] . The significant genes (out of 18109 genes) of the method (2) were defined as genes that passed a BH-adjusted p-value threshold of 0.05. Up or downregulated significant genes were defined by ratio statistic of >0 or <0, respectively. Method (3): timecourse (tc) method. This method was implemented in timecourse package, which calculated a ̃ statistic for each gene using multivariate empirical Bayes approach. This method incorporated the correlation coefficient information across genes to diminish false-positivity and false-negativity. The significance of each gene was calculated by F-statistic that derived from the ̃ statistic. The significant genes (from total 18113 genes) of the method (3) were defined as genes that passed a Bonferroni-adjusted p-value threshold of 0.05. Since this method didn't output information regarding the up or downregulation of significant genes, we thus calculated the logFC as ( scale-free R 2 fit of 0.85. The pairwise gene-to-gene topological overlap (TO) matrix was then calculated. The pre-clusters were identified by applying a top-down dynamic tree cut method (module gene size >200 and deepSplit parameter 2), and pre-clusters were defined as branches of the hierarchical cluster tree. Pre-clusters with pairwise cluster eigengene (PC1) correlation values (r) >0.85 were further merged into clusters.

Enrichment test for DEGs across multiple time points in co-expressed clusters was conducted using two-tailed Fisher's exact test. The odds ratios (ORs) were calculated. We defined OR >1 and p-value <0.05 as significant overrepresentation (enrichment).

Depending on the initial data (either PWM information or target genes of a TF) retrieved from databases, two different strategies were adopted as follows: (1) sequence-based enrichment test. Given a set a gene with size N G and several candidate transcription factors (TFs) with known position weight matrix (PWM) or motif, we want to identify whether the TFs' binding motif was enriched in the promoter sequence of genes in the given gene set. To do this, we scanned the promoter region (2000 basepair (bp) upstream from the transcription start site (TSS)) for the N G genes. We retrieved TFs' human PWM information from MotifDb package. For each PWM, a threshold-free log-normal distribution was calculated by fitting 500bp chunks of 2000bp promoters from all human genes (hg19). The PWM enrichment test of each TF was performed using PWMEnrich package.

We further conducted Bonferroni correction to reduce false positivity, and (2) gene-based permutation test.

Given a set of gene with size N G and several candidate TFs with known target genes from database(s) (such as ChIP-seq results from the ENCODE), we conducted gene-based permutation test by counting the overlapped genes of TF-regulated genes to the given gene set and performed permutation test to assess the enrichment of overlapped count number as following: For each TF, we counted the overlapped target number (n) in the given gene set with size N. Next, we randomly selected N genes from the background gene set and then counted the J o u r n a l P r e -p r o o f Journal Pre-proof overlapped number of targets of this TFs. This process was repeated for 99,999 times to produce a null distribution n 1 , …, n 99999 of target number for the given TFs. We then calculated the single-tailed p-value as (no. of n, n 1 , …, n 99999 ≥ n)/(size of n, n 1 , … , n 99999 ).

Genes in co-expressed clusters were characterized using Gene Ontology (GO) biological processes (BPs) and REACTOME pathways using clusterProfiler [13] and ReactomePA [14] , respectively. Q-values were calculated for multiple testing corrections. The top 20 enriched terms were selected for visualization.

Targetable genes were defined as genes with it (protein) products that could be targeted by specific ligands (drugs) from the IUPHAR/BPS Guide to PHARMACOLOGY [15] . Further data sanitization steps including (i) limited only drugs with target species of human; (ii) ligands that annotated as non-endogenous agents; (iii) ligands with clear and explicit "action" and "type" annotations; and (iii) ligands with explicit target gene (symbol). Finally, the drug-gene pairs were collapsed across time points and further visualized using circlize, VennDiagram, RColorBrewer, and igraph packages.

The Anatomical Therapeutic Chemical (ATC) classification of each ligand was annotated from the DrugBank database [16] . Given the size of N G for the final targetable genes, we randomly selected N G genes from the targetable gene list and mapped the ligands that may target the genes. Next, we summarized the number of drugs that belong to a specific ATC group. We repeated this process for n=99,999 times and calculated the probability of obtaining a number of drugs that equal to or greater than original data in the particular ATC group. We further computing B-H-adjusted p-values for multiple testing corrections and selected a B-Hadjusted p-value of 0.1 as a threshold of significance.

J o u r n a l P r e -p r o o f

In this study, we used R (http://www.r-project.org/ and http://cran.r-project.org/) or Bioconductor (http://www.bioconductor.org/) to perform all statistical association tests, enrichment analyses, and visualization tasks.

In contrast to SARS-CoV, SARS-CoV-2 shows ablated antagonism to interferon (IFN) due to an amino acid truncation in the viral ORF6 gene [17] . We therefore leveraged a previously published dataset (GSE33267) [18] that contains time-dependent transcriptomic profiling of infectious clone-derived (ic)SARS-CoV ΔORF6 (henceforth called "ΔORF6") in human Calu-3 cells. Since ΔORF6 is a mutant form of SARS-CoV that cannot express ORF6, it is reasonable to use ΔORF6 as a surrogate for SARS-CoV-2 (which also contains truncated ORF6) to reveal its pathophysiology and explore drug repositioning (Fig. 1) .

To identify genes that respond to ΔORF6 infection in Calu-3 cells (ΔORF6 infection-related genes), we analyzed the differential expression of genes by comparing ΔORF6-infected cells (n=39; 3 replicates for 11 different time points including 0, 3, 7, 12, 24, 30, 36, 48, 54, 60 and 72 hours) to control cells (n=39; 3 replicates for 11 different time points) by applying three complementary methods (i.e. method (1), (2) and (3) as described in "Materials and Methods" section) that used different strategies to identify differentially expressed genes (DEGs) of time-series expression data. In parallel, we conducted similar tests on wild-type icSARS-CoVinfected Calu-3 cells (henceforth called "wild-type" or "SARS") for comparison (Fig. 2a) . We revealed a wide range of percentages of up or downregulated DEGs (0% to 77.74%), suggesting different sensitivity and specificity across methods.

For significant genes detected by each method, we defined DEGs as genes identified by all three methods and with the same effect of direction (upregulated and downregulated) across methods (Fig. 2b) .

Journal Pre-proof Therefore, we detected no down-regulated DEGs in this study. Especially, we detected 3373 and 848 upregulated DEGs in ΔORF6-infected cells and SARS-infected cells, respectively. However, none of the ΔORF6 group or SARS group shows consensus results regarding downregulated genes.

Significance values of genes were further transformed into 0~1, and the outcome similarity of three methods was quantified using Pearson's product-moment correlation coefficients (r P ). We observed moderateto-high correlations (r P =0.7 and 0.69 in ∆ORF6-infected cells and SARS-infected cells, respectively; Figure S1 ) between the results of methods (1) and (3); while the results from the method (2) deviated from that of methods (1) and (3) (r P ranged from 0.25~0.60; Fig. 2c and Figure S2 ).

To investigate the effect of ∆ORF6 or SARS infection on cytokinome profiles, we assessed the significance of 121 cytokines/chemokines (ligands) and 144 corresponding receptor genes (in total 265 cytokines/chemokines-related genes, a.k.a. cytokinome genes; Figure S3 ) from the results of the differentially expressed test across multiple time points. We quantified the number (and percentage) of ligands or receptors.

The p-values were calculated by using a one-tailed binomial test (with a "greater" alternative hypothesis and confidence level of 95%) by comparing it to the background gene list (excluding cytokinome genes). False positivity was controlled by using Bonferroni correction. As a result, we found that the numbers of ligand genes (cytokines/chemokines) were significantly higher than the background genes ( Figure S4 ). However, we didn't identify any significance in receptor genes. These data suggested that ∆ORF6 or SARS infection has a great effect on the cytokine (and chemokine) levels.

To depict the co-expression dynamics of ΔORF6 infection-related genes in human lung cells after viral infection, we constructed time-dependent co-expression networks from the GSE33267. Post-infection transcriptomic profiles of ΔORF6-infected Calu-3 cells (from 0 to 72 hours, across 11 time-points, each with 3 replicates) were found to be partitioned into 6 clusters based on pairwise correlations between gene expression values; the number of genes in each cluster ranged from 232 to 4517 (Fig. 3a) . In the following steps, we J o u r n a l P r e -p r o o f Journal Pre-proof categorized the significantly upregulated DEGs in ΔORF6-infected cells into two groups: (i) specifically in ΔORF6-infected cells (denoted by "ΔORF6-specific" or "specific"; which containing 2613 genes) and shared in both ΔORF6and wild-type-infected cells (denoted by "shared"; which containing 760 genes).

We posited that additional mapping of ΔORF6 infection-related DEGs in an identified cluster should be observed if the particular cluster reflects a pathophysiological process in ΔORF6-infected Calu-3 cells. To determine the convergence of identified co-expressed clusters of ΔORF6 infection-related genes, we conducted enrichment tests for ΔORF6-specific and shared DEGs using an over-representation analysis (ORA). Among up-regulated DEGs (both ΔORF6-specific and shared genes) in response to ΔORF6 infection, significant enrichment was detected for clusters 1 (Fig. 3b) . We next depicted the temporal trajectory of cluster 1 by calculating the first principal component (PC1; i.e., cluster eigengene) and assessing the value of eigengene across time. Eigengene can be viewed as a summarized value of gene expression in the specific cluster. In particular, genes in cluster 1 were up-regulated and sustained at approximately 30 to 72 hours upon ΔORF6 infection (Fig. 3c) . In details, the eigengene of cluster 1 from ΔORF6-infected cells gradually increased from 0 to 54 hours post-infection, and then slightly dropped from 60 to 72 hours.

To characterize the key factor(s) in up-regulated DEGs, we hypothesized that if there exists a key factor that causes the up-regulation and sustainability of genes from approximately 30 to 72 hours after ∆ORF6 infection, the key factor gene(s) should be also continuously expressed, and exert its/their regulatory functions. Since key factors should be able to regulate-turn on and off-genes' expression (upregulation), we thence focused on transcription factor(s) (TFs). Therefore, we derived a sequence-based enrichment test (please see "Materials and Methods" section) to identify TFs that may regulate the expression of ∆ORF6-specific or shared (with wildtype/SARS) DEGs. As a result, we identified 57 and 18 TFs that passed the significance threshold in ∆ORF6specific and shared groups, respectively ( Figure S5) . To validate our findings in ∆ORF6-specific up-regulated DEGs, we leveraged the chromatin immunoprecipitation sequencing (ChIP-seq) data from the Encyclopedia of J o u r n a l P r e -p r o o f DNA Elements (ENCODE) database. Of the 10 available TFs data in ENCODE, only the FOSL1 was overlapped with the 57 identified TFs. We, therefore, adopted the FOSL1 ChIP-seq results for validation. Given the information on regulatory interactions regarding FOSL1, we conducted a gene-based permutation test (please see "Materials and Methods" section) to assess the enrichment of (∆ORF6-specific) up-regulated DEGs in FOSL1 targets. As a result, we successfully replicated the significance of enrichment regarding FOSL1 using ENCODE ChIP-seq evidence (p-value <0.00001; Figure S6 ).

We next sought to determine whether correlated genes in cluster 1 share biological functions or regulatory mechanisms. The top 20 most significantly enriched Reactome pathways and Gene Ontology (GO) biological process (BP) annotation terms suggested that the genes in cluster 1 were implicated in cellular defensive response to the virus, which includes interferon (especially, type I) signaling pathways (Fig. 3d) .

Based on the fact that SARS-CoV-2 possesses a truncated ORF6 gene, which is critical for viral sensitivity to type I IFN and may influence host immune response [18] , we utilized ΔORF6 as a model for SARS-CoV-2 to compare specific and/or shared biological functions of DEGs in human lung cancer cell line infected with wildtype or ΔORF6 SARS-CoV. Specific and shared genes induced in ΔORF6-infected cells were of interest for our analysis, as both groups may provide clues for drug repositioning. By restricting our study to upregulated genes responsive to ΔORF6 infection, we were able to pinpoint candidate drugs that may inhibit the gene products.

DEGs upregulated at different time-points were classified into two cluster categories (i.e., "Cluster 1" and "Other"). Together, DEGs that were specifically induced by ΔORF6 (denoted as "Specific") or up-regulated in both ΔORF6 and SARS treatment (denoted as "Shared") were subjected to drug repositioning analysis using the drug (ligand)-gene pairing information from the IUPHAR PHARMACOLOGICAL database. In total, 2396 drug-gene pairs (305 unique targetable genes and 1931 unique ligands) were identified, with 1365 (56.97%) pairs from the "Specific" group and 1031 (43.03%) from the "Shared" group (Fig. 4a) . Among them, 1811 (75.58%) pairs were from "Cluster 1" and the remaining 585 (24.42%) pairs were from "Other". Therefore, we J o u r n a l P r e -p r o o f Journal Pre-proof identified a substantial number of candidate drug-gene pairs in "Cluster 1" and "Other" groups. The identified 3373 up-regulated genes were considered "targetable" or "druggable" and may be potentially useful to guide repositioning of drugs for the treatment of SARS-CoV-2 (Fig. 4b) .

From the gene perspective, for the initial 3373 up-regulated DEGs, we consequently identified 305 unique targetable (druggable) genes by intersecting the DEGs with druggable genes from the database, with 200 (65.57%) genes were "Specific" and 105 (34.43%) genes were "Shared" (Fig. 4c) . Moreover, 236 (77.38%) genes were from "Cluster 1" and 69 (22.62%) genes were from "Other". From the ligand perspective, we ascertained in total 1931 unique candidate ligands (Fig. 4d) , with some duplicated ligands been identified in different categories due to multiple target genes (Fig. 4e) . We thus investigated the number of multiple gene targets of each ligand and found that most of the ligands (1557 out of 1931, 80.63%) may target unique upregulated DEGs (Fig. 4f) [19] , we sought to verify our primary findings by examining the candidate drugs identified and discussed in the study. Among previously identified drug candidates, our analysis successfully identified baricitinib (targeting JAK1 and JAK2), sarilumab (targeting IL6R) and ritonavir (targeting CYP3A4) as potential drugs for the treatment of SARS-CoV-2 infection (Fig. 5) . Since the upstream (IL6R) and downstream (JAK1/2) genes in IL6 signaling pathway were up-regulated, we thus checked the significance of IL6 gene. We found that IL6 has successfully been identified as up-regulated DEGs (i.e., passed the significant threshold of three methods, Figure S7 ). After prioritization, we have identified two drugs (siltuximab and sirukumab) as candidate ligands for the treatment COVID-19. Furthermore, we also detected dexamethasone (targeting NC3R1 and NC3R2) as a candidate ligands for SARS-CoV-2, which was proved to reduce mortality in COVID-19 patients who received respiratory support [20] . These results suggested that our pipeline could be empirically supported by current therapeutic options.

We reasoned that our transcriptomic-driven drug discovery approach may provide a novel perspective regarding the pharmacological mechanisms of drugs that could be considered for repositioning to SARS-CoV-2 J o u r n a l P r e -p r o o f infection. To assess the pharmacological actions of the candidate ligands, we next annotated the drugs with their pharmacological and chemical properties using the Anatomical Therapeutic Chemical (ATC) classification code.

For the 345 (17.87%) ligands with available ATC code, which may target 92 druggable genes. We thus conducted a permutation test to assess the enrichment of specific ATC groups for the 92 unique targetable genes and revealed significant enrichment in 16 out of 57 (28.07%) ATC groups (FDR <0.1; Fig. 6a) . Especially, the top 3 most significantly enriched ATC groups with FDR<0.05 were "ophthalmologicals" (FDR = 0.00057), "immunosuppressants" (FDR = 0.00086), and "drugs for obstructive airway diseases" (FDR = 0.0268).

Since we identified "immunosuppressants" as an enriched ATC term, implying the roles of immunerelated functions for drug repositioning to COVID-19. Therefore, it is plausible to investigate whether other immune-relevant molecules could be exploited for drug repositioning in SARS-CoV-2 infection, which exemplified by immune checkpoint markers, such as PDCD1 and HAVCR2. However, in this study, differentially expressed tests and following drug repositioning analyses were derived from virus-infected human Calu-3 cells. Since T cells data was not being included in our research, we could not analyze the differential expression pattern of PDCD1 and HAVCR2, which mainly be expressed on T cells. We thus investigated the DEG results of CD274 and PDCD1LG2 genes due to the following reasons: (1) CD274 and PDCD1LG2 were druggable by immune checkpoint inhibitors (ICIs) and (2) CD274 and PDCD1LG2 could bind to PDCD1 gene product to trigger T-cell exhaustion. We found a significant up-regulation of CD274 (but not for PDCD1LG2) as DEG, and thus identified atezolizumab, avelumab, and durvalumab (targeting CD274) as candidate ligands.

Given the substantial number of identified drugs with the potential to be repositioned for treatment of SARS-CoV-2 infection, we utilized several drug-based (a.k.a. ligand-based) and gene-based criteria to prioritize our candidate drug-gene pairs. For six identified clusters, we first restricted the results to "Cluster 1" genes based on following reasons ( Figure S8) We then leveraged a network (cluster) property, so-called within-cluster connectivity (WCC; i.e. the sum of weights connecting to the gene) to perform gene-based prioritization. We compared the distributions of WCC scores between druggable genes and the remaining genes and found that WCC scores of druggable genes were significantly lower than that of the remaining genes in Cluster 1 (asymptotic general independence test Z value = -2.85; p-value =0.0043; Fig. 6b ). This result indicated that genes with lower connectivity (non-hub) tend to be developed as a pharmacotherapeutic target. We therefore selected the genes with a WCC score of ≤ 2177.606, which was the WCC score of NR3C1 (targeted by dexamethasone), and resulting in 188 druggable genes (Fig. 6c) . These 188 genes corresponded to 1165 candidate ligands. Notably, JAK2 and IL6R were filtered in this step due to higher WCC scores.

Additionally, in consideration of some candidate ligands from IUPHAR PHARMACOLOGY database were under investigation and thus may not be suitable for drug repositioning, we performed ligand-based prioritization by limiting our candidate drugs to those with ATC annotation. Finally, 55 druggable genes and 238 candidate ligands were discovered for drug repositioning to SARS-CoV-2 infection (Figure S9) .

To validate our findings, we next asked whether some of the candidate drugs for repositioning to SARS-CoV-2 infection identified from our study are in ongoing clinical trials for COVID-19. Indeed, our analysis identified seven drugs (including ritonavir, baricitinib, tofacitinib, naproxen, budesonide, ciclesonide and formoterol) in ongoing clinical trials. These seven drugs target 7 genes found in our study, including CYP3A4, J o u r n a l P r e -p r o o f JAK1, PTGS1, PTGS2, NR3C1, NR3C2 and ADRB2. It is noteworthy that dexamethasone was also been identified as candidate ligands, which targets PTGS1 and PTGS2 genes (Fig. 7) .

With the pandemic threat of COVID-19 as it currently stands and the lack of time to develop new medical intervention for over millions of hospitalized victims, devising an empirical yet rationably effective therapeutic strategy is in urgent need. Unfortunately, no preclinical omic-level data from SARS-CoV-2-infected human tissues, especially for lung transcriptomics, are currently available, posing substantial challenges for drug development or repositioning. A feasible, appealing approach is to use pre-existing data that closely resembled SARS-CoV-2-infection for bioinformatics analyses, thereby identifying candidates of currently available medications to be repositioned for SARS-CoV-2 therapy. In this study, we performed such task by analyzing the gene expression profile of a human lung cell line, Calu-3, infected with SARS-CoV ΔORF6, a mutant closely resembling SARS-CoV-2. The rationale for using the mutant as a surrogate is justified by the fact that both the mutant and SARS-CoV-2 lack the antagonistic effect toward interferon response due to a truncation in ORF6. Our method of transcriptomics-driven drug discovery ultimately provided a candidate list of marketed drugs (approved for other indications) to be used in empirical clinical therapy for the urgent need of SARS-CoV-2 infection.

Specifically, we prioritized 55 targetable genes that respond to ΔORF6 infection. These genes can be modulated by 238 candidate drugs that are currently available and hence can be repositioned for COVID -19 theraphy. The reliability of our analyses were justified by the following facts. First, we predicted several drugs including ritonavir, baricitinib, tofacitinib, naproxen, budesonide, ciclesonide and formoterol that are currently under clinical trials. Second, despite of an intensive discussion on the empirical use of several drugs including hydroxychloroquine/chloroquine, remdesivir, lopinavir and ritonavir, interleukin (IL)-1 and -6 inhibitors, and Janus kinase (JAK) inhibitors, no consistent or conclusive evidence for the safety and efficacy is available, except for dexamethasone, which remarkably was on our list of candidates. As such, the analyses we provided J o u r n a l P r e -p r o o f Journal Pre-proof support the idea that analyzing transcriptomic data from virus-infected cells can yield valuable information for potential treatments, which can further be leveraged to identify potential candidates for drug repositioning.

Without any preclinical data or clinical experiments, current therapeutic options on COVID-19 infection are lacking in solid evidence and empirically limited to antiviral agents and host modifiers. In light of recent studies and reviews on potential drugs for COVID-19 therapy [19, [21] [22] [23] [24] , our list of candidates contains unexpected drugs. These candidate drug categories can expand the scope of COVID-19 drug targets, open new avenues for future study, and include agents with solid track records in clinical treatment of infectious disease for empirical tests. For instance, we identified several repositioned candidates that belong to other drug categories (e.g., ophthalmological agents, immunosuppressive agents, drugs for obstructive airway diseases, etc.), which have proven useful in treating other viral infections. Also identified were drugs for obstructive airway diseases (e.g. formoterol and ciclesonide) as candidate for repositioning to COVID-19. Notably, a previous in vitro study has shown that combination use of formoterol with budesonide (both were identified as candidate ligands) and glycopyrronium can inhibit the viral replication and cytokine production of HCoV-229E in human respiratory epithelial cells [25, 26] . Besides, several studies have suggested that ciclesonide may block the replication of SARS-CoV-2 ribonucleic acid and thus reduce the virus-induced cytopathogenic effect in COVID-19 patients [27, 28] . Thus, it appears reasonable to reposition and include drugs from other categories for COVID-19 treatment.

One intriguing feature in our prediction was the emergence of drugs for immune modulation. Yet several lines of molecular clinical evidence suggest that taking into account immune modulation might be necessary for COVID-19 therapy. First, the pathology of the SARS-CoV-2 infected lungs showed mixed features of acute alveolar injury and diffusive immune responses including immune infiltration, focal reactive hyperplasia, and intra-alveolar fibroblastic plug formation [5] . Second, the clinical severity of COVID-19 infection is linked to highly inflammatory responses that produce enormous chemokines implicated in a cytokine storm [6, 7] . Third, a key difference between SARS-CoV and SARS-CoV-2 is a truncation of the open reading frame (ORF) 3b, ORF6, in SARS-CoV-2, which affects the viral antagonistic effect toward interferon response [8] . On our list, J o u r n a l P r e -p r o o f we found a significant enrichment in the category of "anti-inflammatory and antirheumatic products", which may alleviate immune-inflammatory injury that is known to cause increased oxidative stress and organ failure in virus-infected patients [29, 30] . Also identified was β 2 -adrenergic signaling (elicited by some drugs for obstructive airway diseases such as formoterol, salbutamol, etc.) to be critical in norepinephrine-mediated processes and in the seretion of TNFα and IL-10, which may relieve tissue-damaging inflammation (and thus exerts a protective effect to patients) during viral infection [31] .

Most striking was CD274 (PD-L1) identified as druggable gene in this study, suggesting the plausibility of repurposing ICIs (immune checkpoint inhibitors) for COVID-19 therapy. These drugs include atezolizumab, avelumab and durvalumab. The idea of implimenting ICIs for COVID-19 therapy is supported by a previous study showing that T-cell exhaustion markers (PDCD1 and HAVCR2) were overexpressed in severe cases of COVID-19 patients [32] . Two recent commentary articles also addressed the possibility of using ICIs to reinvigorate exhuausted T-cell in the early phase of COVID-19 infection, yet the discussion was mainly focused on cancer patients receiving ICI treatment [33, 34] . Regarding the issue of whether ICIs could be prioritized for the treatment of COVID-19, we note that drugs targeting immune checkpoints may lead to immune checkpoint inhibitor-related pneumonitis (ICI-P) at a relatively low occurrence rate (4-13%) [35] [36] [37] [38] . Given that immunostimulants such as ICIs may exacerbate the clinical symptoms (especially pneumonitis) of SARS-CoV-2 infection, we suggest that repositioning of these drugs to COVID-19 should be evaluated with great caution.

Moreover, it may be useful to consider combining ICIs with other drugs that mitigate unwanted ICI-induced inflammation, such as IL6 and JAK2 inhibitors [39] [40] [41] . Consistently, the potential of targeting IL6 signaling pathway for COVID-19 therapy has already been considered [19, 42] . Of note, our analyses pinpointed several ligands targeting IL6, IL6R, JAK1 and JAK2, suggesting the plausibility of using ICIs with IL6 signaling pathway blockers in a combination therapy.

For COVID-19, concerns have been raised on enhancing SARS-CoV-2 virulence by treatment of widely used drugs, such as angiotensin-converting enzyme 2 (ACE2) inhibitors and angiotensin-receptor blockers (ARBs) [43] [44] [45] [46] . Consistent with this concern, we did not detect significant enrichment for the category "agents J o u r n a l P r e -p r o o f Journal Pre-proof acting on the renin-angiotensin system". To maximize the accuracy of our drug repositioning predictions, we prioritized the initially identified 2396 drug-gene pairs into a list of 296 (12.35%) drug-gene pairs. The final list was restricted to (i) drugs with gene target(s) in "Cluster 1", (ii) drugs with target gene(s) with lower WCC compared to dexamethasone, and (iii) drugs with an available ATC code. Excluding candidate drug-gene pairs that lack ATC annotations, we identified high-priority drug-gene pairs with likely disease relevance. Note that our analysis may be undermined by a lack of data from preclinical cases of SARS-CoV-2 infection and a lack of clinical specimens from COVID-19 patients. It is possible that the DEGs identified from the surrogate system may not fully reflect genes that respond to SARS-CoV-2 and promote disease progression in COVID-19

infection. Another potential caveat is that we did not take into account physical binding affinities in drug-gene interactions. Moreover, since our analyses were based on data from infected cells, we might not be able to identify candidate drugs that act on viral proteins or human gene products involved in the process of virus entry.

For example, ACE2 and TMPRSS2 were identified as the entry factors for SARS-CoV-2 [47] . Among those genes we analyzed, TMPRSS2 was identified as a differentially-expressed gene, and TMPRSS2-upregulating variants were more frequently found in European and American populations. However, we failed to detect the corresponding candidate ligand targeting TMPRSS2. Neither was ACE2 identified as a DEG in our study. Based on the ATC enrichment test, we indeed failed to identify enrichment of "Agents acting on the renin-angiotensin system" in the candidate drug list. Thus, it is likely that the renin-angiotensin system may play an important role in the early stage of infection but not associate with the ongoing pathogenesis after viral infection. Although our initial attempt was to predict drugs for empirical clinical use, we should point out that the clinical feasibility of using candidate drugs for repositioning was not accessed in our study. As such, comorbidities and potential drug-drug interactions arising from polypharmacy in COVID-19 patients should be carefully considered and investigated before repositioning specific drugs and regimens for COVID-19 therapy. Likewise, preclinical experiments are needed to demonstrate the potential efficacy of identified drugs, and preclinical evidence from cellular experiments should be gathered before moving to clinical application.

In summary, our results reveal potential candidates for drug repositioning to SARS-CoV-2 therapy. The candidates we found include drugs from categories other than "antivirals for systemic use" (based on ATC code), suggesting that drugs with various indications should be considered in future studies. Given the scarcity of omics-level data from SARS-CoV-2 infected lung tissue, our surrogate transcriptomic-driven drug discovery approach has prioritized the list of potential candidate drugs to be experimentally examined in preclinical studies, to be considered for empirical clinical use when it is needed, and to open novel therapeutic avenues for the treatment of COVID-19.

The authors declare that they have no known conflicts of interest associated with this work. 

Coronavirus Infections-More Than Just the Common Cold

Host Factors in Coronavirus Replication

Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses

Pathology and pathogenesis of severe acute respiratory syndrome. The American journal of pathology

Pathological findings of COVID-19 associated with acute respiratory distress syndrome

Molecular immune pathogenesis and diagnosis of COVID-19

The landscape of lung bronchoalveolar immune cells in COVID-19 revealed by single-cell RNA sequencing

SARS-CoV-2 sensitive to type I interferon pretreatment

limma powers differential expression analyses for RNA-sequencing and microarray studies

Significance analysis of time course microarray experiments

GENCODE: the reference human genome annotation for The ENCODE Project

A general framework for weighted gene co-expression network analysis

clusterProfiler: an R package for comparing biological themes among gene clusters

ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization

The IUPHAR/BPS Guide to PHARMACOLOGY in 2018: updates and expansion to encompass the new guide to IMMUNOPHARMACOLOGY

DrugBank 5.0: a major update to the DrugBank database

SARS-CoV-2 sensitive to type I interferon pretreatment

Release of severe acute respiratory syndrome coronavirus nuclear import block enhances host transcription in human lung cells

Coronavirus Disease 2019 Treatment: A Review of Early and Emerging Options. Open Forum Infect Dis 2020

Effect of Dexamethasone in Hospitalized Patients with COVID-19: Preliminary Report

Antiviral treatment of COVID-19

Drug targets for corona virus: A systematic review

Some drugs for COVID-19

Inhaled corticosteroids and COVID-19: a systematic review and clinical perspective

Inhibitory effects of glycopyrronium, formoterol, and budesonide on coronavirus HCoV-229E replication and cytokine production by primary cultures of human nasal and tracheal epithelial cells

The inhaled corticosteroid ciclesonide blocks coronavirus RNA replication by targeting viral NSP15

Identification of Antiviral Drug Candidates against SARS-CoV-2 from FDA-Approved Drugs

Bird Flu"), inflammation and anti-inflammatory/analgesic drugs

The development of anti-inflammatory drugs for infectious diseases

The β2-adrenergic receptor controls inflammation by driving rapid IL-10 secretion

Reduction and Functional Exhaustion of T Cells in Patients with Coronavirus Disease 2019 (COVID-19)

Immune checkpoint blockade: releasing the breaks or a protective barrier to COVID-19 severe acute respiratory syndrome?

Immune Checkpoint Inhibitors for Cancer Therapy in the COVID-19 Era

Immune checkpoint inhibitors in non-small cell lung cancer -towards daily practice

Immune Checkpoint Inhibitor Therapy-related Pneumonitis: Patterns and Management

Clinical and Histopathologic Features of Immune Checkpoint Inhibitor-related Pneumonitis

Characteristics, incidence, and risk factors of immune checkpoint inhibitor-related pneumonitis in patients with non-small cell lung cancer

COVID-19 and immune checkpoint inhibitors: initial considerations

Immune checkpoint inhibitors: a physiology-driven approach to the treatment of coronavirus disease 2019

Tocilizumab for the management of immune mediated adverse events secondary to PD-1 blockade

IL-6 Inhibitors in the Treatment of Serious COVID-19: A Promising Therapy

Preventing a covid-19 pandemic

Can angiotensin receptor-blocking drugs perhaps be harmful in the COVID-19 pandemic?

Hypothesis: angiotensin-converting enzyme inhibitors and angiotensin receptor blockers may increase the risk of severe COVID-19

Renin-Angiotensin-Aldosterone System Inhibitors in Patients with Covid-19

SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor

is displayed by using the Fruchterman-Reingold (FR) force-directed algorithm. The drugs (and corresponding genes) that belonged to immune checkpoint inhibitors (ICIs), that had undergoing clinical trials and that have been discussed in the review article of EK McCreary et al. [19] and it's targeting gene(s) are illustrated.

The authors declare that they have no known conflicts of interest associated with this work.