key: cord-0019736-cx73pkid
authors: Li, Albert; Huang, Hsuan-Ting; Huang, Hsuan-Cheng; Juan, Hsueh-Fen
title: LncTx: A network-based method to repurpose drugs acting on the survival-related lncRNAs in lung cancer
date: 2021-07-10
journal: Comput Struct Biotechnol J
DOI: 10.1016/j.csbj.2021.07.007
sha: 9d43577101be90a3ce6b6b9c388f9b18525d39a4
doc_id: 19736
cord_uid: cx73pkid

Despite the fact that an increased amount of survival-related lncRNAs have been found in cancer, few drugs that target lncRNAs are approved for treatment. Here, we developed a network-based algorithm, LncTx, to repurpose the medications that potentially act on survival-related lncRNAs in lung cancer. We used eight survival-related lncRNAs derived from our previous study to test the efficacy of this method. LncTx calculates the shortest path length (proximity) between the drug targets and the lncRNA-correlated proteins in the protein–protein interaction network (interactome). LncTx contains seven different proximity measures, which are calculated in the unweighted or weighted interactome. First, to test the performance of LncTx in predicting correct indication of drugs, we benchmarked the proximity measures based on the accuracy of differentiating anticancer drugs from non-anticancer drugs. The closest proximity weighted by clustering coefficient (closestCC) has the best performance (AUC around 0.8) compared to other proximity measures across all survival-related lncRNAs. The majority of the other six proximity measures have decent performance as well, with AUC greater than 0.7. Second, to evaluate whether LncTx can repurpose the drugs effectively acting on the lncRNAs, we clustered the drugs according to their proximities by hierarchical clustering. The drugs with smaller proximity (proximal drugs) were proved to be more effective than the drugs with larger proximity (distal drugs). In conclusion, LncTx enables us to accurately identify anticancer drugs and can potentially be an index to repurpose effective agents acting on survival-related lncRNAs in lung cancer.

Non-small cell lung cancer (NSCLC) is a heterogenous disease with different molecular drivers and genetic aberrations [1, 2] . Surveying the druggable targets in each patient can provide better care and prolong the survival time [3] [4] [5] . For instance, patients having EGFR mutation, one of the most common genetic aberration in lung adenocarcinoma [6] , can benefit from gefitinib, an EGFR tyrosine kinase inhibitor (TKI), and had favorable treatment response compared to those without the drug target [3, [7] [8] [9] . Other examples include alectinib and crizotinib in ALK-positive NSCLC [10] ; atezolizumab for the first-line treatment of PD-L1-selected NSCLC patients [11] . However, only certain population can benefit from the precision therapy. The discovery of more therapeutic target is thus necessary. Recent years have shown the explosion of studies on non-coding RNAs [12] . Whether non-coding RNAs can become therapeutic targets is worth further investigation.

Long non-coding RNAs (lncRNAs) are expected to be promising drug targets due to characters of tissue specificity, rapid turnover and low expression abundance [13, 14] . These features may be associated with lower dose of medication and lesser adverse effects in other organs [14] . Antisense oligonucleotide (ASO) has been used to target lncRNAs for years [15] . But since ASO is a large and highly-charged molecule, its drug-likeness makes the delivery ASO into human body a challenging task [16] . Few, if any, ASObased therapies were approved in treating lung cancer patients. a sophisticated pocket structure which is suitable for binding druglike molecules can be found [16] ; these enable small molecule to target the special domains of the lncRNAs [17] . However, because of the dynamic structure of lncRNA, effective small molecules targeting lncRNA have still been limited [18] .

Given that roles of most lncRNAs in lung cancer have not been fully discovered, numerous published literatures have revealed crucial lncRNAs that are clearly associated with pathogenesis and prognosis of lung cancer, such as MALAT1 [19] [20] [21] and HOTAIR [22] [23] [24] . In one of our recent works [25] , we constructed lncRNA association networks to investigate functionally similar and coregulated lncRNAs in lung cancer. Some lncRNA modules within the lncRNA association networks were correlated with the overall survival of the patients. In addition, we proved that the modular signature could be used as a novel prognostic biomarker in lung cancer. In terms of the biological functions, we found that the survival-related lncRNA modules were significantly associated with cancer hallmarks and pathways. However, it was uncertain whether these modules can become therapeutic targets as well.

Although it is not denying that few drugs can directly target lncRNAs [18] , the rationale of this study is based on the hypothesis that drugs can indirectly act on a lncRNA through influencing proteins that are highly correlated with that lncRNA.

With the availability of public pharmacogenomic data [26] [27] [28] , the predictive modeling techniques, particularly machine learning [29] [30] [31] and deep learning [32] [33] [34] , have been largely applied in the prediction of drug response. Combining in vitro datasets and patients' samples is even more powerful in analyzing pharmacogenomic data [35] . Network science is another pivotal discipline dealing with complex biological systems [36] [37] [38] [39] . The application of network biology in pharmacology brings new insights into early drug discovery, optimal drug combination regimen [40] and drug repurposing [41, 42] . The advantages of using network pharmacology include unraveling disease mechanisms through topologybased pathways, and prioritizing the candidate targets considering their network effects [43] [44] [45] [46] [47] . Network proximity was proved to be an effective measure for predicting drug efficacy, particularly in Parkinson's disease and several inflammatory disorders [41] . It was revealed that protein targets of the effective drugs tend to localize close to the disease-related proteins in the interactome [41] . Novel drug-disease association can be predicted considering the proximity as well [41] .

The network-based drug repurposing strategy is expected to identify potentially effective drugs acting on cancer-related lncRNAs. Meanwhile, it is also questioned that whether properties of the network (e.g., node degree) are important factors in predicting the effectiveness of a drug. Therefore, we developed a networkbased method, LncTx, which measures the proximity between the drug targets and lncRNA-correlated proteins in the interactome. We accentuated the effects of the network properties by weighting the interactome with various different network parameters. Eight survival-related lncRNAs discovered in our previous work [25] were selected as the therapeutic targets. LncTx was used to repurpose the drugs that are potentially effective at targeting these lncRNAs.

The interactome was derived from the open-access data provided by Guney et al [41] who collected the experimentally validated protein-protein interactions from various databases [48] [49] [50] [51] [52] [53] [54] [55] .

Two drug lists were used in this study. One was adapted from the public data created by Guney et al [41] , which included 237 drugs. Since both anticancer and non-anticancer drugs were included in this list, it was used to assess the performance of predicting the indication of anticancer drug. The other drug list was derived from Genomics of Drug Sensitivity in Cancer (GDSC) [27, 56] , which consists of two datasets (GDSC1 and GDSC2). All drugs within this list are anticancer drugs, either clinically approved or under pre-clinical assessment. Because the improved screening techniques and procedures were used in GDSC2, we removed the duplicates shown in GDSC1, combined the two datasets, and selected the drug response of the non-small cell lung cancer cell lines. There are 429 anticancer drugs in total and their protein targets. We referred to DRUGBANK [57] for protein targets of chemotherapy agents because such information was not revealed in GDSC.

The basal gene expression of the cancer cell lines was derived from the Cancer Cell Line Encyclopedia (CCLE) [26] . Reads per kilobase million (RPKM) of the cancer cell lines was retrieved. The preprocessing process was detailed in our previous work [58] . In brief, genes with empty expression value across more than 20% of the samples were removed. The remaining empty expression was imputed with the minimal expression value in the dataset. RPKM was then log 2 transformed. Finally, 188 lung cancer cell lines were selected for subsequent analysis.

The analyses were implemented with R packages dplyr and ggplot2 [59] .

The clinical data of lung cancer patients were downloaded from the lung adenocarcinoma (LUAD) project in the Cancer Genome Atlas (TCGA-LUAD) [60] . Log rank test was used to assess the effect of lncRNA expression on the survival probability in the subgroups (lncRNA = high v.s. lncRNA = low) of TCGA-LUAD patients. The median of the lncRNA expression was used to define the patient subgroup. The univariate survival analysis was visualized with a Kaplan-Meier plot. Cox proportional hazard (Cox PH) model [61] was used for multivariable analysis to adjust the factors that potentially influence the survival, including age and cancer stage. We used binary classification in cancer stage where patients with stage I and II were assigned to early stage, and stage III and IV were assigned to late stage. The hazard ratio of the effect of high lncRNA expression on overall survival was calculated and visualized with a forest plot.

The biological functions of the lncRNAs were deduced from the function enrichment analysis. Under the norm of ''guilt by association" [62] , which is commonly used in inferring the functions of unknown genes [63] , top 200 lncRNA-correlated mRNAs were selected to predict functions of the lncRNAs. Hypergeometric test was used for hypothesis testing. We implemented the analysis on a web-based platform, The Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.8 [64] . Concerning the multiple hypothesis testing, Benjamini-Hochberg procedure were used to correct p-value. The significant level of the adjusted pvalue was set at 0.05.

The following network properties were used in the calculation of edge weights in the interactome, including degree, betweenness centrality, and clustering coefficient. In addition, we integrated information from differential expressed analysis (DEA) into edge weights. DEA was conducted by comparing the gene expression of tumor and adjacent normal tissues in TCGA-LUAD. The analysis was implemented by limma [65] and edgeR [66] . The HTSeq counts were downloaded from TCGA-LUAD by TCGAbiolinks [67] . The fold change and p-value derived from DEA were used to calculate the edge weights. We used À1 Â log 10 ðp À valueÞ as the p-value (PVal) weight and log 2 fold change ð Þas the fold change (LogFC) weight. The edge weights (w) between node i and node j (E ij(w) ) were defined as follow:

where i and j are nodes in the interactome.

The proximity was defined as the shortest path length between the protein target(s) of a drug and the proteins of interest in the interactome. To deduce the proximity between the drug targets and the survival-related lncRNAs, we used top 20 correlated proteins of each survival-related lncRNA. Specifically, the choice of selecting negatively-or positively-correlated proteins is based on the association network from which the lncRNAs were derived [25] . Seven different proximity measures are defined as below. Of note, the shortest proximity and the closest proximity were adapted from Guney et al [41] .

Let T= {t|t 2 drug targets}; G={g|g 2 lncRNA correlated proteins}; W = {w|w 2 weighting methods (BC, CC, PValue, LogFC, Degree)}; l (a,b) w = the proximity betweeb node a and node b under w weighing method.

Closest proximity = 1 jTj j j P t2T min g2G lðt; gÞ unweighted ClosestBC proximity = 1 jTj j j P t2T min g2G lðt; gÞ BC ClosestDegree proximity = 1 jTj j j P t2T min g2G lðt; gÞ Degree ClosestPVal proximity = 1 jTj j j P t2T min g2G lðt; gÞ PVal ClosestLogFC proximity = 1 jTj j j P t2T min g2G lðt; gÞ LogFC ClosestCC proximity = 1 jTj j j P t2T min g2G lðt; gÞ CC where closestBC is the closest proximity weighted by betweenness centrality; closestDegree is the closest proximity weighted by degree of nodes; closestCC is the closest proximity weighted by clustering coefficient; closestPVal is the closest proximity weighted by p-value from DEA; closestFC is the closest proximity weighted by fold-change from DEA.

The network analysis was implemented with pandas and net-workX [68] in Python.

The drug list from Guney et al [41] contains 237 drugs with diverse indications. In this study, the indications were grouped into two categories (anticancer or non-anticancer drugs) or three categories (anti-lung cancer, anti-other cancer or non-anticancer drugs) by the consensus of two licensed medical doctors based on the latest clinical guidelines [57] . We then built the receiver operating characteristic (ROC) curve by examining different proximity threshold and calculated area under the receiver operating characteristic (AUC) curve. The analysis was implemented with R packages pROC [69] and ROSE [70] .

The proximity is correlated with the mean degree of drug targets ( Supplementary Fig. S2A -H). To reduce the effect of the degree on the proximity, we transformed the proximity into z-score. To begin with, we randomly selected nodes with similar degree as the drug targets, and calculated the proximity between these nodes and lncRNA-correlated proteins. More specifically, to derive the nodes with similar degree as the drug targets, the nodes in the interactome were sorted on the basis of the node degree, and were collected in the bins where the number of nodes within one bin did not exceed 100. We then randomly selected the nodes from the bins that the drug targets belong to. By doing so, the degree of the selected nodes will be very close (or equal) to the drug targets. We repeated the procedure 100 times and calculated the mean and standard deviation of these degree-adjusted proximities. A z-score of the proximity can then be derived by

where p is the proximity of interest; p(rand) is the proximity derived from the randomly selected nodes in similar degree; l is the mean of the samples; r is the standard deviation of the samples.

To visualize the relation between seven proximity measures and drug categories, we reduced the seven dimensions (proximities) into three dimensions by principal component analysis (PCA). Top three principal components were selected. The 3-D PCA analysis was performed with R packages rgl and plot3D [71] .

Drug effectiveness based on lncRNA expression was quantified by the correlation between lncRNA expression and drug IC 50 or AUC under IC 50 . We took the absolute value to Spearman's correlation coefficient (SCC) to model two possible biological mechanisms. 1) if SCC is positive (i.e., lncRNA expression is positively correlated with drug IC 50 or AUC under IC 50 ), the drugs may act on the proteins that are negatively correlated with these lncRNAs; and 2) if SCC is negative, the drugs might potentially act on these lncRNAs via off-target effects since no known lncRNAs are the direct targets of the drugs when we conducted this study. Therefore, the absolute value of SCC between drug IC 50 and lncRNA expression (absIC50SCC) as well as the absolute value of the SCC between drug AUC under IC 50 (absAUCSCC) were used to quantify the sensitivity of the drug toward certain lncRNA.

To categorize the drugs, we conducted hierarchical clustering (HC) to identify drug clusters based on seven proximity measures. Specifically, Euclidian distance between the proximities of drugs was used as distance measure, and complete linkage was used as the agglomerative method. Drugs within the clusters that have smaller proximities are defined as proximal drugs, while those with larger proximities are defined as distal drugs.

To test whether the proximities can predict drug sensitivity, we arranged drug clusters as the same order in hierarchical clustering and visualized the absIC50SCC or absAUCSCC in each cluster with box plots. A cubic polynomial was fitted to observe the trend of proximity and drug sensitivity. Next, we compared the absIC50SCC or absAUCSCC between proximal drugs (clusters with lower proximities) and distal drugs (clusters with higher proximities) by Wilcoxon rank-sum test. Significant level was set at p 0.05.

The flowchart of this study was shown in Fig. 1 . To predict effective drugs acting on the survival-related lncRNAs in lung cancer, we derived eight lncRNAs candidates from our previous study [25] , basal gene expression of cancer cell lines from CCLE [26] , and two public drug lists from Guney et al [41] and GDSC [27] . We first examined the accuracy of proximity in predicting the correct drug indication (e.g., anticancer or non-anticancer drugs), and benchmarked seven different proximity measures. Second, we used hierarchical clustering to categorize drug clusters into proximal drugs and distal drugs. Finally, we validated the results by analyzing the correlation between the drug response and lncRNA expression.

We selected eight lncRNAs from the modular signature of lung adenocarcinoma in our previous study [25] , and examined the association between lncRNA expression and patients' overall survival ( Fig. 2A, Supplementary Fig. S1 ). The results showed that higher expression of lncRNAs is associated with favorable prognosis. We also investigated roles of lncRNAs in lung cancer with functional enrichment analysis which revealed crucial cancer hallmarks such as cell cycle and cell-cell adhesion. (Fig. 2B-H) .

We calculated the proximity by seven different measures (Shortest, Closet, ClosestBC, ClosestDegree, ClosestPValue, Clos-estLogFC, and ClosestCC). When categorizing drugs into anticancer or non-anticancer drugs, closestCC had the best performance compared to other proximity measures, with AUC greater than 0.8 across eight survival-related lncRNAs. The result implied that clos-estCC may be an optimal proximity measure to predict antineoplastic agents (Fig. 3A) .

Degree of nodes in the interactome is an important factor in cancer network biology. Nodes with high degree tend to be hubs and play pivotal roles in the network [36] [37] [38] 63] . Considering the importance of hub genes in cancer, it was speculated that the degree of drug targets may influence the prediction of antineoplastic drugs. In other words, the drug that can target hub pro-teins in the interactome may be a more potent anti-neoplasm agent than the drug that target low-degree proteins. In fact, we found that the diagnostic accuracy of the mean degree of drug targets was acceptable (AUC = 0.743) (Fig. 3B) . Further, degree was found being negatively correlated with proximity ( Supplementary  Fig. S2 ). To investigate the net effect of proximity on diagnostic accuracy of drug indication without the influence of degree, we transformed the proximity into the z-score (ClosestZScore) (See Materials and Methods). We then re-examined the correlation between ClosestZScore and the mean degree of drug targets (Supplementary Fig. S2 ). It was found that the degree effect in all lncRNAs diminishes. Particularly, in lncRNAs ENSG00000232611, the degree effect can be completely adjusted ( Supplementary  Fig. S2J ). Compared to other proximity measures, ClosestZScore proximity has the lowest AUC in predicting drug indication, particularly in ENSG00000232611 whose degree effect was completely removed (Fig. 3A) . We further compared the ClosestCC proximity and ClosestZScore proximity in anticancer and non-anticancer drugs. In ENSG00000232611, the anticancer drugs had significantly lower closestCC proximity than the non-anticancer drugs (Wilcoxin P = 3.9eÀ11) (Fig. 3C ), but showed no significant difference in the ClosestZScore proximity (Fig. 3D) . Similar results were found as well when classifying the drug indications into three categories (i.e., anti-lung cancer, anti-other cancer, or non-anticancer drugs) ( Fig. 3E and F) . 3D-PCA also revealed distinct distribution of the drugs in these three categories ( Supplementary Fig. S3 ). The above findings suggest that the degree may interact with the proximity and have an effect on predicting the indication of anticancer drugs.

To quantify the effectiveness of the drugs that act on lncRNAs, we computed the absolute value of Spearman correlation coefficient (SCC) between the lncRNA expression and the cell response to drugs (absIC50SCC or absAUCSCC). Here, we assume that the drug response will be significantly correlated with the lncRNA expression if that drug is acting (either directly or indirectly) on that lncRNA. To prove that the proximity can be an index in surveying the drugs acting on a lncRNA, we examined the relation between the proximity and the absIC50SCC or absAUCSCC of eight survival-related lncRNAs across seven proximity measures with scatter plots. A cubic polynomial was fitted to reveal the trend ( Supplementary Fig. S4-11 ). We found that the drugs with higher absIC50SCC or absAUCSCC tend to have smaller proximity, while the drugs with larger proximity tend to have lower absIC50SCC or absAUCSCC. In other words, many drugs with low proximity tend to be effective, while few drugs with high proximity are effective. Most of the fitted lines showed a decreasing trend, suggesting that there may exist a negative correlation between proximity and drug efficacy.

However, it was also noted that not all proximity measures have the same trend. For instance, the fitted line in the shortest proximity is concave up (Supplementary Fig. S6G) . Hence, to analyze the similarity of the proximities calculated by different measures, the correlation between proximity measure was compared (Fig. 4) . The results showed that the shortest proximity have relatively low correlation with other measures, implying the dissimilarity of shortest proximity.

According to the above findings, the drugs with higher absAUCSCC or absIC50SCC tend to have lower proximity. However, whether proximity can be a predictor of drug sensitivity based on lncRNA expression still need to be proved. In ENSG00000268650, we used hierarchical clustering (HC) to define new drug clusters according to their proximities derived from seven different measures (Fig. 5A) . The scaled proximities and clusters were shown in the heatmap. Cluster 1 and 3 had smaller proximities and were defined as proximal drugs, while cluster 7, 8, and 9 were defined as distal drugs due to larger proximities. Since the proximities calculated from the seven measures are highly collinear (Fig. 4) , we used principal component analysis (PCA) to reduce seven proximity measures (dimensions) into two main principal components. We found that all clusters identified by HC can be clearly distinguished. Specifically, proximal drugs (cluster 1 and 3) localize in the area with smaller value in both PC1 and PC2, while distal drugs (cluster 7 to 9) have larger values (Fig. 5B) .

We next examined the distribution of absAUCSCC in different drug clusters identified by HC (Fig. 5C) . We arranged the clusters as the same order in HC and found a clear decreasing trend from cluster 1 through cluster 9 (Fig. 5C ). This result suggests that drug clusters with lower proximity tend to be more effective in treating cell lines with aberrant expression of the lncRNA (ENSG00000268650). To validate whether proximal drugs are more effective than distal drugs, we further compared the absAUCSCC of these drugs clusters. We discovered that proximal drugs have significantly higher correlation with lncRNA expression (absAUCSCC) than distal drugs (Wilcoxin p = 0.0094) (Fig. 5D) . We also noticed that a larger proportion of proximal drugs have signif-icant correlation between AUC of IC 50 and lncRNA gene expression ( Fig. 5E and F) . Within the significant proximal drugs, some of the drugs have been currently used in treating lung cancer, such as topotecan [72] [73] [74] (Fig. 6A) ; the others are not currently available in treating lung cancer but have potential to be applied in the future, such as saracatinib [75] and voxtalisib [76] (Fig. 6B and C). The results from the other survival-related lncRNAs were shown in Supplementary Fig. S12-15 In some lncRNAs, such as ENSG00000271646, proximal drugs do not have higher absAUCSCC or absIC50SCC compared to distal drugs (data not shown), implying that not all survival-related lncRNAs in our study are ideal therapeutic targets. All in all, the above results suggest that some, but not all, prognostic lncRNAs can be the therapeutic target as well. Furthermore, drug candidates within proximal clusters may be effective medications acting on the lncRNA (Table 1 ).

With the accessibility of next-generation sequencing technology in clinical practice, it is expected that more and more genes, including lncRNAs, will be found to be involved in cancer progression [77] , patients' prognosis [78] , pathological subgroups [79] , and drug resistance [80] . It is also clear that the speed of new anticancer drugs approved annually would be far less than the new targets or biomarkers being discovered. Therefore, drug repurposing is of importance to provide more treatment opportunities to patients.

A study by Guney et al [41] in 2016 revealed that proximity between drug targets and disease-related proteins can accurately predict drug indication across different diseases and provide a new insight into drug repurposing. We adapted and applied their method to discover drug targets and the lncRNA-related proteins in the interactome in order to reveal effective drugs acting on the survival-related lncRNAs in lung cancer.

Previous studies showed that the hub genes in biological networks may be important targets in cancer [36, 37] . In this study, we found that the proximity is negatively correlated with the degree. Further, both proximity and degree of drug targets involved in the prediction of drug indication, providing another clue that degree is an important network property in cancer biology. Apart from degree, parameters in the networks, including clustering coefficient and betweenness centrality, quantify the characteristics of the nodes or the neighbors of the nodes. Under the assumption that network structure plays important roles in cancer biology, we integrated these parameters by adding weights in the interactome. In terms of drug indication prediction, we found that the weighted proximity measures outperformed the non-weighted measures, particularly the closest proximity weighted with clustering coefficient (closestCC).

When assessing the performance of proximity in predicting effective drugs targeting lncRNA expression (data not shown), Fig. 3 . Prediction of the indication of the anticancer drugs using seven different proximity measures across eight survival-related lncRNAs. (A) The proximity is the shortest path length between survival-related lncRNAs and the drug targets. The label of the drug is dichotomous (i.e., anticancer or non-anticancer). A receiver operating characteristic (ROC) curve was constructed by setting different threshold of the proximity to assess the diagnostic accuracy. Area under the receiver operating characteristic curve (AUC) was then calculated by summing the area under the ROC curve. The AUC of different proximity measures were compared across eight survival-related lncRNAs. (B) ROC curve was constructed considering the mean degree of the drug targets. (C) Comparison of the ClosestCC between anticancer drugs and non-anticancer drugs in ENSG00000232611. (D) Comparison of the ClosestZScore between anticancer drugs and non-anticancer drugs in ENSG00000232611. (E) Comparison of the ClosestCC between anti-lung cancer drugs, anti-other cancer drugs and non-anticancer drugs in ENSG00000232611. (F) Comparison of the ClosestZScore between anti-lung cancer drugs, anti-other cancer drugs and non-anticancer drugs in ENSG00000232611. closestBC: the closest proximity weighted by betweenness centrality; closestDegree: the closest proximity weighted by degree of nodes; closestCC: the closest proximity weighted by clustering coefficient; closestPVal: the closest proximity weighted by p-value from DEA; closestFC: the closest proximity weighted by fold-change from DEA. closestZScore: the closest proximity transformed to z-score. however, we found ambiguous results. First, given the same proximity measure, the value of AUC varies across different lncRNAs, and most of the AUCs are close to 0.5. Second, closestCC proximity was not better than other proximity measures. The above findings suggest that, when predicting drug sensitivity toward lncRNA, it may be less proper to simply consider the proximity calculated from a single measure, because the results may not be robust.

Therefore, considering seven different proximity measures, we performed hierarchical clustering to define clusters of proximal and distal drugs.

The design of this study was not to systematically reveal novel lncRNA therapeutic biomarkers, where comparing the genome between responsive and non-responsive cohorts may be necessary. In fact, we were more interested in investigating whether the known survival-related lncRNAs can be the therapeutic targets as well. Considering the dynamic structure and the metabolism of lncRNAs, it is not expected that all prognostic lncRNA biomarker can be therapeutic biomarkers. We conducted literature search for some of the identified proximal drugs. For example, six effective drugs within the proximal drugs (cluster 1 and 3) were revealed acting on ENSG00000268650. Among them, topotecan in cluster 1 was found having positive results in treating NSCLC [74] . Being a cytotoxic agent, topotecan was revealed to kill cancer cells by interrupting the cell cycle through upregulating CDKN1A-encoded p21 in the lung cancer cell line [81] . Given the low objective response rate with single-agent, topotecan still showed high stable disease rate and, when in combination with other chemotherapy agents and radiotherapy, showed encouraging results [82] . Hence, according to Vennepureddy et al [74] , topotecan in combination with other agents was recommended as the first-line advanced NSCLC therapy if the patient cannot tolerate the standard platinum-based therapy. Saracatinib is a Src-kinase inhibitor used to treat NSCLC in vivo [83] . Saracatinib seemed to be able to restore the epithelialmesenchymal transition and disrupt spheroidogenesis in ovarian cancer cell lines and the subcutaneous xenograft model [84] . A phase II trial showed that 4 out of 37 patients had stable disease for at least four months, and an additional two patients had partial response. This result suggested that a subset of NSCLC patients may have benefitted from saracatinib given that the molecular subtypes have been unclear [75] . Therefore, using ENSG00000268650 as a saracatinib therapeutic response biomarker may be worth further investigations in the future. Voxtalisib is another drug in the proximal drug cluster acting on ENSG00000268650. It is a dual-PI3K/mTOR kinase inhibitor and showed promising results in follicular lymphoma [76] . In glioblastoma multiform, voxtalisib combined with temozolomide can significantly decrease the intracranial xenograft tumor size compared with temozolomide alone [85] . Although limited anti-solid-tumor activity was revealed in a phase Ib trial [86] , its brain-penetrant character [87] may still be promising in treating lung cancer with brain metastasis.

In ENSG00000272402, we revealed four effective drugs within the proximal drug cluster acting on this lncRNA ( Supplementary  Fig. S12G-J) . Vinblastine and vinorelbine are chemotherapy originally used in NSCLC [88] . Meanwhile, we revealed two potential agents targeting ENSG00000272402 in lung cancer. Serdemetan, a novel tryptamine derivative, showed potent in vitro and in vivo antiproliferative activity. A phase I clinical trial [89] suggested its association with p53 induction and modest clinical activity despite the side effect of exposure-related QTc prolong [89] . AZD7969 is a potent inhibitor of glycogen synthase kinase 3 (GSK3b) undergoing preclinical toxicity assessment [90] . The above results showed that some of the proximal drugs are currently used to treat NSCLC, while many of them are under investigation. Our findings can provide even more insight to these drugs, particularly in cell lines/patients with over-or under-expression in these survival-related lncRNAs. More detail regarding the drug clusters in each survival-related lncRNA can be found in Supplementary Table 1. In conclusion, LncTx is a network-based method to repurpose drugs acting on the survival-related lncRNAs. In this study, we revealed that the proximity between drug targets and the lncRNA-correlated proteins can be a decent predictor of anticancer drug (Table 1) . Furthermore, we found that some of the survivalrelated lncRNAs are more susceptible to proximal drugs, suggesting that proximity can be used to predict the treatment response (Table 1 ). This result, as far as we know, is the first study using weighted biological network to repurpose the drugs targeting the survival-related lncRNAs. Given the limitation that our results were only validated on GDSC/CCLE datasets, it is still expected that the method can be applied to other cancers and diseases to select effective drugs for lncRNA-based treatment in the future. 

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Non-small cell lung cancer: Epidemiology, screening, diagnosis, and treatment

Precision diagnosis and treatment for advanced non-small-cell lung cancer

Systemic therapy for locally advanced and metastatic nonsmall cell lung cancer: A Review

Changes in lung cancer treatment as a result of the coronavirus Disease 2019 Pandemic

Lung cancer: Current therapies and new targeted treatments

The effect of advances in lung-cancer treatment on population mortality

EGFR mutation and resistance of non-small-cell lung cancer to gefitinib

Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib

Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma

Alectinib versus Crizotinib in Untreated ALK-Positive Non-Small-Cell Lung Cancer

Atezolizumab for First-Line Treatment of PD-L1-Selected Patients with NSCLC

A wealth of discovery built on the Human Genome Project -by the numbers

Genome regulation by long noncoding RNAs

Gene regulation by long non-coding RNAs and its biological functions

Drugging the lncRNA MALAT1 via LNA gapmeR ASO inhibits gene expression of proteasome subunits and triggers anti-multiple myeloma activity

Principles for targeting RNA with drug-like small molecules

SHAPE reveals transcript-wide interactions, complex structural domains, and protein interactions across the <em>Xist</em> lncRNA in living cells

Non-coding RNAs as drug targets

MALAT1: A druggable long non-coding RNA for targeted anti-cancer approaches

MALAT-1, a novel noncoding RNA, and thymosin beta4 predict metastasis and survival in earlystage non-small cell lung cancer

The long noncoding MALAT-1 RNA indicates a poor prognosis in non-small cell lung cancer and induces migration and tumor growth

Functions of lncRNA HOTAIR in lung cancer

Roles of HOTAIR in lung cancer susceptibility and prognosis

HOTAIR lifts noncoding RNAs to new levels

Modular signature of long non-coding RNA association networks as a prognostic biomarker in lung cancer

The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity

A landscape of pharmacogenomic interactions in cancer

Harnessing connectivity in a large-scale small-molecule sensitivity dataset

Machine learning and feature selection for drug response prediction in precision oncology applications

Therapeutic targeting of non-oncogene dependencies in high-risk neuroblastoma

Machine learning approaches to drug response prediction: challenges and recent progress

Deep learning for drug response prediction in cancer

Predicting drug response of tumors from integrated genomic profiles by deep neural networks

Predicting drug response and synergy using a deep learning model of human cancer cells

Discovering long noncoding RNA predictors of anticancer drug sensitivity beyond proteincoding genes

Network medicine: A network-based approach to human disease

Network biology: Understanding the cell's functional organization

Network geometry

Computational network biology: Data, models, and applications

Network-based prediction of drug combinations

Network-based in silico drug efficacy screening

Target identification among known drugs by deep learning from heterogeneous networks

Network-based technologies for early drug discovery

Molecular networking as a drug discovery, drug metabolism, and precision medicine strategy

Network pharmacology: The next paradigm in drug discovery

Identification and construction of combinatory cancer hallmark-based gene signature sets to predict recurrence and chemotherapy benefit in stage II Colorectal Cancer

Identification of high-quality cancer prognostic markers and metastasis network modules

Transcriptional regulation, from patterns to profiles

The IntAct molecular interaction database in 2010

MINT, the molecular interaction database: 2009 update

The BioGRID Interaction Database: 2011 update

Human Protein Reference Database-2009 update

From genomics to chemical genomics: new developments in KEGG

Global reconstruction of the human metabolic network based on genomic and bibliomic data

The comprehensive resource of mammalian protein complexes

Genomics of Drug Sensitivity in Cancer (GDSC): A resource for therapeutic biomarker discovery in cancer cells

DrugBank: A comprehensive resource for in silico drug discovery and exploration

Identification of lncRNA functions in lung cancer based on associated protein-protein interaction modules

Elegant Graphics for Data Analysis

Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma

Analysis of Survival Data under the Proportional Hazards Model

Guilt by association: contextual information in genome analysis

Using networks to measure similarity between genes: association index selection

Database for annotation, visualization, and integrated discovery

) limma powers differential expression analyses for RNA-sequencing and microarray studies

edgeR: A Bioconductor package for differential expression analysis of digital gene expression data

TCGAbiolinks: an R/ Bioconductor package for integrative analysis of TCGA data

Exploring Network Structure, Dynamics, and Function using NetworkX

pROC: an open-source package for R and S+ to analyze and compare ROC curves

ROSE: A package for binary imbalanced learning

Computing and Displaying Isosurfaces in R

Topotecan for relapsed small-cell lung cancer: Systematic Review and Meta-Analysis of 1347 Patients

Update on the role of topotecan in the treatment of non-small cell lung cancer

Role of topotecan in non-small cell lung cancer: a review of literature

A phase II trial of saracatinib, an inhibitor of src kinases, in previously-treated advanced non-small-cell lung cancer: the princess margaret hospital phase II consortium

Voxtalisib (XL765) in patients with relapsed or refractory non-Hodgkin lymphoma or chronic lymphocytic leukaemia: an open-label, phase 2 trial

Long non-coding RNA RAMS11 promotes metastatic colorectal cancer progression

LncRNA profile study reveals a three-lncRNA signature associated with the survival of patients with oesophageal squamous cell carcinoma

SPSNet: subpopulation-sensitive networkbased analysis of heterogeneous gene expression data

Long non-coding RNAs regulate drug resistance in cancer

Identification of key genes and pathways associated with topotecan treatment using multiple bioinformatics tools

A prospective phase II study of induction carboplatin and vinorelbine followed by concomitant topotecan and accelerated radiotherapy (ART) in locally advanced non-small cell lung cancer (NSCLC)

Combination treatment of Src inhibitor Saracatinib with GMI, a Ganoderma microsporum immunomodulatory protein, induce synthetic lethality via autophagy and apoptosis in lung cancer cells

An EMT spectrum defines an anoikis-resistant and spheroidogenic intermediate mesenchymal state that is sensitive to e-cadherin restoration by a src-kinase inhibitor, saracatinib (AZD0530)

Inhibition of PI3K/mTOR pathways in glioblastoma and implications for combination therapy with temozolomide

A phase Ib dose-escalation and expansion study of the oral MEK inhibitor pimasertib and PI3K/MTOR inhibitor voxtalisib in patients with advanced solid tumours

Firstin-Human Phase I Study to Evaluate the Brain-Penetrant PI3K/mTOR Inhibitor GDC-0084 in patients with progressive or recurrent high-grade glioma

Cisplatin, vinblastine, and hydrazine sulfate in advanced, non-small-cell lung cancer: a randomized placebo-controlled, double-blind phase III study of the Cancer and Leukemia Group B

A phase I first-in-human pharmacokinetic and pharmacodynamic study of serdemetan in patients with advanced solid tumors

Preclinical toxicity of AZD7969: Effects of GSK3b inhibition in adult stem cells

Supplementary data to this article can be found online at https://doi.org/10.1016/j.csbj.2021.07.007.