key: cord-0950688-46jj5day authors: Laise, Pasquale; Stanifer, Megan L.; Bosker, Gideon; Sun, Xiaoyun; Triana, Sergio; Doldan, Patricio; La Manna, Federico; De Menna, Marta; Realubit, Ronald B.; Pampou, Sergey; Karan, Charles; Alexandrov, Theodore; Kruithof-de Julio, Marianna; Califano, Andrea; Boulant, Steeve; Alvarez, Mariano J. title: A model for network-based identification and pharmacological targeting of aberrant, replication-permissive transcriptional programs induced by viral infection date: 2022-02-04 journal: Res Sq DOI: 10.21203/rs.3.rs-1287631/v1 sha: 360377e75d41d7c4325c8bc3d4a22eb3a55247b8 doc_id: 950688 cord_uid: 46jj5day Precise characterization and targeting of host cell transcriptional machinery hijacked by viral infection remains challenging. Here, we show that SARS-CoV-2 hijacks the host cell transcriptional machinery to induce a phenotypic state amenable to its replication. Specifically, analysis of Master Regulator (MR) proteins representing mechanistic determinants of the gene expression signature induced by SARS-CoV-2 in infected cells revealed coordinated inactivation of MRs enriched in physical interactions with SARS-CoV-2 proteins, suggesting their mechanistic role in maintaining a host cell state refractory to virus replication. To test their functional relevance, we measured SARS-CoV-2 replication in epithelial cells treated with drugs predicted to activate the entire repertoire of repressed MRs, based on their experimentally elucidated, context-specific mechanism of action. Overall, >80% of drugs predicted to be effective by this methodology induced significant reduction of SARS-CoV-2 replication, without affecting cell viability. This model for host-directed pharmacological therapy is fully generalizable and can be deployed to identify drugs targeting host cell-based MR signatures induced by virtually any pathogen. Single cell analysis revealed highly conserved differential protein activity signatures, as defined 141 by the top 50 most differentially active candidate MRs, by analogy to tumor MRs 7 . We will refer 142 to this repertoire of virus-induced MRs as the Viral CheckPoint. The analysis identified a highly 143 conserved MR core induced by SARS-CoV-2 infection, within each available cellular model, 144 across all post-infection time-points for which data was available (p < 10 -40 , by 2-tailed aREA test, 145 Fig. 1a and Supplementary Fig. 2a) . 146 When comparing equivalent time-points, we observed significant conservation of the differentially 147 active protein signature across lineage-related cell models (e.g., Calu-3 vs. H1299, at 12h, p < 148 10 -40 , Supplementary Fig. 2a) . Interestingly, the virus-mediated MR signature was highly 149 conserved even across unrelated lineages, when equivalent time-points were considered (e.g., 150 H1299 vs. colon non-transformed organoid at 24h, p < 0.01, Supplementary Fig. 2a) . Taken 151 together, these findings suggest the existence of a highly reproducible, SARS-CoV-2-mediated 152 8 The MR activity signatures detected by single cell analyses were also recapitulated by bulk-tissue 158 analysis of SARS-CoV-2-infected epithelial cells (ST1), albeit at a slightly lower statistical 159 significance, as we expected. These findings applied to both transformed models, including lung 160 (Calu-3, H1299, and A549) and colon (Caco-2) adenocarcinoma, and normal human bronchial 161 epithelial (NHBE) primary cells, as well as to more physiologic models, including lung organoids. 162 As should be expected, MR conservation was more significant for models characterized by high 163 infection rates ( Supplementary Fig. 2a) , likely due to signature dilution/contamination by a high 164 proportion of non-infected cells in other models. 165 Gene Set Enrichment Analysis (GSEA) 20 demonstrated a critical dichotomy of biological hallmark 167 programs enriched in activated vs. inactivated MRs (Fig. 1b) . Specifically, biological hallmarks 168 enriched in activated MRs included inflammatory response, epithelial-to-mesenchymal transition 169 (EMT) and interferon response. Indeed, among the top aberrantly activated MRs, we identified 170 MX1, a protein induced by interferon I and II 21 , the interferon regulator IRF9, and additional 171 transcriptional regulators that mediate cellular response to interferons, such as STAT1 and 172 STAT2 22 (Fig. 1a) . 173 In contrast, our model shows that biological hallmarks enriched in inactivated MRs were strongly 174 related to virus-mediated host-cell hijacking programs, such as PI3K signaling, unfolded protein 175 response, DNA repair, and metabolic-related processes 23,24 (Fig. 1b) . Consistent with this 176 observation, the most significantly inactivated MRs included several ribosomal subunit members 177 (such as RPS27A, RPS3, RPL3, RPS6, RPS14), as well as proteins involved in cell cycle arrest 178 (UBA52) 25 , translational regulation, and cellular metabolism (GABPB1) 26 (Fig. 1a) . To assess whether activated vs. inactivated MRs in our model may represent a more effective 181 target for drug-mediated reversal, we proceeded to assess whether either class was enriched in 182 host proteins previously identified as cognate binding partners of SARS-CoV-2 proteins. For this 183 analysis, we leveraged a collection of 332 host proteins previously reported to be involved in 184 protein-protein interactions (PPIs) with 26 of the 29 proteins encoded by the SARS-CoV-2 185 genome, as determined by mass-spec analysis of pull-down assays 2 . Of these interactions, 90 186 were with proteins included in the 5,734 we analyzed by VIPER. GSEA 20 revealed statistically 187 significant enrichment of these 90 proteins in SARS-CoV-2 inactivated but not activated MRs, 188 across all the evaluated single-cell protein activity signatures (p < 10 -3 , 2-tailed GSEA, 189 Supplementary Fig. 3 ). This suggests that host cell proteins that physically interact with CoV-2 proteins are mostly inactivated in response to the infection. 191 To further confirm the functional duality of the inferred MRs, we also assessed their enrichment 193 in genes previously reported as essential to the virus infectious cycle. Specifically, we evaluated 194 their enrichment in genes identified by functional CRISPR screens from two different studies, 195 including using SARS-CoV-2 infected Vero 6 and Huh-7.5 4 cells. Consistent with our original 196 observation and definition of the SARS-CoV-2 induced MR signature, the 50 most inactivated 197 candidate MRs-as determined by integrating results from both lung and GI models-were 198 significantly enriched in infection-essential genes identified in both CRISPR screen (p < 10 -4 and 199 p < 10 -3 , respectively), as well as in the integrated set ( Supplementary Fig. 4a -c, p < 10 -4 ). In 200 contrast, the 50 most activated MRs were not significantly enriched in infection essential genes 201 (Supplementary Fig. 4d-f) . 202 To test the dependence of SARS-CoV-2 replication on inactivation of the MR proteins-termed 204 Viral Checkpoint for analogy to tumor Checkpoints 7 -, we adapted the OncoTreat algorithm 8 to 205 identify small molecule compounds capable of activating such MRs (ViroTreat, Fig. 2) . We 206 hypothesize that such drug-induced effects would keep the host cell phenotype in a "viral 207 contraception" regulatory state that effectively reduces viral replication rate. 208 We have shown that drug Mechanism of Action (MoA)-as represented by the proteins that are 209 differentially activated/inactivated-is an effective predictor of drug activity in vivo and in explants 210 defined as the repertoire of proteins differentially activated/inactivated at 24h following drug 226 perturbation. While this was done specifically in colon epithelial cells for this study, the analysis 227 can be easily extended to assess drug MoA in other cellular contexts. Specifically, the RNA-seq 228 profiles used in this analysis were generated at 24h (by PLATE-Seq assays 31 ), following 229 treatment of a colon adenocarcinoma cell line (LoVo) with a library of FDA-approved drugs and 230 vehicle control (DMSO). To avoid assessing cell death or stress mechanisms, rather than drug 231 MoA effects, drugs were titrated at their highest sublethal concentration (i.e., their 48h IC20), as 232 assessed by 10-point dose response curves (see methods for additional details). Resulting 233 profiles were then used to assess the differential activity of regulatory proteins in drug vs. vehicle 234 control-treated cells with the VIPER algorithm 9 . Finally, drugs were prioritized based on their 235 ability to activate the MR proteins inactivated by SARS-CoV-2 infection, as assessed by their 236 enrichment in proteins differentially activated by each drug, using the aREA algorithm 8,9 (Fig. 2) . 237 ViroTreat predictions were averaged across available GI organoid models and across all 238 evaluated time points. Among the 154 FDA-approved drugs profiled in LoVo cells, ViroTreat 239 prioritized 22 (13 orally available and 9 intravenous) at a highly conservative statistical threshold 240 (p < 10 -5 , Bonferroni corrected (BC)), see Fig. 3 and Supplementary Table 2) . 241 To provide proof-of-concept validation for the ViroTreat predictions in our model, we first assessed 243 drug-mediated inhibition of SARS-CoV-2 replication by ViroTreat-predicted vs. control drugs in 244 the colon adenocarcinoma cell line (Caco-2) known to support SARS-CoV-2 infection 30 . 245 For this assay, we considered all 13 ViroTreat-inferred orally-available drugs, as a more clinically 246 relevant group, and the top 5 most significant intravenous (IV) drugs. As candidate negative 247 controls, we selected 12 drugs-including 8 orally available agents and 4 IV drugs-not inferred 248 as statistically significant by ViroTreat (p ³ 0.01, Fig. 3 For each drug, the viability-normalized effect on SARS-CoV-2 replication (antiviral effect) was 253 quantified as the log-ratio between viral replication and cell viability reduction relative to vehicle-254 treated (DMSO) controls ( Supplementary Fig. 6) . Since multiple concentrations were tested, the 255 lowest concentration corresponding to a significant antiviral effect was reported (Supplementary 256 Table 2 ). As a proof-of-concept for the ability of this model to identify drugs capable of reducing 257 replication of SARS-CoV-2, we considered drugs to be validated only if their antiviral effect was 258 statistically significant (FDR < 0.05) and they induced a decrease in virus replication of at 259 least 20%. This additional condition was used to further increase the stringency when considering 260 the antiviral effect of a drug (see Methods). 261 Of 18 drugs predicted to activate the MR proteins inactivated by SARS-CoV-2 infection, 15 (83%) 262 showed statistically significant antiviral effect. In contrast, none of the 12 drugs selected as 263 potential negative controls showed significant antiviral effect (Fig. 4b and Supplementary Table 264 2), demonstrating a significant enrichment of ViroTreat results in drugs with antiviral activity (p < 265 10 -5 , 1-tailed Fisher's exact test (FET)). Consistently, the Receiver Operating Characteristic 266 (ROC) had an Area Under the Curve AUC = 0.907 (95% Confidence Interval: 0.77-0.91), which 267 is highly statistically significant (p < 10 -4 , Fig. 4c ), demonstrating the predictive power of ViroTreat 268 in this proof-of-concept. 269 To further assess the pathogen-specific nature of ViroTreat predictions, we tested the ability of Table 297 DISCUSSION 300 We report here a model characterizing the regulatory biology of virus-host interaction, in which 301 viral infection induces a phenotypic transition in the host cell toward a state that is promotive of 302 viral replication. We applied Master Regulator (MR) inference analysis 9,16 to systematically 303 dissect the transcriptional regulators (MR proteins) hijacked by the virus (Viral CheckPoint) and 304 demonstrated, using a model of SARS-CoV-2 infection in gastrointestinal epithelial cells, that 305 pharmacologically blocking this transition is sufficient to maintain the host cell in a state of 306 "transcriptional contraception" that is adverse to virus replication. We adapted the OncoTreat 307 framework, originally developed to prioritize drugs for cancer 8 , to identify drugs with concerted 308 activity on the Viral Checkpoint. 309 We propose that the approach employed in this model, which we call ViroTreat, can be used as 311 a mechanism-based framework for repurposing drugs, based on their ability to reprogram host 312 cells to a state refractory to virus hijacking. In contrast to previous host cell-centric approaches 313 aimed at targeting single host cell proteins that directly interact with the viral proteome, the 314 ViroTreat model was designed to target the entire MR protein module, whose concerted 315 regulatory activity is responsible for implementing and maintaining a virus replication-permissive 316 transcriptional state in the host cell. Thus, ViroTreat expands the one disease/one target/one 317 drug paradigm to targeting an entire protein module (i.e, Viral Checkpoint) based on the accurate 318 assessment of each drug's proteome-wide MoA, as dissected from perturbational profile data. 319 Such a holistic approach to matching disease dependencies to drug MoA overcomes the inherent 320 limitations of drug repurposing efforts that focus on inhibitors of individual proteins or single 321 pathways to thwart viral replication as part of a host cell-targeting strategy. Viral Checkpoint MR identification requires availability of gene expression signatures of virus-323 infected cells. Therefore., to avoid model-specific confounding effects and to identify a more 324 universal and reproducible MR signature of viral infection, we performed MR analysis in multiple, 325 complementary cellular models, including both transformed cell lines and normal 3D-organoid 326 cultures representing both airway and GI epithelium lineages. In addition, to avoid confounding 327 effects from a heterogeneous combination of infected and non-infected cells-representing the 328 majority of the cell population-MR analysis was also performed at the single cell level, using 329 SARS-CoV-2 genome mapped reads to unequivocally identify infected cells. Finally, we avoided 330 confounding effects from single cell transcriptional state heterogeneity by comparing each 331 infected cell to a small pool of the closest non-infected cells, based on MR analysis, as controls. 332 Finally, to achieve cell context-specific elucidation of drug MoA, we analyzed drug perturbations 333 in cell lines that recapitulate the biology of the infected cells, based on conservation of their most 334 differentially active/inactive MRs, as previously described 27 . 335 The ViroTreat framework prioritizes drugs from a predefined library used to generate 336 perturbational assays. For this proof-of-concept, we maximized the translational potential of drug 337 predictions, by focusing our analysis on FDA-approved drugs used primarily in an oncology 338 setting; with particular emphasis on orally available drugs. However, the approach can be easily 339 extended to explore a much larger library of pharmacological compounds. Moreover, the 340 database of drug context-specific MoA can be generated independently and prior to the 341 identification, isolation and characterization of a viral pathogen of interest, making it readily 342 available for current as well as future pandemics. 343 In addition, while most studies have focused on drugs that act as high affinity inhibitors of target 344 proteins 2-6,32,33 , to our knowledge, this is the first study to focus on pharmacologic agents 345 predicted to activate, rather than inhibit, an entire protein module of Master Regulator proteins 346 whose inactivation by the virus was found to be necessary for viral hijack and replication. By inducing drug-mediated reversion of the Viral Checkpoint activity, we successfully reprogrammed 348 host cells to a regulatory state of "viral contraception," thereby significantly buffering the virus's 349 ability to hijack the host cell machinery required for its infective cycle. Ileum Colon MX1 SP110 IRF9 PARP14 STAT1 IFI6 SP100 ETV7 IDO1 CXCL10 BST2 TNFSF13B STAT2 NOTCH1 TBC1D24 SOCS1 Lung GI a b Organoids were cultured in 24-well plates in basal medium for 5-7 days following the original 469 protocol of Sato and co-workers 41 . To obtain human colon organoids-derived 2D primary cell 470 cultures, the medium was removed from the 24-well plates, organoids were washed 1X with cold 471 PBS and spun (450g for 5 mins). PBS was removed and organoids were digested with 0.5% Fig. 9 ). This analysis identified four groups of drugs-i.e. components of 499 the GMM analysis. The first two groups, based on their mean, showed an average decrease in 500 infectivity of 65% and 30%; the third group showed an average decrease in infectivity close to 501 zero (3.5%); and the forth group showed an average increased in infectivity of 29% 502 ( Supplementary Fig. 9 ). Based on this analysis, we empirically estimated 20% as a reasonable 503 threshold distinguishing drugs that inhibit viral replication (first and second groups) from drugs 504 that showed no effect or increased replication (third and fourth groups, see Supplementary Fig. 505 9). The GMM analysis was performed using the mixtools package available on CRAN 506 (https://cran.r-project.org/web/packages/mixtools/index.html) (Supplementary Fig. 9) . 507 RNA isolation, cDNA, and RT-qPCR 508 RNA was harvested from cells using RNAeasy RNA extraction kit (Qiagen) as per manufactures 509 instructions. cDNA was made using iSCRIPT reverse transcriptase (BioRad) from 250 ng of total 510 RNA as per manufactures instructions. RT-qPCR was performed using iTaq SYBR green 511 (BioRad) as per manufacturer's instructions, TBP or HPRT1 were used as normalizing genes. 512 See Supplementary Table 4 for primers used. 513 The source for all the datasets is listed in Supplementary Table 1 normalized gene expression profiles. The optimal number of clusters was estimated by silhouette-527 score analysis as implemented in the "fviz_nbclust" function of the "factoextra" package 528 (https://cran.r-project.org/web/packages/factoextra/index.html). Cluster solutions were evaluated 529 from k=2 to k=10 and the solution with the highest average of silhouette scores was considered 530 as optimal. Based on the optimal cluster solution, we selected as reference for each infected 531 sample the centroid of the mock samples within the same cluster. In cases of clusters constituted 532 by infected samples only, the centroid of the mock samples in the closest cluster were used as 533 reference. Because a two clusters solution was estimated as optimal for all cluster analysis, the 534 other cluster was the trivial closest cluster solution in all cases. Cluster solutions with less than 535 two samples per cluster were considered ineffective. For Calu-3 cell line, we noticed that samples 536 associated to the two series (series-1 and series-2) clustered separately-i.e. samples clustered 537 according to series memberships. To avoid possible batch effects in the analysis, the samples of 538 these two series were re-clustered separately to identify the best matched mock control samples in each series independently. For series-1, the mock samples at 4h and 24h clustered together 540 and were used as reference to compute the differential expression signatures of all the Calu-3 541 SARS-CoV-2 infected samples. For series-2, three mock samples, including one mock sample at 542 4h and two mock samples at 12h clustered together and were used as reference to compute the 543 differential expression signatures for all the Calu-3 SARS-CoV-2 infected samples. Of note, in 544 series-2, one mock sample at 4h (GSM4477923) clustered separately from all the other samples 545 with a silhouette score of zero which indicates no clear cluster assignment. This sample was 546 considered as outlier and excluded from the downstream analysis. For the Caco-2 cell line, the 547 centroid of the 4h mock samples was used as reference to compute the differential expression 548 signatures of the SARS-CoV-2 infected samples at 4h and 12h, while the centroid of 24h mock 549 samples was used as reference to compute the differential expression signatures of the 24h 550 SARS-CoV-2 infected samples. For the H1299 cell line, the centroid of the 4h mock samples was 551 used as reference to compute the differential expression signatures of the SARS-CoV-2 infected 552 samples at 4h and 12h; and the centroid of the 36h mock samples was used as reference to 553 compute the differential expression signatures of the 36h SARS-CoV-2 infected samples. To account for confounding effects and gene expression profile heterogeneity associated with 594 mechanisms that are independent of viral infection 18,42 -such as cell cycle and the use of models 595 derived from cancer cell lines 47 -differential expression signatures between infected and non-596 infected single cells were computed by comparing each infected cell to its k = 50 closest non-597 infected ones (Supplementary Fig. 1 ). This approach significantly improved accuracy and 598 reproducibility of differential gene expression signatures, including across different cell lines, by 599 minimizing confounding effects not associated with viral infection. To identify mock controls cells 600 for each individual infected cell we transformed the count matrices to count per million (CPM) and 601 subsequently to VIPER-inferred protein activity signatures. Briefly, gene expression profiles were 602 transformed to differential gene expression signatures using the "scale" method-i.e. z-score 603 transformation-as implemented in the VIPER package 9 . Then, using lung adenocarcinoma 604 context-specific models of transcriptional regulation, we transformed the single-cell gene 605 expression signature matrices for Calu-3 and H1299 cell lines to VIPER-inferred protein activity 606 signature matrices. Similarly, using colon and rectal adenocarcinoma context-specific networks, 607 we transformed the single-cell gene expression signature matrices for ileum and colon organoids 608 to the corresponding metaVIPER-inferred protein activity signature matrices. 609 The phenotypic state similarity between cells of the same dataset was quantified by the euclidean The differential gene expression signatures of SARS-CoV-2 infected cells were transformed to 629 inferred protein activity signatures by VIPER and metaVIPER algorithms, as described above. 630 Single-cell protein activity signatures of each data set were integrated by arithmetic mean at each 631 available time point for each cell line. 632 The conservation of MR proteins between VIPER-inferred protein activity signatures was inactivated proteins in signature S1 in proteins differentially active in signature S2 and vice versa 636 Proteomics of SARS-CoV-2-infected host cells reveals therapy targets A SARS-CoV-2 protein interaction map reveals targets for drug 693 repurposing Identification of Required Host Factors for SARS-CoV-2 Infection in 695 Genome-Scale Identification of SARS-CoV-2 and Pan-coronavirus 697 Genetic Screens Identify Host Factors for SARS-CoV-2 and Common Cold 699 Genome-wide CRISPR Screens Reveal Host Factors Critical for SARS The recurrent architecture of tumour initiation, progression 703 and drug sensitivity An Integrated Systems Biology Approach Identifies TRIM25 as a Key 747 Determinant of Breast Cancer Metastasis The ubiquitin hybrid gene UBA52 regulates ubiquitination of 750 ribosome and sustains embryonic development Viral hijacking of cellular metabolism Unbiased Assessment of H-STS cells as high-fidelity models for gastro-755 enteropancreatic neuroendocrine tumor drug mechanism of action analysis. bioRxiv Reply to 'H-STS, L-STS and KRJ-I are not authentic GEPNET cell lines A Community Challenge for Pancancer Drug Mechanism of Action 760 Inference from Perturbational Profile Data. bioRxiv Critical Role of Type III Interferon in Controlling SARS-CoV-2 Infection 763 in Human Intestinal Epithelial Cells PLATE-Seq for genome-wide regulatory network analysis of high-766 throughput screens Multilevel proteomics reveals host perturbations by SARS-CoV-2 and 768 SARS-CoV Bepridil is potent against SARS-CoV-2 in vitro Heparan sulfate assists SARS-CoV-2 in cell entry and can be targeted by 780 approved drugs in vitro Identifying SARS-CoV-2 Entry Inhibitors through Drug Repurposing 782 Screens of SARS-S and MERS-S Pseudotyped Particles Identification of SARS-CoV-2 entry inhibitors among already approved drugs A Review of Repurposed Cancer Drugs in Clinical Trials for Potential 787 Treatment of COVID-19 Long-term expansion of epithelial organoids from human colon, adenoma, 789 adenocarcinoma, and Barrett's epithelium Single-cell analyses reveal SARS-CoV-2 interference with intrinsic immune 792 response in the human gut Imbalanced Host Response to SARS-CoV-2 Drives Development of 795 COVID-19 Differential expression analysis for sequence count data ARACNe-AP: gene network reverse 799 engineering through adaptive partitioning inference of mutual information Supplementary Table 2: Drugs library, ViroTreat and focused validation screen results < See supplementary file Table-S2.xlsx > 954