key: cord-0957463-0dl58zld authors: Wang, Ruofan; Simoneau, Camille R.; Kulsuptrakul, Jessie; Bouhaddou, Mehdi; Travisano, Katherine A.; Hayashi, Jennifer M.; Carlson-Stevermer, Jared; Zengel, James R.; Richards, Christopher M.; Fozouni, Parinaz; Oki, Jennifer; Rodriguez, Lauren; Joehnk, Bastian; Walcott, Keith; Holden, Kevin; Sil, Anita; Carette, Jan E.; Krogan, Nevan J.; Ott, Melanie; Puschnik, Andreas S. title: Genetic screens identify host factors for SARS-CoV-2 and common cold coronaviruses date: 2020-12-09 journal: Cell DOI: 10.1016/j.cell.2020.12.004 sha: 9256eec0a105a3b19d8e83a219396e671df72c16 doc_id: 957463 cord_uid: 0dl58zld The Coronaviridae are a family of viruses that cause disease in humans ranging from mild respiratory infection to potentially lethal acute respiratory distress syndrome. Finding host factors common to multiple coronaviruses could facilitate the development of therapies to combat current and future coronavirus pandemics. Here, we conducted genome-wide CRISPR screens in cells infected by SARS-CoV-2 as well as two seasonally circulating common cold coronaviruses, OC43 and 229E. This approach correctly identified the distinct viral entry factors ACE2 (for SARS-CoV-2), aminopeptidase N (for 229E) and glycosaminoglycans (for OC43). Additionally, we identified phosphatidylinositol phosphate biosynthesis and cholesterol homeostasis as critical host pathways supporting infection by all three coronaviruses. By contrast, the lysosomal protein TMEM106B appeared unique to SARS-CoV-2 infection. Pharmacological inhibition of phosphatidylinositol kinases and cholesterol homeostasis reduced replication of all three coronaviruses. These findings offer important insights for the understanding of the coronavirus life cycle and the development of host-directed therapies. between the genes identified in our screens (Cowen et al., 2017) . This approach 239 allowed us to identify molecular networks that emerge from our datasets even if certain 240 gene members fell below our top-scoring threshold. Network propagation is a powerful 241 technique that uses a 'guilt-by-association' approach to propagate biological signal 242 within large networks (e.g. Pathway Commons) to identify interconnected neighborhood 243 clusters or pathways. In addition to revealing the functional networks underlying a 244 particular dataset, this approach can be especially useful for identifying converging 245 molecular networks across datasets. Here, we used an integrative network propagation 246 approach to identify subnetworks and pathways that were common across the three 247 coronavirus screens ( Figure 2B ). Briefly, we propagated the unthresholded CRISPR 248 screen enrichment scores from each coronavirus screen and utilized a statistical 249 permutation test paired with network clustering methods to extract network 250 neighborhoods implicated across all three coronavirus screens. To validate the candidate genes from the SARS-CoV-2 screen, we generated individual 278 KO cells in three cell types. We introduced gene deletions for several top hits in A549 279 lung epithelial cells transduced with ACE2 (A549-ACE2) using Cas9 ribonucleoproteins 280 (RNPs), resulting in high indel frequencies (Table S4) . SARS-CoV-2 RNA levels were 281 markedly reduced in A549-ACE2 cells that contained mutations in ACE2, ADP 282 Ribosylation Factor 5 (ARF5), multiple subunits of the exocyst (EXOC2, EXOC6, 283 EXOC8), the cholesterol homeostasis genes SCAP, MBTPS1 and MBTPS2, the 284 phosphatidylinositol kinase complex genes PIKFYVE and VAC14, or TMEM106B 285 ( Figure 3A ). Next, we lentivirally introduced Cas9 and sgRNAs against a subset of these 286 genes (TMEM106B, VAC14, SCAP, MBTPS2, EXOC2) into Calu-3 lung epithelial cells 287 with endogenous ACE2 levels and also observed reduced viral replication compared to 288 control cells harboring a non-targeting sgRNA ( Figure 3B) . decrease of SARS-CoV-2 RNA levels ( Figure 3E ). When we infected the same 296 Huh7.5.1 KO cells with OC43 and 229E, we observed reduced viral replication in SCAP, 297 MBTPS2 and EXOC2 KO cells but not in TMEM106B KO and only moderately in 298 VAC14 KO cells ( Figure 3F ). This suggests that the latter genes are more rate-limiting 299 in SARS-CoV-2 infection. 300 301 Next, we probed Huh7.5.1 cells lacking genes involved in endosome maturation or the 302 PI3K complex, which were initially found in the common cold coronavirus screens. We 303 saw reduced viral replication for OC43 and 229E ( Figures 3G and 3H) . Additionally, we 304 observed increased cell viability in all KO cells relative to WT Huh7.5.1 cells 8 dpi 305 ( Figures S4C and S4D) , indicating that these genes are important for infection by the 306 common cold viruses and for virus-induced cell death. We then tested whether the hits 307 shared between OC43 and 229E affect SARS-CoV-2. Indeed, SARS-CoV-2 infection 308 was reduced in cells lacking certain endosomal or PI3K genes in the context of 309 Huh7.5.1 without ACE2-IRES-TMPRSS2, similar to the common cold coronaviruses 310 ( Figure 3I ). Complementation of PIK3R4 and VPS16 KO cells with respective cDNAs 311 restored SARS-CoV-2 and 229E, and to a lesser degree, OC43 replication levels 312 ( Figures 3J-O and S4B ). To rule out the possibility that decreased viral replication is not 313 due to severe cellular growth defects, we measured proliferation of RNP-edited A549-314 ACE2 and clonal Huh7.5.1 KO cells. Apart from SCAP KO cells we did not observe any 315 notable growth differences compared to WT cells ( Figures S4E and S4F) . 316 Together, these experiments confirm that the host factors identified in our screens in 317 Huh7.5.1 cells have functional roles for Coronaviridae, which are also relevant in lung 318 epithelial cells. Furthermore, we demonstrated that important aspects of SARS-CoV-2 319 biology can be revealed by studying the common cold coronaviruses. 320 321 Host factors important for virus infection are potential targets for antiviral therapy. Host-323 directed therapy is advantageous as it allows pre-existing drugs to be repurposed, may 324 provide broad-spectrum inhibition against multiple viruses, and is generally thought to 325 be more refractory to viral escape mutations than drugs targeting viral factors 326 (Bekerman and Einav, 2015) . We therefore explored whether the cellular pathways 327 identified in our screens could serve as targets for therapy against coronavirus infection. infection levels at higher doses ( Figure S5B ). We confirmed on-target activity of the 350 SREBP pathway modulators by measuring reduced expression of SREBP-regulated 351 genes upon drug treatment ( Figure S5C ). 352 We also tested Bardoxolone, an activator of the KEAP1-NRF2 complex (Liby and 354 Sporn, 2012), since KEAP1 scored highly in both common cold coronavirus screens. 355 Bardoxolone potently inhibited 229E and OC43 replication and also reduced SARS-356 CoV-2 RNA levels at slightly higher concentrations ( Figures 4J-L Next, we tested whether some of the identified genes affect viral entry. We generated a 367 clonal Huh7.5.1-ACE2/TMPRSS2 overexpression cell line to facilitate efficient infection 368 with a SARS-CoV-2 spike pseudotyped vesicular stomatitis virus (VSV-SARS-CoV-2-S) 369 expressing GFP, which can be utilized to specifically probe effects on spike-mediated 370 entry of SARS-CoV-2. We then introduced Cas9 RNPs and created knockout lines for 371 our genes of interest. Editing efficiencies were high and loss of protein was confirmed 372 for TMEM106B (Figures 5A and Table S4 ). As expected, knockout of ACE2 drastically 373 reduced infection with VSV-SARS-CoV-2-S ( Figure 5B ). By contrast, we did not observe 374 a decrease of viral entry in TMEM106B and VAC14 KO cells, suggesting that they do 375 not play a role in spike-mediated entry ( Figure 5B ). We saw reduced uptake of 376 In this study, we performed genome-scale CRISPR KO screens to identify host factors 387 important for SARS-CoV-2, 229E and OC43. Our data highlight that while the three 388 coronaviruses exploit distinct entry factors, they also depend on a convergent set of 389 host pathways, with potential roles for the entire Coronaviridae family. calculated by hypergeometric test and a false-discovery rate was used to account for 531 multiple hypothesis testing. The top GO terms of each screen were selected for 532 visualization. A complete list of significant GO terms can be found in Table S2 . 533 (B) Data integration pipeline for network propagation of identified host factor genes. 534 Unthresholded CRISPR screen enrichment scores served as initial gene labels for 535 network propagation using Pathway Commons. Separately propagated networks were 536 integrated gene-wise (via multiplication) to identify biological networks that are shared 537 between all three datasets. Genes found to be significant in the propagation were 538 extracted, clustered into smaller subnetworks, and annotated using GO enrichment 539 analysis (see Methods). 540 (C) Selected biological subnetwork clusters from network propagation. Cluster title 541 indicates the most significant biological function(s) for each cluster. Circle size 542 represents p-value from network propagation permutation test (see STAR Methods and 543 Table S3 ). The original enrichment score of a gene in each CRISPR screen is indicated 544 by color scale within the circle. The entire set of identified clusters is displayed in Figure 545 S3A. (#) is the cluster number, which refers to the GO enrichment analysis of biological 546 processes in Figure S3B and Table S2 . Table S3 ). The CRISPR screen enrichment score of a gene from 672 each screen is indicated by color scale within the circle. 673 (B) Gene ontology (GO) enrichment analysis was performed on each subcluster from 674 the network propagation. P values were calculated by hypergeometric test and a false-675 discovery rate was used to account for multiple hypothesis testing. The entire set of 676 enriched biological processes for each subcluster is listed in Table S2 . Table S5 . For the 229E and OC43 CRISPR screens, 100 million cells (per screen) of Huh7.5.1-816 Cas9-blast GeCKO library cells were infected with 229E and OC43 at moi of 0.05 and 3, 817 respectively. Cells were incubated at 33C to increase CPE, which was apparent after 3-818 4 days. Surviving cells were collected after 10 days for 229E and 14 days for OC43. 819 Each screen was performed in two replicates. For all CRISPR screens, genomic DNA 820 Table S5 . is the unnormalized adjacency matrix, and D is the diagonal degree matrix of the 852 network), I is the identity matrix, and α denotes the restart probability (here, α=0.2), 853 which is the probability of returning to the previously visited node, thus controlling the 854 spread through the network. 855 We performed three independent propagations, one for each CRISPR dataset (i.e. each 856 virus). After propagation, each propagated network was integrated by multiplying gene-857 wise. Such an operation is used to create a gene list ranked to prioritize genes with high 858 scores from all propagated datasets. To control for nodes with high degree (i.e. many 859 connections), which due to their heightened connectivity are biased to receive higher 860 propagation scores, we conducted a permutation test. Specifically, we simulated 861 random propagations by shuffling the positive scores to random genes, repeating this 862 20,000 times per CRISPR screen. Next, we calculated an empirical p-value by 863 calculating the fraction of random propagation runs greater than or equal to the true 864 propagation run for each gene. 865 The network was created by extracting a subnetwork from the same Pathway Commons 866 network corresponding to genes possessing a significant p-value (p<=0.01) from the 867 propagation (n=378). Of these, interconnected genes were visualized using Cytoscape 868 (n=284). The resulting network was clustered into subnetworks using the GLay 869 Cytoscape plugin (Su et al., 2010) . Three large clusters (1, 3, and 5) were further 870 clustered using GLay into additional subclusters (denoted with letters), resulting in a 871 total of 25 subnetwork clusters (see Figure S3A and Table S3) Table S5 . Nucleofections were performed on a Lonza HT 96-well nucleofector system using 925 program CM-120 and CM-104 for A549-ACE2 and Huh7.5.1-ACE2/TMPRSS2, 926 respectively. All transfections were performed in Lonza SE buffer. Immediately following 927 nucleofection, each reaction was divided evenly between two wells of a tissue-culture 928 Table S1: CRISPR screen results. MaGECK output for positive gene enrichment 1088 analysis of SARS-CoV-2, 229E and OC43 host factor screens. Related to Figure 1 . 1089 Table S2 : Gene ontology enrichment analysis of individual CRISPR screens and 1090 network propagation clusters. Related to Figure 2 . 1091 Table S3 : Network propagation results. Related to Figure 2 . 1092 To identify host factors required for the infection with SARS-CoV-2 and the common cold coronaviruses OC43 and 229E, Wang et al. conduct genome-wide CRISPR knockout screens. In addition to virus-specific entry factors they uncover shared host pathways, including cholesterol homeostasis and phosphatidylinositol kinases, required for the infection with all three viruses, and demonstrate that pharmacological inhibition of these pathways exhibits pan-coronavirus antiviral activity. J o u r n a l P r e -p r o o f HCoV-OC43 Identification of the neuroblastoma-amplified gene product as a component of the syntaxin 18 1100 complex implicated in Golgi-to-endoplasmic reticulum retrograde transport Identification of TMEM106B as proviral host 1104 factor for SARS-CoV-2 CORVET and HOPS tethering complexes -1106 coordinators of endosome and lysosome fusion Anterograde or retrograde transsynaptic labeling of CNS 1109 neurons with vesicular stomatitis virus vectors Combating emerging viral threats Coronavirus 229E for Cathepsin-Independent Host Cell Entry and Is Expressed in Viral Target 1114 Cells in the Respiratory Epithelium PI3K isoforms in cell signalling and 1116 vesicle trafficking The Global Phosphorylation Landscape of 1119 SARS-CoV-2 Infection Retrospective on Cholesterol 1121 Homeostasis: The Central Role of Scap A versatile viral system for expression and depletion 1124 of proteins in mammalian cells Ebola virus entry requires the 1127 cholesterol transporter Niemann-Pick C1 Pathway Commons, a web resource for biological pathway data Cellular cholesterol abundance regulates potassium accumulation within endosomes and is an 1133 important determinant in bunyavirus entry SARS-CoV-2 Infection Depends on 1136 Cellular Heparan Sulfate and ACE2 Network propagation: a universal 1138 amplifier of genetic associations Therapeutic targeting of the NRF2 and 1141 KEAP1 partnership in chronic diseases Can Activation of NRF2 Be a Strategy 1144 against COVID-19? Relation of Statin Use Prior to Admission to Severity and Recovery 1147 Among COVID-19 Inpatients Identification of Required Host Factors for SARS-1150 CoV-2 Infection in Human Cells Stomatitis Virus for Studies of SARS-CoV-2 Spike-Mediated Cell Entry and Its Inhibition An interactive web-based dashboard to track COVID-19 1156 in real time Identification of a Novel 1159 Coronavirus in Patients with Severe Acute Respiratory Syndrome Survey of Human Chromosome 21 Gene Expression 1163 Effects on Early Development in Danio rerio A genome-wide CRISPR screen identifies N-1166 acetylglucosamine-1-phosphate transferase as a potential antiviral target for Ebola virus Human Coronavirus: Host-Pathogen Interaction A SARS-CoV-2 protein interaction map 1172 reveals targets for drug repurposing Comparative host-coronavirus protein 1175 interaction networks reveal pan-viral disease mechanisms High-Resolution CRISPR Screens 1178 Reveal Fitness Genes and Genotype-Specific Cancer Liabilities Pharmacologic inhibition of site 1 protease 1181 activity inhibits sterol regulatory element-binding protein processing and reduces lipogenic 1182 enzyme gene expression and lipid synthesis in cultured cells and experimental animals The Ccz1-Mon1-Rab7 module and Rab5 control distinct steps of autophagy Human coronaviruses: what do they cause? TMEM41B is a pan-flavivirus host factor A Multibasic Cleavage Site in the 1192 Spike Protein of SARS-CoV-2 Is Essential for Infection of Human Lung Cells SARS-CoV-2 Cell Entry Depends on ACE2 1196 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Human coronavirus NL63 employs the severe acute respiratory syndrome coronavirus receptor 1199 for cellular entry NPC1 regulates ER contacts with endocytic 1202 organelles to mediate cholesterol egress Inference of CRISPR Edits from Sanger Trace Data Human coronaviruses OC43 and HKU1 O-acetylated sialic acids via a conserved receptor-binding site in spike protein domain A A selective PIKfyve inhibitor blocks 1211 PtdIns(3,5)P(2) production and disrupts endomembrane transport and retroviral budding The HOPS complex mediates autophagosome-lysosome fusion through interaction with 1215 syntaxin 17 COT drives resistance to RAF 1218 inhibition through MAP kinase pathway reactivation A small molecule that blocks fat synthesis by inhibiting the 1221 activation of SREBP Inhibition of PIKfyve kinase prevents infection by 1224 Zaire ebolavirus and SARS-CoV-2 Tollip 1226 and Tom1 Form a Complex and Recruit Ubiquitin-conjugated Proteins onto Early Endosomes Loss of TMEM106B Ameliorates Lysosomal and Frontotemporal Dementia-Related Phenotypes 1230 in Progranulin-Deficient Mice Haploid Genetic Screen Reveals a 1233 Profound and Direct Dependence on Cholesterol for Hantavirus Membrane Fusion Binding of Vac14 to neuronal nitric oxide synthase: 1236 Characterisation of a new internal PDZ-recognition motif Functional assessment of cell entry and receptor 1238 usage for SARS-CoV-2 and other lineage B betacoronaviruses Angiotensin-converting enzyme 2 is a 1241 functional receptor for the SARS coronavirus MAGeCK enables robust identification of essential genes from genome-scale 1244 CRISPR/Cas9 knockout screens Synthetic oleanane triterpenoids: multifunctional drugs with 1246 a broad range of applications for prevention and treatment of chronic disease The FTLD Risk Factor TMEM106B 1250 Regulates the Transport of Lysosomes at the Axon Initial Segment of Motoneurons GPHR is a novel anion 1253 channel critical for acidification and functions of the Golgi apparatus Ambra1 regulates autophagy and 1257 development of the nervous system Electroporation and RNA interference in the rodent retina 1259 in vivo and in vitro Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells The cytoplasmic tail of the severe acute 1264 respiratory syndrome coronavirus spike protein contains a novel endoplasmic reticulum 1265 retrieval signal that binds COPI and promotes interaction with membrane protein The exocyst complex TMEM41B is a novel regulator of autophagy and lipid 1270 mobilization Identification of SARS-CoV2-mediated suppression of 1273 NRF2 signaling reveals a potent antiviral and anti-inflammatory activity of 4-octyl-itaconate and 1274 dimethyl fumarate Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-1277 reactivity with SARS-CoV Comparing SARS-CoV-2 with SARS-CoV and influenza 1280 pandemics The Major Cellular Sterol Regulatory Pathway Is 1283 Required for Andes Virus Infection A CRISPR toolbox to study virus-1285 host interactions Genome 1287 engineering using the CRISPR-Cas9 system A 1289 Genetic Screen Identifies a Critical Role for the WDR81-WDR91 Complex in the Trafficking and 1290 Degradation of Tetherin Discovery of SARS-CoV-2 antiviral drugs 1293 through large-scale compound repurposing A highly potent and selective Vps34 inhibitor alters 1296 vesicle trafficking and autophagy Spinster is required for autophagic lysosome reformation and mTOR 1299 reactivation following starvation TRIM28 regulates the nuclear accumulation 1302 and toxicity of both alpha-synuclein and tau Lysosome biogenesis and lysosomal membrane proteins: 1304 trafficking meets function Improved vectors and genome-wide libraries for 1306 CRISPR screening Genome-scale identification of 1309 SARS-CoV-2 and pan-coronavirus host factor networks Middle East Respiratory Syndrome 1311 Coronavirus Infection Mediated by the Transmembrane Serine Protease TMPRSS2 PIKfyve and its Lipid Products in Health and in Sickness Inhibitors of cathepsin L prevent severe acute respiratory syndrome coronavirus entry Lentivirus-delivered stable gene silencing by RNAi in 1321 primary cells Glycan Engagement by Viruses: Receptor Switches and 1323 Specificity Multi-level proteomics reveals host-perturbation 1326 strategies of SARS-CoV-2 and SARS-CoV GLay: Community structure 1328 analysis of biological networks SARS-CoV-2 entry factors are highly 1331 expressed in nasal epithelial cells together with innate immune genes Determination of host proteins composing 1334 the microenvironment of coronavirus replicase complexes by proximity-labeling Genome-wide CRISPR Screens Reveal 1338 Host Factors Critical for SARS-CoV-2 Infection A new coronavirus associated with human respiratory disease in China Human aminopeptidase N is a receptor for human coronavirus 229E Isolation of a Novel Coronavirus from a Man with Pneumonia in Saudi Arabia Cholesterol 25-hydroxylase suppresses SARS-CoV-2 replication by 1350 blocking membrane fusion In-Hospital Use of Statins Is Associated with a Reduced Risk of Mortality among 1353 Individuals with COVID-19 Robust hepatitis C virus infection in vitro The S1/S2 boundary of SARS-CoV-2 spike protein modulates cell entry pathways and 1359 transmission treated 96-well plate containing 100µL normal culture media. Two days post-929 nucleofection, DNA was extracted from using DNA QuickExtract (Lucigen). Amplicons 930 for indel analysis were generated by PCR amplification. PCR products were cleaned-up 931 and analyzed by sanger sequencing. Sanger data files and sgRNA target sequences 932were input into Inference of CRISPR Edits (ICE) analysis (ice.synthego.com) to 933 determine editing efficiency and to quantify generated indels (Hsiau et al., 2019) . A list 934 of all used sgRNA sequences and genotyping primers can be found in Table S5 . 935 936 DNA oligos (IDT) containing sgRNA sequences (see Table S5 ) were annealed and 938 ligated into lentiCRISPRv2 (Addgene, #52961, gift from Feng Zhang) ( RNaseP were displayed in figures. All qPCR primer/probe sequences are listed in Table 973 S5. Replicates are displayed as mean ± s.e.m. or mean ± s.d. as specified in the figure 1077 legends. Mean ± s.e.m. for RT-qPCR data was determined using CFX Maestro 1078Software (Bio-Rad) and then visualized in GraphPad Prism 8. Mean ± s.e.m. or mean ± 1079 s.d. for remaining data was calculated and visualized using GraphPad Prism 8. Dose-1080 response curves for drug treatments were generated by applying a non-linear curve fit 1081 with least squares regression and default parameters using GraphPad Prism 8. No