key: cord-1043117-y53b2sl2 authors: Olson, Erika J.; Brown, David M.; Chang, Timothy Z.; Ding, Lin; Ng, Tai L.; Weiss, H. Sloane; Koide, Yukiye; Koch, Peter; Rollins, Nathan; Mach, Pia; Meisinger, Tobias; Bricken, Trenton; Rollins, Joshua; Zhang, Yun; Molloy, Colin; Queenan, Bridget N.; Mitchison, Timothy; Marks, Debora; Way, Jeffrey C.; Glass, John I.; Silver, Pamela A. title: High-content screening of coronavirus genes for innate immune suppression reveals enhanced potency of SARS-CoV-2 proteins date: 2021-03-02 journal: bioRxiv DOI: 10.1101/2021.03.02.433434 sha: 41220166837abdd7a2226bb64ba8aa05885e3f5d doc_id: 1043117 cord_uid: y53b2sl2 Suppression of the host intracellular innate immune system is an essential aspect of viral replication. Here, we developed a suite of medium-throughput high-content cell-based assays to reveal the effect of individual coronavirus proteins on antiviral innate immune pathways. Using these assays, we screened the 196 protein products of seven coronaviruses (SARS-CoV-2, SARS-CoV-1, 229E, NL63, OC43, HKU1 and MERS). This includes a previously unidentified gene in SARS-CoV-2 encoded within the Spike gene. We observe immune-suppressing activity in both known host-suppressing genes (e.g., NSP1, Orf6, NSP3, and NSP5) as well as other coronavirus genes, including the newly identified SARS-CoV-2 protein. Moreover, the genes encoded by SARS-CoV-2 are generally more potent immune suppressors than their homologues from the other coronaviruses. This suite of pathway-based and mechanism-agnostic assays could serve as the basis for rapid in vitro prediction of the pathogenicity of novel viruses based on provision of sequence information alone. 150 (Maher et al., 2007) . Patients who receive interferon ɑ for hepatitis or interferon β for multiple 151 sclerosis experience flu-like symptoms from these proteins alone. We therefore designed a cell-based assay to reveal the effect of individual coronavirus In the present analysis, the data were aggregated to reveal the most clear-cut Orf7a; proteins NSP4, NSP10, NSP12, Env, and CaORF15 had more modest effects. We determined which genes had consistent effects across coronaviruses. The NSP10 and NSP12 genes from several different viruses also inhibited both IRF-3 and NFkB 256 nuclear accumulation. While the effects of these genes are modest, the fact that we observe effects of these genes from multiple different viruses suggests that the effects are not due to 258 noise in the assay. In contrast to the broad suppression of IRF-3 and NFkB translocation, coronaviral proteins largely did not act upon the STAT1 pathway in our assay, with the noticeable exception of two proteins: NSP1 and ORF6 ( Figure 3C ). Both of these are known inhibitors of host 262 functions: NSP1 binds to the ribosome and inhibits translation of a subset of host genes (Tidu 263 et al., 2020), while Orf6 protein binds to and functionally inhibits Nup98, a nuclear transport 264 protein that carries certain proteins into the nucleus (Miorin et al., 2020) and whose expression 265 is induced by inflammatory stimuli (Enninga et al., 2002) . A number of viral proteins appeared to stimulate, rather than inhibit, the STAT1 pathway 267 ( Figure 3C ). Such stimulation may represent situations where the innate immune system (i.e., the less pathogenic strains have fewer C-terminal ORFs, see Figure 1A ). However, the 296 immune-suppressing proteins encoded by SARS-CoV-2 within this region also generally 297 showed stronger effects than their SARS-CoV-1 homologues. In particular, Orf3a, Orf6, and Orf7a from SARS-CoV-2 showed stronger immune-suppressing effects ( Figures 3A-3C ). These 299 proteins have no homologues in MERS-CoV, HCoV 229E, HCoV NL63, HCoV HKU1 and HCoV 300 OC43, which encode a different set of accessory proteins that generally seem to be less active 301 in our assays. Collectively, we conclude that the individual and cumulative ability of coronaviral 302 proteins to suppress the Type 1 interferon-mediate innate immune response is strongest in 303 SARS-CoV-2. We determined which coronaviral proteins act by suppressing the expression of 307 inflammatory genes within the host cell by a second cell-based assay to test whether 308 coronavirus genes interfere with expression of a TNFɑ-inducible reporter element ( Figure 4A ). Specifically, we constructed a stable pool of HEK293 cells (which have a high transfection 310 efficiency) with a DNA construct in which a degradation-tagged fluorescent mScarlet protein 311 was expressed from an artificial promoter with five NFkB binding sites upstream of a 'minimal expressing and non-expressing cells ( Figures 4B-4D ). The NSP1 protein from several of the coronaviruses inhibited reporter gene expression, with the effect being strongest in SARS-CoV-2 ( Figures 4D-4E ). Figure Because of the complex nature of human antiviral innate immunity, we developed 347 multiple assays that could evaluate whether a viral protein was able to interfere with antiviral 348 defenses. The assays above test the capacity of viral proteins to affect specific pathways 349 involved in innate immunity. In another approach, we created a mechanism-agnostic test of in the assays of Figure including a papain-like protease. We also found that NSP3 genes from most of the viruses 474 tested had inhibitory effects on IRF-3 and NFkB. NSP3 has a number of additional domains 475 that may suppress innate immunity based on modulation of the ubiquitin ligase system (Lei et We also found that NSP9, NSP10 and NSP12 had innate immune-suppressing activity. NSP9 from SARS-CoV-1, MERS, and NL63, promoted replication of HSV-1 in the yeast fusion 479 assay, as did NSP10 from NL63 and SARS-CoV-2. NSP10 and NSP12 inhibited IRF-3 and the IFN-beta promoter. All of these proteins are involved with mRNA metabolism. NSP12 is the 482 RNA-dependent RNA polymerase, so at a mechanistic level it is not apparent how this protein 483 might modulate IRF-3 and NFkB. It is possible that, by analogy to influenza NS1 with its 484 multiple surfaces that bind to unrelated host proteins, that NSP12 has surfaces that are not 485 involved with RNA polymerization and can therefore carry out an unrelated function of titrating a 486 host factor. Alternatively, these proteins may bind to host RNAs that express proteins involved 487 in innate immunity. As part of our preparation of expression vectors for coronavirus genes, we identified a 489 new ORF in SARS-CoV-2, which we provisionally term "CaORF15" and which falls into the In sum, we characterized the behavior of proteins from seven different coronaviruses in 510 three different assays for suppression of intracellular innate immune signaling. We found that a 511 number of proteins showed inhibitory activity. At least two of these, NSP9 and SARS-CoV-2 512 "CaORF15," do not appear to have been previously identified as such. Innate immune suppression may correlate with asymptomatic spread rather than 514 pathogenicity per se. Infection with MERS, for example, is fatal much more often than infection 515 with SARS-CoV-2 or SARS-CoV-1; here the severe symptoms, presumably a result of a robust 516 Type I interferon response, prevent spread of the virus within the human population. In carrying 517 out these experiments, we sought to address whether the pandemic potential of a virus could be 518 estimated based on medium-throughput analysis of an entire viral genome for suppression of 519 the human intracellular innate immune system. Emerging viruses generally fall into known families. To set the stage for inferring 521 pandemic potential in future emerging viruses, we compared the genes of SARS-CoV-2 with 522 several other coronaviruses of known pathogenicity. The results of this analysis indicate that 523 SARS-CoV-2 genes cumulatively appear to have a greater potential for immune pathway 524 suppression than other coronaviruses, including SARS-CoV-1. This is admittedly an 525 approximate statement, since the relative importance of each gene in these viruses is not 526 generally known, but the data collectively are likely to suffice for a rapid assessment of whether 527 to initiate a vaccine program. Taken together, these results suggest that rapid testing of viral 528 genes in assays for innate immune suppression, performed with genes from related viruses, could be used for early-stage evaluation of the pandemic potential for emerging viruses. Further information and requests for resources and reagents should be directed to and will be All image datasets generated during this study are available upon request via OMERO and Of the translations, 15 correspond to known SARS-CoV-2 proteins and each resulted in 562 significant alignments to one or more PFAM profiles (all E-values < 1 E-14). At the time, ORF14 563 was missing from available SARS-CoV-2 annotation, nevertheless our approach identified the 564 translation as a likely protein because of significant alignment to PF17635, now also called 565 bCoV_Orf14 (E-value 6.2 E-35). In addition, while ORF3b is split in SARS-CoV-2 relative to 566 SARS-CoV-1 by an early stop codon, we found alignments to the resulting fractional translations 567 (E-values = 0.0024 and 0.17). There remained 11 candidate ORFs not known to be SARS-CoV-568 2 proteins and that aligned to some PFAM profile (E-values=6.4 E-5 to 0.94). Of these 569 candidates, one stood out as the longest (87aa) and most significant (E-value = 6.4 E-5) by a 570 large margin. We dubbed this translation "Candidate ORF 15" (CaORF15) and decided to test 571 the sequence in our assays. CaORF15 encodes 87 amino acids at genome coordinates 21936-aligned Spike regions, computing the non-gapped identity of each genome versus that of SARS- CoV-2 (Fig 1C, bottom) . To emphasize local identity, we smoothed the windows by weighting 585 matches with a centered normal distribution (sigma=8). construct as whole genes using DH5ɑ, so we used E. coli PY1182 (an MM294-derived strain 609 with a pcnB80 mutation to reduce copy-number of ColE1 vectors; a gift of R. Losick) as a host. were each co-transfected with Cas9-containing plasmid (Synthego) into BJ-5ta cells (ATCC 662 CRL-4001) using Lipofectamine 3000. After 48 hours, samples were removed from each knockout pool for Inference of CRISPR Edits (ICE) analysis to assess gRNA efficiency, which 664 was between 1% and 6%. The knockout pool with 6% gRNA efficiency (gRNA 2) was diluted to 665 a density of 0.5 cells/100uL and plated into 96-well plates for clonal expansion. Colonies grown 666 from a single cell were visually identifiable after 3 weeks. After 8 weeks, the cGAS locus was 667 sequenced in each clonal colony to identify colonies with homozygous indels. One (cGAMP) (Invivogen #tlrl-nacga23-5) at a concentration of 1 mg/mL (final concentration in the 689 well 100 ug/mL) was used to stimulate STING pathway activity by incubation at 37 °C, 5% CO 2 690 for 2 h. Interferon ɑ1 (Cell Signaling #8927) or ɑ2b (PBL Assay Science #11100-1) at a concentration of 50 ng/mL (final concentration in the well 5 ng/mL) was used to stimulate IFNAR activity by incubation at 37 °C, 5% CO 2 for 45-50 min. Cell signaling was stopped by fixation as 4x with 60 µL PBS using an automated plate washer and sealed using impermeable black plate 710 seals. If not imaged immediately, fixed and stained cells were stored at 4 °C for a maximum of cGAMP, poly(I:C) HMW, or poly(I:C) LMW were stained simultaneously for IRF-3 (Cell Signaling 715 Technology #11904) and NFκB (Santa Cruz Biotechnology #sc-8008). IRF-3, phospho-STAT1, 716 and STAT1 primary antibodies were detected using an Alexa-Fluor 647-conjugated goat anti-717 rabbit IgG antibody (ThermoFisher #A21245). NFkB primary antibody was detected using an five experiments were averaged; in general, genes that showed no effects in the first two roughly 0 to 1. The purpose of these calculations is simply to make the inhibition scores more 778 intuitive. The rationale for changing the sign is that, for example, the response to an immune 779 stimulus will be reduced relative to negative controls if a virus gene inhibits that response, so 780 the sign of the fold change score will be negative. The rationale for re-scaling the scores is that the primary fold change scores are very small; this is in part a result of the calculations 782 performed by the proprietary Columbus software, and likely in part results from the fact that we 783 do not perform a background subtraction when assessing signal levels in the initial image 784 processing. Thus, the primary fold change scores are thus highly artificial numbers; our 785 confidence in their meaning results from the fact that differences from controls are statistically 786 significant, that positive control genes such as parainfluenza virus V protein score correctly in 787 our assays, and that coronavirus genes identified by others as major inhibitors of innate immune 788 signaling also behave as expected in our assays. The authors declare that they have no conflicts of interest. SARS-CoV-1. Numerical values presented here correspond to the bar graph in Figure 1 which indicate that the 1095 nucleotide sequence of the 5'-most region of the Spike gene in SARS-CoV-2 (which also 21541