key: cord-0703314-doelyoww authors: Cao, Yingying; Xu, Xintian; Kitanovski, Simo; Song, Lina; Wang, Jun; Hao, Pei; Hoffmann, Daniel title: Comprehensive comparison of transcriptomes in SARS-CoV-2 infection: alternative entry routes and innate immune responses date: 2021-01-11 journal: bioRxiv DOI: 10.1101/2021.01.07.425716 sha: f413f3d59f859b1b473831d865eb7022fabacb2a doc_id: 703314 cord_uid: doelyoww The pathogenesis of COVID-19 emerges as complex, with multiple factors leading to injury of different organs. Several studies on underlying cellular processes have produced contradictory claims, e.g. on SARS-CoV-2 cell entry or innate immune responses. However, clarity in these matters is imperative for therapy development. We therefore performed a meta-study with a diverse set of transcriptomes under infections with SARS-CoV-2, SARS-CoV and MERS-CoV, including data from different cells and COVID-19 patients. Using these data, we investigated viral entry routes and innate immune responses. First, our analyses support the existence of cell entry mechanisms for SARS and SARS-CoV-2 other than the ACE2 route with evidence of inefficient infection of cells without expression of ACE2; expression of TMPRSS2/TPMRSS4 is unnecessary for efficient SARS-CoV-2 infection with evidence of efficient infection of A549 cells transduced with a vector expressing human ACE2. Second, we find that innate immune responses in terms of interferons and interferon simulated genes are strong in relevant cells, for example Calu3 cells, but vary markedly with cell type, virus dose, and virus type. Coronaviruses are non-segmented positive-sense RNA viruses with a genome of around 30 kilobases. The genome has a 5' cap structure along with a 3' poly (A) tail, which acts as mRNA SARS-CoV-2, emerging in late 2019 (6), which has caused many millions of confirmed cases and > 1 million deaths worldwide (7). Infection with SARS-CoV, MERS-CoV or SARS-CoV-2 can cause a severe acute respiratory illness with similar symptoms, including fever, cough, and shortness of breath. SARS-CoV-2 is a new coronavirus, but its similarity to SARS-CoV (amino acid sequences about 76% identical (8)) and MERS-CoV suggests comparisons to these earlier epidemics. Despite the difference in the total number of cases caused by SARS-CoV and SARS-CoV-2 (3, 7) due to different transmission rates, the outbreak caused by SARS-CoV-2 resembles the outbreak of SARS: both emerged in winter and were linked to exposure to wild animals sold at markets. Although MERS-CoV has high morbidity and mortality rates, lack of autopsies from MERS-CoV cases has hindered our understanding of MERS-CoV pathogenesis in humans. Until now there are no specific anti-SARS-CoV-2, anti-SARS-CoV or anti-MERS-CoV therapeutics approved for human use. There are several points of attack for potential anti-SARS-CoV-2/SARS-CoV/MERS-CoV therapies, e.g. intervention on cell entry mechanisms to prevent virus invasion, or acting on the host immune system to kill the infected cells and thus prevent replication of the invading viruses. A better understanding of virus entry mechanisms and the immune responses can therefore guide the development of novel therapeutics. Virus entry into host cells is the first step of the viral life cycle. It is an essential component of cross-species transmission and an important determinant of virus pathogenesis and infectivity (9, 10), and also constitutes an antiviral target for treatment and prevention (11). It seems that SARS-CoV and SARS-CoV-2 use similar virus entry mechanisms (12). The infection of SARS-CoV or SARS-CoV-2 in target cells was initially identified to occur by cell-surface membrane fusion (13, 14) . Some later studies have shown that SARS-CoV can infect cells through receptor mediated endocytosis (15, 16) as well. Both mechanisms require the S protein of SARS-CoV or SARS-CoV-2 to bind to angiotensin converting enzyme 2 (ACE2), and S protein of MERS-CoV to dipeptidyl peptidase 4 (DPP4) (17) , respectively, through their receptor-binding domain (RBD) (18) . In addition to ACE2 and DPP4, some recent studies suggest that there are possible other coronavirus-associated receptors and factors that facilitate the infection of SARS-CoV-2 (19) , including the cell surface proteins Basignin (BSG or CD147) (20) , and CD209 (21) . Recently, clinical data have revealed that SARS-CoV-2 can infect several organs where ACE2 expression could not be detected in healthy individuals (22, 23) , which highlights the need of closer inspection of virus entry mechanisms. The binding of S protein to a cell-surface receptor is not sufficient for infection of host cell (24) . In the cell-surface membrane fusion mechanism, after binding to the receptor, the S protein requires proteolytic activation by cell surface proteases like TMPRSS2, TPMRSS4, or other members of the TMPRSS family (14, 25, 26) , followed by the fusion of virus and target cell membranes. In the alternative receptor mediated endocytosis mechanism, the endocytosed virion is subjected to an activation step in the endosome, resulting in the fusion of virus and endosome membranes and the release of the viral genome into the cytoplasm. The endosomal cysteine proteases cathepsin B (CTSB) and cathepsin L (CTSL) (27) might be involved in the fusion of virus and endosome membranes. Availability of these proteases in target cells largely determines whether viruses infect the cells through cell-surface membrane fusion or receptor mediated endocytosis. How the presence of these proteases impacts efficiency of infection with SARS-CoV-2, SARS-CoV and MERS-CoV, still remains elusive. When the virus enters a cell, it may trigger an innate immune response, a crucial component of the defense against viral invasion. Compounds that regulate innate immune responses can be introduced as antiviral agents (10). The innate immune system is initialized as pattern recognition receptors (PRRs) such as Toll-like receptors (TLRs) and cytoplasmic retinoic acid-inducible gene I (RIG-I) like receptors (RLRs) recognize molecular structures of the invading virus (28, 29) . This pattern recognition activates several signaling pathways and then downstream transcription factors such as interferon regulator factors (IRFs) and nuclear factor κB (NF-κB). Transcriptional activation of IRFs and NF-κB stimulates the expression of type I (α or β) and type III (λ) interferons (IFNs). IFN-α (IFNA1, IFNA2, etc), IFN-β (IFNB1) and IFN-λ (IFNL1-4) are important cytokines of the innate immune responses. IFNs bind and induce signaling through their corresponding receptors (IFNAR for IFN-α/β and IFNLR for IFN-λ), and subsequently induce expression of IFN-simulated genes (ISGs) (e.g. MX1, ISG15 and OASL) and pro-inflammatory chemokines (e.g. CXCL8 and CCL2) to suppress viral replication and dissemination (30, 31) . Dysregulated inflammatory host response results in acute respiratory distress syndrome (ARDS), a leading cause of COVID-19 mortality (32). One attractive therapy option to combat COVID-19 is to harness the IFN-mediated innate immune responses. Clinical trials with type I and type III IFNs for treatment of have been conducted and many more are still ongoing (33, 34) . In this regard, the kinetics of the secretion of IFNs in the course of SARS-CoV-2 infection needs to be defined. Unfortunately, some results on the host innate immune responses to SARS-CoV-2 are apparently at odds with each other (35) (36) (37) (38) (39) , e.g. it is unclear whether SARS-CoV-2 infection induces low IFNs and moderate ISGs (35) , or robust IFN responses and markedly elevated expression of ISGs (36) (37) (38) (39) . This has to be clarified. The use of IFNs as a treatment in COVID-19 is now a subject of debate as well (40) . Thus, the kinetics of IFN secretion relative to the kinetics of virus replication need to be thoroughly examined to better understand the biology of IFNs in the course of SARS-CoV-2 infection and thus provide guidance to identify the temporal window of therapeutic opportunity. We have collected and analyzed a diverse set of publicly available transcriptome data (35, (Table 1 and Table 2 ). Using this collection, we systemically evaluated the replication and transcription status of virus in these cells, expression levels of coronavirus-associated receptors and factors, as well as the innate immune responses of these cells during virus infection. Our analysis shows that the infection efficiency of viruses can be both cell type dependent and virus dose dependent (Fig. 1) . MERS-CoV can efficiently infect MRC5 and Vero E6 cells. However, the infection efficiency is influenced strongly by MOI in the same type of cells. Cells infected with low MOI, say 0.1, have significantly lower mapping rates than those with high MOI, say 3 ( Fig. 1) . For SARS-CoV and SARS-CoV-2, the infection efficiency is influenced strongly by cell type. For SARS-CoV-2, there is efficient virus infection in A549-ACE2, Calu3, Caco2, and Vero E6 cells, but not in A549, H1299, or NHBE cells ( Fig. 1 and Table S1 ). The mapping rates in A549, H1299, and NHBE cells are low even at high MOIs ( Fig. 1 and Table S1 ). Similar to SARS-CoV-2, the infection by SARS-CoV is also cell type dependent, Vero E6 cells and Calu3 cells show high mapping rates to SARS-CoV genome, but the mapping rates of SARS-CoV in MRC5 and H1299 cells are close to zero even at the high MOI of 3 ( Fig. 1 and Table S1 ). Since "total RNA" (see Methods/Data collection) includes additional negative-strand templates of virus, the mapping rates are usually much higher than those that used the PolyA+ selection method in the same condition ( Fig. 1 and Table S1 ). To examine the detailed replication and transcription status of these viruses in the cells, we calculated the number of reads (depth) mapped to each site of the corresponding virus genome ( Fig. 2) . For better comparison, these read numbers were log 10 transformed. The replication and transcription of MERS-CoV, SARS-CoV-2 and SARS-CoV share an uneven pattern of expression along the genome, typically with a minimum depth in the first half of the viral genome, and the maximum towards the end. Among the parts with very high levels, there are especially coding regions for structural proteins, including S, E, M, and N proteins, as well as the first coding regions with nsp1 and nsp2. Interestingly, there is an exception for BALF samples in COVID-19 patients, which show a more irregular, fluctuating behavior along the genome (Fig. 2B ). The deviation from the cellular expression pattern is not surprising because BALF is not a well-organized tissue but a mixture of many components, some of which will probably digest viral RNA. Interestingly, the mentioned uneven transcription pattern of efficient infections with SARS-CoV-2, SARS-CoV, and MERS-CoV, is also visible for inefficient infection with SARS-CoV-2 in A549, NHBE, and H1299 cells, and SARS-CoV in H1299 and MRC5 cells (Fig. 2C, D) , although there the total mapping rates to their corresponding virus genomes are much lower ( Fig. 1 ). To further elucidate the corresponding entry mechanisms for different types of cells, we examined the expression levels of those receptors and proteases that have already been described as facilitating target cell infection (Fig. 3 ). Our analysis shows that MERS-CoV can efficiently infect MRC5 and Vero E6 cells ( Fig. 1 and In A549, H1299, and MRC5 cells, which do express small amounts of SARS-CoV-2 and SARS-CoV virus (Fig. 1, Fig. 2C , D), there is no ACE2 expression at all (Fig. 3B ). This could point to an alternative ACE2-independent entry mechanism for SARS-CoV-2 and SARS-CoV As a virus enters a cell, it may trigger an innate immune response, i.e. the cell may start expression of various types of innate immunity molecules at different strengths. There is currently an intense debate about which of these molecules, especially IFNs and ISGs, are expressed how strongly (35) (36) (37) (38) (39) . We therefore focused in our analysis on innate immunity molecules such as IFNs, ISGs, and pro-inflammatory cytokines. To broaden the basis for conclusions, we analyzed, apart from cell lines, bulk RNA-Seq data of lung, PBMC, and BALF samples of COVID-19 patients, and single-cell RNA-Seq data of BALF samples from moderate and severe COVID-19 patients; for each type of patient data, we also included healthy controls. Gene expressions were compared quantitatively in terms of TPM (transcripts per million), as well as log fold changes (logFC) with respect to healthy controls (human samples) or mock-infected cultures (cell lines) (Fig. S1, Fig. S2 ). (Fig. S8) , which explains why there are almost no virus reads in these tissues. One of the two lung samples (accession number: SAMN14563387) has slightly upregulated IFNL1 (Fig. S6) , which had been ignored in the original publication (35) , although the total mapping rates to virus genome are both 0.00% for these two lung samples. We then checked the detailed coverage along the virus genome. There were a small number of virus reads aligned to SARS-CoV-2 genome in this sample (Fig. S7) . Different from other lung samples that did not express ACE2, this lung sample expressed ACE2 at a considerable level (5.45 TPM, Table S2 ). This result implies that when SARS-CoV-2 enters into lung successfully, or when the lung tissue chosen for sequencing are successfully infected by SARS-CoV-2, IFNs (at least IFNL1) can be upregulated. with SARS-CoV-2 at a high MOI of 2 have upregulated IFNB1, IFNL1, IFNL2 and IFNL3 ( Fig. 4B-E) . A549, H1299, NHBE (Fig. 4B-E) , and MRC5 cells (Fig. S3) One attractive potential anti-SARS-CoV-2 therapy is intervention in the cell entry mechanisms (12). However, the entry mechanisms of SARS-CoV-2 into human cells are partly unknown. During the last few months scientists have confirmed that SARS-CoV-2 and SARS-CoV both use human ACE2 as entry receptor, and human proteases like TMPRSS2 and TMPRSS4 (8, have shown that SARS-CoV-2 can infect several organs where ACE2 expression could not be detected (22, 23) , urging us to explore other potential entry routes. First, our analyses here have shown that even without expression of TMPRRS2 or TM-PRSS4, high SARS-CoV-2 infection efficiency in cells is possible (Fig. 1A, C) with considerable expression levels of CTSB and CTSL (Fig. 2E, F) . This suggests receptor mediated endocytosis (15, 16, 27) as an alternative major entry mechanism. Given this TMPRSS-independent route, TMPRSS inhibitors will likely not provide complete protection. The studies designed to predict the tropism of SARS-CoV-2 by profiling the expression levels of ACE2 and TMPRSS2 across healthy tissues (48, 49) may need to be reconsidered as well. Second, the evidence presented in our study suggests further, possibly undiscovered entry mechanism for SARS-CoV-2 and SARS-CoV (Fig. 2) . Although BSG/CD147 has been recently proposed as an alternate receptor (20) , later experiments reported there was no evidence supporting the role of BSG/CD147 as a putative spike-binding receptor (50) . The expression patterns of BSG/CD147 in different types of cells observed in our study could not explain the difference in virus loads observed in these cells either. CD209 and CD209L were recently reported as attachment factors to contribute to SARS-CoV-2 infection in human cells as well (21) . However, CD209 expression in the cell lines included here is low. Another reasonable hypothesis could be that the inefficient ACE2-independent entry mechanism we observed could be macropinocytosis, one endocytic pathway that does not require receptors (51) . Until now there is still no direct evidence for macropinocytosis involvement in SARS-CoV-2 and SARS-CoV entry mechanism. To confirm such an involvment, specific experiments are needed. Moreover, this ACE2-independent entry mechanism, only enables inefficient infection by SARS-CoV and SARS-CoV-2 (Fig. 2) and therefore cannot be a major entry mechanism. Another attractive potential anti-SARS-CoV-2 point of attack is supporting the human innate immune system to kill the infected cells and, thus disrupt viral replication. Not surprisingly, research in this area is flourishing but sometimes generates conflicting results, especially on the involvement of type I and III IFNs and ISGs (35) (36) (37) (38) (39) . The results of our analyses could help to dissolve the confusion on the involvement of IFNs and ISGs. We found that immune responses in Calu3 cells infected with SARS-CoV and SARS-CoV-2 resemble those of BALF samples of moderate and severe COVID-19 patients, with elevated levels of type I and III IFNs, robust ISG induction as well as markedly elevated pro-inflammatory cytokines, in agreement with recent studies (36) (37) (38) (39) . However this picture differs from the one reported by (35) with low levels of IFNs and moderate ISGs. This latter study was partially based on A549 cells and NHBE cells with nearly no ACE2 expression and very low mapping rate to the viral genome, and lung samples of two patients (both show 0.00% mapping rate to virus genome). Hence, given that there was no efficient virus infection in theses cells, the low levels of IFNs and ISGs were to be expected. However, in one of the lung samples sequenced by (35) (accession number: SAMN14563387), we observed a slight upregulation of IFNL1 (Fig. S6) , which was ignored in the original publication, together with considerable ACE2 expression (Table S2 ) (5.45 TPM), and a few virus reads aligned to SARS-CoV-2 genome (Fig. S7) . This results suggests that levels of IFNs are ISGs are associated with viral load and severity of virus infection. We found low induction of IFNs and moderate expression of ISGs in PBMC samples and BALF samples of COVID-19 patients (Fig. 4, Fig. S5 ). In these PBMC samples, there are no After the successful release of the virus genome into the cytoplasm, a negative-strand genomiclength RNA is synthesized as the template for replication. Negative-strand subgenome-length mRNAs are formed as well from the virus genome as discontinuous RNAs, and used as the templates for transcription. In the public data we collected for the analysis, there are two main library preparation methods to remove the highly abundant ribosomal RNAs (rRNA) from total RNA before sequencing. One is polyA+ selection, the other is rRNA-depletion (56) . It is known that coronavirus genomic and subgenomic mRNAs carry a polyA tail at their 3' ends, so in the polyA+ RNA-Seq, we have (1) virus genomic sequence from virus replication, i.e. replicated genomic RNAs from negative-strand as template, and (2) subgenomic mRNAs from virus transcription; in the rRNA-depletion RNA-Seq we have (1) virus genomic sequence from virus replication: both replicated genomic RNAs from negative-strand as template and the negativestrand templates themselves, and (2) subgenomic mRNAs from virus transcription. PolyA+ selection was used if not specifically stated in this study, "total RNA" is used to specify that the rRNA-depletion method was used to prepare the sequencing libraries. The workflow of this study is summarized in Fig. S1 and was then calculated for each condition. The single cell RNA-Seq data were summarized across all cells to obtain "pseudo-bulk" samples. R packages EDASeq (66) and org.Hs.eg.db (67) were used to obtain gene length, and TPM was calculated with the "calculateTPM" function of R package scater (68) . logFC was then calculated for each patient. The clean RNA-Seq data were also aligned to the virus genome with Bowtie 2 (69) (version 2.2.6) and the aligned BAM files were created, and the mapping rates to the virus genomes were obtained as well. SAMtools (70) (version 1.5) was then used for sorting and indexing the aligned BAM files. The "SAMtools depth" command was used to produce the number of aligned reads per site along the virus genome. The heatmap in Fig. 3I was made by pheatmap R package (71), "complete" clustering method was used for clustering the rows and "euclidean" distance was used to measure the cluster distance. The heatmap in Fig. 4A was made by ComplexHeatmap R package (72) . "complete" clustering method was used for clustering the rows and columns and "euclidean" distance was used to measure the cluster distance. Fig. 1 . ACE2 BSG CD209 CTSB CTSL DPP4 TMPRSS11A TMPRSS11B TMPRSS11D TMPRSS11E TMPRSS11F TMPRSS13 TMPRSS15 TMPRSS2 TMPRSS3 TMPRSS4 TMPRSS5 TMPRSS6 IFIH1 DHX58 TLR1 TLR2 TLR3 TLR4 TLR5 TLR6 TLR7 TLR8 TLR10 IRF1 IRF2 IRF3 IRF4 IRF5 IRF6 IRF7 IRF8 IRF9 TBK1 NFKB1 NFKB2 IFNA14 IFNA2 IFNA6 IFNA8 IFNB1 IFNE IFNG IFNK IFNL1 IFNW1 IFNAR1 IFNGR1 IFNGR2 IFNLR1 JAK1 JAK2 JAK3 TYK2 STAT1 STAT2 STAT3 STAT4 STAT5A STAT5B STAT6 ISG15 ISG20 ISG20L2 MX1 OAS1 OAS2 OAS3 OASL IFIT1 IFIT1B IFIT2 IFIT3 IFIT5 IFITM5 IFITM10 CCL2 CCL5 CCL16 CCL20 CCL24 CCL26 CCL27 CCL28 CXCL1 CXCL2 CXCL3 CXCL6 CXCL8 CXCL9 CXCL10 CXCL11 CXCL12 CXCL13 CXCL14 CXCL16 Newly discovered coronavirus as the primary cause of severe acute respiratory syndrome The life cycle of sars coronavirus in vero e6 cells Sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor Nabel, ph-dependent entry of severe acute respiratory syndrome coronavirus is mediated by the spike glycoprotein and enhanced by dendritic cell transfer through dc-sign Sars coronavirus entry into host cells through a novel clathrin-and caveolae-independent endocytic pathway Host determinants of mers-cov transmission and pathogenesis Structure, function, and evolution of coronavirus spike proteins A single-cell rna expression map of human coronavirus entry factors Sars-cov-2 invades host cells via a novel route: Cd147-spike protein Cd209l/l-sign and cd209/dc-sign act as receptors for sars-cov-2 and are differentially expressed in lung and kidney epithelial and endothelial cells The protein expression profile of ace2 in human tissues Sars-cov-2 viral load in upper respiratory specimens of infected patients Characterization of severe acute respiratory syndrome-associated coronavirus (sars-cov) spike glycoprotein-mediated viral entry Tmprss2 and tmprss4 promote sars-cov-2 infection of human small intestinal enterocytes Tmprss11a activates the influenza a virus hemagglutinin and the mers coronavirus spike protein and is insensitive against blockade by hai-1 Characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov Immune signaling by rig-i-like receptors The role of toll-like receptors in the host response to viruses Post-translational control of intracellular pathogen sensing pathways Type i and type iii interferons-induction, signaling, evasion, and application to combat covid-19 Clinical predictors of mortality due to covid-19 based on an analysis of data of 150 patients from wuhan, china Triple combination of interferon beta-1b, lopinavir-ritonavir, and ribavirin in the treatment of patients admitted to hospital with covid-19: an open-label, randomised Covid-19: lambda interferon against viral load and hyperinflammation Heightened innate immune responses in the respiratory tract of covid-19 patients Type iii interferons disrupt the lung epithelial barrier upon viral recognition Viral invasion and type i interferon response characterize the immunophenotypes during covid-19 infection Single-cell landscape of immunological responses in patients with covid-19 Type 1 interferons as a potential treatment against covid-19 Bulk and single-cell gene expression profiling of sars-cov-2 infected human cell lines identifies molecular targets for therapeutic intervention Transcriptomic characteristics of bronchoalveolar lavage fluid and peripheral blood mononuclear cells in covid-19 patients. Emerging microbes & infections 9 Obesity and disease severity magnify disturbed microbiome-immune interactions in asthma patients Single-cell landscape of bronchoalveolar immune cells in patients with covid-19 Proliferating spp1/mertkexpressing macrophages in idiopathic pulmonary fibrosis Angiotensin-converting enzyme 2 (ace2) is a key modulator of the renin angiotensin system in health and disease Inhibitors of cathepsin l prevent severe acute respiratory syndrome coronavirus entry Sars-cov-2 receptor ace 2 and tmprss 2 are primarily expressed in bronchial transient secretory cells Expression of ace2 and tmprss2 proteins in the upper and lower aerodigestive tracts of rats No evidence for basigin/cd147 as a direct sars-cov-2 spike binding receptor Virus entry by macropinocytosis Candidate drugs against sars-cov-2 and covid-19 Chemokine up-regulation in sars-coronavirus-infected, monocyte-derived human dendritic cells Cytokine responses in severe acute respiratory syndrome coronavirus-infected macrophages in vitro: possible relevance to pathogenesis Sars-coronavirus replicates in mononuclear cells of peripheral blood (pbmcs) from sars patients Comparison of rna-seq by poly (a) capture, ribosomal rna depletion, and dna microarray for expression profiling Database resources of the national center for biotechnology information The european nucleotide archive Database resources of the national genomics data center in 2020 Fastqc: a quality control tool for high throughput sequence data Trimmomatic: a flexible trimmer for illumina sequence data Near-optimal probabilistic rna-seq quantification Differential analysis of rna-seq incorporating quantification uncertainty Gc-content normalization for rna-seq data Genome wide annotation for human. R package version Scater: pre-processing, quality control, normalization and visualization of single-cell rna-seq data in r Fast gapped-read alignment with bowtie 2 The sequence alignment/map format and samtools pheatmap: Pretty Heatmaps Complex heatmaps reveal patterns and correlations in multidimensional genomic data Acknowledgements: The authors thank professor Ke Xu from Wuhan University and professor A-H) Each dot represents the expression value in each sample. (I) Heatmap of the expression levels of coronavirus associated receptors and factors of different cell types. Labels 1a, 1b, 1c mark cell clusters that likely share entry routes sketched in panel J. (J) Entry mechanisms involved in SARS-CoV-2 entry into cells Lung.2 BALF.1 BALF.2 SARS−CoV_VeroE6_0