key: cord-0872392-tkyngspy authors: Qi, Furong; Qian, Shen; Zhang, Shuye; Zhang, Zheng title: Single cell RNA sequencing of 13 human tissues identify cell types and receptors of human coronaviruses date: 2020-03-19 journal: Biochem Biophys Res Commun DOI: 10.1016/j.bbrc.2020.03.044 sha: 5f6a4c37c7c20e77dc239f09da7b00df8a07ed10 doc_id: 872392 cord_uid: tkyngspy Abstract The new coronavirus (SARS-CoV-2) outbreak from December 2019 in Wuhan, Hubei, China, has been declared a global public health emergency. Angiotensin I converting enzyme 2 (ACE2), is the host receptor by SARS-CoV-2 to infect human cells. Although ACE2 is reported to be expressed in lung, liver, stomach, ileum, kidney and colon, its expressing levels are rather low, especially in the lung. SARS-CoV-2 may use co-receptors/auxiliary proteins as ACE2 partner to facilitate the virus entry. To identify the potential candidates, we explored the single cell gene expression atlas including 119 cell types of 13 human tissues and analyzed the single cell co-expression spectrum of 51 reported RNA virus receptors and 400 other membrane proteins. Consistent with other recent reports, we confirmed that ACE2 was mainly expressed in lung AT2, liver cholangiocyte, colon colonocytes, esophagus keratinocytes, ileum ECs, rectum ECs, stomach epithelial cells, and kidney proximal tubules. Intriguingly, we found that the candidate co-receptors, manifesting the most similar expression patterns with ACE2 across 13 human tissues, are all peptidases, including ANPEP, DPP4 and ENPEP. Among them, ANPEP and DPP4 are the known receptors for human CoVs, suggesting ENPEP as another potential receptor for human CoVs. We also conducted “CellPhoneDB” analysis to understand the cell crosstalk between CoV-targets and their surrounding cells across different tissues. We found that macrophages frequently communicate with the CoVs targets through chemokine and phagocytosis signaling, highlighting the importance of tissue macrophages in immune defense and immune pathogenesis. In December 2019, a novel coronavirus (SARS-CoV-2) infection emerged in Wuhan. Over 80 thousand of people are infected with SARS-CoV-2 until March 6, showing that SARS-CoV-2 is highly contagious. Coronavirus is a type of single-stranded RNA (ssRNA) virus [1] , including the well-known Middle East respiratory syndrome coronavirus (MERS-CoV) and severe acute respiratory syndrome coronavirus (SARS-CoV). The symptoms caused by SARS-CoV-2 infection include acute respiratory distress syndrome (~29%), acute cardiac injury (~12%) or acute kidney injury (~7%) [2] , implying that SARS-CoV-2 may infect various human tissues. Viruses bind to host receptors on target cell surface to establish infection. Membrane proteins mediated membrane fusion allowed the entry of enveloped viruses [3] . As recently reported, both SARS-CoV-2 and SARS-CoV could use ACE2 protein to gain entry into the cells [4, 5] . Since the outbreak, many data analysis have showed a wide distribution of ACE2 across human tissues, including lung, liver, stomach, ileum, colon and kidney [6] , indicating that SARS-CoV-2 may infect multiple organs. However, these data showed that AT2 cell (the main target cell of SARS-CoV-2) in the lung actually expressed rather low levels of ACE2 [6] . Hence, the SARS-CoV-2 may depends on co-receptor or other auxiliary membrane proteins to facilitate its infection. It is reported that viruses tend to hijack co-expressed proteins as their host factors [7] . For example, Hoffmann et al. recently showed that SARS-CoV-2-S uses ACE2 for entry and depends on the cellular protease TMPRSS2 for priming [5] , showing that SARS-CoV-2 infections also require multiple factors. Understanding the receptors usage by the viruses could facilitate the development of intervention strategies. Therefore, identifying the potential co-receptors or auxiliary membrane proteins for SARS-CoV-2 is of great significance. For this purpose, we collected single cell gene expression matrices from 13 relatively normal human tissues, consisting of lung [8] , liver [9] , ileum [10] , rectum [10] , blood [11] , bone marrow [12] , skin [13] , spleen [14] , esophagus [14] , colon [15] , eye [16] , stomach [17] and kidney [18] from published literatures. We analyzed the single cell co-expression profiles of 51 known ssRNA viral receptors and 400 membrane proteins, including ACE2, in the identified 119 cell types across the 13 human tissues. After that, we conducted "CellPhoneDB" to identify immune cells frequently crosstalk with CoVs-target cells, in multiple tissues. The gene raw counts or normalized gene expression matrix for each single cell were downloaded from GEO (https://www.ncbi. nlm.nih.gov/geo/) or Human Cell Atlas (https://www. humancellatlas.org) database (Table S1 ). In total, we collected single cell gene expression data of 13 tissues, including liver, lung, colon, ileum, rectum, blood, spleen, bone marrow, eye, skin, stomach, oesophagus and kidney. The data source and the sample information are listed as follows. Liver, GEO Accession No. GSE115469, 5 normal human donors; Lung, GEO Accession No. GSE130148, 4 human donors died from hypoxic brain damage; Colon, GEO Accession No. GSE116222, 3 healthy volunteers; Ileum and rectum, GEO Accession No. GSE125970, totally 4 intestine mucosae sampled at least 10 cm away from the tumor border; Skin, GEO Accession No. GSE132802, 4 healthy volunteers; PBMC, GEO Accession No. GSE136103, 4 samples from cirrhotic patients; Spleen and oesophagus, available from Human Cell Atlas, totally 11 cardiac death donors; Bone marrow, GEO Accession No. GSE120221, 5 healthy donors (A, E, J, R, U); Eye, GEO Accession No. GSE135922, 3 Macula and 3 periphery of human donor eyes; Stomach, GEO Accession No. GSE134520, 3 Non-atrophic gastritis patients; Kidney, GEO Accession No. GSE131685, 3 normal kidney tissues obtained at least 2 cm away from tumor tissue. The high-quality virus-host receptor interactions were downloaded from Viral Receptor database (http://www. computationalbiology.cn:5000/viralReceptor), which curated 152 pairs of mammalian virus-host receptor interactions and 51 virus receptors from 9 mammal species. The membrane proteins were extracted from Membranome database (https://membranome.org). The raw count matrix (UMI counts per gene per cell) was processed by Seurat [19] . Cells with less than 100 expressed genes (UMI count > 0) and higher than 25% mitochondrial genome transcript were removed. Genes expressed in less than three cells were removed. Then, we normalized the gene expression data using "NormalizeData" function with default settings. The sources of cell-cell variation driven by batch were regressed out using the number of detected UMI and mitochondrial gene expression, which was implemented by ''ScaleData'' function. The corrected expression matrix was used for cell clustering and dimensional reduction. The cell clustering and dimensional reduction were performed by Seurat package. Before that, we choose 2000 highly variable genes (HVGs) from the corrected expression matrix and then centered and scaled them. It was implemented by ''FindVaria-bleGenes'' function in the Seurat package. We then performed principle component analysis (PCA) on the HVGs using ''RunPCA'' function. To remove the signal-to-noise ratio, we select a number of significant principal components by implementing "JackStraw" function, which was implemented by permutation test. Specifically, we firstly identified 50 principal components as a result and then selected the significant components according to the p-values produced by "ScoreJackStraw" function for further analysis. The batch effects were removed by harmony package [20] . Cells were then clustered utilizing the ''FindClusters'' function through embedding cells into a graph structure in PCA space. We set the parameter resolution as 0.8 to identify only major cell types, e.g. T cells, B cells or macrophages. The clustered cells were then projected onto a two-dimensional space using "RunUMAP" function. The clustering results were visualized by "DimPlot" function. To annotate cell clusters, we firstly identified the differentially expressed genes on each cluster by performing "FindMarkers" function. The cell clusters were then annotated according to curated known cell markers (Fig. S1 ). The cell clusters consistently expressed the same cell marker were merged. We conducted cell-cell interaction analysis utilizing cellphonedb function curated by CellPhoneDB database [21] . The significant cell-cell interactions were selected with p-value < 0.01. We collected the single cell RNA sequencing data (raw count gene expression matrix or normalized gene expression matrix) from published literatures, which have been deposited in public database, e.g. GEO (https://www.ncbi.nlm.nih.gov/geo/) or Human Cell Atlas (https://www.humancellatlas.org). Totally, we curated single cell gene expression matrices of 13 human tissues, including lung [8] , liver [9] , ileum [10] , rectum [10] , blood [11] , bone marrow [12] , skin [13] , spleen [14] , esophagus [14] , colon [15] , eye [16] , stomach [17] and kidney [18] (Table S1) . For each tissue, we performed cell clustering and dimension reduction on the scaled gene expression matrix using Seurat package [19] . After filtering out low quality cells, we obtained 8443, 43,474, 4248, 5282, 3279, 30,693, 97,695, 17,131, 4335, 11,552, 4871, 8880, 20,197 cells from liver, lung, colon, ileum, rectum, blood, spleen, bone marrow, eye, skin, stomach, esophagus and kidney, respectively (Table S1 ). The cell clusters were then annotated using canonical markers searched from the published articles (figs1). We finally annotated 119 cell types from 13 human tissues. Lung belongs to respiratory system, in which 13 cell types were identified (figs2). These cell types consist of macrophages, Alveolar The eyes are sensory organs in nervous system. Ten cell types were identified in eyes (figs2). Fibroblasts and immune cells composed of~31% and~25% of total cells, respectively. Eyes also contain endothelial cells, melanocytes, pericytes, retinal pigment epithelium (RPEs) and Schwann cells. Bone marrow is the primary site of hematopoiesis. We identified large number of NK/NKT cells (~44%) and erythrocytes (~28%) cells in bone marrow (figs2). B cells, hematopoietic stem cells, MK progenitors, monocytes, neutrophils and DCs were also detected in bone marrow. Blood is circulated around various tissues. Monocytes (~32%) and T cells (~55%) make up the largest proportion of blood cells (figs2). In addition, we also identified B cells, cDCs, macrophages, NK cells, pDCs and platelets in blood. For the viral life cycle, the viruses firstly bind the host receptors on the cell surface. Hence, the distribution of viral receptors in different cell types of diverse tissues can reveal the viral tropism and potential transmission routes. We therefore explored the expression spectrum of host receptors. We firstly analyzed the expression pattern of ACE2 across 13 tissues (Fig. 1) . Our results reveal that ACE2 expresses in lung AT2, liver cholangiocyte, colon colonocytes, esophagus keratinocytes, ileum ECs, rectum ECs, stomach epithelial cells, and kidney proximal tubules, consistent with the recent reports [6] . However, ACE2 expression levels are rather low in lung AT2 (4.7-fold lower than the average expression level of all ACE2 expressing cell types). We assume that the presence of co-receptors or other auxiliary membrane proteins in AT2 cells may facilitate the binding and entry of the nCoV. We then analyzed the co-expression features of the human ssRNA viral receptors and membrane proteins. We collected a total of 152 pairs of high quality virus-host receptor interaction from Viral Receptor database [7] which contain 51 host receptors in 9 hosts and 96 viruses (Table S2) . Furthermore, 400 membrane proteins were extracted from Membranome database [22] . Totally 451 genes were curated, 95.7% (432/451) of which express in at least one of the 13 tissues. To elaborate the potential relationship between ACE2 and other membrane proteins or viral receptors, we calculated the Pearson Correlation Coefficient between each two genes in the curated reservoir. The findings show that 94 genes are significantly correlated with ACE2 (P < 0.01) in a manner of gene expression. Of note, ANPEP, ENPEP and DPP4 are the top three genes correlated with ACE2 (R > 0.8) (Fig. 2) . ANPEP, alanyl aminopeptidase, is a host receptor targeted by porcine epidemic diarrhoea virus, human coronavirus 229E, feline coronavirus, canine coronavirus, transmissible gastroenteritis virus and infectious bronchitis virus. These viruses all belong to Coronaviridae. ANPEP mainly expresses in colon, ileum, rectum, kidney, liver and skin (figs3 & figs4), demonstrating that receptor of coronavirus may have similar expression profiles in human body. ENPEP, Glutamyl Aminopeptidase, belongs to the peptidase M1 family which is the mammalian type II integral membrane zinc-containing endopeptidases. ENPEP regulates blood pressure regulation and blood vessel formation through the catabolic pathway of the renin-angiotensin system [23] . The relationship between ENPEP and viral infection is unknown. DPP4, the receptor of MERS-CoV, shows expression similarity with ACE2, except that DPP4 expresses in some T cells of all the observed tissues (figs3 & figs4). All of the three genes encode peptidase, which are uniquely adopted by coronavirus as their receptors [24] . This result raised the possibility that ENPEP may be another yet unknown receptor for coronavirus. To further consolidate the findings, we calculated the Euclidean distance between all the curated proteins and constructed their hierarchy relationships across the 119 cell types. DPP4 was the first gene clustered with ACE2. Together, our data demonstrates that the coronavirus receptors tend to share co-expression pattern across different tissues, consistent with the fact that CoVs infect similar types of cells and CoV-infected patients share similar clinical symptoms. Virus-infected cells can recruit and modulate immune cells through secreting chemokines or other cytokines. We sought to identify potential immune cells crosstalking with CoVs-targeted cells. The cell-cell interaction analysis was conducted by CellPho-neDB [21] . The interactions with p-value < 0.01 were adopted to construct the interaction relationship between cell types in each tissue. Using the cell type expressing ACE2 as ligand-secreting cells, we calculated the total number of interactions with each receptorsecreting cell type. As a result, we found that macrophages showed highest active interaction with ACE2-expressing cells in liver, lung and stomach (Fig. 3A) , sharing a CD74-MIF signaling pairs (Fig. 3B ). CD74 is expressed on the cell surface of antigenpresenting cells and act as a receptor for the cytokine in immune cells. MIF, macrophage migration inhibitory factor, is a proinflammatory cytokine participating in inflammatory and immune responses. Besides, PROs, SCs and TAs show high activity responding to the ACE2-expressing cells in ileum and rectum. In colon, ILCs were found frequently interacted with the ACE2expressing cells. Glomerular parietal epithelial cells and epithelial basal cells in kidney and esophagus also correlated with the cells transcribing ACE2 at very high frequency. We conclude that the nCoV-targeted cells (ACE2-expressing), can interact with various cell types in different tissues, especially macrophages in lung, liver and stomach. Macrophages may be recruited by nCoV-targeted cells through CD74-MIF interaction and other signaling pathways during infection, play defensive and destructive functions. The coronaviruses are a large family of ssRNA viruses causing respiratory diseases in humans. Most of the coronaviruses are associated with mild clinical symptoms, except SARS-CoV and MERS-CoV, showing fatality rate of 9.6% and 34%, respectively [25, 26] . In late December 2019, a novel coronavirus, named SARS-CoV-2, emerged in Wuhan, Hubei, China, and a total of 80,710 SARS-CoV-2 infected cases have been confirmed until March 06, 2020. The phylogenic tree constructed from full-genome sequences indicated that SARS-CoV-2 is a distinct clade from SARS-CoV and MERS-CoV [4] . The most common symptoms of patients infected with SARS-CoV-2 are fever and cough [27] . However, a proportion of patients show multi-organ damage and dysfunction, including acute respiratory distress syndrome (17%), acute respiratory injury (8%) and acute renal injury (3%). It is also increasingly recognized that SARS-CoV-2 could be transmitted via multiple routes. The viruses target host cells via binding host receptors before engaging the infection cycle. ACE2 was proved to be the cell receptor of SARS-CoV-2, the same receptor as SARS-CoV. The expression profiles of ACE2 across different cell types of different organs will reveal clues of the virus transmission routes and its potential pathogenesis. In previous studies, ACE2 was found to express in the esophagus upper and stratified epithelial cells, absorptive enterocytes from ileum and colon, alveolar type II cells in lung, liver cholangiocyte and kidney proximal tubules. These findings suggested that the clinical symptoms of hepatic failure, respiratory injury, acute kidney injury or diarrhoea may be associated with the pervasive ACE2 expressing cells in these tissues. However, we and others found that ACE2 is lowly expressed, especially in the lung (the main target organ of nCoVs), raising the possible existence of co-receptors facilitating nCoV infection. It is well recognized that ssRNA viruses tend to have multiple receptors [7] . For example, ACE2, CD209 (Dendritic Cell-Specific ICAM-3- . In addition, other membrane proteins may also assist virus entry [3] . Since the viral receptor and co-receptors should be co-expressed on the same cell types, we analyzed single cell co-expression patterns covering 400 membrane proteins and 51 known viral receptors in this study. After calculating their gene expression similarity, we found ANPEP, ENPEP and DPP4 are top three genes correlated with ACE4 (R > 0.8). Interestingly, both ANPEP and DPP4 are viral receptors of human coronaviruses [32] , while ENPEP is also a peptidase, despite that its involvement in virus infection is unclear. For mysterious reasons, human coronaviruses use peptidases as their receptors [24] . Now, we showed co-expression profiles of these molecules, indicating that different human CoVs actually target the similar cell types across different human tissues. It also explains why, patients infected with different human CoVs manifest similar clinical symptoms. We propose that further experimental validations should be performed to explore the role of these peptidase in SARS-CoV-2 and other CoVs infection. Host immune response plays crucial roles in the fight against viruses. Generally, virally infected cells release interferons to suppress viral activities [33e35]. The interferons also act on warning the neighboring cells of virus attack. It can signify the nearby cells to upregulate MHC class I molecules to notify the CD8 þ T cells to identify and eliminate the viral infection [36, 37] . Understand the potential cell-cell communication mechanisms across different tissues is important for understanding immune reactions. In this study, we investigated the cells communicated with CoVs-targets (ACE2-expressing cells) in each tissue. Our results illustrate that macrophages frequently crosstalk with the ACE2-expressing cells, in lung, liver and stomach etc. This suggests that macrophages play the sentinel role during human CoVs infection. Future studies should investigate these signaling pairs in the setting of CoVs infection in patients and animal models. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted. Host factors in positive-strand RNA virus genome replication Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Virus entry, assembly, budding, and membrane rafts, Microbiol Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding The Novel Coronavirus 2019 (2019-nCoV) Uses the SARS-Coronavirus Receptor ACE2 and the Cellular Protease TMPRSS2 for Entry into Target Cells, bioRxiv The single-cell RNAseq data analysis on the receptor ACE2 expression reveals the potential risk of different human organs vulnerable to Wuhan 2019-nCoV infection. Front. Med Cell membrane proteins with high N-glycosylation, high expression and multiple interaction partners are preferred by mammalian viruses as receptors Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations Single-cell transcriptome analysis reveals differential nutrient absorption functions in human intestine Single-cell transcriptomics uncovers zonation of function in the mesenchyme during liver fibrosis Human bone marrow assessment by single-cell RNA sequencing, mass cytometry, and flow cytometry Targeted therapy guided by single-cell transcriptomic analysis in drug-induced hypersensitivity syndrome: a case report scRNA-seq assessment of the human lung, spleen, and esophagus tissue stability after cold preservation Colonic epithelial cell diversity in health and inflammatory bowel disease Single-cell transcriptomics of the human retinal pigment epithelium and choroid in health and macular degeneration Dissecting the single-cell transcriptome network underlying gastric premalignant lesions and early gastric cancer, Cell Rep Single-cell RNA sequencing of human kidney Comprehensive integration of single-cell data Fast, sensitive and accurate integration of single-cell data with harmony v2.0: Inferring Cell-Cell Communication from Combined Expression of Multi-Subunit Receptor-Ligand Complexes, bioRxiv Membranome: a database for proteome-wide analysis of single-pass membrane proteins Mammalian Glutamyl aminopeptidase genes (ENPEP) and proteins: comparative studies of a major contributor to arterial hypertension Receptor recognition mechanisms of coronaviruses: a decade of structural studies Responding to global infectious disease outbreaks: lessons from SARS on the role of risk perception, communication and management Middle East Respiratory Syndrome Coronavirus (MERS-CoV). World Health Organization Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Endocytosis of the receptor-binding domain of SARS-CoV spike protein together with virus receptor ACE2 pH-dependent entry of severe acute respiratory syndrome coronavirus is mediated by the spike glycoprotein and enhanced by dendritic cell transfer through DC-SIGN LSECtin interacts with filovirus glycoproteins and the spike protein of SARS coronavirus DC-SIGN and DC-SIGNR interact with the glycoprotein of Marburg virus and the S protein of severe acute respiratory syndrome coronavirus Permissivity of dipeptidyl peptidase 4 orthologs to Middle East respiratory syndrome coronavirus is governed by glycosylation and other complex determinants Interferon activation and innate immunity The host type I interferon response to viral and bacterial infections Type I interferons in infectious disease MHC class I antigen presentation: learning from viral evasion strategies Impact of MHC class I diversity on immune control of immunodeficiency virus replication Supplementary data to this article can be found online at https://doi.org/10.1016/j.bbrc.2020.03.044.