key: cord-0779266-anz96jld authors: Zhao, Yu; Zhao, Zixian; Wang, Yujia; Zhou, Yueqing; Ma, Yu; Zuo, Wei title: Single-Cell RNA Expression Profiling of ACE2, the Receptor of SARS-CoV-2 date: 2020-09-01 journal: Am J Respir Crit Care Med DOI: 10.1164/rccm.202001-0179le sha: e69bf38b5d3d9d141643a68bfc8c278fa9a96806 doc_id: 779266 cord_uid: anz96jld nan remains a challenge for such analysis. The recently developed single-cell RNA-sequencing technology enables us to study the ACE2 expression in each cell type and provides quantitative information at a single-cell resolution. Previous work has built up the online database for single-cell RNA-sequencing analysis of eight normal human lung transplant donors (16) . In the current work, we used the updated bioinformatics tools to analyze the data. Some of the results of these studies have been previously reported in the form of a preprint (https://doi.org/10.1101/2020.01.26.919985) (16) . We analyzed 43,134 cells derived from the normal lung tissue of eight adult donors ( Figure 1A) . We performed unsupervised graphbased clustering (Seurat version 2.3.4), and for each individual, we identified 8-11 transcriptionally distinct cell clusters based on their marker gene expression profile. Typically, the clusters include AT2 cells, AT1 cells, airway epithelial cells (ciliated cells and club cells), fibroblasts, endothelial cells, and various types of immune cells. The cell cluster map of a representative donor (a 55-yr-old Asian man) was visualized using t-distributed stochastic neighbor embedding (tSNE), as shown in Figure 1B . Next, we analyzed the cell type-specific expression pattern of ACE2 in each individual. For all donors, ACE2 is expressed in 0.64% of all human lung cells. The majority of the ACE2expressing cells (83% in average) are AT2 cells. Other ACE2expressing cells include AT1 cells, airway epithelial cells, fibroblasts, endothelial cells, and macrophages. However, their ACE2-expressing cell ratio is relatively low and is variable among individuals. For the representative donor (Asian male, 55 yr old), the expressions of ACE2 and cell type-specific markers in each cluster are demonstrated in Figure 2A . There are 1.4 6 0.4% of AT2 cells expressing ACE2. To further understand the special population of ACE2-expressing AT2, we performed a gene ontology (GO) enrichment analysis to study which biological processes are involved with this cell population by comparing them with the AT2 cells not expressing ACE2. Surprisingly, we found that multiple viral life cycle-related functions are significantly overrepresented in ACE2-expressing AT2 cells, including those relevant to viral replication and transmission ( Figure 2B ). We found an upregulation of CAV2 and ITGB6 genes in ACE2-expressing AT2. These genes are components of caveolae, which is a special subcellular structure on the plasma membrane critical to the internalization of various viruses, including SARS-CoV (17) (18) (19) . We also found an enrichment of multiple ESCRT (endosomal sorting complex required for transport) machinery gene members (including CHMP3, CHMP5, CHMP1A, and VPS37B) in ACE2-expressing AT2 cells that were related to virus budding and release (20, 21) . These data showed that this small population of ACE2-expressing AT2 cells is particularly prone to SARS-CoV-2 infection. We further analyzed each donor and their ACE2-expressing patterns. As the sample size was very small, no significant association was detected between the ACE2-expressing cell number and any characteristics of the individual donors. But we did notice that one donor had a five-fold higher ACE2-expressing cell ratio than average. The observation on this case suggested that ACE2expressing profile heterogeneity might exist between individuals, which could make some individuals more vulnerable to SARS-CoV-2 than others. However, these data need to be interpreted very cautiously because of the very small sample size of the current dataset, and a larger cohort study is necessary to draw conclusions. Altogether, in the current study, we report the RNA expression profile of ACE2 in the human lung at single-cell resolution. Our analysis suggested that the expression of ACE2 is concentrated in a special small population of AT2 cells, which also expresses many other genes favoring the viral infection process. It seems that SARS-CoV-2 has cleverly evolved to hijack this population of AT2 cells for its reproduction and transmission. Targeting AT2 cells explained the severe alveolar damage and minimal upper airway symptoms after infection by SARS-CoV-2. The demonstration of the distinct number and distribution of the ACE2-expressing cell population in different cohorts can potentially help to identify the susceptible population in the future. The shortcomings of the study are the small sample number and the fact that the current technique can only analyze the RNA level and not the protein level of single cells. Furthermore, although previous studies reported abundant ACE2 expression in pulmonary endothelial cells (14, 22) , we did not observe high ACE2 RNA levels in this population. This inconsistency may be partly due to the fact that the cell number and portion of endothelial cells in the current dataset is relatively smaller than expected. Indeed, because the limitation of sample collection and processing, the analyzed cells in this study may not fully represent the whole lung cell population. Future quantitative analysis at the transcriptomic and proteomic level in a larger total population of cells is needed to further dissect the ACE2 expression profile, which could eventually lead to novel anti-infective strategies, such as ACE2 receptor blockade (23, 24) , ACE2 protein competition (25) , or ACE2-expressing cell ablation. Public datasets (Gene Expression Omnibus GSE122960) were used for bioinformatics analysis. First, Seurat (version 2.3.4) was used to read a combined gene-barcode matrix of all samples. Low-quality cells with less than 200 or more than 6,000 detected genes were removed; cells were also removed if their mitochondrial gene content was ,10%. Only genes found to be expressed in more than three cells were retained. For normalization, the combined genebarcode matrix was scaled by the total unique molecular identifier counts, multiplied by 10,000, and transformed to log space. The highly variable genes were identified using the function FindVariableGenes. Variants arising from number of unique molecular identifiers and the percentage of mitochondrial genes were regressed out by specifying the vars.to.regress argument in Seurat function ScaleData. The expression level of highly variable genes in the cells was scaled, centered along each gene, and conducted to principal component (PC) analysis. Then the number of PCs to be included in downstream analysis was assessed by 1) plotting the cumulative SDs accounted for each PC using the function PCElbowPlot in Seurat to identify the "knee" point at a PC number after which successive PCs explain the diminishing degrees of variance and 2) by exploring primary sources of heterogeneity in the datasets using the PC Heatmap function in Seurat. Based on these two methods, the first top significant PCs were selected for two-dimensional tSNE, which was implemented by the Seurat software with the default parameters. FindClusters was used in Seurat to identify cell clusters for each sample. After clustering and visualization with tSNE, the initial clusters were subjected to inspection and merging based on the similarity of marker genes and a function for measuring phylogenetic identity using BuildClusterTree in Seurat. The identification of cell clusters was performed on the final aligned object, guided by marker genes. To identify the marker genes, differential expression analysis was performed by the function FindAllMarkers in Seurat with the Wilcoxon rank sum test. Differentially expressed genes that were expressed at least in 25% of cells within the cluster and with a fold change of .0.25 (log scale) were considered to be marker genes. tSNE plots and violin plots were generated using Seurat. For GO enrichment analysis, differentially expressed genes of the ACE2-expressing AT2 cells were calculated for each donor when they were expressed in at least 25% of cells within the cluster and had a fold change of . 0.25 (log scale) compared with all AT2 cells. All differentially expressed genes were combined to a gene list for GO analysis by the ClusterProfiler R package. GO terms with a corrected P value of less than 0.05 were considered significantly enriched by differentially expressed genes. Dot plots were used to visualize enriched terms by the enrichplot R package. n Author disclosures are available with the text of this letter at www.atsjournals.org. Clinical features of patients infected with 2019 novel coronavirus in Wuhan Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Clinical characteristics of 2019 novel coronavirus infection in China. medRxiv; 2020 Coronavirus disease 2019 (COVID-19) situation report-89 WHO Strategic and Technical Advisory Group for Infectious Hazards. COVID-19: towards controlling of a pandemic Novel Wuhan (2019-nCoV) coronavirus Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission The S proteins of human coronavirus NL63 and severe acute respiratory syndrome coronavirus bind overlapping regions of ACE2 Crystal structure of NL63 respiratory coronavirus receptor-binding domain complexed with its human receptor Expression of elevated levels of pro-inflammatory cytokines in SARS-CoV-infected ACE21 cells in SARS patients: relation to the acute lung injury and pathogenesis of SARS A new coronavirus associated with human respiratory disease in China Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation A pneumonia outbreak associated with a new coronavirus of probable bat origin Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus: a first step in understanding SARS pathogenesis Binding of SARS coronavirus to its receptor damages islets and causes acute diabetes Single-cell transcriptomic analysis of human lung provides insights into the pathobiology of pulmonary fibrosis Cigarette smoke increases uptake of influenza A virus by lung epithelial cells by increasing expression of caveolin-1 Lipid rafts are involved in SARS-CoV entry into Vero E6 cells Integrin regulation of caveolin function ESCRT-III protein requirements for HIV-1 budding The mechanism of budding of retroviruses from cell membranes AMPactivated protein kinase phosphorylation of angiotensin-converting enzyme 2 in endothelium mitigates pulmonary hypertension Angiotensin receptor blockers as tentative SARS-CoV-2 therapeutics COVID-19 with different severities: a multicenter study of clinical features Inhibition of SARS-CoV-2 infections in engineered human tissues using clinical-grade soluble human ACE2 Acknowledgment: The authors thank Alexander Misharin's group for sharing their original single-cell RNA-sequencing dataset with the public. They also thank the medical workers who gave their lives in the fight against the COVID-19 pandemic.