key: cord-0801658-ytfkigku authors: Bao, R.; Hernandez, K.; Huang, L.; Luke, J. J. title: ACE2 and TMPRSS2 expression by clinical, HLA, immune, and microbial correlates across 34 human cancers and matched normal tissues: implications for SARS-COV-2 COVID-19 date: 2020-04-30 journal: nan DOI: 10.1101/2020.04.29.20082867 sha: 6f1bb5e9a6a0e870bdb657b6bfecc63cb853f48f doc_id: 801658 cord_uid: ytfkigku Background: Pandemic COVID-19 by SARS-COV-2 infection is facilitated by the ACE2 receptor and protease TMPRSS2. Modestly sized case series have described clinical factors associated with COVID-19, while ACE2 and TMPRSS2 expression analyses have been described in some cell types. Cancer patients may have worse outcomes to COVID-19. Methods: We performed an integrated study of ACE2 and TMPRSS2 gene expression across and within organ systems, by normal versus tumor, across several existing databases (The Cancer Genome Atlas, Census of Immune Single Cell Expression Atlas, The Human Cell Landscape, and more). We correlated gene expression with clinical factors (including but not limited to age, gender, race, BMI and smoking history), HLA genotype, immune gene expression patterns, cell subsets, and single-cell sequencing as well as commensal microbiome. Results: Matched normal tissues generally display higher ACE2 and TMPRSS2 expression compared with cancer, with normal and tumor from digestive organs expressing the highest levels. No clinical factors were consistently identified to be significantly associated with gene expression levels though outlier organ systems were observed for some factors. Similarly, no HLA genotypes were consistently associated with gene expression levels. Strong correlations were observed between ACE2 expression levels and multiple immune gene signatures including interferon-stimulated genes and the T cell-inflamed phenotype as well as inverse associations with angiogenesis and transforming growth factor-{beta} signatures. ACE2 positively correlated with macrophage subsets across tumor types. TMPRSS2 was less associated with immune gene expression but was strongly associated with epithelial cell abundance. Single-cell sequencing analysis across nine independent studies demonstrated little to no ACE2 or TMPRSS2 expression in lymphocytes or macrophages. ACE2 and TMPRSS2 gene expression associated with commensal microbiota in matched normal tissues particularly from colorectal cancers, with distinct bacterial populations showing strong associations. Conclusions: We performed a large-scale integration of ACE2 and TMPRSS2 gene expression across clinical, genetic, and microbiome domains. We identify novel associations with the microbiota and confirm host immunity associations with gene expression. We suggest caution in interpretation regarding genetic associations with ACE2 expression suggested from smaller case series. Severe acute respiratory syndrome (SARS) coronavirus 2 (SARS-CoV-2), which causes the disease COVID-19, was initially described near the end of 2019 [1, 2] and has caused a global pandemic. SARS-CoV-2 is a positive-sense single-strand RNA virus related to the SARS and the Middle East respiratory syndrome (MERS) coronaviruses that have caused previous global health emergencies [3] . COVID-19 is characterized predominately by fever, cough, and pneumonia, with some patients presenting with diarrhea and other symptoms [4, 5] . Mortality rates are described as approximately ten times higher than seasonal influenza in some clinical sub-groups [6] . Angiotensin-converting enzyme 2 (ACE2) has been identified as the receptor for the SARS-CoV family [7] , and the SARS-CoV-2 spike protein binds ACE2 on host cells with greater affinity than previous SARS-CoV [8, 9] . Type II transmembrane serine protease TMPRSS2 is the primary human protease that mediates spike protein activation on infected cells, facilitating viral entry via receptor-mediated internalization [9, 10] . Multiple physiologic roles are known for ACE2 impacting systems such as cardiovascular, nephrology, and immune [11] but perhaps most notably related to SARS-CoV-2, pulmonary, where ACE2 has been described to limit severe acute lung injury [12] . Analyses of ACE2 protein expression by organ system have suggested high levels in epithelia of the lung and small intestine, consistent with presenting symptoms of patients with COVID-19 [13] . However, these studies have not integrated analysis of TMPRSS2 and integrated analyses may better inform which organ systems express both genes and may be at greatest infection risk. Gene expression studies by bulk RNA sequencing and single-cell approaches have attempted to delineate expression patterns of normal airway tract and other tissues [14] [15] [16] . These studies have suggested high ACE2 expression levels on the epithelia of oral and airway mucosa as well as small intestine. ACE2 has additionally been suggested as an interferon-. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint response gene suggesting a complicated interaction between viral infection and host antiviral response [15] . Further, a report has been advanced suggesting that lymphocytes may directly be infected by SARS-CoV-2 [17] , a finding reported with MERS as well [18] , however of unclear clinical significance. Patients with cancer may be at particularly high risk for SARS-CoV-2 infection and deleterious outcomes to COVID-19 disease. In a single hospital study from Wuhan, China patients with cancer made up 1% of the overall prevalence of COVID-19 [19] , substantially higher than the overall incidence of cancer in the Chinese population at 0.29% [20] . Outcomes to COVID-19 appeared to be worse in patients with cancer with increased intensive care unit admission, mechanical ventilation, and mortality, especially those who had recently received chemotherapy or surgery [19] . A subsequent literature-based international meta-analysis of COVID-19 incidence in patients with cancer has suggested a prevalence of approximately 2.0% globally [21] . Particularly there is concern that patients being treated with cancer immunotherapy drugs may be at even a greater risk given the possible overlapping immune- To better inform considerations surrounding SARS-CoV-2 and COVID-19 in patients with cancer and more broadly in the general population, we performed an integrated analysis of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint 6 ACE2 and TMPRSS2 gene expression across clinical, genetic, and microbiome domains. We identify novel associations with the commensal microbiota and confirm host immunity associations with gene expression. We suggest caution against over-interpretation of clinical or genetic associations from smaller case series noting that these are not strongly associated with ACE2 or TMPRSS2 gene expression. We hope these data may better inform clinical considerations surrounding risk stratification and prevention approaches. Sample metadata tables were downloaded from The Cancer Genome Atlas (TCGA) CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. Links to data files, single-cell cohorts, and bioinformatics software are provided in Supplementary Table 3 . Data generated in this study are accessible on GitHub repository https://github.com/riyuebao/ACE2_TMPRSS2_multicorrelates. The gene expression of ACE2 (Entrez Gene ID 59272) and TMPRSS2 (Entrez Gene ID 7113) was retrieved from the RSEM-quantified RNAseq data and used for all analyses described in this study. Spearman's correlation was calculated between the expression of the two genes in tumor (n=10,024) and normal (n=708) samples across all tumor types and within individual tumor types. The expression percentile was calculated separately within each of the four analysis sets (ACE2 in normal, ACE2 in tumor, TMPRSS2 in normal, TMPRSS2 in tumor) following two steps. First, the median expression of ACE2 or TMPRSS2 was calculated within individual tumor types. Next, tumor types were ranked by the median expression of each gene from higher to lower, and the position of each tumor type in the ranked list was scaled to 0 to 100, hereafter regarded as "expression percentile" per tumor type, with smaller values indicating top-ranked tumor types. The same process was repeated for each gene in tumor samples (34 tumor types) and normal tissues (14 tumor types). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint The expression of ACE2 and TMPRSS2 was compared between designated clinical groups, split by age (younger (<65 years) / older (≥65 years) in tumor or normal), gender (female / male in tumor or normal), race (African American (AA) / Asian / White in tumor or normal), menopause (not post / post in tumor or normal), BMI (level 1 (<25) / level 2 (25-30) / level 3 (30-35) / level 4 (>35) [32] in tumor or normal), smoking history (never / light / heavy [33] in tumor or normal), tumor stage (I / II / III / IV in tumor), tumor grade (G1G2 / G3G4 in tumor). For tumor grade, G1 and G2 were collapsed to indicate low-to mid-grade (G1G2), and G3 and HLA-A, HLA-B, and HLA-C genotypes were identified for 9,559 patients from TCGA across 34 tumor types using OptiType (v1.3.2) with WES BAM files. We performed two levels of analysis. In the allele level analysis, considering each patient carries two copies of HLA-A, B, or C alleles, both copies were counted towards the total number (19, 118) of A, B, or C alleles in the entire cohort. In the patient-level analysis, only one HLA-A, B, or C allele was kept as the final label to assign to each patient, with the priority determined by the lexicographical ranking of all alleles present in the entire cohort. For each patient, between the two copies of HLA-A, B, or . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint C alleles, if one was ranked before the other, then the first one was assigned to the patient. The calculation of HLA prevalence calculation was performed at the allele level. The comparison of ACE2 and TMPRSS2 gene expression between HLA genotypes was performed at the patient level using two-way ANOVA, given that gene expression was estimated per sample. Five immune responsive and suppressive signatures (interferon-stimulated genes (ISG), T cell-inflamed (Tinfl), myeloid, angiogenesis (angio), and transforming growth factor-β (TGF-β)) (Supplementary Table 7 ) were correlated with ACE2 and TMPRSS2 gene expression in tumor and normal tissues. The expression level of a signature was computed as the average expression of all genes in this signature after centering and scaling. Spearman's correlation was calculated between each signature and ACE2 or TMPRSS2. The full correlation metrics are provided in Supplementary Table 8 . FPKM estimates of RNAseq gene expression was used for quantifying enrichment of 64 tumor and stroma cell types using xCell (v1.1.0). xCell converts gene expression into rankbased metrics within each sample, hence normalization and batch correction were not required. To make data comparable across samples, xCell was run once using all samples (n=10,732). Spearman's correlation was computed between the enrichment score of each cell population and ACE2 or TMPRSS2 expression. The full correlation metrics are provided in Supplementary Table 9 . HPV, EBV, and HBV were selected for this study as those are the three most prevalent cancer-associated viruses in the cohort. Other viruses detected were excluded. Samples were . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint set to HPV positive/negative by cutoff 10, EBV positive/negative by cutoff 5, and HBV positive/negative by cutoff 5 given previously recommended thresholds [31] . STAD, ESCA, LAML, and OV were reset to "negative" for HPV, EBV, and HBV after a manual inspection, which revealed no strong clinical support for viral presence in those tumor types. Within each tumor type, ACE2 and TMPRSS2 gene expression was compared between viral positive and negative tumors using Welch Two Sample t-test. Across all tumor types, two-way ANOVA was used to compare gene expression between viral positive and negative tumors with tumor type and viral group as variables plus the interaction between the two. The abundance of 1,093 genus-level microbial taxa was quantified from tissue RNAseq data after rigorous QC, batch correction, and contamination filtering, and normalized to 1 million reads to make data comparable across samples [30] . Seven hundred six normal tissues and 9,801 tumor samples were included in the analysis where data were available. Taxa were filtered to keep bacteria in analysis; viruses and archaea were excluded. Nine hundred fifty taxa present in at least 20% of samples were kept for statistical testing. Within each tumor type, Spearman's correlation was computed between each bacteria taxon and ACE2 or TMPRSS2 gene expression in tumor and normal tissues. For each test, at least 15 samples with taxon abundance ≥ one were required. 75 taxa passed FDR-adjusted P<0.05 and Spearman's ρ > 0.5 or < -0.5 in at least one pairwise correlation (Supplementary Table 10 ). LASSO regression models of ACE2 or TMPRSS2 gene expression were built in tumor (n=10,024) and normal (n=708) samples separately using 90 features consisting of tissue type as well as clinical (age, gender, race), immune signatures (ISG, T cell-inflamed, myeloid, angiogenesis, TGF-β), immune cell subsets (macrophage M1, macrophage M2, CD8 T cell, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. proinflammatory phenotype [34] in COVID-19 disease. Thirty-four tumor types were collapsed into 15 tissue types based on categorizations from The Human Protein Atlas to reduce complexity. Categorical variables were converted to dummy variables using R function dummyVars with parameter fullRank set to TRUE. Data were preprocessed to remove features that have near-zero variance, high correlation (Spearman's ρ > 0.75), or high collinearity. Each feature was scaled and centered. Given the purpose of this analysis was to evaluate the relative importance of features rather than training and validating a predictive model, we did not split samples into training and test sets. Instead, we used all samples with 10-fold cross-validation. Variable importance was reported as raw values and as scaled values to 0-100 (Supplementary Table 11 ). R package caret (v6.0-84) was used for analysis. In all analyses, a minimal sample size of 15 per group was required for statistical testing. For group-wise comparisons, two-way ANOVA was used with tumor type and a group of interest as variables plus the interaction. When more than two groups are present, Tukey's HSD test was used for pairwise comparisons. Within each tumor type, two-sided Welch Two Sample ttest was used to compare log2-transformed gene expression between groups; for paired samples, two-sided paired t-test was used. Spearman's correlation was used to determine the relationship between two continuous variables. For multiple comparisons, p-values were corrected using Benjamini & Hochberg (BH)-FDR method. LASSO regression models were used to evaluate variable importance with 10-fold cross-validation. Those analyses were performed using R (v3.6.1) and Bioconductor (release 3.10). P-values less than 0.05 were considered statistically significant. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. Table 1) , we compared gene expression between tumor and matched normal from the same patients. The expression of ACE2 and TMPRSS2 was significantly higher in normal tissues relative to tumors ( Figure 1C) . Within individual tumor types, this pattern was . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. Table 6 ). The results suggested those clinical variables are not strongly associated with ACE2 or TMPRSS2 expression. In addition, we investigated the association between the presence of diabetes and gene expression in pancreatic adenocarcinoma (PAAD), where data were available. Similarly, no significant differences in gene expression were detected in tumor or normal tissues from patients with or without diabetes. Two HLA genotypes (B*46:01, B*54:01) have been reported to be associated with severe clinical outcomes by other groups [36] . We investigated the prevalence of the two alleles and identified low prevalence across all tumor types (0.6% and 0.2%, respectively, out of 19,118 HLA-B alleles from 9,559 patients). When looking into individual tumor types, both alleles were . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. (Figure 3D) . With the observation of correlation with specific immune signatures, we sought to investigate whether ACE2 or TMPRSS2 were expressed directly by immune cells or other . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. Therefore, we concluded that ACE2 or TMPRSS2 are not expressed in immune cell populations, at least in the cohorts investigated. The expression of both genes in bulk RNAseq data was likely to be derived from non-immune cells, such as epithelial cells in the tissues. Given associations between strong anti-tumor immune responses due to the presence of tumor-related virus and particular commensal microbiota, we sought to investigate associations between ACE2 and TMPRSS2 gene expression with the presence of virus or tissue microbiota. Across known viral positive and negative tumor types, we found an inconsistent pattern relative . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. Table 10 ). To integrate all correlates and evaluate their relative importance in determining the gene expression of ACE2 and TMPRSS2, we built LASSO regression models in tumor and normal tissues separately utilizing features from the clinical, immune, and microbial domains (Supplementary Figure 2) . Clinical features included were age, gender, and race, while menopause, BMI, and smoking history were excluded because >50% of the samples were missing information. HLA genotype was not included because of many categories and/or levels, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint which may lead to overfitting. Immune gene expression signatures included ISG, T cellinflamed, myeloid, angiogenesis, and TGF-β. Immune cell type features included macrophage M1/M2, CD8, and CD4 T cells, and non-immune cell type features included epithelial cells. Microbe features included the 75 bacteria taxa from the analysis above. In addition, we collapsed 34 tumor types into 15 tissue types and included these in the model to account for tissue-specific gene expression variations. We calculated the importance of each feature in the models with 10-fold cross-validation (Supplementary Table 11 ). After quality control and filtering, among the features kept in each model, immune and epithelial cells were the top-ranked features that predict ACE2 expression in normal tissues and tumors. Microbiota was observed to be important features for ACE2 in normal tissues but not in tumors (Figure 5A and 5B) . For TMPRSS2 expression, epithelial cell abundance is the most important predictor in both normal and tumor samples (Figure 5C and 5D ). Taken together, these results suggested that immune signatures, epithelial cells, and commensal microbiota were important predictors for ACE2 expression, while TMPRSS2 expression was primarily determined by epithelial cells. We performed a pan-cancer analysis of the receptor that facilitates SARS-CoV-2 infection (ACE2) and the protease that mediates spike protein activation and viral entry (TMPRSS2) by integrating data across six resources including clinical, genetic, transcriptomic and microbiome domains. We found that ACE2 and TMPRSS2 are generally expressed lower in tumors relative to matched normal and that digestive organs (both tumor and normal samples) have the highest expression. Neither clinical factors nor HLA genotypes were consistently associated with gene expression levels. Multiple immune gene expression signatures such as . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint ISG and the T cell-inflamed tumor microenvironment did correlate with ACE2, and inverse correlations were seen with angiogenesis and TGF-β. ACE2 expression correlated with increased macrophage abundance in some tumors, while TMPRSS2 was strongly associated with epithelial cells. Regarding lymphocytes and macrophages, no ACE2 expression was observed in these cells across multiple single-cell sequencing studies. Microbiota contents are clearly associated with ACE2 and TMPRSS2 gene expression levels, possibly suggesting a causal role and the potential to be a modifiable biomarker. The mortality of COVID-19 disease has been substantially greater than that seen with seasonal influenza and led to the identification of or hypothesis that certain clinical factors may be associated with outcomes. The factors of particular focus included advanced age, BMI, and possibly diabetes or other chronic health conditions such as cardio-pulmonary syndromes and immuno-suppression or cancer [19, 23, 24 ]. In addition, certain races or ethnicities have experienced greater morbidity and mortality due to pandemic [25] . In our analysis of ACE2/TMPRSS2 gene expression in tumors and matched normal tissues, we observe no consistent association for these factors. Further work would be required to investigate other variables associated with these disease states, such as chronic inflammatory conditions, immuno-suppression, and other disparities that may be contributing factors [39, 40] , and cellular context is important to interpret the complexity of those associations [41] . An initial hypothesis when considering the deleterious outcomes for patients with cancer and COVID-19 disease was that cancer tissues themselves might have higher expression of viral entry related genes. We found that gene expression levels did not support this to be the case. Rather cancer tissues broadly have lower expression of ACE2 and TMPRSS2, though the cancers of the digestive tract do have the highest relative level among cancer tissues. This suppressed expression level is consistent with that observed in immuno-oncology gene expression studies [42] , in which the T cell-inflamed tumor microenvironment has been . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint observed to be lower in cancer compared with matched normal. ACE2 has been described as a type I interferon-inducible gene [15] . Across our analysis, we see strong correlations of ACE2 with type I (ISG) and type II (T cell-inflamed) interferon signatures consistent with this. Observing higher ACE2 levels in T cell-inflamed tumors does suggest cautious consideration in the administration of cancer immunotherapy during the COVID-19 pandemic, especially in patients with tumors of the aerodigestive tract such as head and neck, lung and colorectal/anal tracts. T cell-inflamed gene expression is strongly correlated with treatment response to checkpoint immunotherapy [43] and has not been associated with immune-related adverse events (irAE) [44] . However, if ACE2 and TMPRSS2 levels are high, making viral infection potentially more likely, concomitant treatment with checkpoint blockade may potentially change anti-viral host response [45] Direct infection or dysregulation of immune cell populations is an additional area of concern in patients with cancer and more broadly in infected patients. COVID-19 can manifest with lymphopenia with some autopsy series suggesting lymph node or splenic atrophy [48] . Certainly, dysregulated macrophage activity, with the elaboration of IL-6 and other inflammatory cytokines, is a major component of the disease. Studies have raised the possibility that SARS-CoV-2 infects lymphocytes [17] or macrophages [48], leading to COVID-19 associate findings. In our study, we investigated the expression of ACE2 and TMPRSS2 across multiple single-cell sequencing databases encompassing nine independent studies. However, we found no evidence of expression in these cells. It must be noted that the possibility exists that type I interferon may induce ACE2 expression, which would not be captured in our analysis. We would note, however, that previous studies have not definitively determined that T cells or macrophages are infected by SARS-CoV-2, and direct viral culture from purified cell populations . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. Certainly, there may be virus infection-induced changes that are dynamic. However, we believe our analysis to be the most comprehensive catalog of ACE2 and TMPRSS2 correlates to date . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint (34 tumor types from 15 tissue types across 10,038 subjects including both tumor samples and matched normal tissues as well as scRNAseq databases consisting of patients with cancer and healthy donors). We also acknowledge that the microbiota we analyzed were identified from tissue RNAseq data, and the sample collection and preparation of tissue RNAseq was not designed originally to completely rule out potential contamination or confirm the vitality of identified microbes. However, these source data constitute the largest collection of microbiota communities identified from patients with cancer, have previously been used in this manner to build prediction algorithms, and the data were optimized via rigorous methodology to control for noise across the data set [30] . We also note that we are unable in this analysis to comment on respiratory or fecal samples from patients infected with COVID-19 and very much look forward to better understanding the functional mechanisms associated with those commensal and pathogenic microbiota related to COVID-19. Lastly, our work does not determine a causal role of those correlates in driving response or severity of COVID-19 disease and would require further mechanistic studies as well as prospective clinical trials in patients to further develop or investigate interventional approaches. We have performed a multi-omic analysis of ACE2 and TMPRSS2 gene expression related to clinical, genetic, microbiome covariates associated with COVID-19 infection. We have identified novel commensal microbiome associations and further described interferon associated gene expression patterns in normal and tumor tissues related relevant to SARS-CoV-2 infection. These data will hopefully inform sample collection, future analyses, and treatment of patients with cancer and others infected with COVID-19. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint Table 5 . Two-way ANOVA was used in A to F, with tumor type and clinical group as the variables plus interaction between the two. For clinical factors that have more than two groups (C, E, F), Tukey's honest significance test (HSD) was used with the fitted ANOVA model for pairwise comparisons while controlling for Type I errors. Two-way ANOVA p-values after BH-FDR correction are shown in . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint LAML does not have data available for microbiota abundance, hence excluded from analysis for both genes. Spearman's correlation was used in A and B. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint Features are shown on the y-axis colored by clinical, immune, non-immune, microbiota, and tissue type. Top 20 features ranked by variable importance higher to lower are shown, and the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint Coronaviridae Study Group of the International Committee on Taxonomy of V: The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding A novel coronavirus outbreak of global health concern Clinical Characteristics of Coronavirus Disease 2019 in China Clinical features of patients infected with 2019 novel coronavirus in Estimating Risk for Death from 2019 Novel Coronavirus Disease, China Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus SARS-CoV-2 Cell Entry Depends on ACE2 and 9 Veesler D: Structure, Function, and Antigenicity of the SARS-CoV-2 Spike Glycoprotein Angiotensinconverting enzyme in innate and adaptive immunity The cell biology of receptor-mediated virus entry Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis Single-cell RNA-seq data analysis on the receptor ACE2 expression reveals the potential risk of different human organs vulnerable to 2019-nCoV infection SARS-CoV-2 Receptor ACE2 is an Interferon High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa SARS-CoV-2 infects T lymphocytes through its spike protein-mediated membrane fusion Middle East Respiratory Syndrome Coronavirus Efficiently Infects Human Primary T Lymphocytes and Activates the Extrinsic and Intrinsic Apoptosis Pathways Cancer patients in SARS-CoV-2 infection: a nationwide analysis in China COVID-19 and Cancer: Lessons From a 31 The Immune Landscape of Cancer. Immunity High prevalence of obesity in severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) requiring invasive mechanical ventilation Mutational analysis of head and neck squamous cell carcinoma stratified by smoking status Cytokine release syndrome in severe COVID-19 Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate Cancer Association of HLA class I with severe acute respiratory syndrome coronavirus infection Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer Associations of Chronic Inflammation, Insulin Resistance, and Severe Obesity With Mortality, Myocardial Infarction, Cancer, and Chronic Pulmonary Disease Integrated analyses of single-cell atlases reveal age, gender, and smoking status associations with cell typespecific expression of mediators of SARS-CoV-2 viral entry and highlights inflammatory programs in putative target cells Density of immunogenic antigens does not explain the presence or absence of the T-cell-inflamed tumor microenvironment in melanoma Pan-tumor genomic biomarkers for PD-1 checkpoint blockadebased immunotherapy Managing toxicities associated with immune checkpoint inhibitors: consensus recommendations from the Society for Immunotherapy of Cancer (SITC) Toxicity Management Working Group Influenza vaccination of cancer patients during TMPRSS2 in tumor (n=33 tumor types; LAML not shown due to lack of TMPRSS2 expression in this tumor type). (B) Correlation between ACE2 and TMPRSS2 gene expression in normal (n=708 samples) and tumor (n=10,024 samples). (C and D) ACE2 and TMPRSS2 gene expression are higher in normal relative to tumor samples in (C) all tumor types pooled and in (D) individual tumor types. Line connects tumor and matched normal samples from the same patient (n=692 patients). Spearman's correlation was used in B. Two-sided paired t-test was used in C and D. P-values shown are after FDR correction for multiple comparisons Pooled Meta-Analysis. JCO Glob Oncol 2020, 6:557-559. Ascierto PA, Fox B, Urba W, Anderson AC, Atkins MB, Borden EC, Brahmer J, Butterfield LH, Cesano A, Chen D et al: Insights from immuno-oncology: the Society . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted April 30, 2020. Construction of a human cell landscape at single-cell level. Nature 2020.. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint 37 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint full list is provided in Supplementary Table 11 . LASSO regression was used in A to D with 10fold cross-validation.. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted April 30, 2020. . https://doi.org/10.1101/2020.04.29.20082867 doi: medRxiv preprint