key: cord-1000014-kxuapnul authors: Gómez-Carballa, Alberto; Rivero-Calle, Irene; Pardo-Seco, Jacobo; Gómez-Rial, José; Rivero-Velasco, Carmen; Rodríguez-Núñez, Nuria; Barbeito-Castiñeiras, Gema; Pérez-Freixo, Hugo; Cebey-López, Miriam; Barral-Arca, Ruth; Rodriguez-Tenreiro, Carmen; Dacosta-Urbieta, Ana; Bello, Xabier; Pischedda, Sara; Currás-Tuala, María José; Viz-Lasheras, Sandra; Martinón-Torres, Federico; Salas, Antonio title: A multi-tissue study of immune gene expression profiling highlights the key role of the nasal epithelium in COVID-19 severity date: 2022-02-22 journal: Environ Res DOI: 10.1016/j.envres.2022.112890 sha: 186bfe4af631db6764c93e8d93b42a430227f911 doc_id: 1000014 cord_uid: kxuapnul COVID-19 symptoms range from mild to severe illness; the cause for this differential response to infection remains unknown. Unravelling the immune mechanisms acting at different levels of the colonization process might be key to understand these differences. We carried out a multi-tissue (nasal, buccal and blood; n = 156) gene expression analysis of immune-related genes from patients affected by different COVID-19 severities, and healthy controls through the nCounter technology. We then used a gene expression approach and pathways analysis to detect tissue specific immune severity signals in COVID-19 patients. Mild and asymptomatic cases showed a powerful innate antiviral response in nasal epithelium, characterized by activation of interferon (IFN) pathway and downstream cascades, successfully controlling the infection at local level. In contrast, weak macrophage/monocyte driven innate antiviral response and lack of IFN signalling activity were shown in severe cases. Consequently, oral mucosa from severe patients showed signals of viral activity, cell arresting and viral dissemination to the lower respiratory tract, which ultimately could explain the exacerbated innate immune response and impaired adaptative immune responses observed at systemic level. Results from saliva transcriptome suggest that the buccal cavity might play a key role in SARS-CoV-2 infection and dissemination in patients with worse prognosis. Co-expression network analysis adds further support to these findings, by detecting modules specifically correlated with severity involved in the abovementioned biological routes; this analysis also provides new candidate genes that might be tested as biomarker in future studies. We found severity-related signatures in patient tissues mainly represented by genes involved in the innate immune system and cytokine/chemokine signalling. Local immune response could be key to determine the course of the systemic response and thus COVID-19 severity. Our findings provide a framework to investigate severity host gene biomarkers and pathways that might be relevant to diagnosis, prognosis, and therapy. The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which causes the coronavirus disease 19 , was first identified in patients suffering from a respiratory disease in Wuhan (Hubei province, China) in December 2019 (Gómez-Carballa et al. 2020b; Pipes et al. 2020) . Soon after, the virus quickly spread worldwide favored by two main catalyzers, namely, variants showing higher transmissibility (Davies et al. 2021 ) and superspreading events (Althouse et al. 2020; Gómez-Carballa et al. 2020a; Lemieux et al. 2020; Gómez-Carballa et al. 2021) . COVID-19 is mainly characterized by producing influenza-like illness Lamers et al. 2020) . A wide range of symptoms have been described, from absolutely no symptoms to severe pneumonia, lung damage, and progressive respiratory failure . Investigating the immunological processes underpinning SARS-CoV-2 pathogenesis is key for the development of efficient therapeutic strategies and vaccines against COVID-19. The immune system dysregulation described in severe patients is characterized by a reduce number of lymphocyte T subpopulations CD4+ and CD8+, pointing to an absence of T-cell specific response (Qin et al. 2020; Yang et al. 2020; Zheng et al. 2020b ). Impairment of the adaptative immune system, together with a non-effective host immune response by innate cells, could be the cause of a higher viral replication and the activation of inflammatory processes, resulting in tissue damage and a more severe disease course (Gomez-Rial et al. 2020; Qin et al. 2020) . The hyper-inflammatory response occurring during the acute phase is mainly driven by the monocyte-macrophage activation (Gómez-Rial et al. 2020) , the release of pro-inflammatory cytokines (McGonagle et al. 2020; Qin et al. 2020; Zhang et al. 2020 ) and increase of other mediators e.g. ferritin and Creactive protein (CRP). The infiltration of monocytes-derived macrophages from blood points to these cells as the main cause of the lung tissue damage in severe cases (Gomez-Rial et al. 2020) . Gene expression studies conducted on blood samples from COVID-19 patients indicated that genes associated with inflammatory and hypercoagulability pathways (Zhou et al. 2021) Most of the studies to date were carried out in vitro or nasal-derived epithelial cells (Gamage et al. 2020; Sajuthi et al. 2020; Sungnak et al. 2020; Alfi et al. 2021) . Only a few of them were performed on naso-pharyngeal (NP) swabs of infected patients; the results pointed to different expression patterns depending on the viral load, age, and sex (Lieberman et al. 2020) . Comparison of expression profiles of infected adults and children (Pierce et al. 2021) indicated an early and more efficient innate immune response in nasal mucosa from children, thus conditioning the final outcome of the disease. So far, only one study undertaken in only three severe patients has proposed candidate transcripts for severity and disease outcome in NP samples (Jain et al. 2021 ). Saliva has also been proposed as a good alternative to NP swabs for SARS-CoV-2 detection (Pasomsub et al. 2020; Wyllie et al. 2020 ). The study of the specific host gene expression profile produced by the infection in oral mucosa has not raised comparable interest; only a recent study by Huang et al. (2021) highlighted the importance of the oral cavity in viral transmission and dissemination. We carried out a multi-source gene expression study in blood, nasal, and saliva samples, collected on asymptomatic, mild, and severe patients. Our study aims at understanding the differential immune response that leads to different disease severities, and finding specific biomarkers of severity to improve patient management and prognostic prediction. We prospectively collected blood, saliva, and nasal epithelium samples from 52 COVID-19 patients with confirmed PCR-positive diagnosis of SARS-CoV-2 J o u r n a l P r e -p r o o f infection in the Hospital Clínico Universitario of Santiago de Compostela (Galicia; Spain) from March to June 2020. We also collected samples from healthy controls, defined as asymptomatic subjects with no diagnosis of COVID-19 disease at the time of sample collection. To avoid potential effects of ancestry on gene expression (Barral-Arca et al. 2019) , we assured that patients and healthy controls were of (self-reported) South-European ancestry. In addition, we also carried out a maximum likelihood estimation of individual ancestries from multilocus SNP data using whole exome sequencing data available from a significant proportion of the patients (35/52; data not shown). Procedures for exome data processing, quality control, variant annotation and estimation of ancestries were performed as in (Salas et al. 2017; Salas et al. 2018) . All the analyzed samples were from European genomic ancestry (Supplementary Table 1 ) corroborating self-reported expectations. Cases were classified at the time of sample collection by severity illness: severe (ICU admission), moderate (non-ICU but admitted to hospital) and mild (domiciliary lockdown patients with mild symptoms or asymptomatic; herein mild patients). Comparisons were carried out as follows: (i) whole blood from 41 patients and 13 controls, (ii) nasal epithelium from 38 patients and 11 controls, and (iii) saliva from 41 patients and 12 controls. The complete tissue sample set could be collected for 27 out of the 52 patients. Demographic and clinical data from the patients are summarized in Table 1 (see Supplementary Table 1 for more details). The study was approved by the Ethical Committee of Clinical Investigation of Galicia (CEIC ref. 2020/178) kit (Qiagen) following manufacturer recommendations and carrying out the oncolumn DNase I treatment during the extraction process. Saliva and nasal epithelium samples were collected in Oragene CP-190 kit (DNA Genotek). We used sponges (SC-2 kit from DNA Genotek) to assist saliva collection in ICU patients unable to spit. RNA from saliva and nasal epithelium specimens was isolated using 500l of sample and the RNeasy microkit (Qiagen). We used Oragene RNA neutralizer solution (DNA Genotek) to precipitate impurities and inhibitors and slightly modified the protocol provided by the extraction kit as recommended by the Oragene tubes supplier. After isolation, RNA amount and integrity was checked using TapeStation 4200 (Agilent), and DV200 values were calculated to both ensure that >50% of the of the RNA fragments were above 200nt and estimate the optimal sample input for the nCounter analysis. Universal Transport Medium tubes (UTM) nasopharyngeal samples were collected to check for the presence as well as the viral load of SARS-CoV-2 at the same time as collection of blood, saliva, and nasal epithelium samples. Molecular detection of viral particles was performed through a multiplex real-time PCR using the Allplex™SARS-CoV-2 Assay (Seegene). Nanostring nCounter expression assay Immunological gene expression patterns of different tissues from COVID-19 patients and controls were evaluated through the SPRINT nCounter system (NanoString Technologies) and the Immunology Panel (594 genes involved in several immunological processes). This panel covers the main immunological routes, including major classes of cytokines and their receptors, chemokine ligands and receptors, IFNs and their receptors, the TNF-receptor superfamily, and the KIR family genes. We used the standard gene expression protocol for 12x RNA hybridization with 5l of RNA as input, a hybridization time of 18h for all samples and mixing control and COVID-19 samples to avoid technical sample bias. We first carried out a quality control (QC) of the raw data checking some technical parameters following manufacturer recommendations to verify the absence of technical problems. Samples that did not pass technical QC, or with low number of genes detected, were excluded for downstream analysis. J o u r n a l P r e -p r o o f Sample normalization and differential expression analysis For the longitudinal multi-tissue experiment, normalization of the gene expression data was carried out for each tissue separately by selecting the optimal number and best reference candidate from the set of housekeeping (HK) genes included in the panel, as well as other endogenous genes that showed high levels of stability between samples. The GeNorm algorithm (Vandesompele et al. 2002) implemented in the R package CrtlGene (Zhong 2019 ) was used to test for the most stable and optimal number of genes for normalization. We excluded HK candidate genes with less than 50 raw counts in any of the samples compared from the reference selection analysis. Data normalization was performed through a combined approach using both DESeq2 (Love et al. 2014) and RUVSeq (Risso et al. 2014 ) as described in (Bhattacharya et al. 2020) . Genes with counts below the background (mean + 2 standard deviations [SD] of the negative control spikes in the code set) were not included in the differential expression (DE) analysis. After data normalization and background thresholding, we eliminated lower expressed genes from the list of DEG, retaining a total of 495 genes in blood samples, 544 genes in nasal epithelium samples and 545 genes in saliva samples for downstream analysis. DEG between cohorts were obtained using DESeq2 (Love et al. 2014) . We used age and sex as covariables to correct the model. Volcano plots and heatmaps were built with EnhancedVolcano (Blighe et al. 2020) and ComplexHeatmap (Gu et al. 2016 ) R packages, respectively. All graphics were created using R (www.r-project.org) software and the ggplot2 package (Wickham 2016) . Wilcoxon test was used to assess statistical significance between patient groups, and Spearman test for the correlation indices. We focused on highly statistically significant expression changes by considering as thresholds an adjusted P-value < 0.05, and log2FC  |1.5| for downstream analysis. For the transversal multi-tissue experiment we subselected 21 patients for which samples from the three tissues were available (mild: n = 11; severe: n = 10). We employed a complex model design including the tissue/severity interaction to find tissue-specific severity effects. As reference for the analyses, we used blood and the mild category. We followed the same procedure as in the longitudinal analysis but data from the three tissues were normalized together after selecting a set of adequate reference genes. J o u r n a l P r e -p r o o f We followed both over-representation (ORA) and gene-set enrichment (GSEA) (Subramanian et al. 2005) approaches to identify biological functions, cellular components and pathways related to the differences in gene expression. For ORA we used the DEG as input (threshold: log2FC  |1.5|), and the genes included in the expression panel as pool for statistical calculations. For GSEA we used all available molecular measurements (log2FC) from the genes included in the DE to detect coordinated changes in the expression of genes from same pathway. Analyses were carried out using the Clusterprofiler (Yu et al. 2012 ) R package. We applied the Benjamini-Hochberg (H-B) procedure for multiple test correction and thresholds were set to 0.05. We tested both the GO (Gene Ontology) and Reactome databases. Graphics were obtained with the R package enrichplot (Yu 2021 ). We used three additional datasets to validate and check the specificity of the transcriptomic findings using independent data: a) Thair et al. (Thair et al. 2021) included RNA-seq transcriptomes for 62 SARS-CoV-2 positive and 24 healthy controls; and b) Aschenbrenner et al. (Aschenbrenner et al. 2021 ) included RNAseq transcriptomes for 39 COVID-19 patients and 10 healthy controls. We additionally checked the specificity of the candidate genes obtained to differentiate COVID-19 from other viral infections using log2FC values reported in (Thair et al. 2021) , through the comparison of other mixed viral infections (n = 652) and healthy control samples (n = 672). Independently, to further test the status of our best candidate genes in bacterial and other viral infections, we carried out an independent validation study by meta-analyzing samples from bacterial (n = 382) and viral (n = 513) infections, and healthy controls (n = 215). Microarray and RNA-seq data were downloaded from the public gene expression microarray repository Gene Expression Omnibus (GEO) under accession numbers: GSE64456 (Mahajan et al. 2016) , GSE40012 (Parnell et al. 2012 ), GSE42026 (Herberg et al. 2013) , GSE60244 (Suarez et al. 2015) , those also included in Thair et al. (Thair et al. 2021 ) as viral cohorts; and GSE72829 (Herberg et al. 2016) , GSE69529 (DeBerg et al. 2018) . To merge and integrate J o u r n a l P r e -p r o o f independently viral vs. healthy controls and bacterial vs. healthy controls, we first normalized and pre-processed each dataset separately using Lumi (Du et al. 2008) for Illumina® microarrays data and Oligo (Carvalho and Irizarry 2010) for Affymetrix® datasets, while RNA-seq data were pre-processed as described previously . Then, we merged these databases retaining only the common genes. Subsequently, we used the R package COCONUT to combine all datasets into one and reduce batch effects in the meta-analysis (Sweeney et al. 2016 ). Next, we used Limma package (Ritchie et al. 2015) to detect DEG and calculate log2FC values between groups. Scatterplots of log2FC values and Pearson correlation indexes were obtained for the pseudo-validation analysis. We imputed immune cell subset abundances from nCounter expression data in different patient categories, including mild and healthy controls, from which real immune cell counts from blood tests were not available. We used the CIBERSORTX (Newman et al. 2019) tool and a leukocyte gene signature matrix (Newman et al. 2015) obtained from microarray data as reference. We applied a B-mode batch correction to account for cross-platform variation and 500 permutations for the analysis. We graphically represented immune cell proportions using R software and carried out a Wilcoxon test to assess statistical significance between categories. We generated a signed weighted correlation networks from gene expression data with the WGCNA R package (Langfelder and Horvath 2008) in order to find coexpression modules related to mild-severe phenotypes in the different tissues analyzed. We used normalized expression data (and corrected for differences in gender and age) from genes showing the most variant expression values between samples (top 50% with higher variance) as input, and we chose a softthresholding power based on the criterion of scale-free topology after testing a set of candidate powers (powers selected were 16, 9 and 18 for blood, nasal epithelium and saliva samples, respectively). The similarity matrix between each pair of genes across all samples was calculated using correlation values and J o u r n a l P r e -p r o o f transformed into an adjacency matrix. Subsequently, the topological overlap matrix (TOM) and the corresponding dissimilarity (1-TOM) values were computed. The resulting topological overlap is a biologically meaningful measure of gene similarity based on the co-expression relationship between two genes. As co-expression module detection parameters we chose a minimum module size of 30, a medium sensitivity for cluster splitting, and a 0.2 value as dendrogram cut heigh threshold for module merging. Correlation between module eigengenes and clinical traits were analyzed to identify modules of interest significantly associated with the mild-severe phenotype (gene significance [GS] ). Module membership (MM), as a measurement of intramodular connectivity, was also calculated. We have explored the correlation between GS and MM, and calculated the average absolute gene significance for all genes within a module in order to highlight the most important modules. The top hub genes within the most relevant modules were selected using both MM > 0.7 and a P-value for GS < 0.05 as thresholds. We studied biological significance of the hub genes from the phenotyperelated modules by performing an over-representation analysis and using the compare cluster function included in the Clusterprofiler R package (Yu et al. 2012 ). Expression data that support the findings of the present study have been deposited in Gene Expression Omnibus (GEO) database under the accession code GSE183071. Most of the clinical features were only available for hospitalized patients ( Table 1) . Similar median symptoms days to sampling were obtained for mild, moderate, and severe cases, as well as admission days to sampling in the case of moderate and severe patients (P-value = 0.236 and 0.704, respectively). During admission, oxygen support was required for 64.3% of the moderate cases and 100% of the severe cases (78.6% required invasive ventilation). All moderate cases were J o u r n a l P r e -p r o o f treated with a double therapy including antibiotic and antimalarial drugs (in 78.6% of the moderate cases, antiviral therapy was also administrated). Severe cases received a combined triple therapy of antibiotic, antiviral (lopinavir and ritonavir) and antimalarial (hydroxychloroquine) treatment, and 46.6% of them received tocilizumab. No statistical differences were observed in viral loads at the time of sample collection depending on COVID-19 severity, nor statistical correlation between days of symptoms to sample collection and Ct values from the SARS-CoV-2 qPCR assay (Supplementary Figure 1A) on nasopharyngeal swabs obtained at the point of sample collection. However, we obtained statistically significant differences in some analytical parameters between severe and moderate cases at the sampling timepoint: higher neutrophil counts (P-value = 0.012), ferritin (Pvalue = 0.002), D-Dimer (P-value = 0.006) and lactate (P-value = 0.019) values, but lower lymphocyte counts (P-value = 0.002) and fibrinogen (P-value = 0.004) values in severe cases. Maximum values for C-reactive protein (CRP) and neutrophils (P-values = 0.02 and 0.006, respectively) and minimum values for lymphocytes (P-value < 0.001) also showed statistically significant differences between these groups. Immune cell fractions inferred from gene expression data also pointed to a significantly lower proportion of T lymphocyte subpopulations CD4 and CD8 in severe cases than in other severity categories and healthy controls (Supplementary Figure 1B) . Likewise, the proportion of B cells was significantly higher in mild when compared to moderate and severe cases. Cell numbers of CD4 and NK subpopulation also showed a significant decrease with disease severity. Finally, a higher proportion of monocytes was also observed in moderate cases, correlating well with real monocyte counts detected in the blood tests from moderate and severe cases. Principal component analysis (PCA) of blood gene expression revealed a moderate differentiation by severity in the first principal component (PC1; Fig. 1A ). When we compared the whole COVID-19 cohort against controls we obtained nine DEG (Supplementary Table 2, Fig. 1A ), all of them overexpressed in patients except for FCER1A gene that is under-expressed (log2FC J o u r n a l P r e -p r o o f = -1.7; adjusted P-value = 2.3E-5). The most statistically significant gene was TNFRSF17 gene (log2FC = 1.9; adjusted P-value = 2.4E-10), followed by FCER1A and SERPING1 genes. Three out of the nine DEG were found to be still significant when using a model that accounted for differences in treatment with immunomodulators (steroids), administered only in some severe and moderate patients: TNFRSF17 (log2FC = 1.72; adjusted P-value = 4.1E-6), SERPING1 healthy controls). Notably, TNFRSF17 gene (log2FC = 0.16; adjusted P-value = 0.50) did not show appreciable differences between non-COVID-19 viral infections and controls. To further test the differential pattern of this gene in COVID-19 patients, we used a combined dataset from different microarray and RNA-seq studies that includes transcriptomes from viral and bacterial infected patients, and healthy controls. We consistently observed again a moderate correlation between our results and those from the viral infections vs. healthy control dataset (R = 0.81, P-value = 0.0078), and a weaker correlation with those from bacterial infections vs. healthy control dataset (R = 0.61, P-value = 0.081). Again, TNFRSF17 gene showed a log2FC value very close to 0, and nonsignificant results in the viral combined dataset (log2FC = -0.30; adjusted P-value = 0.28). Furthermore, in the bacterial combined dataset, the TNFRSF17 gene (and MX1) was under-expressed in bacterial infections vs. controls, even though this contrast was also statistically significant (log2FC = -0.81, adjusted P-value = 1.9E-4) as in COVID-19 datasets. We performed a more detailed analysis of the immunological processes underlying COVID-19 by comparing different severities ( Fig. 1B Table 2 ). Other significatively over-expressed genes in severe cases were those related to proinflammatory pathways IL-1 (IL1R1 and IL1R2) and IL-18 (IL18R1) and immunoregulatory functions (FKBP5 and S100A8); Supplementary Figure S5 . To further test this 14-transcript COVID-19 signature for severe patients, we compared our results to an independent cohort (Aschenbrenner et al. 2021) ; after excluding two genes (ZBTB16 and IL1RL1) that were not found to be DE between their severe cases vs. healthy controls (Aschenbrenner et al. 2021 ), we found a notable correlation between both datasets (R = 0.87; P-value = 0.00023); Analysis of moderate vs. healthy controls and mild cases vs. healthy controls returned a few DEG coinciding with the previous 14-transcript signature, e.g. TNFRSF17 and SERPING1. TNFRSF17 represents the strongest signal of COVID-19 disease of all the genes analyzed, with an increased gene expression associated with severity (log2FC values compared to healthy controls: 1.66 in mild, 1.95 in moderate, and 2.71 in severe cases); Supplementary Table 2 . Conversely, FCER1A gene was under-expressed in patients and shows a gradual decreased gene expression with severity (log2FC values compared to healthy controls: not DE in mild, -1.83 in moderate, and -3.06 in severe cases). Genes MX1 and C1QB showed an increased expression in mild (log2FC = 2.10; adjusted P-value= 1.32E-5) and moderate cases (log2FC = 1.78; adjusted Pvalue = 5.59E-5) respectively, with respect to controls (Supplementary Table 2 in severe vs. moderate cases, and even higher in severe vs. controls/mild cohorts. Other under-expressed genes in severe compared to mild/moderate cases included IFIT2 (Interferon Induced Protein with Tetratricopeptide Repeats 2) and IDO1, whose encoded protein is produced in response to inflammation and has autoimmune protective features (Supplementary Table 4 The other DEGs of the signature from hospitalized samples also exhibited expression levels that increased with severity (Supplementary Figure 22) . We found 18 common over-expressed DEGs that differentiate severe cases from the other phenotypic categories ( Fig. 3E; Supplementary Figure S23 ). CCL20 showed the highest log2FC value in all comparisons (Supplementary Table 6 ). As expected, higher differences were found in severe vs. mild and severe vs. controls (Fig. 3E) . Eleven out of the 18 genes were also included in the signature that differentiated hospitalized from both nonhospitalized and healthy controls, respectively ( Fig. 3E; Supplementary Figure S24 ). Pathway analysis in severe vs. controls using GSEA found some moderate up-regulation of processes and cell components, but only phagocytic vesicle as under-regulated cellular component (Supplementary Figure 25) . Some of these pathways were also highlighted as up-regulated in severe vs. mild patients but, in this case, a positive regulation of cell migration and motility was also detected (Supplementary Table 7 We selected a subset of severe and mild patients (n = 21) with samples available from all tissues collected at the same time. We explored for interactions between severity and tissue to elucidate immunological features that could illuminate the causes for the differential evolution observed in patients. Only when relaxing the thresholds for this interaction analysis (adjusted P-value < 0.05 and a cumulative log2FC > |1.4|) we found that the effect of severity for the CXCL9 gene is significant for all tissue comparisons, being much lower in nasal compared to blood tissue (P-adjusted = 1.41E-6; cumulative log2FC = -2.94) and moderately lower for both saliva vs. blood (P-adjusted = 0.03; cumulative log2FC = -1.42) and Table 8 ). Looking at the CXCL9 normalized data, we found higher expression in severe than in mild cases for blood, with the opposite pattern in nasal and saliva epithelium samples (Supplementary Figure 31) . When we increased the significance thresholds (P-adjusted < 10E-6; cumulative log2FC > |2.5|) a set of 17 genes were found as DE in all possible comparisons ( Supplementary Table 8; Fig. 4B; Fig. 4C ). Thirteen out of these 17 genes were also detected as DE between severe and mild cases in the longitudinal transcriptomic analysis (Fig. 4B) . The transversal multi-tissue analysis generates four clusters for these 17 genes according to their cumulative Log2FC values (Fig. 4C ). This analysis highlights that the main specific severity signals in blood were driven by gene cluster 1 represented by ARG1, CEACAM8, LTF and ITGA2B, and to a lesser extent by cluster 2 (specially IRAK3 and TLR5). This means that severity has a more marked effect in blood in comparison with nasal and saliva samples ( Fig. 4C; Supplementary Table 8; Supplementary Figure 31 ). Cluster 3 (Fig. 4C) , represented by CD22 and TRAF1, indicates that severity determines an opposite behavior in nasal (lower expression in mild and higher in severe) than in blood (higher expression in mild and lower in severe (Supplementary Figure 31) . Finally, cluster 4 ( Fig. 4C) represented by seven genes indicates a differential effect of severity in saliva with respect to blood and nasal epithelium (especially remarkable for MUC1 and LGALS3). Co-expression analysis related to the mild-severe phenotypes Through a co-expression network analysis, we aimed to detect modules of coexpressed genes that might be responsible for the extremely unbalance immune response to the SARS-CoV-2 infection between mild and severe phenotypes. We detected four co-expression clusters of immune genes in blood samples (Supplementary Figures 32A; Supplementary Table 9 ) constituting two meta-modules of highly related clusters (Supplementary Figures 32B) . Two of these modules were weakly significantly correlated with the mild-severe phenotype, one indicating over-expression ('blue' module in Supplementary (Nchioua et al. 2020 ). It has been described the importance of a solid T cell response in COVID-19 recovery (Zheng et al. 2020a) and, therefore, the suppression of the adaptative immune system could explain in part the characteristic lymphopenia observed in severe patients. In agreement with an over-activation of the innate and under-activation of adaptative immune response, an unbalanced activation of HLA class I and II antigen presentation was also found in severe cases. Overall our results are in the line with previous gene expression studies carried out in blood samples (Zheng et al. 2020a; Yan et al. 2021) , indicating that severe cases exhibit signals of a systemic impairment of IFN response characterized by a non-robust IFN production, a suppression of T and B responses and lymphopenia, neutrophil recruitment (with overexpression of ARG1 gene), abnormal increase of low density neutrophils (with high capacity to release NETs) and also enrichment in blood coagulation-related genes. The most prominently over-expressed genes in the nasal epithelium of severe cases were anti-inflammatory IL-1 family receptors IL-1R2 and IL-18R1 J o u r n a l P r e -p r o o f (mainly expressed by monocytes-macrophages). Both receptors are involved in the IL-1 and IL-18 cascades, which play a role in the initiation of the inflammation process after PAMPs recognition. Nevertheless, we did not detect DE of the proinflammatory cytokines IL1A, IL1B, IL1RAP or IL18 in severe samples, just a moderate up-regulation of IL1R1 gene. This imbalance between agonistsantagonists/soluble receptors in severe disease with respect to mild cases was previously noted in broncho-alveolar lavage fluids (BALF) from COVID-19 patients , and it could lead to exacerbated inflammatory responses (Italiani et al. 2020 ). In addition, pathways analysis detected an alteration of NAD+ metabolism in severe cases with respect to those with mild disease, probably pointing to a NAD (and ATP) depletion in host cells from severe patients. It is known that some intracellular pathogens can modulate host cell NAD, impairing several innate immune routes (Mesquita et al. 2016 ). Nasal samples from severe cases also showed similar expression patterns of CXCchemokines, GZMB and antiviral IFITM1/IFIT2 genes compared to controls. Unlike in mild cases, in which main signals came from cell-mediated immunity (Th1, NK and T cytotoxic cells responses) as well as from a robust IFN-II induction, severe cases show an over-expression of genes associated to monocyte-macrophage recruitment; these cells could play a key role in the cytokine storm observed in severe patients (Merad and Martin 2020) . For instance, CC pro-inflammatory chemokines genes CCL3 and CCL4 act as chemoattractant cytokines in response to tissue injury or inflammation. Some of them have already been highlighted as severity-related biomarkers of COVID-19 in airway epithelial cells (Chua et al. 2020) , and they were found highly expressed in BALF of severe compared to mild cases . They are ligands of chemokine receptors CCR1 (CCL3) and CCR5 (CCL3 and CCL4) and have a role in monocyte recruitment. CD163, CLEC4E and MARCO are expressed in macrophages, and the latter two can act as pattern recognition receptors (PRR) for pathogen-associated molecular patterns (PAMPs). The protein encoded by CLEC4E can interact with FCER1G which, unlike FCRE1A gene, is overexpressed in severe cases when compared to controls. Nasal epithelium response in moderate and mild patients was mainly characterized by an overexpression of CXCR3 receptor ligands chemokines CXCL9, CXCL10 and CXCL11. These chemokines are highly inducible by IFN gamma, and once joined J o u r n a l P r e -p r o o f to the receptor IFN is activated and triggers calcium release to the cytosol, acting as a second messenger to activate downstream signalling pathways (also highlighted in the pathway analysis of nasal samples from mild patients). These chemokines were previously described as biomarkers of respiratory viral infection in nasopharyngeal samples (Landry and Foxman 2018) . Similarly, other antiviral genes were also over-expressed in mild cases: IFITM1 has antiviral activity against several enveloped viruses, hampering the viral membrane fusion, and has demonstrated a strong inhibition to SARS-CoV-2 infection by blocking the virus entrance into cells (Buchrieser et al. 2020) ; and IFIT2 is involved in the inhibition of viral replication. Co-expression network analysis yielded comparable results to those observed in DE analysis, reporting two significant modules with opposite behavior in nasal epithelium from severe and mild cases. One of the modules consists of genes showing over-expression in the mild phenotype, while the other includes genes over-expressed in severe patients. Hub genes from the mild-related module are mostly implicated in CXCR chemokine receptor binding processes, interferon gamma signalling and adaptative T-cell immune response. By contrast, hub genes within the severe-related module were linked to IL-1 response, CCR chemokine receptor binding, NAD metabolism, and an innate response involving neutrophils and TLR cascades. Contrary to the expression profile observed in nasal epithelium samples, there is an absence of appreciable DE in the saliva of mild cases, suggesting a lack of oral immune response against viral infection. Nevertheless, in severe, and to a lesser extent in moderate samples, some up-regulated severity related hits were detected, many of them linked to the viral dynamics. The virus has mechanisms to hijack cellular machinery to their own benefit, altering the levels of several metabolites and cellular processes; one of these molecules are galectins, which have been proposed as biomarkers to monitor viral infections as well as therapeutic targets or antagonists . One of the most remarkable over-expressed hits in moderate and severe saliva samples was LGALS3 (galectin-3), a contributor to pro-inflammatory response against different viral infections , enhancing macrophage survival, and positively regulating macrophage recruitment (Liu and Hsu 2007) , even promoting a more severe course of the infection (Chen et al. 2018) . Elevated J o u r n a l P r e -p r o o f levels of galectin-3 were observed in macrophages, monocytes and dendritic cells in severe vs. mild COVID-19 patients (Caniglia et al. 2020) ; the regulatory role of galectin-3 in inflammatory response, fibrosis and disease progression, makes it a promising therapeutic target for severe cases (Garcia-Revilla et al. 2020 ). We did not observe systemic or nasal epithelium over-expression of this gene. Another important over-expressed gene in saliva was PLAU (urokinasetype plasminogen activator), which, together with CD59 (inhibitor of complementmediated lysis) and CD9 (tetraspanin), can facilitate virus-host cell interaction, host immune system viral evasion and, thus, dissemination and viral pathogenesis through buccal epithelium. Urokinase transforms plasminogen to the active form serine protease plasmin; this airway protease (like several others e.g. trypsin, cathepsins, elastase, and TMPRSS family members) are required for coronavirus entry enhancement (Millet and Whittaker 2015) . After viral interaction with ACE2 receptor, furin site in S protein of SARS-CoV-2 is proteolytically cleaved by TMPRSS2 to enable viral internalization (Devaux et al. 2020 ). Plasmin has demonstrated to cleave several viral proteins of many respiratory pathogens (Dubovi et al. 1983; Goto et al. 2001; Berri et al. 2013 ) and also S protein in SARS-CoV in vitro (Kam et al. 2009 ), increasing the ability of the virus to enter host cells. Even though the plasmin protein S cleavage in SARS-CoC-2 has not been demonstrated, there is growing evidence pointing towards a probable role of the plasmin-plasminogen system and fibrinolytic pathway in the COVID-19 severity (Gralinski et al. 2013; D'Alonzo et al. 2020; Ji et al. 2020; Kwaan and Lindholm 2021) . Co-expression and upregulation of PLAU, TMPRSS2, and ACE2 was observed in airway epithelial cells from severe/moderate patients and SARS-CoV-2 infected cell lines. This, together with a plasmin-mediated cleavage, could make cells more susceptible to infection (Hou et al. 2021) . Interestingly, we also observed an enhanced expression of PLAU in nasal epithelium from severe cases when compared to mild and controls, highlighting the crosslink between both tissues. In the same vein, tetraspanins are molecules associated to specific viruses and seem to play a role in different stages of viral infections (Martin et al. 2005) . Tetraspanin CD9 was involved in early MERS-CoV lung cell entry (Earnest et al. 2017) . Recently, it has also been proposed that extracellular vesicles may contribute to viral dissemination by transferring viral receptors, such as CD9 and ACE2, to other host cells making J o u r n a l P r e -p r o o f them more susceptible to the infection after exosomes endocytosis (Hassanpour et al. 2020) . Our functional analysis revealed down-regulation of some pathways related to cell communication, adhesion and signal transduction in mild cases, probably as a mechanism to minimize viral spread. Conversely, severe cases showed up-regulation of genes involved in cell junction, adhesion and extracellular exosomes, all processes that could facilitate the viral dissemination. CD59 is a non-IFN inducible protein that acts as inhibitor of complementmediated lysis by disruption of the membrane attack complex. Its implication in viral complement evasion of many enveloped viruses (Li and Parks 2018) , including coronaviruses (Wei et al. 2017) , has been demonstrated. This escape mechanism can be mediated by incorporating CD59 to the viral envelope, controlling cellular CD59 expression or producing a counterpart protein that mimics host CD59, or even use it as cell entry receptor (Maloney et al. 2020) . We have found an important co-expression module defining the severe phenotype in saliva samples containing all these aforementioned genes. In addition, most of them were also hub genes in the module, indicating that genes highly significantly associated with the severe phenotype were also key elements within this module. These hub genes were related to lymphocyte proliferation and activation, but also to apoptosis, cell-cell adhesion and cell cycle arrest. The gene with most connections of the module was the SRC gene, a member of the Src family kinases (SFKs) that demonstrated to have an important role in the life cycle of a many viruses and have emerged as an important factor in the interplay between cell functions and viral demands (Pagano et al. 2013) . It has been shown that inhibitors of SFKs can significantly reduce the replication of MERS-CoV (Shin et al. 2018 ) and some candidates are being proposed as treatment for COVID-19 because off its potential to effectively block the viral dissemination (Weisberg et al. 2020; Pillaiyar and Laufer 2021) . Overall, immune response in buccal cavity, due to its anatomical location and interplay between upper and lower respiratory tract (LRT), could play a key role in viral spread and transmission not only at inter-individual level ) but also in intra-individual viral dissemination to LRT, leading to a more severe phenotype. Transversal analysis of different tissue samples from the same patients showed a tissue-specific behavior depending on illness severity. Most of the J o u r n a l P r e -p r o o f tissue-specific severity signals observed were represented by genes involved in innate immune response and cytokine signaling. Genes that appeared differentially expressed with severities in nasal and blood samples are related to leukocyte chemotaxis, cytokine-cytokine signaling, cell adhesion and extracellular exosomes, with higher expression of severe cases (with respect to those with mild disease) in blood and, in contrast, lower expression in nasal samples; B-cell antigen receptor signaling genes showed an opposite pattern. Specific differential signatures of severity in saliva samples correspond to genes involved in inflammatory and NK activation routes, cell adhesion, and vesicle trafficking. Our results point to a set of genes and pathways that are tissue-specific and correlated to severity (Fig. 5) ; many of these are involved in the innate immune system and cytokine/chemokine signaling pathways. Disease severity at systemic level might be strongly dependent on local immune response. Thus, the nasal epithelium shows a robust activation of the antiviral immune machinery in mild cases, characterized by IFN pathway involvement, thus triggering the IFN induced genes, chemokine release and transduction signaling in the cells, leading to the activation of downstream cascades related to Th1, NK and Tcytotoxic adaptative immune response. Instead, severe cases showed these local antiviral reactions switched off; they develop an exacerbated innate response represented by an over-expression of genes related to monocytemacrophage recruitment in nasal epithelium (Fig. 5) . We did not find signals of immune viral response in buccal mucosa in mild patients, which probably indicates a successful local control of the infection and, therefore, prevention of viral dissemination and systemic colonization. On the contrary, buccal mucosa in moderate and, to a greater extent, severe cases, provided evident signals of viral activity, cell arresting and viral dissemination to the LRT (Fig. 5) ; ultimately generating an exacerbated innate and impaired adaptative systemic immune responses. In addition, the buccal cavity might play a key role in body infection dissemination: a severe phenotype seems to be associated to limited immunological resistance to the virus in the mouth, therefore facilitating the dissemination of the virus towards the LRT. This could make saliva not only a good prognostic proxy of severity, but also a source of therapeutic targets to contain viral dissemination in early stages of the disease. The present study has a few limitations. The number of genes that could be explored is limited by the technology employed (n-Counter from NanoString); in counterpart, this is the only one available that allows the analysis of RNA expression patterns from different tissues (usually involving limited and/or degraded RNA). Although we used a sample size substantially higher than usual in transcriptomic studies, results from the present study should be further validated using a similar design and other independent cohorts and, ideally, performing functional studies aimed at validating the severity related pathways detected. The growing number of high-throughput gene expression data available from COVID-19 patients could also help to validate tissue specific findings in a near future. Overall, our study highlights the key role of the nasal epithelium in COVID-19 severity, and include transcript biomarkers and pathways that signal severity at different tissue layers. The findings might illuminate new diagnostic, prognostic, or therapeutic targets in COVID-19. AS and FMT conceived, designed, and provided financial support to the study. Grupos con Potencial de Crecimiento Framework Partnership Agreement between the Consellería de Sanidad de la XUNTA de Galicia and GENVIP-IDIS -2021-2024 Human nasal and lung tissues infected ex vivo with SARS-CoV-2 provide insights into differential tissue-specific and virusspecific innate immune responses in the upper and lower respiratory tract Superspreading events in the transmission dynamics of SARS-CoV-2: Opportunities for interventions and control Disease severity-specific neutrophil signatures in blood transcriptomes stratify COVID-19 patients Ancestry patterns inferred from massive RNA-seq data A 2-transcript host cell signature distinguishes viral from bacterial diarrhea and it is influenced by the severity of symptoms Plasminogen controls inflammation and pathogenesis of influenza virus infections via fibrinolysis An approach for normalization and quality control for NanoString RNA expression data EnhancedVolcano: Publication-ready volcano plots with enhanced colouring and labeling. version 180 Syncytia formation by SARS-CoV-2-infected cells Immunopathology of galectin-3: an increasingly promising target in COVID-19 A framework for oligonucleotide microarray preprocessing Detection of SARS-CoV-2 in saliva and characterization of oral symptoms in COVID-19 patients Galectin-3 Enhances Avian H5N1 Influenza A Virus-Induced Pulmonary Inflammation by Promoting NLRP3 Inflammasome Activation COVID-19 severity correlates with airway epithelium-immune cell interactions identified by single-cell analysis Outcomes, and Inflammatory Profiles in Hospitalized COVID-19 Regulation of antiviral T cell responses by type I interferons COVID-19 and pneumonia: a role for the uPA/uPAR system Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England Shared and organism-specific host responses to childhood diarrheal diseases revealed by whole blood transcript profiling ACE2 receptor polymorphism: Susceptibility to SARS-CoV-2, hypertension, multi-organ failure, and COVID-19 disease outcome lumi: a pipeline for processing Illumina microarray Enhancement of respiratory syncytial virusinduced cytopathology by trypsin, thrombin, and plasmin The tetraspanin CD9 facilitates MERS-coronavirus entry by scaffolding host cell receptors and proteases Infection of human Nasal Epithelial Cells with SARS-CoV-2 and a 382-nt deletion isolate lacking ORF8 reveals similar viral kinetics and host transcriptional profiles Hyperinflammation and Fibrosis in Severe COVID-19 Patients: Galectin-3 Monocytes/Macrophages in Covid-19 Pathogenesis: Implications for Therapy Plasminogen-binding activity of neuraminidase determines the pathogenicity of influenza A virus Mechanisms of severe acute respiratory syndrome coronavirus-induced acute lung injury Complex heatmaps reveal patterns and correlations in multidimensional genomic data Phylogeography of SARS-CoV-2 pandemic in Spain: a story of multiple introductions, micro-geographic stratification, founder effects, and superspreaders Mapping genome variation of SARS-CoV-2 worldwide highlights the impact of COVID-19 super-spreaders Superspreading in the emergence of COVID-19 variants Indicate a Preponderant Role for Monocytes in COVID-19 Immunopathology Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients The role of extracellular vesicles in COVID-19 virus infection Temporal dynamics in viral shedding and transmissibility of COVID-19 Transcriptomic profiling in childhood H1N1/09 influenza reveals reduced expression of protein synthesis genes Diagnostic test accuracy of a 2-transcript host RNA signature for discriminating bacterial vs viral infection in febrile children Fibrinolysis influences SARS-CoV-2 infection in ciliated cells Clinical features of patients infected with 2019 novel coronavirus in SARS-CoV-2 infection of the oral cavity and saliva Profiling the Course of Resolving vs Elevated Plasmin(ogen) as a Common Risk Factor for COVID-19 Susceptibility Epidemiological, clinical and virological characteristics of 74 cases of coronavirus-infected disease 2019 (COVID-19) with gastrointestinal symptoms Cleavage of the SARS coronavirus spike glycoprotein by airway proteases enhances virus entry into human bronchial epithelial cells in vitro The Central Role of Fibrinolytic Response in COVID-19-A Hematologist's Perspective SARS-CoV-2 productively infects human gut enterocytes Antiviral Response in the Nasopharynx Identifies Patients With Respiratory Virus Infection WGCNA: an R package for weighted correlation network analysis Phylogenetic analysis of SARS-CoV-2 in Boston highlights the impact of superspreading events Vascular occlusion by neutrophil extracellular traps in COVID-19 Relative Contribution of Cellular Complement Inhibitors CD59, CD46, and CD55 to Parainfluenza Virus 5 Inhibition of Complement-Mediated Neutralization In vivo antiviral host transcriptional response to SARS-CoV-2 by viral load, sex, and age The role of galectin-3 in promotion of the inflammatory response 2020. Type I Interferon Susceptibility Distinguishes SARS-CoV-2 from SARS-CoV Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Association of RNA Biosignatures With Bacterial Infections in Febrile Infants Aged 60 Days or Younger Interactions of viruses and the humoral innate immune response Antiviral activities of type I interferons to SARS-CoV-2 infection Tetraspanins in viral infections: a fundamental role in viral biology The Role of Cytokines including Interleukin-6 in COVID-19 induced Pneumonia and Macrophage Activation Syndrome-Like Disease Pathological inflammation in patients with COVID-19: a key role for monocytes and macrophages Exploring NAD+ metabolism in host-pathogen interactions Host cell proteases: Critical determinants of coronavirus tropism and pathogenesis SARS-CoV-2 Is Restricted by Zinc Finger Antiviral Protein despite Preadaptation to the Low-CpG Environment in Humans Robust enumeration of cell subsets from tissue expression profiles Determining cell type abundance and expression from bulk tissues with digital cytometry A Dynamic Immune Response Shapes COVID-19 Progression Viral proteins and Src family kinases: Mechanisms of pathogenicity from a "liaison dangereuse A distinct influenza infection signature in the blood transcriptome of patients with severe community-acquired pneumonia Saliva sample as a noninvasive specimen for the diagnosis of coronavirus disease 2019: a crosssectional study Natural mucosal barriers and COVID-19 in children Kinases as Potential Therapeutic Targets for Anti-coronaviral Therapy Assessing uncertainty in the rooting of the SARS-CoV-2 phylogeny Dysregulation of Immune Response in Patients With Coronavirus Normalization of RNA-seq data using factor analysis of control genes or samples limma powers differential expression analyses for RNA-sequencing and microarray studies Type 2 and interferon inflammation regulate SARS-CoV-2 entry factor expression in the airway epithelium Whole exome sequencing identifies new host genomic susceptibility factors in empyema caused by streptococcus pneumoniae in children: a pilot study Whole exome sequencing reveals new candidate genes in host genomic susceptibility to respiratory syncytial virus disease The role of galectins in virus infection -A systemic literature review CD59 association with infectious bronchitis virus particles protects against antibody-dependent complement-mediated lysis Repurposing of Kinase Inhibitors for Treatment of COVID-19 ggplot2: Elegant Graphics for Data Analysis Saliva or Nasopharyngeal Swab Specimens for Detection of SARS-CoV-2 The differential immune responses to COVID-19 in peripheral and lung revealed by single-cell RNA sequencing Longitudinal Peripheral Blood Transcriptional Analysis Reveals Molecular Signatures of Disease Progression in COVID-19 Patients Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Visualization of Functional Enrichment Result clusterProfiler: an R package for comparing biological themes among gene clusters Cytokine release syndrome in severe COVID-19: interleukin-6 receptor antagonist tocilizumab may be the key to reduce mortality Longitudinal transcriptome analyses show robust T cell immunity during recovery from COVID-19 Elevated exhaustion levels and reduced functional diversity of T cells in peripheral blood may predict severe progression in COVID-19 patients ctrlGene: Assess the Stability of Candidate Housekeeping Genes Profiling of the immune repertoire in COVID-19 patients with mild, severe, convalescent, or retesting-positive status SARS-CoV-2 Viral Load in Upper Respiratory Specimens of Infected Patients Is FURIN gene expression in salivary glands related to SARS-CoV-2 infectivity through saliva? This study received support from Instituto de Salud Carlos III ( The authors declare that there are no competing interests.