key: cord-0976198-80nq42tb authors: Alvarez, Carlos Ramirez; Sharma, Ashwini Kumar; Kee, Carmon; Thomas, Leonie; Boulant, Steeve; Herrmann, Carl title: SPINT2 controls SARS-CoV-2 viral infection and is associated to disease severity date: 2020-12-28 journal: bioRxiv DOI: 10.1101/2020.12.28.424029 sha: 97dd31c2d764069eb8eff33555c705359dddb7cf doc_id: 976198 cord_uid: 80nq42tb COVID-19 outbreak is the biggest threat to human health in recent history. Currently, there are over 1.5 million related deaths and 75 million people infected around the world (as of 22/12/2020). The identification of virulence factors which determine disease susceptibility and severity in different cell types remains an essential challenge. The serine protease TMPRSS2 has been shown to be important for S protein priming and viral entry, however, little is known about its regulation. SPINT2 is a member of the family of Kunitz type serine protease inhibitors and has been shown to inhibit TMPRSS2. Here, we explored the existence of a co-regulation between SPINT2/TMPRSS2 and found a tightly regulated protease/inhibitor expression balance across tissues. We found that SPINT2 negatively correlates with SARS-CoV-2 expression in Calu-3 and Caco-2 cell lines and was down-regulated in secretory cells from COVID-19 patients. We validated our findings using Calu-3 cell lines and observed a strong increase in viral load after SPINT2 knockdown. Additionally, we evaluated the expression of SPINT2 in datasets from comorbid diseases using bulk and scRNA-seq data. We observed its down-regulation in colon, kidney and liver tumors as well as in alpha pancreatic islets cells from diabetes Type 2 patients, which could have implications for the observed comorbidities in COVID-19 patients suffering from chronic diseases. SARS-CoV-2 entry requires a two-step process: first, the envelope protein spike (S) binds to the viral cellular receptor Angiotensin-converting enzyme 2 (ACE2) membrane protein 1 and is then proteolytically activated by cellular serine proteases like TMPRSS2, TMPRSS4 and Furin 2-4 . TMPRSS2 has been proposed as a putative drug target 3,5,6 and as a biomarker for COVID19 disease severity 7,8 . Despite its central role, the regulation of TMPRSS2 is poorly understood, although its activation by androgen response elements has been documented in normal and tumor prostate tissues 9 . viral load in Calu-3 cells and its inhibition reduced viral infection 22 . We also found several ribosomal proteins ( RPL9 , RPL23 , RPL26 , RPL28 , RPL38 , RPS7 , RPS12 and RPS27A ) and elongation factors ( EIF3A , EIF4A2 and EIF4B ) which could be related to viral protein translation and ER stress response 30 . In order to confirm that the permissivity signature are not just reflecting tissue specific or immune signatures, a Pathway Enrichment Analysis (PEA) was performed using the top ranked genes. Interestingly, we found an enrichment of host-viral interactions processes, protein stabilization and Endoplasmic Reticulum (ER) trafficking pathways ( Figure 2C ). Next, we investigated if these susceptibility genes, identified from lung-derived cell lines, are also expressed in other cell types. Therefore, we ranked the cell types in the Human Cell Landscape dataset (HCL) 25 based on the permissivity score derived from our top ranked genes in the permissivity signature ( Figure 2D ) and found that stratified epithelial, basal, AT2 lung cells and enterocytes were among the top-ranked cell types which correspond to cell types known to be infected by the virus 31,32 . In order to further refine our permissivity signature by going beyond transcriptional levels, we used protein expression levels of a previously released proteomic dataset from SARS-CoV-2 infected Caco-2 cells 13 . We determined the Spearman correlations of the translation rates for the top ranked genes to that of the N and S viral proteins ( Figure 2E ). Some of the highly correlated genes (both negative and positive) have been previously reported to participate in viral infection processes. For example, LGALS3BP is a glycoprotein secreted molecule with antiviral properties observed in HIV and Hantavirus infection 33, 34 and in the regulation of LPS induced endotoxin shock in murine models 35 . CLIC1 has been previously identified as a virulence factor of Merker Cell Polyomavirus Interestingly, SPINT2 was consistently correlated to viral translation ( Figure 2E ). Furthermore, the correlation of SPINT2 with viral gene expression is negative and this trend is consistent in both Caco-2 and Calu-3 cell lines, indicating a repressive role on SARS-CoV-2 infection ( Supplementary Figure 1A and B) . Hence, these findings suggest that SPINT2 represents a permissivity factor that negatively correlates with SARS-CoV-2 infection. To experimentally validate the negative correlation of SPINT2 expression with SARS-CoV-2 viral gene expression, we hypothesized that this gene could have a direct influence on SARS-CoV-2 infection by impairing early steps of viral entry. Hence, to test our hypothesis, we knocked-down SPINT2 using small-hairpins RNA in the human lung carcinoma derived line Calu-3 cells. SPINT2 expression was readily detectable in wild-type (WT) Calu-3 cells. When Calu-3 cells were transduced with a specific shRNA directed against SPINT2 , SPINT2 levels were significantly decreased compared to WT cells or cells transduced with a scrambled shRNA ( Figure 3A ). To address the impact of SPINT2 knocked-down on the permissivity of Calu-3 cells to SARS-CoV-2, WT, scrambled and SPINT2 knocked-down cells were infected with SARS-CoV-2 using the same multiplicity of infection (MOI). Knocked-down of SPINT2 resulted in an almost two-fold increase in the number of cells positive for SARS-CoV-2 ( Figure 3B -C ). In order to test the hypothesis whether SPINT2 could modulate viral load by regulating TMPRSS2 expression, we monitored its fold change expression. Interestingly, TMPRSS2 gene expression was found to be higher in SPINT2 knocked-down cells when compared to WT or scramble cells ( Figure 3D ). Together, our results show that genetic depletion of SPINT2 results in an increased susceptibility of Calu-3 cells to SARS-CoV-2 infection, which is in agreement with our analysis suggesting that SPINT2 expression negatively correlates with infection. Given the observed negative correlation between SPINT2 expression and SARS-CoV-2 infection in cell lines ( Figure 2E , Supplementary Figure 1A , B ) we next investigated if SPINT2 expression is associated with disease severity in COVID-19 patients. We used a publicly available scRNA-seq dataset on nasopharynx swabs samples from patients with severe and mild symptoms 38 . We correlated a list of serine proteases and inhibitors (SPRGs, Supplementary Table 2 ) to the viral RNA reads and found that SPINT2 was the second most negatively correlated gene ( Figure 4A ). Then, we selected the cell cluster with the highest expression of SPINT2 , which correspond to secretory cells ( Supplementary Figure 2A ) and among these cells, observed a lower SPINT2 gene expression in cells from critical COVID19 cases compared to moderate cases ( Figure 4B ). This finding is particularly relevant since secretory cells are primary targets of viral infection 39 . We also evaluated data on Peripheral Blood Mononuclear Cells (PBMC) from severe COVID-19 patients 40 . In this dataset, SPINT2 was found to be strongly expressed in Dendritic Cells (DC), plasmacytoid DC (pDC), and stem cells (SC) and eosinophils (Supplementary Figure 2B ). Among these cells, again, we observed lower SPINT2 expression in patients from Intensive Care Units (ICU) ( Figure 4C ). Additionally, we could also corroborate the negative correlation of SPINT2 and viral load using bulk RNA-seq data from lung autopsies of COVID-19 deceased patients 41 . We calculated the correlations of gene expression between SPINT2 , ACE2 and TMPRSS2 to E, M, N and S viral genes and observed the similar negative correlation ( Supplementary Figure 2C ). Collectively, this evidence suggests that SPINT2 expression level could be associated to COVID-19 disease severity. COVID-19 patients with previous records of chronic diseases like cancer or diabetes are considered at higher risk 42-46 . Also, SPINT2 gene silencing by promoter hypermethylation has been reported in multiple tumor types which promotes tumor progression 14, [47] [48] [49] . For this reason, we hypothesized that SPINT2 down-regulation in tumor cells would increase viral infection permissivity which among others, could be one of the mechanisms behind the comorbidity observed in COVID-19 patients. We screened lung, colon, liver and hepatic tumor datasets to evaluate the differences in SPINT2 gene expression between tumor and paired normal samples. We found statistically significant down-regulation of SPINT2 in the kidneys and liver tumors ( Supplementary Figure 3A ) . Similarly, using comparable tumor scRNA-seq datasets 50-54 we observed a down-regulation of SPINT2 in colon adenocarcinoma (epithelial cells), renal clear cell carcinoma (endothelial cells) and hepatocellular carcinoma (hepatocytes) ( Figure 5 ). Interestingly, we were able to detect SPINT2 down-regulation in colorectal tumor epithelial cells at single cell level but not in bulk RNA-seq data suggesting that SPINT2 expression might me modulated in specific cell subtypes ( Figure 5 and Supplementary Figure 4A ). In lung adenocarcinomas, we found SPINT2 upregulation in tumors both in TCGA and scRNA-seq data which might reflect the existence of different determinants for comorbidity in lung tissues independent of SPINT2 modulation. We also looked at the expression of SPINT2 in pancreatic cells from diabetes type 2 (DT2) patients 54 . Islet cells have high SPINT2 expression when compared to other cell types like endothelial cells (see Supplementary Figure 4B ). We observed a strong down-regulation of SPINT2 in alpha-cells of DT2 patients, which have been shown to be primary targets of the SARS-CoV-2 virus. This down-regulation of the virulence associated factor SPINT2 might contribute to the comorbidity between COVD19 and DT2. In this study, we describe a tight protease-inhibitor/protease balance at the gene expression level between SPINT2 and TMPRSS2, a major co-receptor of SARS-CoV-2. We found Transcription Factor Binding Sites (TFBS) for ten regulators including IRF1 , IRF3 , JUNB , JUND and ELF3 whose TF activities were found to be correlated to both SPINT2 and TMPRSS2 gene expression which suggests their possible role as common regulators of both genes. Interestingly, ELF3 and IRF7 TF activity has been found to be modulated in SARS-CoV-2 infected vs bystander enterocytes from ileum 18 , which could point to viral load modulation mediated by TMPRSS2 and SPINT2 through these TFs. We show that SPINT2 and TMPRSS2 gene expression levels are correlated across cell types and tissues. Interestingly, known SARS-CoV-2 target tissues have high correlation values and co-expression for both genes which suggest that SPINT2 could play a role in SARS-CoV-2 viral entry. Currently, it is unclear what are the molecular signatures that determine viral permissivity and how they are related to disease severity. We inferred a SARS-CoV-2 permissivity signature, using differentially expressed genes between permissive and non-permissive cell lines from which we removed viral induced genes. We were able to find SPINT2 in this permissivity signature and observed a negative correlation to SARS-CoV-2 viral load in Calu-3 cells. We also corroborated this trend at the protein level in Caco-2 cells. During the preparation of these manuscript, a study from Bojkova D et al, 2020 was published suggesting a possible role of SPINT1 , SPINT2 and SERPINA1 in viral infection by observing the down-regulation of their protein levels in infected cells and also by evaluating the effect of Aprotinin a non-specific SP inhibitor on viral load 6 . However, here for the first time by knocking down SPINT2, we provide a direct causal evidence that SPINT2 is indeed able to modulate SARS-CoV-2 infection. SPINT2 inhibits TMPRSS2 enzymatic activity through its KD1 and KD2 domains 12 . Interestingly, we could observe an up-regulation of TMPRSS2 mRNA expression in the SPINT2 knocked-down Calu-3 cells. Further investigation is needed to explore this regulation at the gene expression level. Although, it has been reported previously that SPINT2 can modulate ST14 protein activity by regulating its shedding from the cell membrane of mouse intestinal epithelial cells 16 which suggest that SPINT2 could modulate serine proteases activity through different mechanisms. Interestingly, SPINT2 has been reported to regulate transcription of certain genes like CDK1A via histone methylation 55 . Our findings suggest that SPINT2 regulates SARS-CoV-2 viral infection through the inhibition of TMPRSS2 , however, we cannot discard the possibility of an indirect interaction. We found a lower expression of SPINT2 in secretory cells from COVID-19 patients with severe symptoms 38 . This could have implications for COVID-19 disease severity since secretory cells have been shown to be the target of SARS-CoV viral infection using organotypic human airway epithelial cultures 39 . We found SPINT2 in the permissivity signature from which we filtered out viral induced genes, suggesting that this gene could be used as a marker for predicting COVID-19 disease susceptibility prior to infection, however this needs to be further evaluated. Serine proteases (SPs) have been reported to be abnormally regulated in diverse chronic diseases 14, [56] [57] [58] . For example, during carcinogenic development SPs influence metastasis and cancer progression 59, 60 , while in the context of diabetes they control fibrinolysis, coagulation and inflammation which in turn affects disease severity 57 . This led us to hypothesize that shared molecular mechanisms between some chronic diseases and COVID-19 could be explained in part by the regulation of SPINT2 . We observed SPINT2 down-regulation in Hepatocellular Carcinoma (HCC), Colon Adenocarcinoma (COAD) and renal Clear Cell Carcinoma (rCCC) tumor cells. down-regulation in liver has been reported to contribute to the development of HCC by the binding and inhibition of the serine protease HGFA which transforms Hepatocyte Growth Factor ( HGF ) into its active form which in turn promotes metastasis, cell growth and angiogenesis 14,61 and the same mechanism has been suggested for rCCC 9 . A marked down-regulation of SPINT2 can be observed in alpha islets pancreatic cells from diabetes patients. It has been reported that islet cells can be infected by SARS-CoV-2 which could contribute to the onset of acute diabetes 62 . Hence, these results suggests that kidney, colon and liver tumor types as well as pancreatic islets cells from diabetic patients could be more permissive and susceptible to SARS-CoV-2 viral infection due to an imbalance of SPINT2 gene expression, which could lead to the disruption of the protease-inhibitor/protease balance 63 . In conclusion, we showed for the first time that SPINT2 is a permissivity factor that modulates SARS-CoV-2 infection. This modulation could be explained by the balance of TMPRSS2 / SPINT2 (serine protease/inhibitor) that we observed at the gene expression level across several tissues. We also found lower SPINT2 gene expression in samples from COVID-19 patients with severe symptoms, hence, this gene might represent a biomarker for predicting disease severity. We also found SPINT2 down-regulation in tumor types which could have implications for the observed comorbidities in COVID-19 patients with cancer. Human lung adenocarcinoma cell lines Calu-3 (ATCC HTB-55) were cultured in Dulbecco's modified Eagle's medium (DMEM) supplemented with Glutamax (Gibco), 10% fetal bovine serum and 1% penicillin/streptomycin while Vero E6 cells (ATCC CRL 1586) were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum and 1% penicillin/streptomycin (Gibco). Calu-3 cell lines stably expressing the SPINT2 knockdown were generated by lentiviral transduction. Oligonucleotides encoding the sequence for SPINT2 knockdown were designed from the TRC library based on Genetic Perturbation Platform (GPP) Web Portal, cloneID: TRCN0000073581 (Table 1) 64 Annealed oligonucleotides were ligated with the AgeI-HF and EcoRI-HF digested pLKO.1 puro vector (Add gene #8453) using the T4 DNA Ligase (New England Biolabs) and the resulting plasmids were transformed into E. coli DH5α-competent cells. Amplified plasmid DNA was purified using the NucleoBondR PC 100 kit by Marchery-Nagel following the manufacturer's instructions. HEK293T cells (ATCC CRL-3216) were seeded on 10 cm 2 dishes and allowed to adhere for 36 hours. The cells were transfected with 4 μg of pMD2.G (Addgene #12259), 4 μg of psPAX2 (Addgene The The fold change in SARS-CoV-2 genome copy number was calculated using input as a reference. Input samples were harvested directly post-infection and accounted for the basal viral genome copy number detected due to viruses attaching to the cell membrane. Cells were seeded on iBIDI glass bottom 8-well chamber slides which are previously coated with 2.5% human collagen in water for at least 1 hour . In order to quantify infected cells from indirect immunofluorescent stained samples, ilastik 1.2.0 was used on DAPI images to generate a mask representing each nucleus as an individual object. These masks were used on CellProfiler 3.1.9 to measure the intensity of the conjugated secondary antibodies in each nucleus. A threshold was set based on the basal fluorescence of non infected samples, and all nuclei with a higher fluorescence were considered infected cells. For Calu-3 cells we filter out cells with an extremely high number of detected genes (>50,000) which probably corresponds to doublets. In H1299, since few cells were detected to be infected, because this line is non-permissive, in order to obtain DEGs we defined infected cells as those with cumulative sum of viral genes expression >0 . As we wanted to differentiate between permissivity and infection signatures, we first looked for differentially expressed genes in SARS-CoV-2 permissive vs non-permissive cell lines and then we removed all the genes which were up-or down-regulated during infection ( Figure 2A ). We performed Differential Expression Analysis using Seurat 65 A Random Forest (RF) regression analysis was performed using the normalized gene expression of the permissivity signature to predict the cumulative sum of the expression of viral genes in Calu-3 cells at 12 hpi. We trained the RF using a random subsample of 75% and tested the results with the remaining set. Next, we estimated the feature importance for each of the permissivity signature genes and performed enrichment analysis using enrichR 66 on the top 25% ranked genes. For the scoring of cells based on the permissivity signature among cell types in the HCL dataset, we used the top 25% RF ranked genes and applied the AddModuleScore function of Seurat setting nbin =100. For the translatome correlation analysis, the summed intensity normalized values were used as provided in the study 13 . In order to compute the correlations of SPRGs ( Supplementary Table 2 ) to the viral reads in the scRNA-seq data from Chua RL et al, 2020 the raw count matrices were extracted from the Seurat object provided by the authors, splitted by sample and then imputed using scimpute 67 with the following parameters: drop_thr=0.5 and Kcluster equal to the number of annotated cell types in each matrix. The imputed matrices were then merged and log2 normalized. Finally, correlations were performed restricted to infected cells (viral read counts>0). In the bulk RNA-seq data from deceased COVID-19 patients log2 RPM of normalized counts are used. In both cases correlation to viral genes were carried out using spearman coefficients. In order to have a standardized workflow for the processing of scRNA-seq data we used SCT normalization using the Seurat workflow for every dataset except for Human Cell Landscape data where log2 normalization and scaling were performed since this dataset is large and using SCT was unpractical. HCC data were downloaded from GEO (GSE149614) and reprocessed. We used the Louvain method implemented in Seurat for community detection and clusters were identified by using tissue markers. We used the markers used to characterize cell types from an independent scRNA-seq human liver atlas 68 A footprinting analysis was carried out using the TOBIAS pipeline 69 For this work we used the following datasets available in public repositories: scRNA-seq profiles of Calu-3 and H1299 cell lines 22 ; scRNA-seq from ileum derived organoids 18 All the codes for the data processing and analysis are provided in the following GitHub repository: https://github.com/hdsu-bioquant/covid19-comorbidity . Inhibition of TMPRSS2 by HAI-2 reduces prostate cancer cell invasion and metastasis Proteomics of SARS-CoV-2-infected host cells reveals therapy targets Serine peptidase inhibitor Kunitz type 2 (SPINT2) in cancer development and progression Intestinal regulation of suppression of tumorigenicity 14 (ST14) and serine peptidase inhibitor, Kunitz type -1 (SPINT1) by transcription factor CDX2 The protease inhibitor HAI-2, but not HAI-1, regulates matriptase activation and shedding through prostasin Functional genomics analysis of human colon organoids identifies key transcription factors Single-cell analyses reveal SARS-CoV-2 interference with intrinsic immune response in the human gut A scalable SCENIC workflow for single-cell gene regulatory network analysis type 2 diabetes USP28 and SPINT2 mediate cell cycle arrest after whole genome doubling Role of Serine Proteases and Inhibitors in Cancer The role played by serine proteases in the development and worsening of vascular complications in type 1 diabetes mellitus Protein targets of inflammatory serine proteases and cardiovascular disease Cell surface-anchored serine proteases in cancer progression and metastasis Role of proteases in tumor invasion and metastasis Hepatocyte growth factor activation inhibitors -therapeutic potential in cancer Binding of SARS coronavirus to its receptor damages islets and causes acute diabetes Proteolytic networks in cancer A public genome-scale lentiviral expression library of human ORFs Integrated analysis of multimodal single-cell data. Cold Spring Harbor Laboratory Enrichr: a comprehensive gene set enrichment analysis web server 2016 update An accurate and robust imputation method scImpute for single-cell RNA-seq data A human liver cell atlas reveals heterogeneity and epithelial progenitors ATAC-seq footprinting unravels kinetics of transcription factor binding during zygotic genome activation