key: cord-1036080-mt72f4dq authors: Mayneris-Perxachs, Jordi; Moreno-Navarrete, José-Maria; Ballanti, Marta; Monteleone, Giovanni; Alessandro Paoluzi, Omero; Mingrone, Geltrude; Lefebvre, Philippe; Staels, Bart; Federici, Massimo; Puig, Josep; Garre, Josep; Ramos, Rafael; Manuel Fernández-Real, José title: Lipidomics and metabolomics signatures of SARS-CoV-2 mediators/receptors in peripheral leukocytes, jejunum and colon date: 2021-11-08 journal: Comput Struct Biotechnol J DOI: 10.1016/j.csbj.2021.11.007 sha: bd8246830786083f569a8f800ddb558db1626e12 doc_id: 1036080 cord_uid: mt72f4dq Cell surface receptor-mediated viral entry plays a critical role in this infection. Well-established SARS-CoV-2 receptors such as ACE2 and TMPRSS2 are highly expressed in the gastrointestinal tract. In fact, there are evidences that SARS-CoV-2 infects epithelial cells from the digestive system. However, emerging research has identified novel mediators such as DPP9, TYK2, and CCR2, all playing a critical role in inflammation. We evaluated the expression of SARS-CoV-2 receptors in peripheral leukocytes (n=469), jejunum (n=30), and colon (n=37) of three independent cohorts by real-time PCR, RNA-sequencing, and microarray transcriptomics. We also performed HPCL-MS/MS lipidomics and metabolomics analyses to identify signatures linked to SARS-CoV-2 receptors. We found markedly higher peripheral leukocytes ACE2 expression levels in women compared to men, whereas the intestinal expression of TMPRSS2 was positively associated with BMI. Consistent lipidomics signatures associated with the expression of these mediators were found in both tissues and peripheral leukocytes involving n-3 long-chain PUFAs and arachidonic acid-derived eicosanoids, which play a key role in the regulation of inflammation and may interfere with viral entry and replication. Medium- and long-chain hydroxy acids, which have shown to interfere in viral replication, were also liked to SARS-CoV2 receptors. Gonadal steroids were also associated with the expression of some of these receptors, even after controlling for sex. The expression of SARS-CoV2 receptors was associated with several metabolic and nutritional traits in different cell types. This information may be useful in the design of potential therapies targeted at coronavirus entry. The knowledge of the factors that lead to a poor prognosis after SARS-CoV-2 exposure is of utmost importance in recent days. Sexual dimorphism and the prevalence of chronic metabolic diseases are two of the most increasingly recognized factors to impact on prognosis in patients with COVID-19. Men with SARS-CoV-2 virus had a worse mortality outcome [1, 2] . Of the 1591 patients admitted in the intensive care unit in one study, 82% were male [1] . Seventy percent of patients that died of COVID-19 in Italy were men [2] . Among 5,700 patients with SARS-CoV-2, the most common comorbidities were hypertension (56.6%), obesity (41.7%), and diabetes (33.8%) [3] . Consistent with these findings, a recent review of genetic correlations between COVID-19 and several diseases, confirmed that type 2 diabetes, obesity, and hypertension are significantly associated with severe or hospitalized COVID- 19 [4] . BMI was also modestly associated with severe COVID-19. A first critical step of viral infection involves cell surface receptor-mediated viral entry, where the spike (S) glycoprotein play a key role in facilitating SARS-CoV-2 to enter hosts cells. The S protein consists in two subunits: S1 and S2 [5] . The S1 subunit comprises receptor-binding domains and N-terminal domains. SARS-CoV-2 entry into human cells from the intestinal and respiratory tracts is mediated by angiotensin-converting enzyme 2 (ACE2), acting as a receptor binding the SARS-CoV-2 spike protein S1 [5] [6] . High levels of ACE2 expression have been associated with higher risk of COVID-19 infection and disease severity [7] . In a recent paper, plasma concentrations of ACE2 were found to be higher in men than in women and the authors proposed that these results might explain the higher incidence and fatality rate of COVID-19 in men [8] . However, while ACE2 receptors are ubiquitous and widely expressed in the heart, vessels, gut and lung, ACE2 is mostly bound to cell membranes and only scarcely present in the circulation in a soluble form [9] . Differences in tissue tropism suggest that additional mediators may facilitate the host cell infection. In contrast to S1 subunit, the S2 subunit contains a fusion peptide, heptad repeat regions, a transmembrane domain and a cytoplasmic domain which facilitate viral membrane fusion processes with hosts cells [5] . Thus, the transmembrane protein serine 2 (TMPRSS2) has also shown to play a key role in SARS-CoV-2 entry by priming the S2 subunit of the virus [6] . Moreover, a furan cleavage site for TMPRSS2 at the boundary between S1 and S2 subunits, resulting in S protein cleavage during the progression of viral infection, has been reported [5] . Notably, evidences in mice suggest that ACE2 and TMPRSS2 expression is sex-dependent and modulated by the obesity status [10] . In addition, a recent GWAS and TWAS study identified host genetic variants in genes encoding tyrosine kinase 2 (TYK2), monocyte/macrophage chemotactic receptor (CCR2) and dipeptidyl peptidase 9 (DPP9) associated with severe COVID-19 suggesting a novel host-driven inflammatory mechanism in critical illness in COVID-19, as all proteins play a key role as mediators of inflammatory lung damage [11] . Apart from respiratory manifestations, gastrointestinal symptoms are common in SARS-CoV-2 infected patients [12] . In fact, there are evidences that SARS-CoV-2 infects epithelial cells of the gastrointestinal tract [13] , thereby indicating a potential route of infection through the digestive system. In addition, both ACE2 and TMPRSS2 are highly expressed in the digestive tract organs. Considering this, we aimed to evaluate the expression of SARS-CoV-2 mediators/receptors in colon, jejunum and peripheral leukocytes of three independent cohorts and the association with lipidomics (plasma and jejunum) and metabolomics (plasma) profiles as a measure of the real-end products of the physiological processes affecting directly the phenotype. We evaluated the expression of five well-established and novel SARS-CoV-2 mediators/receptors (ACE2, TMPRSS2, DPP9, TYK2, CCR2) in three independent cohorts ( Table 1) : a) in peripheral leukocytes from the IMAGEOMICS cohort, b) in jejunum biopsies from the JEJUNUM cohort, and c) in colon biopsies from the FLOROMIDIA cohort. ACE2 expression was studied in peripheral leukocytes from consecutive n=469 subjects included in a cohort of healthy volunteers aged ≥50 years (Aging Imageomics study) [14] . To analyze gene expression in leukocytes, whole blood was collected in PAXgene Blood RNA tubes (Qiagen, Hilden, Germany). RNA purification was performed using RNeasy Lipid Tissue Mini Kit (QIAgen, Izasa SA, Barcelona, Spain) and the integrity was checked by Agilent Bioanalyzer (Agilent Technologies, Palo Alto, CA). Gene expression was assessed by real time PCR using an LightCycler® 480 Real-Time PCR System (Roche Diagnostics SL, Barcelona, Spain), using TaqMan® technology suitable for relative genetic expression quantification. The commercially available and pre-validated TaqMan® primer/probe sets used were as follows: Peptidylprolyl isomerase A (cyclophilin A) (4333763, PPIA as endogenous control) and angiotensin I converting enzyme 2 (Hs01085333_m1, ACE2 gene). For non-targeted metabolomics analysis, metabolites were extracted from plasma and faecal samples with methanol (containing phenylalanine-C13 as an internal standard) according to previously described methods [15] . Briefly, for plasma samples 30µl of cold methanol were added to 10 µl of each sample, vortexed for 1 minute and incubated for one hour at −20 °C. For faecal samples, the content of a 1.2 ml tube of Lysing Matrix E (MP biomedicals) and 600 μl of cold methanol were added to 10mg of sample. Samples were homogenized using FastPrep-24™ (MP biomedicals) and were incubated overnight in a rocker at 4°C. Then, all samples were centrifuged for three minutes at 12.000g, the supernatant was recovered and filtered with a 0.2 μm Eppendorf filter. Two µL of the extracted sample were applied onto a reversed-phase column (Zorbax SB-Aq 1.8 µm 2.1 x 50 mm; Agilent Technologies) equipped with a precolumn (Zorbax-SB-C8 Rapid Resolution Cartridge 2.1 x 30 mm 3.5 µm; Agilent Technologies) with a column temperature of 60°C. The flow rate was 0.6 mL/min. Solvent A was composed of water containing 0.2% acetic acid and solvent B was composed of methanol 0.2% acetic acid. The gradient started at 2% B and increased to 98% B in 13 min and held at 98% B for 6 min. Post-time was established in 5 min. Data were collected in positive and negative electrospray modes time of flight operated in fullscan mode at 50-3000 m/z in an extended dynamic range (2 GHz), using N2 as the nebulizer gas (5 L/min, 350°C). The capillary voltage was 3500 V with a scan rate of 1 scan/s. The ESI source used a separate nebulizer for the continuous, low-level (10 L/min) introduction of reference mass compounds 121.050873 and 922.009798, which were used for continuous, online mass calibration. MassHunter Data Analysis Software (Agilent Technologies, Barcelona, Spain) was used to collect the results, and MassHunter Qualitative Analysis Software (Agilent Technologies, Barcelona, Spain) to obtain the molecular features of the samples, representing different, co-migrating ionic species of a given molecular entity using the Molecular Feature Extractor algorithm (Agilent Technologies, Barcelona, Spain). We selected samples with a minimum of 2 ions. Multiple charge states were forbidden. Compounds from different samples were aligned using a retention time window of 0.1% ± 0.25 minutes and a mass window of 20.0 ppm ±2.0 mDa. We selected only those present in at least 50% of the samples of one group and corrected for individual bias. Consecutively, a group of n=30 morbidly obese (BMI>35 kg/m 2 ) subjects were recruited at the Endocrinology Service of the Hospital. All subjects were of Caucasian origin and reported a body weight stable for at least three months before the study. Subjects studied in the postabsorptive state. The following exclusion criteria were considered: i) no systemic disease other than obesity; ii) free of any infections in the previous month before the study; iii) no liver diseases (specifically tumoral disease and HCV infection) and thyroid dysfunction, which will be specifically excluded by biochemical work-up. This protocol was revised, validated and approved by the Ethics committee of the Hospital. The purpose of the study was explained to participants and they signed written informed consent before being enrolled in the study. Intestinal epithelium (~2 cm, specifically from jejunal region) was obtained during gastric bypass surgery and collected in RNA preservative solution (RNAlater TM Stabilization Solution, Thermo Fisher Scientific Inc., MA, USA). Samples were immediately transported to the laboratory (5-10min). The handling of tissue was carried out under strictly aseptic conditions, and stored at -80ºC. Intestinal epithelium from jejunum was collected during gastric by-pass surgery in RNAlater (Thermo Fisher Scientific), to preserve RNA integrity. Then, samples were immediately transported to the laboratory. The handling of tissue was carried out under strictly aseptic conditions and stored at -80ºC. RNA purification was performed using RNeasy-Tissue Mini-Kit (QIAgen). Total RNA was quantified by Qubit® RNA BR Assay kit (Thermo Fisher Scientific) and the integrity was checked by using the RNA Kit (15NT) on 5300 Fragment Analyzer System (Agilent). The RNASeq libraries were prepared with Illumina® TruSeq® Stranded Total RNA Sample Preparation kit following the manufacturer´s recommendations with some modifications. Briefly, in function of availability 100-500ng of total RNA was rRNA depleted using the RiboZero Magnetic Gold Kit and fragmented by divalent cations. The strand specificity was achieved during the second strand synthesis performed in the presence of dUTP. The cDNA was adenylated and ligated to Illumina platform compatible IDT adaptors with unique dual indexes with unique molecular identifiers (Integrated DNA Technologies), for paired end sequencing. The ligation products were enriched with 15 PCR cycles and the final library was validated on an Agilent 2100 Bioanalyzer with the DNA 7500 assay (Agilent). The libraries were sequenced on NovaSeq 6000 (Illumina) in a fraction of sequencing flow cell with a read length of 2x100bp following the manufacturer's protocol for dual indexing. Image analysis, base calling and quality scoring of the run were processed using the manufacturer's software Real Time Analysis (RTA v3.4.4) and followed by generation of FASTQ sequence files. RNA-seq reads were mapped against human reference genome (GRCh38) using STAR software version 2.5.3a [16] with ENCODE parameters. Genes were quantified using RSEM version 1.3.0 [17] with default parameters and using the annotation file from GENCODE version 29. A regularized log2 transformation (rlog) as implemented in DESeq2 R package [18] was applied to the count data to remove the dependence of the variance on the mean and to normalize for library size. Then, the rlog values for the selected SARS-CoV-2 receptors were used for further analyses with the lipidomics data. Lipidomics analyses were performed combining two different methods. Method 1: For the extraction of less hydrophobic lipids, a protein precipitation extraction was performed by adding 500 μL of methanol containing internal standard mixture (myristic acid-d27, arachidonic acid-d8, Cholic acid-d4, Taurocholic acid-d5 LysoPC 18:1-d7), to 10 mg of tissue sample previously weighted for the customer. Then, the samples were mixed and homogenized on a bullet blender using a stainless-steel ball, incubated at -20ºC for 30 min., centrifuged at 15,000 rpm and supernatant was analyzed by UHPLC-qTOF (model 6550 of Agilent, USA) in negative electrospray ionization mode. The chromatographic method was a gradient elution with water and acetonitrile with 0.05% formic acid as mobile phase and C18 column (ACQUITY UPLC BEH C18 Column, 1.7 μm, 2.1 mm X 100 mm) that allowed the sequential elution of the less hydrophobic lipids such as non-esterified fatty acids, bile acids, steroids and lysophospholipids among others. The identification of lipid species was performed by matching their accurate mass and tandem mass spectrum, when available, to Metlin-PCDL from Agilent containing more than 40,000 metabolites and lipids. In addition, chromatographic behaviour of pure standards for each family and bibliographic information was used to ensure their putative identification. After putative identification of lipids, these were semi quantified depending on their family similarity by providing an internal standard ratio. For the extraction of more hydrophobic lipids, a liquid-liquid extraction with chloroform: methanol (2:1) based on Folch procedure was performed by adding chloroform: methanol (2:1) containing internal standard mixture (Lipidomic SPLASH®) to 10 mg of tissue sample previously weighted for the customer. Then, the samples were mixed and homogenized on a bullet blender using a stainless-steel ball. Afterwards, water with NaCl (0.8 %) was added and mixture was centrifuged at 15,000 rpm. Lower phase was recovered, evaporated to dryness and reconstituted with methanol: methyl-tert-butyl ether (9:1) and analyzed by UHPLC-qTOF (model 6550 of Agilent, USA) in positive electrospray ionization mode. As for method 1, the chromatographic method was an elution gradient consisting on an elution with a ternary mobile phase containing water, methanol and 2-propanol with 10mM ammonium formate and 0.1% formic acid. The stationary phase was a C18 column (Kinetex EVO C18 Column, 2.6 μm, 2.1 mm X 100 mm) that allows the sequential elution of the more hydrophobic lipids such as lysophospholipids, sphingomyelins, phospholipids, diglycerides, triglycerides and cholesteryl esters, among others. The identification of lipid species was performed by matching their accurate mass and tandem mass spectrum, when available, to Metlin-PCDL from Agilent containing more than 40,000 metabolites and lipids. In addition, chromatographic behaviour of pure standards for each family and bibliographic information was used to ensure their putative identification. After putative identification of lipids, these were semi quantified depending on their family similarity by providing an internal standard ratio. The FLOROMIDIA cohort (n=37) is a pilot study to investigate OMICS signatures in subjects with a BMI ranging from 20 to 60 kg/m 2 [19] . Inclusion criteria included Caucasian origin, reported stable bodyweight 3 months before study, and free of any infections one month before visit 1. Exclusion criteria included presence of liver disease, specifically HBV/HCV infection and tumor disease, thyroid dysfunction (based on biochemical work-up) and alcohol consumption (>20 g/day). All subjects gave written informed consent, validated and approved by the ethical committee of Policlinico Tor Vergata University of Rome (Comitato Etico Indipendente Policlinico Tor Vergata, approval number 113/15). Colon biopsies were obtained during routine colonoscopy screening for cancer from patients negative from neoplasms. Samples were maintained in RNAlater (Thermo Fisher Scientific), to preserve RNA integrity. Then, samples were immediately transported to the laboratory. The handling of tissue was carried out under strictly aseptic conditions and stored at -80ºC. RNA purification was performed using RNeasy-Tissue Mini-Kit (QIAgen). Total RNA was quantified by Qubit® RNA BR Assay kit (Thermo Fisher Scientific) and the integrity was checked by using the RNA Kit (15NT) on 5300 Fragment Analyzer System (Agilent). Colon transcriptomics was performed with Affymetrix HUGENE 2.0 ST ARRAY FORMAT 100. After normalization to the median and applying log2 transformation, the transformed values of SARS-CoV-2 receptors were selected for further analyses with the metabolomics and lipidomics data. Data generation: After blood draw serum was immediately shock-frozen and stored at -80°C until metabolomics analysis. Prior to extraction process, samples were thawed and 100 µL of each serum sample was pipetted into randomly assigned wells of 2 mL 96-well plates. Human reference plasma samples were pipetted into a well of each 96-well plate. In addition, human pooled serum (Seralab) or pool of aliquots of mouse serum samples were pipetted into 6 wells of each 96-well plate for human or mouse serum sample set, respectively. Those samples were used to assess process variability and to serve as technical replicates. 100 μL of water were pipetted in 6 wells of each 96-well plate to serve as process blanks. Protein was precipitated and metabolites were extracted with methanol (475 µL for human serum samples or 500 µL for the mouse serum samples) containing four recovery standard compounds to monitor extraction efficiency. After centrifugation, the supernatant was split into 4 aliquots (100 µL each for human serum sample set) or 5 aliquots (80 µL each for mouse serum sample set) onto 96-well microplates. To minimize human error, liquid handling was performed on a Hamilton Microlab STAR robot (Hamilton). Sample extracts were dried on a TurboVap 96 (Zymark). Two aliquots were used for reverse phase (RP)/Ultra Performance Liquid Chromatographytandem Mass spectrometry (UPLC-MS/MS) analysis in positive and negative electrospray ionization (ESI) mode, while two aliquots were kept as a reserve. Prior to UPLC-MS/MS in positive ESI mode, the samples were reconstituted with 50 µL of 0.1% formic acid (FA) and those analyzed in negative ESI mode with 50 µL of 6.5 mM ammonium bicarbonate, pH 8.0. Reconstitution solvents for both ionization modes further contained a cocktail of QC internal standards to monitor instrument performance and that also to serve as retention reference markers. Analyses were performed on a linear ion trap LTQ XL mass spectrometer (Thermo Fisher Scientific GmbH, Dreieich, Germany) coupled with a Waters Acquity UPLC system (Waters GmbH, Eschborn, Germany). Two separate columns (2.1 x 100 mm Waters BEH C18 1.7 µm particle) were used for acidic (solvent A: 0.1% FA in water, solvent B: 0.1% FA in methanol) and for basic (A: 6.5 mM ammonium bicarbonate pH 8.0, B: 6.5 mM ammonium bicarbonate in 95% methanol) mobile phase conditions, optimized for positive and negative ESI mode, respectively. After injection of the sample extracts, the columns were developed in a gradient of 99.5% A to 98% B in 11 min run time at 350 µL/min flow rate. The eluent flow was directly connected to the ESI source of the LTQ XL mass spectrometer. Full scan mass spectra (80 -1000 m/z) and data dependent MS/MS scans with dynamic exclusion were recorded in turns. Raw data was extracted, peak-identified and QC processed using hardware and software of Metabolon (Durham, NC, USA). Compounds were identified using their retention index (RI), accurate mass (+/-10 ppm) and MS/MS by comparison to library entries maintained by Metabolon (Durham, NC, USA). Lastly compounds were manually checked and corrected, if necessary, by data analysts at Metabolon using proprietary visualization and interpretation software to confirm the consistency of peak identification among the various samples. The data was delivered with the area under the curve of the compounds peak as the value at its original scale. For further analysis, compounds were filtered for deficient groups. A compound was removed if it was missing in more than half of the samples within a group and if it was present above this threshold in just one treatment group. Missing values were filled with half the minimum of each compound. Sex differences in the ACE2 expression were determined using a Mann-Whitney U test. The associations between dietary information from the food frequency questionnaires and the ACE2 expression were evaluated using a Spearman correlation. Metabolomics and lipidomics data were normalized using a probabilistic quotient normalization. Then, for the identification of relevant lipids/metabolites associated with the expression of ACE2, TMPRSS2, DPP9, TYK2, and CCR2, we adopted an all-relevant machine learning variable selection strategy applying a multiple random forest-based algorithm as implemented in the Boruta R package [20] after adjusting for age, BMI, and sex. It has been recently proposed as one of the two best-performing variable selection methods making use of RF for high-dimensional omics datasets [21] . It performs variables selection in three steps: a) Randomization, which is based on creating a duplicate copy of the original features randomly permutate across the observations (the so-called shadow features); b) Model building, based on RF with the extended data set to compute the normalized permutation variable importance (VIM) Z-scores; c) Statistical testing, to find those relevant features with a VIM higher than the best shadow feature using a Bonferroni corrected two-tailed binomial test (P Bonferroni ) (predictor features with significantly higher, significantly lower, or non-significantly different Z-scores than expected at random compared to the best shadow feature are deemed important (selected), unimportant (rejected), or tentative, respectively); and d) Iteration, until the status of all features is decided. We run the algorithm with 500 iterations, a confidence level cut-off of 0.005 for the Bonferroni adjusted p-values (P Bonferroni ) and 5000 trees. In addition, we also used a second decision tree-based machine learning approach based on Gradient Boosting Machines (GBMs) using the R package "lightGBM" [22] . Unlike random forests, which build and ensemble of deep independent decision trees, GBMs build an ensemble of shallow and weak successive trees, where each tree learns and improves based on the previous one. We used a learning rate of 0.01, 100 iterations. We used a minimum number of 5 samples in one leaf, a feature and samples fraction of 70% to randomly select a subset of features and samples on each tree node to deal with over-fitting. To facilitate model interpretation, the contribution and effect of each selected feature (i.e., lipids) in predicting the target variable (i.e., expression of ACE2, etc) was calculated using SHapley Additive exPlanations (SHAP) scores based on game-theory [23] . SHAP values determine the importance of a specific value in a specific feature by comparing the model prediction with and without the feature for each individual. Therefore, the same feature with a specific value may have different SHAP valued for different individuals depending on the interactions with other features of that individual. The SHAP scores were calculated using the R package "SHAPforXGBoost". A popular alternative to the decision tree-based approach for variable selection are the penalized regression methods, where a penalty is added to the loss function to shrink some regression coefficients towards zero. Therefore, we further analysed the data applying a lasso regression model as implemented in the "glmnet" package using 10 cross-validation folds and 100 lambda values for the optimization of the lambda grid. The expression levels of ACE2 in peripheral leukocytes were quite low, but higher ACE2 expression levels were found among women (Figure 1a) . Only 32 subjects had a qPCR-Ct value lower than 36.5 and all were women. Age or BMI were not significantly associated with ACE2 levels. However, in subjects aged ≤70 years, age was negatively correlated with ACE2 expression (r=-0.148, P = 0.009, n=309). These findings are in line with recent clinical reports suggesting that patients infected with SARS-CoV-2 with more severe disease (i.e., older age, cardiovascular disease) share a variable degree of ACE2 deficiency [9] . We here add that male sex also shares this particularity. We adopted three different machine learning variable selection strategies to identify lipids and metabolites associated with SARS-CoV-2 receptors/mediators. From random forest-based machine learning (Boruta algorithm), the expression of ACE2 in peripheral leukocytes was associated with the plasma levels of the DHA-derived metabolite 19,20dihydroxydocosapentanoic acid (19,20-DiHDPA), the arachidonic acid (AA, C20:4n-6) derived eicosanoid 5,12-dihydroxyeicosatetraenoic acid (5,12-DiHETE), 17-estradiol-3sulfate, and deoxycholic acid (Figure 1b) . These results were confirmed using gradient boosting machines (LightGBM) and lasso regression (glmnet) (Figure S1 ). In line with these results, lipidomics analyses of the jejunum revealed that the omega-3 (n-3) long-chain polyunsaturated fatty acids eicosapentaenoic (EPA, C20:5 n-3) and docosapentaenoic (DPA, C22:5 n-3) acids were the lipid species most positively associated with the ACE2 expression in the jejunum (Figure 1c, Figure S2a,b) . In agreement with these findings, the reported intakes of n-3 fatty acids from a food frequency questionnaire also had the strongest positive association with the jejunal ACE2 expression (Figure 1d) . Remarkably, the expression of ACE2 in the colon was again associated with plasma EPA, as well as medium chain hydroxy fatty acids (Figure 1e, Figure S3a ). Lasso regression also identified a strong negative association between medium chain hydroxy fatty acids and the expression of ACE2 in peripheral leukocytes ( Figure S1b) . Notably, we also found significant associations between gonadal steroids (pregnenolone sulfate) and the jejunal expression of ACE2 ( Figure S3a) . Metabolomics analyses also identified several metabolites predictive of the ACE2 expression such as pyridoxate, a form of vitamin B6, and pseudouridine and 1-methyl adenosine, markers of RNA damage (Figure 1f) . Recent observations characterized the expression of potential SARS-CoV2 receptors in different tissues, but the potential influence of obesity on the expression of these molecules was not evaluated [24] . Although we did not find significant associations between the BMI and the expression of ACE2 in the jejunum in the jejunum cohort, it had a positive association with the expression in the expression (Figure 2a) . This was especially remarkable given the relatively narrow range of BMI in the jejunum cohort. Gastrointestinal symptoms are increasingly recognized to be remarkable in patients with COVID-19 [25] and increased TMPRSS2 expression with obesity might play a role. Random-forest based machine learning analyses revealed that the expression of TMPRSS2 in the jejunum was associated with jejunal eicosanoids derived from the n-6 fatty acids linoleic acid (9-HODE/13-HODE) and AA (12-HETE, 15-HETE, 14,15-diHETE) (Figure 2b) . It also had negative associations with several lysophosphatidylcholines (LPC) and lysophosphatidylethanolamines (LPE). These results were replicated using gradient boosted machines and lasso regression (Figure S2b,d) . Similarly, the expression of TMPRSS2 in the colon was associated with the plasma levels of the AA-derived eicosanoid Leukotriene B4 (LTB4) (Figure 2c) . In line with the ACE2 results, the expression of TMPRRS2 was also associated with the medium-chain hydroxy fatty acid 2-hydroxydecanoic acid (Figure 2c) . Once more, these results were replicated using lasso regression (Figure S3b) . The circulating levels of acisoga, a catabolic product of spermidine, also had a strong association with the colonic expression of TMPRSS2. Lipidomics analyses also revealed significant associations with DPP9, TYK2, and CCR2 both in the jejunum (Figure 3a,c,e, Figure S2e -j) and colon (Figure 3b,d,f, Figure S3c-e) . Of note, the expression of the three gene transcripts was again associated with the levels of EPA in the jejunum. EPA levels were also associated with the expression of TYK2 in the colon ( Figure S3d) . Again, we found associations with several jejunal LPE and LPC in all cases. In the colon samples, we found once more associations of these gene transcripts with the plasma levels of 2-and 3-hydroxydecanoic acid and LPC. Another remarkable finding was the positive association of the DPP9, TYK2 and CCR2 expression with metabolic products of male gonadal steroids such as andro steroid monosulfate, dehydroisoandrosterone sulfate (DHEA-S), and epiandrosterone sulfate (Figure S3c-e) , but negative with metabolites resulting from the catabolism of female gonadal steroids, including pregnenolone sulfate, pregnanediol derivatives, and pregn steroid monosulfate (Figure 3 d,f, Figure S3d -e). In the present study, we found that the expression of ACE2 in peripheral leukocytes was increased in female compared to men. In addition, we identified lipidomics signatures associated with the expression of well-established (ACE2, TMPRSS2) and novel (DPP9, TYK2, CCR2) mediators of SARS-CoV2-infection involving n-3 fatty acids (particularly EPA), AA eicosanoids, and gonadal steroids. The apparently discrepant results regarding ACE2 (increased circulating ACE2 among men in one study [8] and decreased ACE2 expression in leukocytes in the current study) could be due to several confounding factors. As recently noted, ACE2 gene maps on the nonpseudoautosomal, that is, the X-specific, region of the chromosome X (Xp22.2) [26] . In order to balance the dosage effect of X-linked genes, one of the two X chromosomes is randomly inactivated in females during development. In fact, ACE2 presents remarkable differences in male-female expression levels [27] . Tissue differences in the escape from this X inactivation process have been suggested to directly translate into tissue-specific sex biases in gene expression [27] . Escape of ACE2 from X-inactivation resulted in low levels of expression in the liver, lung, and visceral adipose tissue of women [28] . Conversely, ACE2 expression levels in colon transverse and subcutaneous adipose tissue were significantly higher in females than males [27] . We now add here peripheral leukocytes. It is intriguing that lymphocytes lacked ACE2 expression in a previous study, suggesting an alternative mechanism by which SARS-CoV-2 compromises T lymphocytes [28] . However, macrophages frequently communicate with coronavirus targets through chemokine and phagocytosis signalling, highlighting the importance of tissue macrophages in immune defence and immune pathogenesis [29] . The most remarkable finding from the current study was the association of n-3 fatty acids and AA-derived eicosanoids with the expression of SARS-CoV-2 mediators/receptors in peripheral leukocytes, jejunum and colon. Specifically, the jejunal levels of EPA strongly associated with the expression of ACE2, DPP9, TYK2, and CCR2 in the jejunum. In addition, circulating EPA levels were negatively associated with the expression of ACE2 in the colon, whereas plasma and jejunal levels of eicosanoids derived from n-6 fatty acids such as linoleic acid (9-HODE, 13-HODE) and AA (LTB4, 12-HETE, 15-HETE, 14,15-DiHETE) were positively associated with the expression of ACE2 and TMPRSS2. Most significant in the context of SARS-CoV-2 infection, which is characterized by excessive and uncontrolled inflammation, there is extensive evidence for an anti-inflammatory and pro-resolution effect of long-chain n-3 PUFA after they incorporate into cell membrane phospholipids [30, 31] . Among the responsible mechanisms, increases in EPA and DHA content of cell membrane results in decreased production of the n-6 PUFA AA and derived eicosanoids [32] , most of them with well-established proinflammatory role [33] . Of note, the proteins encoded by DPP9, TYK2 and CCR2 play an important in inflammatory processes and have been involved in the biological mechanisms related to critical illness in COVID-19 by driving inflammatory lung injury [11] . On the other hand, lipids, including PUFA, exhibited multiple roles in antiviral innate and adaptive response [34] . n-3 PUFA may interfere with virus entry by modulating lipids rafts [35] , where both ACE2 and TMPRSS2 are known to be mainly expressed. Similarly, an in silico study has recently shown that n3-PUFA could bind to the close form of the S glycoprotein and reduce viral entry to host cells [36] . Consistently, a recent performing target-based ligand screening using the receptor-binding domain SARS-CoV-2 sequence, identified PUFA as the most efficient agents interfering with the binding to ACE2 [37] . Remarkably, in line with our findings, EPA significantly blocked the entry of SARS-CoV-2 and reduced the activity of TMPRSS2. In addition, n-3 PUFA can also interfere with viral replication. For instance, MERS-CoV manipulates host cellular lipid metabolism and reprograms de novo SREBP-dependent lipogenesis pathway to ensure its replication [38] and both EPA and DHA are well-known inhibitors of SREBP transcription and maturation. As increased intakes of long-chain n-3 PUFAs result in increased levels of these fatty acids in blood lipids, blood cells and many tissues, including the intestinal epithelium [39] , a diet enriched in EPA, DPA and DHA may be useful in preventing viral exposure through the gastrointestinal route. In fact, a recent pilot study in 100 patients found a strong trend towards reduction in mortality from COVID-19 in those patients with higher red blood cell EPA + DHA content. Thus, patients in the highest quartile were 75% less likely to die than those in the other quartiles (OR = 0.25, P = 0.07) [40] . Consistently, n-3 PUFA supplementation has recently shown to improve the levels of several respiratory (pH, bicarbonate, baes excess) and renal function (blood urea nitrogen, creatinine) parameters and 1-month survival rates compared to the control group [41] . We also found a consistent association between the plasma levels of the medium chain hydroxy fatty acids 2-hydroxydecanoic acid and 3-hydroxydecanoic acid the expression of ACE2, TMPRSS2, DPP9, and CCR2 in the colon. Notably, 2-hydroxydecanoic acid has shown to inhibit the replication of varicella-zoster virus, poliovirus and HIV-1 [42] . Recently, hydroxy polyunsaturated fatty acids have also shown antiviral activity against measles influenza virus [43] . In line with this, we also found a strong association between the expression of ACE2 in peripheral leukocytes and the circulating levels of 19,20-dihydroxydocosapentaenoic acid. Of note, a similar compound, 10,17-dihydroxydocosahexaenoic, has been identified as a potent inhibitor of virus replication by preventing viral RNA export from the nucleus to the cytoplasm by interfering with binding of host nuclear receptor factors to viral RNA [44] . It is also worth noting that we found several markers of RNA damage (pseudouridine, 1-methyladenosine) associated with the expression of ACE2 in the colon. On the other hand, we also found that 17-estradiol-3-sulfate, an endogenous oestrogen metabolite, was positively associated with the expression of ACE2 in the peripheral leukocytes, even after controlling for sex. In line these results, 17-estradiol, a primarily female sex steroid, has shown to regulate the gene expression of ACE2, but not that of TMPRSS2, in differentiated airway epithelial cells [45] and human endothelial cells [46] . Similarly, we found that metabolites derived from male gonadal steroids were positively associated with the expression of DPP9, TYK2 and CCR2, while female gonadal steroids had a negative association. In agreement with these findings, progesterone has shown to promote the ACE2 expression in the endometrial stroma in both humans and mice [47] . These results would agree with female sex as a protective factor against SARS-CoV-2 infection [48] . In conclusion, we have identified consistent lipidomics signatures involving n-3 long-chain PUFA, hydroxy fatty acids, and female gonadal steroids linked to well-established and novel mediators/receptors for SARS-CoV-2 infection. Our results highlight the importance of targeting a diet rich in long-chain omega-3 to modulate the host inflammatory response in SARS-CoV-2 infection. in peripheral leukocytes. c) Boxplots of the VIM for the jejunal lipids associated with the expression of ACE2 in the jejunum. d) Spearman's rank correlations ACE2 expression in the jejunum with macronutrients, vitamins, minerals, amino acids and fatty acids derived from food frequency questionnaires. e) Boxplots of the VIM for the plasma lipids and f) metabolites associated with the expression of ACE2 in the colon. In the boxplots, the red dot represents the mean and the color bar above each plot indicates the sign of the association among the metabolites or lipids with the expression of the corresponding receptor, with red indicating negative correlation and green positive correlation. Significant metabolites and lipids were identified using a multiple random forest-based machine learning variable selection strategy as implemented in the Boruta algorithm with 5000 trees and 500 iterations and P Bonferroni <0.005. in the colon. In the boxplots, the red dot represents the mean and the color bar above each plot indicates the sign of the association among the metabolites or lipids with the expression of the corresponding receptor, with red indicating negative correlation and green positive correlation. Significant metabolites and lipids were identified using a multiple random forest-based machine learning variable selection strategy as implemented in the Boruta algorithm with 5000 trees and 500 iterations and P Bonferroni <0.005. associated with the expression of DPP9 in the jejunum and colon, respectively. c) Boxplots of the variable importance measure (VIM) for the jejunal and d) plasma lipids associated with the expression of TYK2 in the jejunum and colon, respectively. e) Boxplots of the variable importance measure (VIM) for the jejunal and f) plasma lipids associated with the expression of CCR2 in the jejunum and colon, respectively. In the boxplots, the red dot represents the mean and the color bar above each plot indicates the sign of the association among the metabolites or lipids with the expression of the corresponding receptor, with red indicating negative correlation and green positive correlation. Significant metabolites and lipids were identified using a multiple random forest-based machine learning variable selection strategy as implemented in the Boruta algorithm with 5000 trees and 500 iterations and P Bonferroni <0.005. Figure S1 . Lipidomics signatures associated with the expression of ACE2 in peripheral leukocytes using gradient boosted machines and lasso regression. a) SHAP summary plot for the lightGBM model. Each dot represents and individual sample in the model. The X-axis represents the SHAP value: the impact of a specific lipid on the prediction of the expression in a specific individual, with points to the right and left indicating that for that individual this lipid contributed to increasing or decreasing the expression of ACE2, respectively. Lipids are sorted in decreasing order based on their overall importance for final prediction (sum of SHAP values). Colors represent the values of the lipids normalized concentrations, ranging from yellow (low concentrations of the specific lipid) to purple (high concentrations of the specific lipid). b) Coefficients plot for the lasso regression model. In the interest of transparency, we ask you to disclose all relationships/activities/interests listed below that are related to the content of your manuscript. "Related" means any relation with for-profit or not-for-profit third parties whose interests may be affected by the content of the manuscript. Disclosure represents a commitment to transparency and does not necessarily indicate a bias. If you are in doubt about whether to list a relationship/activity/interest, it is preferable that you do so. The following questions apply to the author's relationships/activities/interests as they relate to the current manuscript only. The author's relationships/activities/interests should be defined broadly. For example, if your manuscript pertains to the epidemiology of hypertension, you should declare all relationships with manufacturers of antihypertensive medication, even if that medication is not mentioned in the manuscript. In item #1 below, report all support for the work reported in this manuscript without time limit. For all other items, the time frame for disclosure is the past 36 months. form. Baseline Characteristics and Outcomes of 1591 Patients Infected with SARS-CoV-2 Admitted to ICUs of the Lombardy Region Case-Fatality Rate and Characteristics of Patients Dying in Relation to COVID-19 in Italy Presenting Characteristics, Comorbidities, and Outcomes among 5700 Patients Hospitalized with COVID-19 in the Genetic correlations between COVID-19 and a variety of traits and diseases Genomic variation, origin tracing, and vaccine development of SARS-CoV-2: A systematic review SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Overexpression of the SARS-CoV-2 receptor ACE2 is induced by cigarette smoke in bronchial and alveolar epithelia Circulating plasma concentrations of angiotensin-converting enzyme 2 inmen and women with heart failure and effects of renin-angiotensin-aldosterone inhibitors The pivotal link between ACE2 deficiency and SARS-CoV-2 infection Obesity alters Ace2 and Tmprss2 expression in lung, trachea, and esophagus in a sex-dependent manner: Implications for COVID-19 Genetic mechanisms of critical illness in Covid-19 Gastrointestinal Manifestations of SARS-CoV-2 Infection and Virus Load in Fecal Samples From a Hong Kong Cohort: Systematic Review and Meta-analysis Evidence for Gastrointestinal Infection of SARS-CoV-2 The aging imageomics study: rationale, design and baseline characteristics of the study population Metabolomic analysis of the cerebrospinal fluid reveals changes in phospholipase expression in the CNS of SIVinfected macaques STAR: Ultrafast universal RNA-seq aligner RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Cross-omics analysis revealed gut microbiome-related metabolic pathways underlying atherosclerosis development after antibiotics treatment Feature selection with the boruta package Evaluation of variable selection methods for random forests and omics data sets LightGBM: A highly efficient gradient boosting decision tree A unified approach to interpreting model predictions SARS-CoV-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes Prevalence and Characteristics of Gastrointestinal Symptoms in Patients With Severe Acute Respiratory Syndrome Coronavirus 2 Infection in the United States: A Multicenter Cohort Study COVID-19 and ACE2 in the Liver and Gastrointestinal Tract: Putative Biological Explanations of Sexual Dimorphism Landscape of X chromosome inactivation across human tissues Tissue distribution of ACE2 protein, the functional receptor for SARS coronavirus. A first step in understanding SARS pathogenesis Single cell RNA sequencing of 13 human tissues identify cell types and receptors of human coronaviruses Very long-chain n-3 fatty acids and human health: Fact, fiction and the future Nutritional Lipids and Mucosal Inflammation Dose-related effects of eicosapentaenoic acid on innate immune function in healthy humans: A comparison of young and older men Eicosanoid storm in infection and inflammation Lipids in innate antiviral defense Docosahexaenoic acid regulates the formation of lipid rafts: A unified view from experiment and simulation In silico study of polyunsaturated fatty acids as potential sars-cov-2 spike protein closed conformation stabilizers: Epidemiological and computational approaches Polyunsaturated ω-3 fatty acids inhibit ACE2-controlled SARS-CoV-2 binding and cellular entry SREBP-dependent lipidomic reprogramming as a broad-spectrum antiviral target Omega-3 Polyunsaturated Fatty Acids and the Intestinal Epithelium-A Review Blood omega-3 fatty acids and death from COVID-19: A pilot study The effect of omega-3 fatty acid supplementation on clinical and biochemical parameters of critically ill patients with COVID-19: a randomized clinical trial Antiviral Activity of 2-Hydroxy Fatty Acids Polyunsaturated fatty acids from Phyllocaulis boraceiensis mucus block the replication of influenza virus The lipid mediator protectin D1 inhibits influenza virus replication and improves severe influenza Estrogen regulates the expression of SARS-CoV-2 receptor ACE2 in differentiated airway epithelial cells Estradiol, acting through ERα, induces endothelial non-classic renin-angiotensin system increasing angiotensin 1-7 production The SARS-CoV-2 receptor, angiotensin-converting enzyme 2, is required for human endometrial stromal cell decidualization † Implications of Sex Differences in Immunity for SARS-CoV-2 Pathogenesis and Design of Therapeutic Interventions Investing in your future") and the project PI20/01090 (co-funded by European Regional Development Fund "A way to make Europe Please place an "X" next to the following statement to indicate your agreement: X I certify that I have answered every question and have not altered the wording of any of the questions on this