key: cord-0425854-l9uwb2oq authors: Alvarez-Mulett, S.; Buyukozkan, M.; Racanelli, A. C.; Schmidt, F.; Batra, R.; Hoffman, K. L.; Sarwath, H.; Engelke, R.; Gomez-Escobar, L.; Simmons, W.; Benedetti, E.; Chetnik, K.; Schenck, E.; Suhre, K.; Choi, J. J.; Zhao, Z.; Racine-Brzostek, S.; Yang, H. S.; Choi, M. E.; Choi, A. M. K.; Cho, S. J.; Krumsiek, J. title: Integrative Metabolomic and Proteomic Signatures Define Clinical Outcomes in Severe COVID-19 date: 2021-07-22 journal: nan DOI: 10.1101/2021.07.19.21260776 sha: a7f94923c24e7aa42f7a300d5501d392ad59f9d0 doc_id: 425854 cord_uid: l9uwb2oq The novel coronavirus disease-19 (COVID-19) pandemic caused by SARS-CoV-2 has ravaged global healthcare with previously unseen levels of morbidity and mortality. To date, methods to predict the clinical course, which ranges from the asymptomatic carrier to the critically ill patient in devastating multi-system organ failure, have yet to be identified. In this study, we performed large-scale integrative multi-omics analyses of serum obtained from COVID-19 patients with the goal of uncovering novel pathogenic complexities of this disease and identifying molecular signatures that predict clinical outcomes. We assembled a novel network of protein-metabolite interactions in COVID-19 patients through targeted metabolomic and proteomic profiling of serum samples in 330 COVID-19 patients compared to 97 non-COVID, hospitalized controls. Our network identified distinct protein-metabolite cross talk related to immune modulation, energy and nucleotide metabolism, vascular homeostasis, and collagen catabolism. Additionally, our data linked multiple proteins and metabolites to clinical indices associated with long-term mortality and morbidity, such as acute kidney injury. Finally, we developed a novel composite outcome measure for COVID-19 disease severity and created a clinical prediction model using a set of 33 metabolites. The model significantly improved the identification of key events of critical illness such as prolonged hospitalization, supplemental oxygen requirement, acute kidney injury (AKI), and acute respiratory distress syndrome (ARDS), beyond those achieved using the more traditional risk factors of age, gender, and BMI. The novel coronavirus disease 2019 (COVID-19) has a broad spectrum of clinical 55 features that range from asymptomatic disease to acute respiratory distress syndrome 56 (ARDS)(1, 2). COVID-19 ARDS can lead to refractory hypoxia, mechanical ventilation, 57 prolonged intensive care unit (ICU) stay and increased mortality(3). Previous studies have 58 shown a high incidence of concomitant organ failure in COVID-19, including acute kidney 59 injury (AKI)(4), acute liver injury(5), thromboembolic events(6, 7) and secondary 60 infections contributing to a fatal outcome(8). 61 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint Massive investigative efforts by multiple scientific groups have used proteomic and 62 metabolomic approaches to begin to unravel disease mechanisms relevant to SARS-63 CoV-2 infection such as inflammation, coagulation, and metabolism(9). However, how 64 COVID-19 specific protein-metabolite interactions relate to the severity of disease and of the cohort can be found in Table 1 and Supplementary Table 1 . Of note, we excluded 89 samples collected after intubation because we found that the clinical act of intubation 90 significantly alters a patient's metabolic profile (Supplementary Table 2) . 91 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Differential metabolomic analysis identified significant changes in abundances of 70 out 100 of the 125 analyzed metabolites between COVID-19 and controls at a false discovery rate 101 (FDR) of 0.05 ( Figure 2A ). The top three differentially expressed metabolites were 102 involved in amino acid metabolism: N-acetyl-L-aspartic acid (p-value = 9.19E-18), N-103 acetyl-aspartyl-glutamic acid (p-value = 5.30E-15), and argininosuccinic acid (p-value = 104 6.35E-12) ( Figure 2B ). KEGG pathway mapping of the differentially expressed 105 metabolites revealed an involvement of various metabolic pathways ( Figure 2C ). These 106 pathways included arginine and proline metabolism, glycine and serine metabolism, 107 alanine metabolism, methionine metabolism, sphingolipid metabolism, gluconeogenesis, 108 and the TCA cycle pathway, demonstrating involvement of the broader categories of 109 amino acid, lipid and energy metabolism in COVID-19 pathogenesis. 110 Comparative proteomic analysis identified significant changes in the expression of 48 out 111 of the 266 analyzed proteins between COVID-19 and controls at an FDR of 0.05 112 ( Figure 2D ). The top three differentially expressed proteins were C-X-C motif chemokine 113 ligand 10 (CXCL10) (p-value = 5.09E-13), galectin 9 (Gal-9) (p-value = 5.49E-10), and 114 monocyte chemoattractant protein 3 (MCP-3) (p-value = 2.85E-09) ( Figure 2E ). KEGG 115 pathway mapping revealed that these proteins participated in various protein pathways 116 ( Figure 2F ) including the interleukin 17 (IL-17), tumoral necrosis factor (TNF) and JAK-117 STAT signaling pathways. Detailed results of the differential analysis and pathway 118 mappings can be found in Supplementary Table 3 . 119 120 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint To obtain further mechanistic insight into the biology of COVID-19, we developed a 122 comprehensive, data-driven network for the integrative analysis of our multi-omics 123 dataset. We first generated a Gaussian graphical model (GGM) of correlated metabolites 124 and proteins from our COVID-19 cohort ( Figure 3A , Supplementary Data 1). GGMs are 125 correlation-based network models that we have previously demonstrated to accurately 126 reconstruct biological pathways from blood-based omics data(10-12). A minimum 127 spanning tree-based algorithm was then used to identify a focused subnetwork that 128 connects the most significantly correlated metabolites and proteins from the original 129 network ( Figure 3B ). The subnetwork included 13 proteins from the Olink inflammatory 130 panel, 32 proteins from the Olink cardiovascular panels II and III, and 70 metabolites. 131 From this subnetwork, we selected 4 network modules to query the interplay between 132 metabolism, inflammation, and vascular dysfunction in COVID-19 patients. Figure 3B ). This finding 147 is consistent with the hyperinflammatory state of COVID-19 infection in which cytokine 148 storm is often observed(15). Cytosine, a pyrimidine-class nucleotide that is an essential 149 metabolite for cell proliferation and survival, is commonly upregulated in the host 150 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. phenotype. Of note, elevated plasma activity of cathepsin D has been found in patients 180 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint with type 2 diabetes, suggested a link between abnormal vasculature and the 181 dysregulated glucose metabolism seen in our higher risk diabetic COVID-19 patients(24). 182 Taken together these data suggest that the disruption of vascular and glucometabolic 183 homeostasis in COVID-19 is mediated by cathepsin D. Figure 4C . Remarkably, hexanoylcarnitine and cytosine, which we earlier 207 showed to be upregulated in COVID-19, were among the 20 metabolites associated with 208 all three of these clinical indices. This finding supports the potential utility of 209 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. nucleotides, 6 lipids and 2 metabolites related to energy metabolism ( Figure 5C ). 248 Interestingly, our metabolite-based model showed improvement over the baseline model 249 not only for predicting the composite outcome, but also for predicting some of its individual 250 components (i.e., intubation, AKI, and length of hospital stay, Figure 5D ). is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint that have dichotomized outcomes into severe and non-severe cases which is known to 270 be statistically less powerful and can lead to bias(27, 28). Taken together, we believe that 271 our analysis provides a higher powered and a more holistic assessment of the added 272 value of metabolites to predict disease severity as well as the effect on overall quality of 273 life rather than just predicting survival. 274 275 The novel coronavirus has ravaged the global healthcare system due to its high There is increasing evidence that evaluating symptoms and multiple clinical outcomes 290 during acute disease is crucial in determining the risk of long COVID-19(29). To the best 291 of our knowledge, we are the first group to develop a composite outcome measure in 292 COVID-19 using multiple clinical indices in a prediction model that assesses not only 293 COVID-19 disease severity but also the sequelae of COVID-19 that characterize post-294 acute COVID-19 syndrome (PACS). Compared to dichotomous outcome measures such 295 as death and survival, our composite outcome score reflects a broader, more holistic 296 assessment of COVID-19 morbidity in the hospital setting. 297 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. Several key strengths underlie our study cohort. As opposed to the use of healthy controls 298 reported in other COVID-19 studies(9, 26, 30-32), our use of non-COVID patient samples, 299 in the same hospital during the same time period between March and April 2020, allowed 300 us to investigate the interactions highly relevant to COVID-19 pathogenesis and clinical 301 course. We furthermore had relatively large sample sizes compared to other studies, with 302 hundreds of samples available for both metabolomic and proteomic analysis. 303 Our study has several limitations. As alterations in proteome and metabolome were 304 analyzed in sera but not in lung tissues or bronchoalveolar lavage fluid, our results may 305 not reflect what occurs at tissue-specific cellular levels. Furthermore, based on the current 306 study design and methodology, the correlative relationships we report between 307 metabolomic and proteomic alterations and SARS-CoV-2 outcomes should be interpreted 308 as purely correlative rather than causal in nature. Additional studies are required to define 309 the mechanistic roles of individual molecules highlighted in this paper. Finally, as our 310 study was only a single center investigation, our results will need to be validated in other 311 In conclusion, our investigation has sought to not only define the metabolomic and 313 proteomic signatures of COVID-19, but also to explore interactions between metabolites 314 and proteins that can serve as a roadmap for future mechanistic studies. We have 315 furthermore proposed a novel clinical composite outcome score that can be used in a 316 clinical prediction model for COVID-19. Ultimately, a better understanding of the 317 pathophysiology of COVID-19 at the molecular level may lead to short-term and long-318 term targeted therapies. 319 320 This is a single-center prospective analysis of one cohort comparing hospitalized COVID- within the COVID-IDR. WC-CEDAR(33) is a critical care database originally designed to 357 automatically extract, transform, and store EHR data on Intensive Care Unit (ICU) 358 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. Metabolite profiling was performed according to methods described in a previous 364 publication(34). In brief, metabolites were extracted using pre-cooled 80% methanol and 365 measured on a Q Exactive Orbitrap mass spectrometer (Thermo Scientific), which was 366 coupled to a Vanquish UPLC system (Thermo Scientific) via an Ion Max source and HESI 367 II probe (Thermo Scientific). The Q Exactive was operated in full scan, polarity-switching 368 mode. A Sequant ZIC-HILIC column (2.1 mm i.d. × 150 mm, Merck) was used for 369 separation of metabolites. Flow rate was set at 150μL/min. Buffers consisted of 100% 370 acetonitrile for mobile phase A, and 0.1% NH4OH/20 mM CH3COONH4 in water for 371 mobile phase B. Gradient ran from 85% to 30% A in 20 min followed by a wash with 30% 372 A and re-equilibration at 85% A. Metabolites were identified based on exact mass within 373 5ppm and standard retention times. Children, pregnant women, and samples after intubation were excluded from all analyses. 397 The metabolomics data was measured in three different batches. For each batch, data 398 was preprocessed by filtering out samples with more than 50% missing values, followed 399 by filtering out metabolites with more than 25% missing values, probabilistic quotient is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. The dataset was first reduced to the samples that overlap between metabolomics and 428 proteomics, and corrected for age, sex, BMI, and COVID-19 status (yes/no). A Gaussian 429 Graphical Model (GGM) based network was then constructed using the GeneNet 430 algorithm(39) and drawing an edge for all partial correlations with an FDR smaller than 431 0.2. In a second step, this network was condensed to highlight the connections between 432 molecules that were significantly different between COVID-19 and controls. To this end, 433 a shortest-path distance matrix between all molecules was constructed and subset to the 434 significant molecules. A minimum spanning tree(40) of this matrix was then constructed 435 to visualize a simplified network. proposed by Ripatti and Pamgren(42). An optimal LASSO penalty parameter was 448 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. There are legal and ethical restrictions on data sharing because the Institutional Review 457 Board of Weill Cornell Medicine did not approve public data deposition. The data set used 458 for this study constitutes sensitive patient information extracted from the electronic health 459 record. Accordingly, it is subject to federal legislation that limits our ability to disclose it to 460 the public, even after it has been subjected to deidentification techniques. To request the 461 access of the de-identified minimal dataset underlying these findings, interested and 462 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Coronavirus Disease 2019 in China CoV-2 Infections and Transmission in a Skilled Nursing Facility. N Engl 481 Clinical course and risk factors for 483 mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The 484 Lancet Acute kidney injury 486 in patients hospitalized with COVID-19 Liver injury in COVID-19: management and challenges. The 488 Thrombosis in Hospitalized Patients With COVID-19 in a New York City Health System Abnormal coagulation parameters are associated with 493 poor prognosis in patients with novel coronavirus pneumonia Causes of 496 death and comorbidities in hospitalized patients with COVID-19 Longitudinal 499 analyses reveal immunological misfiring in severe COVID-19 Gaussian graphical modeling 501 reconstructs pathway reactions from high-throughput metabolomics data Mining the 504 unknown: a systems approach to metabolite identification combining genetic and metabolic 505 information Network 507 inference from glycoproteomics data reveals new reactions in the IgG glycosylation pathway CXCL10, CXCL11/CXCR3 axis for immune activation -A target for novel cancer therapy Translocase in the Inner Mitochondrial Membrane Immune cartography of macrophage 516 activation syndrome in the COVID-19 era Presence and role of cytosine methylation in . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint 604 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint 616 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 22, 2021. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint 653 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)The copyright holder for this preprint this version posted July 22, 2021. ; https://doi.org/10.1101/2021.07.19.21260776 doi: medRxiv preprint