key: cord-0838227-chnstzp7 authors: Hunter, E.; Koutsothanasi, C.; Wilson, A.; Santos, F. C.; Salter, M.; Westra, J.; Powell, R.; Dring, A.; Brajer, P.; Egan, B.; Matthew, P.; Catriona, W.; Aemilia, K.; Thomas, L.; Ramadass, A.; Messner, W.; Brunton, A.; Lyski, Z.; Robbins, P.; Mellor, J.; Vancheeswaran, R.; Barlow, A.; Pchejetski, D.; Akoulitchev, A. title: Development and validation of blood-based prognostic biomarkers for severity of COVID disease outcome using EpiSwitch 3D genomic regulatory immuno-genetic profiling. date: 2021-06-28 journal: nan DOI: 10.1101/2021.06.21.21259145 sha: e7de3b57f1fc275def868304817e275d2497a8a6 doc_id: 838227 cord_uid: chnstzp7 The COVID-19 pandemic has raised several global public health challenges to which the international medical community have responded. Diagnostic testing and the development of vaccines against the SARS-CoV-2 virus have made remarkable progress to date. As the population is now faced with the complex lifestyle and medical decisions that come with living in a pandemic, a forward-looking understanding of how a COVID-19 diagnosis may affect the health of an individual represents a pressing need. Previously we used whole genome microarray to identify 200 3D genomic marker leads that could predict mild or severe COVID-19 disease outcomes from blood samples in a multinational cohort of COVID-19 patients. Here, we focus on the development and validation of a qPCR assay to accurately predict severe COVID-19 disease requiring intensive care unit (ICU) support and/or mechanical ventilation. From 200 original biomarker leads we established a classification model containing six markers. The markers were qualified and validated on 38 COVID-19 patients from an independent cohort. Overall, the six-marker model obtained a positive predictive value of 93% and balanced accuracy of 88% across 116 patients for the prognosis of COVID-19 severity requiring ICU care/ventilation support. The six-marker signature identifies individuals at the highest risk of developing severe complications in COVID-19 with high predictive accuracy and can assist in patient prognosis and clinical management decisions. Background 75 The COVID-19 outbreak, which the World Health Organization (WHO) declared 76 a pandemic in March 2020, represents one of the greatest global health crises 77 the world has faced in recent history [1] . In addition to the estimated 130+ 78 million people that have been infected with the SARS-CoV-2 virus to date and 79 the more than 3 million deaths attributed to COVID-19 related causes; the 80 pandemic has placed tremendous strain on healthcare systems, caused 81 devastating mental health crises, and tested global economic resiliency [2, 3] . EpiSwitch ® 3C libraries, with chromosome conformation analytes converted to 171 sequence based tags, were prepared from frozen whole blood samples using 172 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. The 21 markers screened on 78 individual patient samples were subject to 238 permutated logistic modelling with bootstrapping for 500 data splits and non-239 parametric Rank Product analysis (EpiSwitch® RankProd R library). Two 240 machine learning procedures (eXtreme Gradient Boosting: XGBoost and 241 CatBoost) were used to further reduce the feature pool and identify the most 242 predictive/prognostic, 3D genomic markers. The resulting markers were then 243 used to build the final classifying models using CatBoost and XGBoost. All 244 analysis was performed using R statistical language with Caret, XGBoost, 245 SHAPforxgboost and CatBoost libraries. 246 247 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 28, 2021. Identification of the top prognostic 3D genomic markers for severe 254 In this study we employed a sequential stepwise strategy to identify a minimal 256 set of biomarkers that were predictive of COVID-19 disease severity ( were procured for a Training cohort used to build and refine the classifier model, 263 and Test cohort to assess the predictive performance of the model. Clinical 264 characteristics of the patients are shown in Table 1 with predictive power to differentiate between COVID-19 patients that required 271 a high degree of medical disease management (e.g. admission to the intensive 272 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. ; https://doi.org/10.1101/2021.06.21.21259145 doi: medRxiv preprint care unit (ICU), mechanical ventilation) and those that were hospitalized but 273 required less interventional care and support (Supplemental Table 1 Pathway enrichment for genes contained within the 21 3D genomic markers 285 revealed the top two pathways to be related to downstream signalling mediated 286 by B-cell receptor activation (Table 3) . Importantly, genomic loci encoding 287 proteins involved in hemostasis/clotting were also enriched (Figure 3, Table 288 3). The 21 3D genomic markers were further refined to a set of 6 markers ( Table 289 4) with predictive ability for COVID severity and applied to an independent Test 290 cohort (Supplemental Table 2 ). 291 292 Testing of the prognostic 3D genomic biomarker panel for severe COVID-293 To assess the predictive power of the model, the 6-marker 3D genomic panel 295 was validated on an independent (samples that were not used to build and 296 refine the model) Test cohort (Figure 4 , Supplemental Table 2 ). Samples were 297 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. ; https://doi.org/10.1101/2021.06.21.21259145 doi: medRxiv preprint collected upon admission to COVID hospital wards in Peru, the USA, and the 298 Dominican Republic and shipped to OBD's processing facility in Oxford, UK. 299 The EpiSwitch platform read outs for the six-marker classifier model were 300 uploaded to the EpiSwitch Analytical Portal for analysis. Classifier calls for high-301 risk COVID-19 disease outcomes are shown in Table 5 . Clinical outcomes for 302 the Test cohort included 10 mild cases or 28 severe cases requiring ventilation 303 and/or ICU support. EpiSwitch prognostic calls based on the 6-marker model 304 demonstrated performance of 90.9% positive predictive value for high-risk 305 disease outcomes in the Test cohort ( Figure 4A) . Interestingly, two of the mild 306 case patients (COVID 0696 and 0213) ( Table 5) , identified as high risk by the 307 EpiSwitch test subsequently died in the hospital within 28 days of admission. 308 This suggests an early, pre-symptomatic detection of a hyperinflammatory state 309 leading to fatal outcomes and is being investigated further. Across all 116 310 patients used in this study, the test demonstrated positive predictive value for 311 high-risk disease outcomes of 92.9 with, 88% sensitivity, 87% specificity, and 312 a balanced accuracy of 87.9% (Table 4B and Supplementary Table 3) . we used a sequential, stepwise approach employing a 78-patient Training 331 cohort to refine the marker set and build a predictive classifier model containing 332 six 3D genomic markers. The 6-marker model/assay was tested on an 333 independent Test cohort of 38 COVID-19 patient blood samples. 334 335 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. ; https://doi.org/10.1101/2021.06.21.21259145 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. ; is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. The COVID-19 pandemic will represent a major public health crisis for months 438 to come. As a corollary, there remains a pressing need for prognostic testing 439 Janssen, and AstraZeneca for use in the US and EU, there are still many 444 individuals that will not be vaccinated due to 1) lack of access 2) ineligibility or 445 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Extension of the current study to a wider distribution and larger number of 505 individuals could help define the regional, racial, and epigenetic prevalence of 506 high-risk biomarkers in these populations. A longitudinal observational study 507 with collections before and after resolution of the acute and chronic phases of 508 COVID disease will provide further invaluable insights into the mechanisms and 509 the long-term stability of the identified systemic biomarker signature. Early 510 evidence indicates that blood samples collected from patients before the onset 511 of the COVID pandemic reveal high-risk profiles in some individuals. This would 512 suggest that the biomarker profiles identified in this study are not emerging in 513 response to COVID infection, but rather represent a pre-existing default state 514 on the spectrum of outcome susceptibility. 515 516 There are several immediate implications of the results reported here. The 517 availability of a simple blood-based assay that provides a readout of likely 518 disease course if infected with SARS-CoV-2 is especially helpful for the triage 519 of individuals who either 1) do not have access to COVID-19 vaccines (due to 520 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 28, 2021. ; https://doi.org/10.1101/2021.06.21.21259145 doi: medRxiv preprint underlying medical conditions, location, or age for example) or 2) choose to 521 forgo vaccination for other reasons. It has been well appreciated that the 522 heterogeneity seen in COVID-19 disease outcomes are largely defined by the 523 host response, rather than the virus or its variants [15] . Here we report on the development and validation of a predictive blood-based 543 assay that can identify, with high accuracy, individuals who are at the highest 544 risk of developing severe complications in COVID-19 disease. The 3D genomic 545 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. assistance in data analysis. In addition, we acknowledge Boca Biolistics LLC. 570 and Reprocell USA Inc. for the timely provision of high-quality clinical blood 571 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 28, 2021. The authors declare that they have no competing interests. 576 577 Written informed consent for publication was obtained from all authors. 579 580 The datasets used and/or analysed during the current study are available from 582 the corresponding author on reasonable request. 583 584 585 Ethics approval and consent to participate 586 All patients signed informed consent forms prior to providing blood samples. All 587 ethical guidelines were followed. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 28, 2021. ; https://doi.org/10.1101/2021.06.21.21259145 doi: medRxiv preprint WHO declares COVID-19 a pandemic Combating COVID-19: health equity matters Mental Health and the Covid-19 Pandemic. N Engl 608 FDA EUAs for Molecular COVID-19 Diagnostic Tests FDA EUAs for Anitgen COVID-19 Diagnostic Tests Common pitfalls and 618 recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans Baseline Characteristics and Outcomes 622 of 1591 Patients Infected With SARS-CoV-2 Characteristics, treatment, 625 outcomes and cause of death of invasively ventilated patients with COVID-19 ARDS in Clinical course and outcomes of critically 628 ill patients with SARS-CoV-2 pneumonia in Wuhan 630 10. WHO. COVID-19 Clinical Management BNT162b2 mRNA Covid-19 Vaccine Efficacy and Safety of the 635 mRNA-1273 SARS-CoV-2 Vaccine ChAdOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: an interim 638 analysis of four randomised controlled trials in Brazil, South Africa, and the 639 UK University of Chile-COVID-19 3D genomic capture of regulatory immuno-genetic profiles in COVID-19 645 patients for prognosis of severe COVID disease outcome Epigenetic chromatin conformation changes in peripheral blood can 648 Chromatin conformation changes in peripheral blood can detect prostate 655 cancer and stratify disease risk groups Chromosome conformation signatures define predictive markers of 658 inadequate response to methotrexate in early rheumatoid arthritis Initial Identification of a Blood-Based Chromosome Conformation Signature 662 for Aiding in the Diagnosis of Amyotrophic Lateral Sclerosis A pilot study of chromosomal aberrations and epigenetic changes in 666 peripheral blood samples to identify patients with melanoma Development and validation of baseline predictive biomarkers for response to 670 avelumab in second-line (2L) non-small cell lung cancer (NSCLC) using 671 EpiSwitchTM epigenetic profiling Spectroscopic features of dual 673 fluorescence/luminescence resonance energy-transfer molecular beacons The STRING database in 2017: Quality-controlled protein-protein association 680 networks, made broadly accessible COVID-19 Vaccination in the EU European Centre for Disease Prevention and Control: EU Vaccination 684 Update on the Department of Defense's 687 Evolving Roles and Mission in Response to the COVID-19 Pandemic Vaccination in a Large University Healthcare System United States Centers for Disease Control and Prevention: COVID Data 696 Reduction in COVID-19 Patients Requiring 698 Mechanical Ventilation Following Implementation of a National COVID-19 Vaccination Program -Israel Genomic 702 characteristics and clinical effect of the emergent SARS-CoV-2 B.1.1.7 703 lineage in London, UK: a whole-genome sequencing and hospital-based 704 cohort study Genetic Variants of SARS-CoV-2-What Do They 706 Mean? Estimated transmissibility and 710 impact of SARS-CoV-2 lineage B.1.1.7 in England. Science (80-) Torjesen I. Covid-19: Delta variant is now UK's most dominant strain and 712 spreading through schools Origins and evolution of viruses of 716 eukaryotes: The ultimate modularity Qualitatively 718 distinct modes of Sputnik V vaccine-neutralization escape by SARS-CoV-2 SARS-CoV-2 Entry Related Viral and Host Genetic 721 Variations: Implications on COVID-19 Severity, Immune Escape, and 722 Infectivity The D614G mutations in the 724 SARS-CoV-2 spike protein: Implications for viral infectivity, disease severity 725 and vaccine design Association of 727 SWAP-70 with the B cell antigen receptor complex A B-cell-specific DNA 730 recombination complex Longitudinal antibody 732 repertoire in "mild" versus "severe" COVID-19 patients reveals immune 733 markers associated with disease severity and resolution High risk of 735 thrombosis in patients with severe SARS-CoV-2 infection: a multicenter 736 prospective cohort study Complement associated 738 microvascular injury and thrombosis in the pathogenesis of severe COVID-19 739 infection: A report of five cases Pulmonary Vascular Endothelialitis, Thrombosis, and Angiogenesis in 742 Myocardial injury and COVID-744 19: Possible mechanisms Magadum A, Kishore R. Cardiovascular Manifestations of COVID-19