key: cord-1038495-7nkmdmkd authors: Sanchez-Montalva, A.; Alvarez-Sierra, D.; Martinez-Gallo, M.; Perurena-Prieto, J.; Arrese-Munoz, I.; Ruiz-Rodriguez, J. C.; Espinosa-Pereiro, J.; Bosch-Nicolau, P.; Martinez-Gomez, X.; Anton, A.; Martinez-Valle, F.; Riveiro-Barciela, M.; Rodriguez-Frias, F.; Castellano-Escuder, P.; Poyato-Canton, E.; Bas-Minguet, J.; Martinez-Caceres, E. M.; Sanchez-Pla, A.; Zurera-Egea, C.; Teniente-Serra, A.; Hernandez-Gonzalez, M.; Pujol Borrell, R. title: Exposing and Overcoming Limitations of Clinical Laboratory Tests in COVID-19 by Adding Immunological Parameters date: 2022-02-02 journal: nan DOI: 10.1101/2022.01.29.22270016 sha: 8d907cbd30aa06b23f061fff124f6953d714fb99 doc_id: 1038495 cord_uid: 7nkmdmkd Background: Almost two years since the onset of the COVID-19 pandemic no predictive algorithm has been generally adopted, nor new tests identified to improve the prediction and management of SARS-CoV-2 infection. Methods: Retrospective observational analysis of the predictive performance of clinical parameters and laboratory tests in hospitalised patients with COVID-19. Outcomes were 28-day survival and maximal severity in a cohort of 1,579 patients and two validation cohorts of 598 and 434 patients. A pilot study conducted in a patient subgroup measured 17 cytokines and 27 lymphocyte phenotypes to explore additional predictive laboratory tests. Findings: 1) Despite a strong association of 22 clinical and laboratory variables with the outcomes, their joint prediction power was limited due to redundancy. 2) Eight variables: age, comorbidity index, oxygen saturation to fraction of inspired oxygen ratio, neutrophil-lymphocyte ratio, C-reactive protein, aspartate aminotransferase/alanine aminotransferase ratio, fibrinogen, and glomerular filtration rate captured most of the statistical predictive power. 3) The interpretation of clinical and laboratory variables was improved by grouping them in categories. 4) Age and organ damage-related tests were the best predictors of survival, and inflammatory-related tests were the best predictors of severity. 5) The pilot study identified several immunological tests (including chemokine ligand 10, chemokine ligand 2, and interleukin 1 receptor antagonist), that performed better than currently used tests. Conclusions: Currently used tests for clinical management of COVID 19 patients are of limited value due to redundancy, as all measure aspects of two major processes: inflammation, and organ damage. There are no independent predictors based on the quality of the nascent adaptive immune response. Understanding the limitations of current tests would improve their interpretation and simplify clinical management protocols. A systematic search for better biomarkers is urgent and feasible. Two years after the onset of the coronavirus disease (COVID-19) pandemic, the clinical, laboratory, and imaging features of patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection have been widely described [1] [2] [3] [4] [5] . The wide clinical spectrum of COVID-19 became obvious during the first wave, and although the effect of inoculum size should be considered [6, 7] , variation has been mainly attributed to host factors, as variants of concern only appeared later [8] . The analysis of the first wave has obvious advantages for the identification of host factors and their biomarkers. Among host factors that affect the severity of illness, age, sex, genetic background, immunological status and prior immunity to coronaviruses [9] have been evaluated. Gene mutations of the interferon (IFN) pathway play a clear role in a small proportion of cases [10] , and polymorphisms of several genes associated with immune response have been identified in genome-wide association studies [11, 12] ; however, to date, the genotypes that convey a risk of severe COVID-19 have not been defined in a way that is practically applicable for prediction in clinical practice. Reports originating from the analysis of electronic health records have confirmed the predictive value of clinical laboratory tests usually associated with poor outcomes in other infections (i.e., blood cell counts, acute-phase reactants [APRs] , and coagulation factors)[13-21] but none of the proposed predictive algorithms combining demographic, clinical, and laboratory data have been widely adopted. In small case series, the state of the immune system in COVID-19 patients has been analysed using the latest tools [22] [23] [24] [25] [26] [27] [28] leading to the detection of deep perturbations in the immune system. However, inferences of the effect of these perturbations in the efficiency of the immune response and their clinical consequences are not simple and, to date, no new predictive tests have been validated or added to the clinical laboratory toolbox for COVID-19 management. We report a retrospective analysis of data from a cohort of 1,579 consecutive patients treated at the Vall d'Hebron University Hospital (HUVH) during the first wave of COVID-19 in Barcelona. We validated the main conclusions by comparison with cohorts from two other academic hospitals that belong to the same healthcare provider (the Catalan Institute of Health [ICS] ) in Catalonia, Spain. We conducted this retrospective observational study assuming that the predictive power of clinical laboratory tests had not been fully exploited, with the main objective of improving their interpretation. A secondary objective was to explore the predictive value of a selection of robust immunological tests that might identify an early dysregulated immune response associated with severe COVID-19. Table 1S . The data from HUVH corresponds to patients who were admitted to the emergency division between 10 March and 29 April 2020; to HUGTP between 17 March and 12 May 2020; and to HUB between 16 March and 23 September 2020. The number of deceased patients corresponds to the 28-day follow-up period. The HUB and HUGTP cohorts include only hospitalised patients. Comorbidities were classified as 1) cardiovascular disease and/or hypertension, 2) chronic lung disease, 3) diabetes, 4) neurological disease, 5) chronic kidney disease, 6) active non-terminal malignancy, 7) obesity, and 8) chronic liver disease. Each comorbidity was assigned value of 1, and a global comorbidity index (1 to 8) was generated. The clinical severity category was assigned as the maximal score attained during hospitalisation, using a simplified version of the World Health Organization (WHO) 10-point COVID-19 disease clinical progression score[29] as follows: 1) Mild, no activity limitations or not requiring hospitalisation; 2) Moderate, hospitalised, not requiring high-flow oxygen therapy or ventilation support; 3) Severe, hospitalised requiring highflow oxygen therapy or ventilation support; and 4) Deceased, those who died before day 28 of hospitalisation. These categories correspond to the WHO scores 1-3, 4-5, 6-9, and 10, respectively. For some analyses, the mild and moderate categories were combined into a non-severe category, and the severe and deceased categories were combined into a severe category. non-parametric Spearman test. For analysis of follow-up data of the HUVH cohort, locally weighted smoothing (LOESS) was applied to clinical laboratory variables to visualise the relationship between the mean and CI of each variable, time and 28-day outcome, as described by Abers et al. [1] . To assess the performance of each clinical laboratory test, the receiver-operating characteristic (ROC) curve and the corresponding area under the curve (AUC) values were calculated, using age as a variable for comparison. In addition, random forest simulation and principal component analysis (PCA) were performed to further compare the influence of the laboratory and clinical variables on the outcomes in each hospital dataset. The presenting symptoms are shown in Table 1 . Of note, digestive symptoms were more frequent in survivors (31.1% vs. 20.0%, p <0.001). Cardiovascular and/or hypertension, chronic lung disease, diabetes, neurological disease, chronic kidney disease, and active non-terminal malignancy were associated to decreased 28-day survival, but not chronic liver disease nor obesity. The comorbidity index was significantly higher in deceased patients and patients with severe disease than in survivors and patients with non-severe disease. Each comorbidity added 10% mortality risk up to an index of 4 (Table 3S) 9 The distribution of disease severity was as follows, 71 (4.5%), 969 (61.4%), 284 (17.9%), and 255 (16.1%) in the mild, moderate, severe, and deceased categories, respectively. The age of patients increased with increasing disease severity category, except between the moderate and severe disease groups (Fig 2B and Table 4S ). The LOS increased with disease severity for the three initial disease severity categories but was shorter among the deceased because 24.9% of the deceased patients died during the initial 4 days of hospitalisation ( Fig 2C) . The median disease duration was 18 days (IQR: 10-18 days) and was progressively longer with increasing disease severity (Table 5S) . Age had a strong effect on mortality: for patients in the age groups 50-59, 60-69, 70-79, 80-89 and >90 years, with 28-day case fatality rates of 1.82%, 10.9%, 26.4%, 49.7% and 60.6% respectively. In the dichotomous disease severity grouping, there were 1,040 and 539 patients in the non-severe and severe categories, respectively. Deceased patients accounted for 43.7% of the severe category. The disease severity was significantly associated with age, DFSO, LOS, disease duration, and comorbidities other than chronic liver disease. Disease severity was greater in males than in females, but after adjusting for multiple comparisons the statistical significance was moderate compared with the other statistically significant associations (exact p =0.001, after Bonferroni's correction p=0.03) ( Table 1) . The exploratory statistical analysis of the HUVH cohort revealed that, despite the strong association of 22 of the 30 variables with 28-day outcomes (Fig 3 and Tables 2 and 5S) , the predictive power of the combined variables was limited and appeared to rely disproportionally on age, a non-laboratory variable (Table 3) . Further analyses described in the supplementary section "Sequence of statistical biomarker analyses" were undertaken to determine the reason of this limitation, as summarised below. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. Univariate comparisons of a selection of clinical laboratory-derived variables at admission and 28-days survival for the survival/decease and non-severe, severe outcomes in the Vall d'Hebron University Hospital cohort. N, number of cases plotted; NLR, neutrophil-to-lymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GFR, glomerular filtration rate. * p <0.05; ** p <0.01; *** p <0.001; **** p <0.0001. When non-significant, the numerical p-values are given. The exact p-values are given in Table 2 . The distribution of age and GFR are markedly different in the severity and survival analysis. Classification tables using different sets of variables show that, the despite good ROC curves and overall high proportion of correctly classified cases, their power in predicting poor outcomes, either decease or severe disease, is under the 60%. Prediction it is very dependent on age and, among biomarkers, on SpO2/FiO2 as seen comparing the different tables using 17 and 16 non-clinical All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. ; https://doi.org/10.1101/2022.01.29.22270016 doi: medRxiv preprint variables. It should be noticed that laboratory variables, even without SpO2/FiO2, are better for predicting severity than decease. When SpO2/FiO2 is excluded (16 variables) the % of correctly classified drops even if the number of observations was increase to the double. Analyses with the reduced set of eight variables i.e., age, comorbidities, SpO2/FiO2, NLR, CRP, AST/ALT, fibrinogen and GFR, gave similar results confirming the redundancy of the variables (data not shown). For more details see tables in xlsx format, "multiple logistic regression by decease" and "multiple logistic regression". The white blood cell differential counts showed marked imbalance due to an approximately 250% reduction in the lymphocyte count and a 20-30% increase in the neutrophil count. At the individual level, the reduction of lymphocytes was disproportionate to the increase in neutrophils. The APRs had a broad range of variation e.g., >10,000 and 50-fold for IL-6 and CRP, respectively, and in most patients the values were out of the normal range, while the aspartate aminotransferase/alanine aminotransferase (AST/ALT) ratio and kidney function test results were only moderately altered and often remained within the normal range. Multiple correlation (Fig 4) , univariate age-adjusted logistic regression (Table 4) , multivariable logistic regression analyses (Table 3 ) and examining their respective shifts from the normal range (Table 6S ), suggested that these variables could be classified into three broad categories, clinicodemographic (CD), including age, sex and the comorbidity index; inflammation related biomarkers (IFRB) including blood cell counts, levels of APRs, and coagulation factors; and organ damagerelated biomarkers (ODRB), including liver and kidney function tests and SpO2/FiO2. These analyses revealed that the neutrophil-lymphocyte ratio (NLR) and the AST/ALT ratio captured most of the predictive value of lymphocyte and neutrophils variations and of liver function test variations, respectively, and that SpO2/FiO2 conveyed much of the predictive power of the ODRBs (see supplementary text "Sequence of statistical biomarker analysis") All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. ; [1] The blue rectangle highlights the negative correlation between neutrophils and the cluster of lymphocytes, monocytes, and eosinophils. [2] The green rectangle highlights the blood cell variables that correlate positively with the acute-phase reactants (APRs) and coagulation factors. [3] The orange rectangle highlights the negative correlation between lymphocytes, monocytes, and eosinophils with APRs and coagulation factors. [4] The magenta rectangle highlights the correlations between kidney function and the disease severity, comorbidities, SpO2/FiO2, and disease duration. The cells following the diagonal highlights the seven families of variables: demographics/clinical, myeloid cells, lymphocytes/mononuclear cells, APRs, coagulation, liver function test, and kidney function test, which show the expected strong correlations among themselves. The thick lines between rows separate the main categories. APR, acute-phase reactants; SpO2/FiO2, oxygen saturation/fraction of inspired oxygen; DFSO, days from symptom onset; LOS, length of stay; NLR, neutrophil-to-lymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; eGFR, estimated glomerular filtration rate. Applying this classification to assess variable performance using ROC curve analysis (Table 5) showed that the CD and ODRB variables performed moderately better for predicting survival, while the IFRB variables were better for predicting disease severity and in distinguishing between moderate and severe disease ( Fig 5A and Table 3 ). The strong influence of age was more evident in the analysis of survival curves (using Youden index as cut-off, Table 5 ) in which the hazard ratio (HR) for age under or above 60 years was 32, while the next highest HR was GFR with HR of 9.3. The predictors of disease severity in descending order were age, GFR, urea, IL-6, D-dimer, and comorbidities ( Fig 5B) . The predictive power of both the ODRB and IFRB variables was maintained in the logistic regression analysis after adjusting for age (Table 4 ) or reduced by ROC analysis of age interval stratified values (Table 7S ). However, the random forest simulation further confirmed that age was the single best predictor of outcome, and that the combination of all variables was only partially additive (Table 8S ). All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. ; https://doi.org/10.1101/2022.01.29.22270016 doi: medRxiv preprint The analysis of the 7,586-follow-up observations showed that association of biomarkers with survival varied during the 28 days of follow-up. The Kaplan-Meier survival curves of most IFRB variables for predicting survival remained separated during the first few days of hospitalisation with maximum separation around day 5 (Fig 6) . Interpretation of the values in patients with longer hospital stays was difficult due to the decreasing sample size and complications arising from medical interventions. The survival curves for ODRBs, GFRs and AST/ALT ratio maintained their separation for most of the follow-up period. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. ; https://doi.org/10.1101/2022.01.29.22270016 doi: medRxiv preprint [1] Data correspond to 7,586 samples, 6,589 from survivors and 997 from deceased out of 1,079 patients of the HUVH cohort. NLR, neutrophil-to-lymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; GFR, glomerular filtration rate; IFRB, inflammation-related biomarkers, ODBRs, organ damage-related biomarkers. APRs, acute-phase reactants. At present in HUVH, as in many hospitals, approximately 30 clinical laboratory variables and SpO2/FiO2 are routinely measured in COVID-19 patients as part of the work-up on admission. Correlation analysis and multivariable logistic regression showed that these variables had a high level of multicollinearity (Table 3) , which was confirmed by random forest simulation and PCA (Fig 4 and 4S , Table 8S ). Using repeated analysis and progressively excluding variables, a reduced set of eight variables: age, comorbidity index, SpO2/FiO2, haemoglobin, NLR, CRP, AST/ALT ratio, and GFR, were found to capture the prediction power of all variables (see supplementary material, "Sequence of statistical biomarkers analyses: complexity reduction" and tables "Repeated multivariable logistic regression deceased" and "Repeated multivariable logistic binary severity" among the supplementary Excel tables). As age and comorbidities are non-time-varying only six of the eight variables are required for clinical management. These results do not imply, however, that IL-6, ferritin, lactate dehydrogenase, triglycerides, procalcitonin, D-dimer, and coagulation tests do not provide valuable information in clinical practice depending on the context. The comparison among the three cohorts confirmed the prognostic power of the main IFRB and ODRB variables, even though the statistical ranking of their positions varied between cohorts (Table 6 , and Figs 7 and 8). In addition, biomarker performance as predictors of outcome was maintained in the three cohorts in the random forest simulations (Table 9S ). Hb (g/dL) 13.5 (12.3-14.5) 12.8 (11.5-13.9) 13.6 (12.5-14.7) <0.0001 0.14 <0.0001 WBC (10 9 /L) 6.6 (5.1-8.8) 7.2 (5.3-11.1) 6.9 (5. Monocytes (%) 6.7 (4.8-8.8) 5.8 (3.6-8.9) 6.8 (4.9-9.1) <0.0001 0.29 <0.0001 Monocytes ( perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in APRs rank above the estimated glomerular filtration rate (eGFR) in the HUVH cohort and have a similar ranking in the three cohorts. The horizontal whiskers represent the 95% confidence intervals; values in red indicate positive predictive and blue negative predictive effect of the 28-day survival/deceased as outcome. These graphs are for comparing the OR rankings among the different hospital cohorts, and not for comparing the weight of the variables within a cohort, as the ORs are derived from variables that use different units and ranges of variation. APR, acute-phase reactants; DFSO, days from symptom onset; HUVH, Vall d'Hebron University Hospital; LOS, length of stay; NLR, neutrophil-to-lymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; eGFR, estimated glomerular filtration rate; IL, interleukin; LDH, lactate dehydrogenase; Hb, haemoglobin. At the bottom, the AUC for some variables available only from the HUVH cohort and the AUC values for the three cytokines that perform better in the group of 74 patients who were analysed in the HUVH cohort. The numbers within the cells are the AUC values. APR, acute-phase reactants; Sa/Fi, oxygen saturation/fraction of inspired oxygen; DFSO, days from symptom onset; HUVH, Vall d'Hebron University Hospital; LOS, length of stay; NLR, neutrophil-tolymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; eGFR, estimated glomerular filtration rate; IL, interleukin; LDH, lactate dehydrogenase; Hb, haemoglobin; ROC, receiveroperating characteristic; AUC, area under the curve. Despite the limited size of the group analysed in the pilot study (n=74 , Table 10S ), CXCL10 had the highest ROC curve (AUC=0.83) of all variables including age, IFRB and ODRB, and performed better than any of the other variables considered. IL1RA and CCL2 also showed promise as biomarkers (Table 7 and Fig 9) . Table 7 . Performance of expanded immunological parameters in the special immunological studies group as assessed by ROC curve analysis and compared with other variables in the same group. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. The immune phenotype was analysed in 41 patients (Table 10S ). There was a steep reduction in the size of all T-cell subsets, which was more marked for CD8 effector and memory cells, and an increase in activation markers that was similar to the pattern observed in other time-series analyses [23, 36] , revealing a deep disturbance of the immune response in severely ill patients (see Expanded phenotype analysis in supplementary). Naïve T cells were found associated to higher mortality (Figs 10 and 11S ). The analyses revealed the limitations of currently used clinical laboratory tests used to assess the prognosis of patients with COVID-19 and tried to improve their interpretation by grouping them into categories that reflect the two main biological processes that are measured, i.e., inflammation and organ damage. As their limitations are due to redundancy, clinical management protocols could be simplified, but additional biomarkers with independent predictive power are urgently needed. This study highlights the lack of tests, for early prediction of the specific immune response to SARS-CoV-2. Such tests could provide critical non-redundant information required for prediction and clinical management. The results of the pilot study using a selection of robust immunological techniques derived from other areas of clinical immunology, suggest that better tests can be identified through systematic investigation. As well as this central message, other notable findings are: 1) The three cohorts confirmed the strong association of: SpO2/FiO2, neutrophilia, lymphopenia, APRs, coagulation factors, kidney function and the AST/ALT ratio with survival and predicting disease severity. 2) There was a high All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted February 2, 2022. ; https://doi.org/10.1101/2022.01.29.22270016 doi: medRxiv preprint level of collinearity (redundancy) among the different variables, which explains the limited predictive ability of current tests. 3) After reducing redundancy, the best combination of variables was age, comorbidity index, SpO2/FiO2, NLR, CRP, AST/ALT ratio, fibrinogen, and GFR. 4) The classification of biomarkers into IFRB and ODRBs helped with their interpretation and revealed that ODBRs are better predictors of survival than severity, and that IFRBs are better predictors of severity than survival. 5) For the clinician at the bedside, laboratory ODBRs changes may be less conspicuous than IFRBs but they may deserve more attention. There are several limitations to this study, including its retrospective nature. Another limitation is the absence of information regarding two key factors: the SARS-CoV-2 viral load and markers of the adaptive immune response. The SARS-CoV-2 detection techniques used during this period were not quantitative, and the variability of sampling efficiency reduces their value, even with current improved measurement methods. Serological markers need 7-21 days to become detectable and are not helpful as a tool to predict the prognosis of the patients during the initial medical assessment [37] . Finally, the effect of treatment on mortality, which changed continuously during the first wave, was not analysed in this study. We did not strictly follow the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis recommendations, as generating a prediction mode was not an objective, but most requirements were fulfilled [38]. The analyses presented here are intended for improved interpretation of available biomarkers, but no algorithm is proposed. Most algorithms with good predictive power include parameters, such as oxygen requirements and imaging data, that reflect organ damage in patients that are already on the path to severe disease [13-21,38]. The ideal algorithm/biomarker should be able to identify patients at risk before organ damage occurs. Our results suggest that this is difficult with current tests because inflammation and organ damage biomarkers are strongly correlated at the time the patients reach the emergency department. If, as postulated, the main determinant of severity is a pre-existing latent pro-inflammatory state that leads to a late and inefficient adaptive immune response, biomarkers of this basal inflammation and inefficient response should be identifiable; if the generation of specific cytotoxic T lymphocytes is the main defence mechanism against an acute respiratory infection to a novel virus such as SARS-CoV-2, the early monitoring of these cells would help to predict the patient outcome [1,23,27,28,39-41]. These are the two obvious approaches to generate better biomarkers and the corresponding tests. Reliable early biomarkers would reduce the rate of hospitalization and as new treatments that are becoming available require early administration, generation of such biomarkers is urgent and should be feasible. The supplementary material contains detailed information on the statistical analysis but deidentified data tables will shared on request after approval of a proposal, with a signed data access agreement. Sequence of statistical Biomarkers Analyses, text Table 1S , Patients excluded from HUVH cohort. Table 2S . Monoclonal Antibodies used in in the flowcytometric phenotypic analysis Table 3S . Distribution of comorbidities among the severity categories and demographics and hospitalization data in the HUVH cohort. Table 4S . Demographics and hospitalization data by severity categories of the HUVH cohort. Table 5S . Pairwise comparison of variables for maximal severity in four categories. Table 6S . Median values of laboratory variables and the proportion out of the normal range and relation with mortality. The statistical analysis sequence and main conclusions are summarised in figure 1S . The main stages were exploratory, in depth and complementary analyses. Exploratory Analysis 22 of 29 available variables were found strongly associated to 28-day outcome by pairwise comparison (table 2, and figure 3). Despite these strong associations, Multiple Logistic Regression, Random Forest Model and Principal Component (PCA) analysis showed limited prediction power as only around half of decease/severe cases were correctly classified in the multiple logistic regression and random forest classification tables and separation was poor in PCA (tables 3 and 8S and figure 3S ). To explain this limitation, multiple mutual correlation analysis was carried out (figure 4) and conclusions were confirmed by repeated logistic regression, see sections below "Complexity reduction and the weight of age" and "Repeated multiple Logistic regression…". In the global correlogram, lymphocytes, monocytes and eosinophils constitute a cluster of variables that correlated positively with each other (r=0·42 to r=0·30, see figure 4), while neutrophils, basophils and platelets form another cluster (r= 0·4 to r=0·38); each cluster kept a negative correlation with the other (lymphocytes with neutrophils % r= -0·95). These reciprocal changes in neutrophil and lymphocyte clusters are also seen across the severity categories; Acute Phase Reactants (APRs) correlated among themselves (r=0·71 and r=0·46 for IL-6 with CRP and with Ferritin respectively) with coagulation factors (r=0·67, r=0·42, r=0·35 for IL-6, with fibrinogen, D-dimer and prothrombin time (INR) respectively) and with neutrophils %, (IL-6 r=0·6, CRP r=0·6). This may reflect the central position of CRP in the network of interactions typically occurring in infection diseases and systemic inflammation in which IL-1, IL-6 and TNF-alpha all act synergistically on the liver increasing the production of acute phase proteins and coagulation factors. 1 Age correlated strongly with kidney function tests (r=0·6, r=0·32 and r=-0·6 for urea, creatinine and GFR respectively), moderately with AST/ALT ratio (r =0·40) and weakly with SpO2/FiO2 (r=0·32). All above correlations were significant, (tables format excel r and p values of correlogram xlsx format). The relative weight of age in different age intervals was investigated by comparing the correlograms of patients over and under 65 years which showed that they are maintained (figure 4S <65 years vs >65 years correlation heatmaps). These networks of correlations explain the compound effect of age on the clinical laboratory variables and indicated that the limited prediction of power of biomarker combinations is due to redundancy. This conclusion was supported by the VIF indexes of many variables in the multiple logistic regression analysis, see below the sections "Repeated multiple logistical regression to confirm redundancy and identification minimal set of variables". The above analysis and the understanding of the biological interrelations of the variables led to reduce the complexity of the analysis by combining variables of physio pathologically related families that were statistically correlated in: 1) Blood including Haemoglobin (Hb), Blood White Cell Count (WBC) and differential counts in % and number, Neutrophil to Lymphocyte Ratio (NLR) and platelets; 2) Acute Phase Reactants including CReactive Protein (CRP), IL-6 and Ferritin; 3) Coagulation including D-dimer, fibrinogen, prothrombin time (INR); 4) Liver tests including Bilirubin direct and total, AST, ALT and AST/ALT ratio); and 5) Kidney function tests including urea, creatinine and glomerular filtration rate (GFR). As these groups of variables were found to behave similarly as predictors of survival and of severity and they are known to participate in common pathophysiological networks, we further combined them in Clinical Demographic (CD) variables, InFlammation Related Biomarkers (IFRB) that include the blood, APRs and coagulation and in Organ Damage Related biomarkers (ODRB) that include liver and kidney tests plus SpO2/FiO2, as a biomarker of lung damage. In the pairwise comparisons for survival/decease the lowest p values among CDs were for age and comorbidities e.g., exact p=7·26x10 -81 and 2·3x10 -38 respectively (table 2). Among ODRBs the lowest p values were GFR 2·37x10 -101 , AST/ALT ratio 6·06x10 -31 , while among IFRBs lowest p values were IL-6, 10 -55 ; CRP 5·99x10-43 ; and NLR 0·34x10 -41· The analyses of variables in patients split into four maximal severity categories (Kruskal-Wallis test) showed that in severe vs deceased, ODRB (GFR and AST/ALT ratio) kept a high differential association (p values 10 -112 -10 -11 ) while the association with IFRBs, APR and WBC differential counts was weaker i.e., only Hb, platelets and coagulation factors were significantly associated to outcome. Age and comorbidities are differentially associated to severity categories as outcome with p values of 2·8x10 -53 and 10 -10 respectively but in the comparison of moderate vs severe, age association was not significant and the significancy of the associations is low for ODRB while is high for IFRBs (table 4S and figure 2S ). The univariate logistic regression analysis of the 19 main biomarkers adjusted by age confirmed that the association of 16 of 19 variables including ODRB and IFRB, with survival and severity respectively is only partially linked to age. The corresponding age corrected Z values followed the ranking IL-6> CRP> SpO2/FiO2> neutrophiles %> NLR > Monocytes % > Neutrophils n > GFR > lymphocytes % for survival/decease and IL-6> CRP> neutrophiles % SpO2/FiO2>> NLR > lymphocytes % > Monocytes % for severity where GFR is displaced to position 11th in the ranking (table 4) . Kaplan-Mayer survival curve analysis using Youden indexes from laboratory test performance ROC curves as cutoff (see next section Performance of variables by ROC curves…) was applied to assess the relation of each biomarker with survival within the 28d period (figures 5B and 5S). Age had the highest hazard ratio (32·8), followed by GFR (9.3), urea (6·3), IL-6 (5·9), D-dimer (4·7), comorbidities (4·7), AST/ALT (4·3), CRP (4·3), SpO2/FiO2 (2·8) and differential WBC % (2·8-2·6) while platelets, ferritin and sex gave low or no-significant hazard ratios. ROC curves were generated to assess biomarker predictive power in the clinical context. For survival/decease as outcome, GFR, IL-6, AST/ALT, and SpO2/FiO2 showed the best curves, AUCs (CI): 0,80 (0·77-0·83), 0·77 (0·73-0·81), 0·73 (0·69-0·77) and 0·73 (0·70-0·78) respectively, followed by the other APRs and blood variables. Age, treated as a variable for comparison, gave an AUC of 0·87 (0·85-0·89) better than any of other variables; comorbidities gave an AUC 0·75 (0·72-0·78) (table 5 and figure 5A ). For non-severe vs severe as outcome, the larger AUC corresponded to: IL-6, 0·78 (0·75-0·80) followed by SpO2/FiO2, 0·77 (0·74-0·81), CRP, 0·75 (0·71-0·77), NLR 0·71 (0·68-0·73), GFR 0·69 (0·65-0·71) and age 0·67 (0·64-0·70). The comparison of the ROC curves to predict severity with those to predict survival shows that ODRBs are better predictor of survival/decease and IFRBs of severity/non-severe with the exception of SpO2/FiO2 that is a very good predictor of both outcomes (table 5) . To reduce the effect of age better assess the effect of the other variables, patients were stratified by age intervals (40-55, 55-65, 66-75, 76-85 and >85 years old); the variable giving the larger AUC across all age group was SpO2/FiO2 (0·72 to 0·79) except in over 85 year-old patients in which IL-6 had the larger AUC (0·79); in this stratified analysis tests measuring IFRBs variables were found to be better tests than ODRBs probably because there is higher collinearity of ODRB with age, as suggested by the reduction of their prediction power when stratified by age (table 7S) . Repeated multiple logistic regression analyses of variables for outcomes decease and severity were carried out to confirm their redundancy, calculating VIF scores of collinearities to identify the combinations giving with the best prediction scores. The exploratory analysis had already indicated some redundancies, and the number of variables was reduced to 19 which included age, sex, comorbidity index, Hb, neutrophil, lymphocytes, monocyte, and eosinophils % and number (n), NLR, platelets, CRP, IL-6, D-dimer, ferritin, fibrinogen, prothrombin time INR, SpO2/FiO2, AST/ALT ratio and GFR. Notice that when SpO2/FiO2 was included, the number of observations was reduced to 411, as SpO2/FiO2 was available in only 52% of these patients. The scores generated for each variable were odd ratios (OR), Z scores, p values, VIF, area under the ROC curve (AUC, CI) at 50% cut-off, Positive Predictive Value (PPV, predicting either decease or severity) and Negative Predictive Value (NPV, predicting survival or non-severity) and the % of correctly classified patients for the outcome decease or severity (tables logistic regression in xlsx format). For survival/decease prediction with 19 variables, scores were AUC 0·95, PPP 75·7, NPP 94·4 and 54·4% of deceased patients correctly classified (n=411); if age, comorbidities, and sex were excluded, values were AUC 0·84, PPP 76·6, NPP 80·6 and 58.2% correctly classified (n=411). When SpO2/FiO2, was excluded from the biomarkers the prediction scores were AUC 0·86, PPP 68.9, NPP 92·9 and correct classification of decease cases was reduced to 22·7 %, however these variables correctly classified 99% of the patients who survived as deduced by the PPP and NPP (n=994). Age, comorbidity, and sex by themselves would give an AUC of 0·89, PPP 56·7, NPP 88.5 and 36·5% deceased correctly classified (n=1,579) with sex not modifying the scores. Therefore, while, if additive, the prediction of demographics and biomarkers would give 90·7% of correctly classified decease cases, only 58.2% of cases were actually correctly classified. A reduced set of variables to predict survival/decease was generated by repeated analyses progressively excluding redundant variables thus reducing the VIF scores of collinearities; the reduced set including eight variables: age, comorbidity index, SpO2/FiO2, NLR, CRP, fibrinogen, AST/ALT ratio and GFR gave AUC 0·94, PPP 83·3, NPP 94·1, and 65·8 of deceased patients were correctly classified (n=502). The same analysis with 19 variables applied to non-severity/severity outcome gave AUC 0·85, PPP 74·7%, NPP 79·2 and 54·6 of severe were correctly classified (n=411). If age and comorbidities were excluded, scores were AUC 0·84, PPP 76·6% NPP 80·6, and 58.2 % of severe cases correctly classified. When SpO2/FiO2, was excluded, the prediction scores were AUC 0·81, PPP 65·3, NPP 81·0 and correct classification of severe cases was reduced to 36·2% (n=994). The reduced set of eight variables gave AUC of 0·86, PPP 81,5, NPP 80·3 and correctly classified 64,1% of the severe cases (n=502). The predictive value of laboratory variables was expected to evolve during the disease course but because of the retrospective nature of this study, data at regular intervals for every patient were not available. However, as we had data from 7,586 additional follow up samples corresponding 1,079 of the 1,579 patients, we plotted them to generate an approximation of the evolution of the variables along the 28-days follow-up period. The curves representing means ± CI values for each variable for survivors and deceased maintained a clear separation during the initial 10 days and only overlapped at the end of the 28d period. Of note, NLR curves were more clearly separated between days 5 and 20, while IL-6 values overlapped after day 4-5; CRP, lymphocyte, and neutrophil % curves remained separated over most of the period; the eosinophil curve shows a remarkable increase in survivors but only after day 10· GFR curves were clearly separated from the beginning, but the mean values differ little from the normal range. From this analysis it cannot be deduced whether IFRBs precede ODRB or vice versa. (figure 6 LOWESS main) The patient populations in the three hospitals were not perfectly matched for demographics nor mortality (table 6). The mortality rate was higher in HUB (25·7% vs. 16·1% and 12·3% in HUVH and HUGTP, respectively). This may be explained by their higher age, 65 (5374) as compared to HUVH 62 (50-75) and in HUGTP 62 (52-71) (table 6) . There are also differences in the median laboratory variables that indicates that this cohort includes more critically ill patients. These differences however made these almost contemporary cohorts adequate for comparing the relative predictive power of clinical laboratory test with demographics and clinical variables in a real-life situation. The pairwise univariate analysis of laboratory variables association with 28d outcome of the HUB and HUGTP data sets were conducted similarly as for the HUVH data set. The significant association was confirmed for seven blood variables, three APRs (CRP, IL-6, and ferritin), and the D-dimer (table 6 and figure 6S ). The only index of renal function that was available in the three hospitals, urea, was clearly validated, and this supports the findings from the HUVH data of creatinine and GFR as having strong prognostic value after adjustment for age. Multiple logistic regression analysis that was adjusted for age showed that the ranking of the odds ratios of clinical laboratory variables associated with mortality was similar in the three centres. The APRs, IL-6, CRP, ferritin, and Ddimer occupy the top positions of the ranking, followed by neutrophils and, in the other side, lymphocytes, monocytes, and eosinophils (figure 7). The PCA of 17 clinical laboratory variables show an almost complete overlap that support identical basic physiopathology of the disease, regardless of the differences in the patient population ( figure 7S) . The Random Forest model was applied to the common variables of the three cohorts and the reduction of the mean Gini index in the exploratory HUVH cohort and in the combination of the three cohorts were calculated. Table 9S and figure 8S show the similar rankings of the variables in the different analyses. The clinical laboratory test performance as assessed by the AUC from ROC curves from the three hospitals, when subjected to unbiased clustering, showed the central role of APRs and kidney function; the AST/ALT ratio and GFR, available only from the HUVH cohort, showed the larger AUC, together but AUC of cytokines from the immunological studies, see below, section expanded immunological biomarkers ( figure. 8 and table 7) . Cytokine profile Cytokines were measured in 74 patients, who were representative of the HUVH cohort's moderate and severe disease categories (age 53 [44-64] years) (see table 10S Immunological studies patient). The ELLA platform to measure cytokines has been in use in the HUVH immunology laboratory for six years to monitor cytokines in transplantation and sepsis projects and has proved very robust. Samples were collected on days 0 and +2. The levels of cytokines IL-6, TNF-α, IL-10, IL-2, CXCL10, and CCL2 as well as the receptor antagonist of IL-1 (IL-1RA) reached significantly higher values in the severe patients and tended to increase over time. The remaining cytokines and sCD163 did not show significant differences in relation to time and severity categories. On day 0, the IFN-α decreased, but increased in two mild cases; this may be interpreted as indicating the end of an initial peak in most patients. The IL 12p70 level seemed to follow a similar pattern, but the data did not show significant differences ( figure 9A & B) . The comparison of correlograms of cytokines at days 0 and 2 did not show significant changes in the mutual correlations, probably because the interval was too short. However, greater differences were seen between the moderate and severe of the four categories of patients, with a loss of the correlation with IFN-α probably attributable to its quick drop in severely ill patients. We also observe a moderate increase in the correlations around IL-17 (figure 9S). Predictive power of cytokines and sCD163 were compared for both survival and severity as outcome. For 7 of the 20 cytokines, the survival predictive power as measured by the AUC of the ROC curves at day 0 was significant and followed the order CXCL10 > ILRA > IL-6 > CCL2 > IL10 > IL-15 > IL-7 > TNF-α > IFN-α. On day +2, the predictive power order for IL-6, CCL2, IL-15, TNF-α, and IL-7 and significant for eight of them. The prediction power for non-severity vs severity categories followed the order CXCL10 > IL-6 > IL1-RA > IL-15 at day 0 and IL-6, CCL2, IL-2 at day 2 of the 10 largest AUCs were statistically significant. The ROC curves for the main clinical laboratory biomarkers were also calculated for this group and in the comparison showed smaller AUCs than these cytokines (table 5 and figure 10S ). Of the 41 patients, 5, 27, 6, and 3 were in the mild, moderate, severe, and deceased severity categories, respectively. The median age was lower than that of the HUVU cohort, although the M/F proportion, DFSO, and LOS were similar (table 10S immunology cohort). As adapted for this project, the HIPC protocol generated 163 variables corresponding to 42 lymphocytes, monocytes, and neutrophil subsets. In pairwise comparison for decease and severity outcomes, 12 lymphocyte subsets were significantly associated with the 28-day decease according to a ranking as follows: naïve T lymphocytes (n) > Th2 (n) > total T lymphocytes > CD8 T lymphocytes (n) > CD8 TCM (n) > CD8 TEM (n)> CD4 T lymphocytes (n) > CD4EM (n) > non-classical monocytes. Remarkably, for this subgroup, the significance was similar to that of CRP, IL-6, and D-dimer. Interestingly, the CD4 TEM and CD8 TEM populations showed a continuous decrease with the disease severity, whereas the naïve populations showed a surge in the most severe cases (probably, due to the mobilization of naïve cells which did not contribute to improved outcome (figure 10). In the correlogram between laboratory biomarkers and cell phonotypes there is a negative correlation of APRs and T-cell populations except for Th1-and Th17-activated T cells. There is, therefore, an inverse correlation between inflammation and the T-cell subsets that drive the adaptive immune response; there is a strong negative correlation of age with CD8-naïve T cells, which is critical for the immune response to the virus; these changes reveal the deep dysregulation of lymphocyte biology in COVD-19 patients (figure 11S), The positive correlations between APR and B cells could be interpreted as resulting from an accelerated recruitment of naïve B cells to the switching stage; furthermore, there is a steeper reduction in non-classical monocytes and a reduction in their DR expression. This fits well with the high mortality rate seen among patients with <2% monocytes and with the lack of changes in the sCD163 (see the section titled Cytokines), which is a marker of M2 macrophages. The ROC curves of CD3+CD62L (n) naïve T cells had an AUC of 0·74, with a sensitivity and specificity of 66·7% and 72%, respectively, for severity (table 5) . Patients who were excluded from the HUVH cohort and their pathology.ECMO, ExtraCorporeal Membrane Oxygenation; NA, Not Applicable. Clinical laboratory variables and the 28-day decease outcome. Mortality among patients who had abnormal and extreme deviations in the laboratory values. The WBC count was at the above upper limit of normal (ULN) in 22.2% of the patients, and the mortality rate of these patients was significantly higher (25.0%) than that of the entire study cohort (16.1%). In most patients, the platelet count was within the normal range; however, when values fell below the lower limit of normal (LLN), the mortality rate was 26.7%, which was significantly higher than that of the HUVH cohort's. The CRP values were outside the normal range in all but one patient; the mortality rate was 30.3% among patients with values in the top 10% (>20 mg/dL), which was significantly higher than the HUVH cohort's. The IL-6 values were above the ULN in 96.8% of cases; the mortality rate was 37.7% in the participants with the top 10% values (>140 ng/dL). LDH and Triglycerides were excluded from other analysis because the excessive and unbalance number or missing data. Comparisons by Fisher exact test, APR, acute-phase reactants; AST, aspartate aminotransferase; ALT, alanine aminotransferase; CRP, C-reactive protein; DFSO, days from symptom onset; GFR, glomerular filtration rate; IL-6, Interleukin-6; Hb, hemoglobin; LDH, lactate dehydrogenase; LOS, length of stay; NLR, neutrophil-to-lymphocyte ratio; OR, Odd Ratio; ONR, out of the normal range. 27 Figure 5S . Representative Kaplan-Meyer survival curves clinicodemographic biomarkers (ODRB) and inflammation related biomarkers (IFRB). Cut off points determined by the Youden index in performance clinical laboratory type of ROC curves. The message from this figure is that ODRB are more closely associated to mortality, even if their variations were not wide and they may be easily missed during clinical management. 28 Figure 6S . Heatmap summarizing pairwise comparison of the survivors vs deceased in the three hospital cohorts; p values scale refer non-parametric comparisons (Mann-Whitney and Kruskal-Wallis). In general p values are not very informative of the weight of each independent variable in the outcome (dependent variable) but when differences are so large, that they are indicative of which variables are more closely associated to outcome, to be corroborated by the subsequent analyses. The heatmap makes visual those differences. Log of exact p values Figure 7S . PCA analysis representation of the three hospitals cohorts, indicating similar main vectors, with local differences. The plot should be compared with plot figure 3S; . The variables to the right of the plot contribute to decease, the variables to the left contribute to survival, the length of the vector and the angle is and indicator of the relative contribution to the outcome. Notice how lymphocyte % and n and neutrophils % and n pull to the opposite outcomes 30 Figure 8S . Heatmap summarizing the reduction of mean Gini index in the Random Forest model that reflects the importance of each variable in the predictive model. The left column represents the values when laboratory and age plus comorbidities models were run separately and the right column, when run combined; notice that in both, age is the dominant variable. Cytokine correlogram Figure 10S . Performance of cytokines as clinical laboratory test in ROC curve analysis. For values see Table 6 , but direct examination of the curves indicates that these variables may be useful for further studies. Standardizing immunophenotyping for the Human Immunology Project Extended immunophenotyping reference values in a healthy pediatric population Barcelona demography Covid-19 clinical profile in latin american migrants living in spain: Does the geographical origin matter Planning for the assistance of critically ill patients in a Pandemic Situation: The experience of Vall d'Hebron University Hospital T cell responses in patients with COVID-19 Defining the features and duration of antibody responses to SARS-CoV-2 infection associated with disease severity and outcome Blood cell Count; n, number. For the three cohorts' simulation the training dataset had 2600 samples, and the test dataset 519 samples and 19 features. Variables with more than 20% missing values were removed, missing values imputation by median HUVH Cohort: clinical and demographic features of the sub-cohorts from cytokines and blood cell-phenotypic analysis. Univariate comparison using Mann-Whitney U test with adjusted p-values. DFSO, days from symptom onset; LOS, length of stay; NLR, neutrophil-to-lymphocyte ratio; CRP, C-reactive protein; AST, aspartate aminotransferase; ALT, alanine aminotransferase; eGFR, estimated glomerular filtration rate; IL-6, Interleukin-6; LDH, lactate dehydrogenase; Hb, hemoglobin.