key: cord-0984937-58k7s2mu authors: Ishikawa, G.; Argenti, G.; Fadel, C. B. title: Non-specific blood tests as proxies for COVID-19 hospitalisation: are there plausible associations after excluding noisy predictors? date: 2021-01-11 journal: Epidemiol Infect DOI: 10.1017/s0950268821000078 sha: c22f3ad34fab3c015e996bb00df63a2156260ae2 doc_id: 984937 cord_uid: 58k7s2mu This study applied causal criteria in directed acyclic graphs for handling covariates in associations for prognosis of severe coronavirus disease 2019 (COVID-19) cases. To identify non-specific blood tests and risk factors as predictors of hospitalisation due to COVID-19, one has to exclude noisy predictors by comparing the concordance statistics (area under the curve − AUC) for positive and negative cases of severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2). Predictors with significant AUC at negative stratum should be either controlled for their confounders or eliminated (when confounders are unavailable). Models were classified according to the difference of AUC between strata. The framework was applied to an open database with 5644 patients from Hospital Israelita Albert Einstein in Brazil with SARS-CoV-2 reverse transcription – polymerase chain reaction (RT-PCR) exam. C-reactive protein (CRP) was a noisy predictor: hospitalisation could have happened due to causes other than COVID-19 even when SARS-CoV-2 RT-PCR is positive and CRP is reactive, as most cases are asymptomatic to mild. Candidates of characteristic response from moderate-to-severe inflammation of COVID-19 were: combinations of eosinophils, monocytes and neutrophils, with age as risk factor; and creatinine, as risk factor, sharpens the odds ratio of the model with monocytes, neutrophils and age. Coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndromecoronavirus-2 (SARS-CoV-2) stands out for its high rate of hospitalisation and long hospital stay and in intensive care units (ICUs). COVID-19 disease severity can be mild, moderate, severe and critical [1] . While 81% of those infected with COVID-19 have mild or moderate symptoms, World Health Organization estimates that 14% of those infected with COVID-19 are severe and require hospitalisation and oxygen support, and 5% are critical and admitted to ICUs [1] . Reported median hospital length of stay (LoS) was from 4 to 21 days (outside China) and ICU LoS was from 4 to 19 days [2] . The severity of COVID-19 states is associated with many risk factors. Early reports suggest advanced age, morbidities, multi-morbidities and immunosuppression [3, 4] . The enlarging list includes cardiac, chronic lung, cerebrovascular, chronic kidney and liver diseases, cancer, diabetes, obesity, hypertension, dyspnoea, fatigue and anorexia [1, 5, 6] . Early identification of severe cases allows for optimising emergency care support [1] and improving patient outcomes [7] . However, patients who do not yet meet supportive care criteria may fail to receive the necessary care, when there is rapid deterioration or inability to promptly go to a hospital. In the transition from moderate-to-severe cases there can be avoidable delays in life support interventions with non-optimised treatments. Together with high hospitalisation rates [1] and lengthy stay [2] , the superposition of COVID-19 waves and sustained transmission [8] are causing prolonged depletions of health care resources in many countries. Prognosis tools may play a role in planning and in improving the access to supportive treatments by allowing timely allocation of scarce resources to better cope with COVID-19. Indeed, there is widespread interest in predictive models of COVID-19 outcomes [7, 9] , but a review of 50 prognostic models concluded that they are at high risk of bias [9] . As they focus on statistical findings, our concern is with lack of minimum causal criteria to identify associations that are effectively related to In this context, a path to optimised supportive treatments is more reliable assessments of the transition from moderate-to-severe cases of COVID-19 inflammation. We choose nonspecific blood tests as they are widely available, and hospitalisation decision as a proxy to characterise the transition from moderate-to-severe cases (when not constrained by inpatients availability). After formalising an analytical framework with causal reasoning, the goal is to identify candidate sets of blood tests associated with hospitalisation (with risk factors), excluding noisy predictors that are not related to COVID-19 inflammation. Whereas causal effects are clearly predictive, prediction studies usually refer to non-causal analysis that uses observational data to make predictions beyond the observed ones and confounding bias is generally considered a non-issue [10] . However when one needs more reliable predictions, confounding bias and causality should be accounted for in associations. This study applies analytical tools from the causal effect estimation of directed acyclic graph (DAG) theory [11] to investigate associations considering covariates. The strength of the association depends on the specificity and sensitivity of the inflammation pattern, as a kind of distinctive signature of the disease. A low association can also occur and means that the pattern with that set of variables allows weak inferences. If a substantial association due to COVID-19 is identified and it is also stable and representative of the target population, then these blood tests may be useful as proxies in surveillance protocols and screening interventions. The theory of DAG provides graphical notation and a nonparametric probabilistic terminology to describe and evaluate causal relationships [11] . The use of DAGs in epidemiology is emergent [12] and it is especially helpful with multiple potential confounders [12, 13] that may introduce systematic bias [10, 14] . In DAGs, confounding associations between two variables may come from unblocked backdoor paths [13] that can be graphically identified because they share parent nodes. With a formal definition of backdoor path, for instance, DAG provides a general explanation of the Simpson's paradox [15] , where a phenomenon appears to reverse the sign of the estimated association in disaggregated subsets in comparison to the whole population. As a framework, DAG supplies analytical tools to evaluate which adjustment is mandatory (to predict a non-causal sign reverse) and which covariate should be omitted (to estimate the causal effect), thereby enforcing the elicitation of qualitative causal assumptions [11, 12, 14] . A hypothetical DAG model with latent variable was conceived to evaluate the influence of various types of covariates on the focal association. Initially, we drew the main causal path from exposure to outcome. The DAG in Figure 1 starts from the infection by SARS-CoV-2 (exposure E) that, in some cases, leads to 'Moderate-to-severe inflammation due to COVID-19' (MSIC, hypothetical latent variable (E→MSIC)), and that inflammation causes two outcomes (mutual dependent relationship (H←MSIC→B)): (H) hospitalisation decision; and (B = {B 1 ,…, B k }) blood tests measured at hospital admission. The blood tests are selected according to their strength with hospitalisation. The focal outcomes under investigation are hospitalisation (H) and blood tests (B). Considering the initial DAG plausible, we hypothesised candidate covariates that are parents of the variables and may open back-door paths, Figure 1 shows one risk factor (RF3) and one confounder (BOC1). Figure 2 is an enhancement of the initial DAG with potential risk factors, confounders of the focal association and other covariates. Risk factors contribute directly to the development of COVID-19 inflammation (RF = {RF 1 ,…,RF L }, mutual causation relationships (RF i →MSIC←RF j )) and they can also affect other variables. Figure 2 also distinguishes the covariates in terms of their confounding potential on the association between H and B. Covariates that affect both focal outcomes are identified as Both-Outcomes-Confounders (BOC = {BOC 1 ,…,BOC m }), as they are correlated to the focal outcomes but not to COVID-19, and when affect one outcome as Single-Outcome-Covariate (SOC = {SOC 1 ,…,SOC n }). These covariates are not exhaustive but to generate causal graph criteria for handling confounding factors. Causal relationships in DAGs are defined with the do(.) operator that performs a theoretical intervention by holding constant the value of a chosen variable [11, 16] . The association caused by COVID-19 inflammation can be understood as a comparison of (1) where P(H|B = b,do(SARS-CoV-2 = 1)) represents the population distribution of H (hospitalisation) given a set of blood tests equal to b, if everyone in the population had been infected with SARS-CoV-2. And P(H|B = b',do(SARS-CoV-2 = 0)) if everyone in the population had not been infected. Of interest is the comparison of these distributional probabilities for each intervention. The interventions with do(.) generate two modified DAGs: • The do(SARS-CoV-2 = 0) eliminates all arrows directed towards SARS-CoV-2 and to MSIC (Fig. 3) . Ignoring the floating covariates, there are single arrow covariates pointing to hospitalisation (RF3, RF4A, SOC1, SOC3) and to blood tests (RF4B, SOC2, SOC4) and fork covariates pointing to both outcomes (BOC1, BOC2, RF5). • Similarly, the modified graph of do(SARS-CoV-2 = 1) is equal to the former by adding single arrows from RF1 and RF2 to MSIC; and converting RF3, RF4A, RF4B and RF5 to fork types with arrows directed to MSIC. As most covariates are either unmeasured or unknown, the effect of their absence can be evaluated following the d-separation concept [11] . This concept attempts to separate (make independent) two focal sets of variables by blocking the causal ancestors (or back-door paths) and by avoiding statistical control for mutual causal descendants [11] . Differently, to preserve the association between descendants of MSIC ( Fig. 2) , the focal outcomes (H and B) must remain d-connected (dependent on each other only through MSIC) and their relations with other covariates (that may introduce systematic bias) should be d-separated (conditionally independent). Figure 3 , at the negative stratum, shows the confounders that may introduce systematic bias into both outcomes: BOC1, BOC2, RF5. The influence of these confounders on the focal association can be estimated with the modified model at the negative strata. A strong association of the outcomes without infection can be due to these confounders and suggest efforts to measure and control for them (as they have to be d-separated). Another pragmatic possibility is to exclude the noisy exams affected by these confounders. The other covariates are single arrows or they affect only one outcome (H or B)their absence should not be critical because they are likely to be discarded due to poor discriminative performance. A naïve estimation of equations (1) and (2) is to assume that they are equal to their conditional probabilities available in a given dataset at each stratum. The cost of this simplification is that the analysis is no longer causal (in a counterfactual sense, because we are not contrasting the whole population infected and the whole population not infected [10, 11, 16] ) and the estimation becomes an association between two disjoint sets that each represents separate parts of the target population. As hospitalisation is a dichotomous variable, this conditional probability, P(H|B = b, SARS-CoV-2 = 1), can be computed through a logistic regression of hospitalisation (dependent variable) given a set of blood tests at SARS-CoV-2 = 1. From the modified graph with intervention, P(H|B = b', SARS-CoV-2 = 0) is calculated with the same model parameters but applied to cases at the negative stratum. It is implicit that there is the conditioning by a proper set of covariates for each model. The concordance statistic of a logistic regression model is a measure of its predictive accuracy and is calculated as the area under curve (AUC) of the receiver operating characteristic (ROC) [10, 17] . A way to compare the discriminative ability of (3) and (4) is to subtract the AUC values at each stratum. A difference of 0.0 means no specific association with COVID-19 (i.e. equivalent responses for both strata) and 0.5 means perfect focal association of the outcomes and perfect differentiation among strata (i.e. perfect response at the positive stratum and random response at the negative). The comparison of the models with AUC values at the negative stratum of SARS-CoV-2 is a necessary improvement in the assessment of prognostic models. This is similar to the null values concept in measures of associations of two groups with two outcomes [10] , but generalised for continuous multivariable prognostic models. The above framework guided our approach to identify sets of blood tests associated with the hospitalisation due to COVID-19 together with • Acceptable overall statistical properties of each model at the positive stratum of SARS-CoV-2, without and with bootstrap procedure. • Consistency of the blood test coefficients across models with one variable and with multiple variables: considering causal effects, coefficients should not change signal when properly conditioned across models [15] . • Elimination of models with high AUC at the negative stratum of SARS-CoV-2 and classification of the sets of blood tests by the difference of AUC between strata. We identified one public observational database in which, at least partially, we could apply the framework and generate candidate prognostic models. Hospital Israelita Albert Einstein (HIAE), Sao Paulo/Brazil, made public a database (HIAE_dataset) [18] in the kaggle platform of 5644 patients screened with SARS-CoV-2 RT-PCR (reverse transcription-polymerase chain reaction) exam and a few collected additional laboratory tests during a visit to this hospital from February to March 2020. All blood tests were standardised to have mean of zero and unitary standard deviation. As this research is based on public and anonymised dataset, it was not revised by any institutional board. The logistic regression models were evaluated with IBM SPSS version 22.0 and the causal map with DAGitty.net version 3.0. Of the 5644 patients, 558 presented positive results for SARS-CoV-2 RT-PCR. Of the 170 patients hospitalised (in regular ward, semi-intensive unit or ICU), 52 were positive (9.3% rate of hospitalisation due to COVID-19). Patient age quantile, from 0 to 19, with sample mean of 9.32, was the only demographic variable available. Age was not conditionally independent with SARS-CoV-2 RT-PCR exam. Only 0.9% were positive in the age quantile 0, 1 and 2 (8 positive cases in 883 exams) while the incidence (not weighted) in the age quantile from 3 to 19 was 11.7% ± 2.6%. In the first round, 15 blood tests were discarded because of poor performance of the univariate model when SARS-CoV-2 = 1 ( Table 1 ). The remaining blood tests were creatinine, C-reactive protein (CRP), eosinophils, lymphocytes, monocytes and neutrophils (Table 1) . Only creatinine was not related with the immune system directly and was evaluated as a risk factor. Of the 5644 patients, 602 patients presented values of eosinophils, 602 lymphocytes, 601 monocytes, 513 neutrophils, 506 CRP and 424 creatinine. Regarding missing cases, all observations with the required data were included (available-case analysis). CRP is a biomarker of various types of inflammation [19, 20] . At SARS-CoV-2 = 1, the model with CRP and age had good discriminative ability with AUC of 0.872. But at SARS-CoV-2 = 0, AUC = 0.680 was also substantial and the difference of the discriminative ability Δ = 0.192 was moderate (candidate models should present higher differences); the corresponding ROC curve in Figure 4 shows overlapping curves up to sensitivity of 0.5−0.6. Models with CRP demonstrated sensitivity to resampling within the dataset [17] , the coefficient significance moved from 0.005 to 0.144. Similar effects were found in models that include CRP with other blood tests and sensitivity to bootstrapping was reduced by dichotomising CRP (reactive/not-reactive). Models G. Ishikawa et al. with CRP_reactive, neutrophils and age-generated AUC of 0.901 and 0.730 in the positive and negative strata (Δ = 0.171), and CRP_reactive, monocytes, neutrophils and age-generated AUC of 0.921 and 0.706, respectively (Δ = 0.215). CRP is a predictor of hospitalisation in general, but high levels of AUC at the negative stratum mean that CRP is a response with significant bias due to other causes than COVID-19. Differently from other prognostic studies [21] [22] [23] [24] [25] [26] , CRP was excluded as candidate. The neutrophils to lymphocytes ratio (NLR) is considered a possible indicator of severity [21, 24, 27, 28] of COVID-19, but NLR could not be evaluated as all variables were standardised (division by zero). Lymphocytes presented inconsistent behaviour across models. Single exam models indicated lymphopenia at SARS-CoV-2 = 1, as expected [29, 30] . But lymphocytes reversed the sign in the model with neutrophils and age (SARS-CoV-2 = 1), possibly, due to collinearity between them (Pearson's correlation of −0.925 and −0.937 at positive and negative strata, both significant at 0.01 (two-tail)). As there are indications of collinearity issues at both strata, lymphocyte and neutrophils should not be in the same model as independent variables, and this is an indication that NLR may be a noisy association with hospitalisation. As models with combinations of neutrophils were slightly better than with lymphocyte, lymphocyte was dropped from analysis. In the second round, combinations of eosinophils, monocytes and neutrophils with age were tested systematically. Table 2 presents parameters of models combining eosinophils, monocytes and neutrophils (with age) and the best model with creatinine (as risk factor). Table 3 presents AUCs for each model with the difference of discriminative ability between strata. Considered individually, eosinophils, monocytes and neutrophils generated models with good performance to estimate the probability of hospitalisation (models 1, 2, 3 with AUC>0.810 at positive stratum). The combinations of these blood tests generated models (4, 5, 6, 7) with better discriminative ability (AUC>0.856 at SARS-CoV-2 = 1). The AUC at SARS-CoV-2 = 0 is a simplified measure of the systematic bias in both outcomes: models 1, 2 and 4 presented low values (with AUC<0.564) and the others presented relevant noisy associations (AUC from 0.600 up to 0.665), but with better difference in discriminative ability Δ>0.252 in models with two or more exams. Two patterns of associations were more salient: (1) age as a risk factor with combinations of eosinophils, monocytes and neutrophils as predictors; (2) age and creatinine as risk factors with monocytes and neutrophils as predictors. The interpretation of the conditional probabilities will focus on models 7 and 8, but models with at least two blood tests (4−8) are potential candidate associations. Considering creatinine as a marker of the renal function, model 8 is the overall best model with significant coefficients at P < 0.05 and has the highest difference of discriminative ability between strata (Δ = 0.313). Comparative ROC curves for models 7 and 8 are shown in Figures 5 and 6 , where there is a substantial discriminative difference between both strata of SARS-CoV-2; confidence intervals at 95% of AUC values are in Table 3 . When the coefficients of model 7 (Table 2 ) are converted to conditional probabilities we find that at average age quantile (9.32) and average monocyte and neutrophil levels, there is a hospitalisation probability of 51.1% with eosinophils at −1 standard deviation (S.D.); and 90.2% when age quantile is 15. Model 8 with creatinine has different responses: age quantile coefficient is more pronounced and the odds ratio of creatinine is steep (8.338), so average levels of creatinine result in a probability of hospitalisation >50% for age quantile >9 (with monocytes and neutrophils at average). When creatinine is + 1 S.D. at age quantile 9, hospitalisation probability is 85.9% (monocytes and neutrophils at average). Only below average levels of creatinine lower hospitalisation probabilities. Monocytes and neutrophils are also steeper than model 7. At age quantile 9, + 1/2 S.D. of creatinine, −1/2 S.D. of monocytes and + 1/2 S.D. of neutrophils result in a hospitalisation probability of 92.5%. Model biases may be due to missing cases selection. Most likely, missing data are not at random (MNAR). We performed the bootstrapping procedure to identify potential sensitivity to resampling and, indirectly, to selection bias. The selected models maintained the magnitude and statistical significance of the coefficients. Apparently, as no significant deviation was detected, the missing cases bias may not be an issue. ROC and AUC calculations used the same data for model fitting. Because of limited sample size, it was not suitable to apply the approach of splitting the database for training and then prediction. After dividing the sample in two groups, most coefficients were not significant at P > 0.10 (Table 4 ) at least in one group. Notwithstanding, classification tables were coherent between subsets and we found no clear indication of model misspecification. We focused on models with discriminative ability to identify peculiar responses in the transition from moderate-to-severe inflammation only due to COVID-19. The AUC evaluation at the negative SARS-CoV-2 stratum to estimate the influence of unwanted confounders into the focal association together with equivalent criteria of severity state at both strata is, to the best of our knowledge, a needed improvement in prognosis studies of COVID-19. In comparison to other prediction studies, we identified a few focused on the transition from moderate-to-severe cases of COVID-19 [21] [22] [23] [24] [25] [26] [27] [28] . None of them considered data from the negative stratum of SARS-CoV-2, therefore, these models are biased by not excluding noisy predictors. We eliminated variables with 'high' AUC at SARS-CoV-2 = 0, so that variables with more peculiar responses to COVID-19 were included. Reactive levels of CRP together with SARS-CoV-2 RT-PCR exam may be a predictor of hospitalisation, but this can happen due to causes other than COVID-19 (most cases of COVID-19 are asymptomatic to mild). To include it in a model, one should control for all other causes of CRP reactive. We evaluated age and creatinine as risk factors. Controlling for age improved the AUC of all models at the positive stratum of SARS-CoV-2. The difference between risk factor and outcome among blood tests is subtle. The emergent literature is cautious about whether eosinopaenia may be a risk factor [31] and whether creatinine (and other renal markers) may be associated with COVID-19 renal inflammatory response [32] . As an acute inflammatory kidney response to COVID-19, the interpretation changes and further refinement of the framework is necessary. If eosinopaenia is a risk factor, the prevalence of this condition should be considered and must be properly diagnosed at admission, and the models should be reviewed with new data. Note: The cut off at 5030 cases was selected to generate valid parameters with similar quantities of available cases at SARS-CoV-2 = 1 because lower/higher thresholds generated invalid parameters for model 8 due to perfect discrimination. SARS-CoV-2, acute respiratory syndrome coronavirus 2; RT-PCR, reverse transcriptionpolymerase chain reaction; B, coefficient of the variable; P, value of the statistical significance of the coefficient; OR, odds ratio of B (it is equal to exp(B)); CI, confidence interval. Results of classification table cut-off value of 0.5 with percentage of correct non-hospitalisation (H = 0) and correct hospitalisation (H = 1). As we drop noisy predictors, we are effectively dealing with hypothesis about the physiopathology of COVID-19 inflammation. Although not as frequent as the mentions of neutrophils, there are studies on the complex role of eosinophils [31, 33] and monocytes [34, 35] in COVID-19 inflammation indicating eosinopaenia in severe cases and monocytopaenia in some phase of the cytokine storm and other COVID-19 pathologies [36] . We selected two patterns of blood tests that are associated with hospitalisation due to COVID-19 inflammation: age with combinations of eosinophils, monocytes and neutrophils; and age and creatinine with monocytes and neutrophils. The model findings are aligned with the known physiopathology of COVID-19 but in a more integrative framework of analysis (not as individual predictors, but as a set that is related to risk factors). The selected blood tests are broadly available even in regions with scarce health care resources. It is unlikely that we will have just one or two overall best models; given different sets of risk factors, we should expect a few representative patterns of the COVID-19 inflammation from moderate to severe. The models are candidates only and the results cannot be representative beyond the patient health profiles of this reference hospital in Sao Paulo/Brazil that attends a high social-economic segment [37] . The sample refers to the initial phase of the pandemics in Brazil and the patterns may change with medicine prescriptions and other adaptations of SARS-CoV-2. The reduced quantity of available cases did not allow the dataset split for training and prediction. Further efforts are needed to increase internal and external validity across populations, as the prognostic ability is also a function of the variability of the development of COVID-19 inflammation. As there is no unambiguous way to characterise 'moderate-to-severe COVID-19 inflammation', the inclusion of an unmeasured variable reduces the predicted conditional independences from the DAG. But still this framework can help in the identification and estimation of risk factors. This crosssectional data (single point time) cannot inform if creatinine (or eosinophil) is risk factor or effect of COVID-19 inflammation. In future data collection efforts, participants should be followed over time, from diagnosis to hospitalisation; ideally from exposure throughout the lifecycle and also with the follow-up of negative cases. Causal studies are intrinsically predictive [10] , therefore, we need to advance prognosis research within causal frameworks. As most studies will be observational, data collection with ample selection of variables for matching estimators (e.g. stratification) [16] will be required to reduce systematic bias. All candidate models can be reproduced from the dataset [18] . We believe most hospitals can apply this framework to generate similar models appropriate to the target population in which they are inserted by making efforts to collect blood tests and potential risk factors at admission, and other clinical data. By making these databases public (anonymised and with standardised data), they will allow future external validation in larger target populations. Finally, in the wider context of COVID-19 epidemiology, the collapse of health systems due to opportunistic pathogens is a symptom of threats that requires system-level measures during and after the pandemics [38] . This research is concerned with hospital care. As a bottleneck, even small gains may have multiplicative effects on health systems. In countries with porous containment efforts, hospital occupancy is a critical metric [39] to alternate between 'soft lockdown' and economic activity with 'constrained mobility'. As some regions with sustained transmission are hesitant and being pushed towards these states, they are poorly capturing the benefits of the switching strategy (Parrondo's paradox applied to epidemics [40] )because they are struggling in trial and error mode to establish thresholds of when to restrain (and open) and at what pace. Due to the fast saturation of hospital infra-structures with overshooting in these regions, the tendency of excessive losses in each transition is hard to manage. In this context, we believe that the application of prognosis tools can improve the timely access to supportive care in countries with sustained COVID-19 transmission. World Health Organization (2020) Clinical management of COVID-19: Interim guidance. WHO publications COVID-19 length of hospital stay: a systematic review and data synthesis Clinical characteristics of coronavirus disease 2019 in China Characteristics of COVID-19 patients dying in Italy. Epidemiology for public health: Istituto Superiore di Sanità Human infection with 2019 novel coronavirus person under investigation (PUI) and case report form Risk factors associated with disease severity and length of hospital stay in COVID-19 patients Predictors of COVID-19 severity: a literature review. Reviews in Medical Virology n/a, e2146 Superposition of COVID-19 waves, anticipating a sustained wave, and lessons for the future Prediction models for diagnosis and prognosis of Covid-19: systematic review and critical appraisal Epidemiology by Design: A Causal Approach to the Health Sciences, 1st Edn Causality: Models, Reasoning, and Inference, 2nd Edn Directed acyclic graph Causal diagrams for epidemiologic research Causal diagrams Comment: understanding Simpson's paradox Counterfactuals and Causal Inference: Methods and Principles for Social Research Applied Logistic Regression Diagnosis of COVID-19 and its clinical spectrum: AI and Data Science supporting clinical decisions (from 28th Mar to 3rd Apr). Kaggle [Internet Interpretation of C-reactive protein concentrations in critically ill patients C-reactive protein Predictors of progression from moderate to severe coronavirus disease 2019: a retrospective cohort Validation of predictors of disease severity and outcomes in COVID-19 patients: A descriptive and retrospective study [published online ahead of print Clinical value of immune-inflammatory parameters to assess the severity of coronavirus disease 2019 The value of clinical parameters in predicting the severity of COVID-19 Predictive factors of severe coronavirus disease 2019 in previously healthy young adults: a single-center, retrospective study Predictors for severe COVID-19 infection The diagnostic and predictive role of NLR, d-NLR and PLR in COVID-19 patients Preliminary study to identify severe from moderate cases of COVID-19 using combined hematology parameters Lymphopenia is associated with severe coronavirus disease 2019 (COVID-19) infections: a systemic review and meta-analysis Lymphopenia in severe coronavirus disease-2019 (COVID-19): systematic review and meta-analysis Eosinophil responses during COVID-19 infections and coronavirus vaccination Acute kidney injury in the 2019 novel coronavirus disease The role of peripheral blood eosinophil counts in COVID-19 patients Monocytopenia, monocyte morphological anomalies and hyperinflammation characterise severe COVID-19 in type 2 diabetes Monocyte activation in systemic Covid-19 infection: assay and rationale Severe COVID-19 and aging: are monocytes the key? GeroScience Epidemiologic and clinical features of patients with COVID-19 in Brazil. einstein (Sao Paulo) 18 Introducing the 21st century's new four horsemen of the coronapocalypse Predictive model for COVID-19 incidence in a medium-sized municipality in Brazil Relieving cost of epidemic by Parrondo's paradox: a COVID-19 case study. Advanced Science Acknowledgements. We are grateful to Antonio Magno Lima Espeschit and Sonia Mara de Andrade who contributed with suggestions to this research. We are also indebted to Hospital Israelita Albert Einstein for making the dataset available, and the referees for their detailed comments. This paper has not been published previously in whole or part. The data that support the results of this study are openly available in reference number [18] .Although this research received no specific grant from any funding agency, commercial or not-for-profit sectors, as institutionally required we inform that 'this study was financed in part by the Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior -Brasil (CAPES) -Finance Code 001'.