key: cord-0328432-o9gypwbq authors: Lai, K.-L.; Hu, F.-C.; Wen, F.-Y.; Chen, J.-J. title: Lymphocyte count is a universal predictor to the health status and outcomes of patients with coronavirus disease 2019 (COVID-19): A systematic review and meta-regression analysis date: 2021-08-04 journal: nan DOI: 10.1101/2021.08.02.21261505 sha: 57b1a015125d4e938d574ba2b552958244485c25 doc_id: 328432 cord_uid: o9gypwbq Background This study aimed to evaluate the prediction capabilities of clinical laboratory biomarkers to the prognosis of COVID-19 patients. Methods Observational studies reporting at least 30 cases of COVID-19 describing disease severity or mortality were included. Meta-data of demographics, clinical symptoms, vital signs, comorbidities, and 14 clinical laboratory biomarkers on initial hospital presentation were extracted. Taking the outcome group as the analysis unit, meta-regression analysis with the generalized estimating equations (GEE) method for clustered data was performed sequentially. The unadjusted effect of each potential predictor of the three binary outcome variables (i.e., severe vs. non-severe, critically severe vs. non-critically severe, and dead vs. alive) was examined one by one by fitting three series of simple GEE logistic regression models due to missing data. The worst one was dropped one at a time. Then, a final multiple GEE logistic regression model for each of the three outcome variables was obtained. Findings Meta-data was extracted from 76 articles, reporting a total of 26,627 cases of COVID-19. Patients were recruited across 16 countries. The number of studies (patients) included in the final models of the analysis for severity, critical severity, and mortality was 38 studies (9,764 patients), 21 studies (4,792 patients), and 24 studies (14,825 patients), respectively. After adjusting for the effect of age, lymphocyte count mean or median [≤] 1.03 (estimated hazard ratio [HR] = 46.2594, p < 0.0001), smaller lymphocyte count mean or median (HR < 0.0001, p = 0.0028), and lymphocyte count mean or median [≤] 0.8714 (HR = 17.3756, p = 0.0079) were the strongest predictor of severity, critical severity, and mortality, respectively. Interpretation Lymphocyte count should be closely watched for COVID-19 patients in clinical practice. Keywords Laboratory data, lymphocyte, logistic regression analysis, clustered data, GEE. Although numerous treatment options and vaccines are authorized for COVID-19, 1,2 the situation of a global pandemic is still continuing. Each day over four hundred thousand new cases are identified even in time of July 2021. 3 COVID-19, the illness caused by infection with SARS-nCoV2, 4 is spreading since December 2019 from Wuhan, China, and has accumulated more than 192 million cases and more than 4 million deaths in over 219 countries, area or territories up to 23 July 2021. 5 During pandemic period medical care system started to overwhelmed in bunch communities no matter from economically developed or underdeveloped regions. [6] [7] [8] [9] [10] [11] [12] [13] How to use simple tools to differentiate, triage patients is crucial. Several laboratory data have been identified as predictors for disease severity or mortality of COVID-19 patients, e.g., lymphocyte count, neutrophil/lymphocyte ratio (NLR), lactate dehydrogenase (LDH), D-dimer, C-reactive protein (CRP), procalcitonin (PCT). [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] However, since many studies were conducted at the same region during a short period of time, [14] [15] [16] [17] [18] [19] [20] [21] [22] 26 ,27 potential bias of subject duplication cannot be ruled out for the following metaanalysis. [28] [29] [30] [31] [32] [33] [34] Additionally the relative strength of broader spectrum lab data for their prediction capability has not been explored on a head-to-head basis. This study aimed to investigate whether laboratory data at hospital presentation play a role in distinguishing severity or predicting mortality for COVID-19 patients and to explore the relative significance of these predictors across regions. ………………………………………………………………………………………… Evidence before This Study Plentiful tools, e.g., demographics, symptoms, vital signs, comorbidity, imaging, and lab data have been explored to their prediction capability for COVID-19 patients. Several lab data such as lymphocyte, NLR, LDH, CRP, PCT have been identified to distinguish the severity or to predict the survival of COVID-19 patients. However, most of the meta-analysis research conducted in a specific region at the early stage of the pandemics and subject duplication concerns cannot be ignored due to the large amount papers published within a short period. In addition, the relative strength of these lab items for predictions has not been tested under a broader spectrum which including severity, critical severity, and mortality and across different regions. hypersensitive troponin I (hs-cTnI); (4) subjects number below thirty; (5) research subjects may duplicate from other studies after investigation of the sites and the recruitment period. Under condition (5), the study with utmost information by calculating (the number of study subjects) × (number of lab data items)] was selected. At the initial stage, after duplicates were removed, 1,126 records were identified from MEDLINE or EMBASE databases. Of the leaving records, after the title and abstract review, 660 documents were excluded. The leaving 466 articles were carefully and detailed evaluated. At last, 390 articles were excluded, because the studies did not meet the criteria we have set. Finally, a total of 76 studies with 26,627 patients were included in qualitative synthesis ( Figure 1 ). Among the 76 studies, based on the features of data, a total of 38 studies, 21 studies, and 24 studies were incorporated in the analysis for severity, critical severity, and mortality respectively. After all, a total of 35 studies, 15 studies, and 19 studies were presented in the meta-regression analyses, respectively. The following data were extracted from the qualified studies: first author, year/month of the publication, location (city-country), hospital name, definition of disease severity, subject number, number of COVID-19 patients in each health status, age, male to female ratio, vital sign, clinical feature (12 symptoms), comorbidity (any; 8 main diseases), and desired 14 lab data [Appendix I, Appendix II). Lab data on the initial hospital presentation were classified as blood routine, blood biochemistry, coagulation functions, inflammatory markers, myocardial injury markers (see Appendix II). The primary outcome measures were to compare the level of laboratory data and their impact on different health outcomes (non-severe vs. severe, non-critically severe vs. critically severe, and alive vs. dead) after adjusting the effects of other covariates. Statistical analysis was performed using the R 4.1.0 software (R Foundation for Statistical Computing, Vienna, Austria). Two-sided p value ≤ 0.05 was considered statistically significant. We chose the outcome groups in the collected studies as the analysis unitinstead of the collected studies themselves -in this meta-analytical study. The distributional properties of continuous variables were expressed by mean ± standard deviation (SD), median, interquartile range (IQR), and categorical variables were presented by frequency and percentage (%). In univariate analysis, the unadjusted effect of each potential risk factor, prognostic factor, or predictor of the three binary outcome variables (i.e., severe vs. nonsevere, critically severe vs. non-critically severe, and dead vs. alive) was examined respectively using the Wilcoxon rank-sum test, Chi-square test, and Fisher's exact test as appropriate for the data type. Next, multivariate analysis was conducted by fitting the logistic regression models to estimate the adjusted effects of potential risk factors, prognostic factors, or predictors on the three binary outcome variables (i.e., severe vs. non-severe, critically severe vs. non-critically severe, and dead vs. alive) respectively with the generalized estimating equations (GEE) method. The GEE method was used to account for the correlation between the two outcome groups within a collected study. 35 Computationally, we used the geeglm function (with the specified "exchangeable" correlation structure and the default robust estimator of standard error) of the geepack package 36,37 to fit GEE logistic regression models for the three sets of correlated binary responses (i.e., severe vs. non-severe, critically severe vs. non-critically severe, and dead vs. alive) respectively in R. To ensure a good quality of analysis, the model-fitting techniques for (1) variable selection, (2) goodness-of-fit (GOF) assessment, and (3) regression diagnostics and remedies were used in our GEE logistic regression analyses. All the univariate significant and non-significant relevant covariates (listed in Appendix II) were put on the variable list to be selected. However, each of the collected studies selectively reported the potential risk factor, prognostic factor, or predictor of the three binary outcome variables. If we wanted to assess simultaneously the effects of all the relevant covariates (listed in Appendix II), then the number of studies without missing values would be very few. Thus, our meta-regression analysis was performed by fitting a series of simple GEE logistic regression models and then dropping the worst one at a time to maximally use all the available information. Then, a final multiple GEE logistic regression model for each of the three outcome variables was obtained. Any discrepancy between the results of univariate analysis and multivariate analysis was likely due to the variation in the number of studies without missing values or the confounding effects of uncontrolled covariates in univariate analysis. The GOF measures, including the estimated area under the receiver operating characteristic (ROC) curve (also called the c statistic) and adjusted generalized R 2 , and the Hosmer-Lemeshow GOF test were examined to assess the GOF of the fitted GEE logistic regression model. The value of the c statistic (0 ≤ c ≤ 1) ≥ 0.7 suggests an acceptable level of discrimination power. Larger p values of the Hosmer-Lemeshow GOF test imply better fits of logistic regression model. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint Simple and multiple generalized additive models (GAMs) were fitted to draw the GAM plots for detecting nonlinear effects of continuous covariates and then for identifying the appropriate cut-off point(s) to discretize continuous covariates, if necessary, during the above variable selection procedure. Computationally, we used the vgam function of the VGAM package with the default values of the smoothing parameters (e.g., s(age, df=4, spar=0) for the cubic smoothing splines) to fit the GAMs for our binary responses, and then used the plotvgam function of the same package to draw the GAM plots for visualizing the linear or nonlinear effects of continuous covariates in R. 36,38,39 If a separation or high discrimination problem occurred in logistic regression analysis, we fitted the Firth's bias-reduced logistic regression model using the logistf function of the logistf package in R. 40 Finally, the statistical tools of regression diagnostics for residual analysis, detection of influential cases, and check of multicollinearity were applied to discover any model or data problems. The values of the variance inflating factor (VIF) ≥ 10 in continuous covariates or VIF ≥ 2.5 in categorical covariates indicate the occurrence of the multicollinearity problem among some of the covariates in the fitted logistic regression model. Based on the search strategy, 76 articles were included in the qualitative synthesis 10,14,16- of the studies were based on the WHO interim guidance 105 or national guidance modified from the WHO principles 91, 106, 107 , followed by the American Thoracic Society Guideline (5, 9.1%) and the International Guideline for Community-Acquired Pneumonia (1, 1.8%). Since the contents of the above guidelines were similar, they were all included in the analysis. In general, disease severity is classified into four types: mild, moderate, severe, and critically . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint severe. Severe: Meet any of the following (1) Shortness of breath, RR>30 times per minute; (2) At room air, SpO2 lower than 93%; (3) The partial pressure of Arterial blood oxygen (PaO2)/the fraction of inspired oxygen (FiO2) ≤ 300mmHg; (4) CT chest imaging shows that lung damage develops significantly within 24 to 48 hours. Critically severe: Meet any of the following (1) Respiratory failure requiring mechanical ventilation; (2) Signs of septic shock; (3) Multiple organ failure requiring ICU admission. For comparison purposes in this study, the subjects in the mild and the moderate conditions were assembled into the non-severe group; subjects in the mild, moderate, and severe were assembled into the non-critical group. There were therefore six health outcomes classified into three pairs in this study: severe vs. non-severe, critically severe vs. non-critically severe, dead vs. alive. Summary statistics of the demographics, clinical characteristics, comorbidities, and laboratory data of the COVID-19 patients on initial hospital presentations for the assessment of severity, critical severity, and mortality were shown in Appendix II-1, Appendix II-2, and Appendix II-3, respectively. Not every desired lab parameter was collected for each studyfor example, NLR and hs-cTnI were rarely reported. Most of the lab data had statistical significance (all p < 0.05) between the two groups except less collected parameters to the disease severity (Appendix II -1, Appendix II-2); ALT, total bilirubin to the mortality (Appendix II-3). Results of univariate analyses of the predictors for severity (severe vs. non-severe) were shown in Table 1 Results of univariate analyses of the predictors for critical severity (critically severe vs. non- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint critically severe) were shown in Table 2 -A. At the last run (seventh, m = 30) only lymphocyte count (AUC = 0.933) and age (AUC = 0.829) or age > 59.82 (AUC = 0.767) were existed in the final univariate analyses. Table 2 -B listed the result of multivariate analysis for the predictors to critical severity (critically severe vs. non-critically severe). Age and lymphocyte count were in the final meta-regression model. We found that higher lymphocyte count mean or median had an extremely lower risk of critical severity (HR < 0.0001, p = 0.0284) while age mean or median > 59.82 have a higher risk of critical severity (HR = 307.6130, p = 0.0009). Results of univariate analyses of the predictors for mortality (dead vs. alive) were shown in The results of this study provide numerous imperative insights. After comparisons lymphocyte count was the most powerful predictor among the fourteen explored lab items. Single lab data, lymphocyte count at initial hospital presentation together with age can be remarkable indicators to discriminate the health status (severe vs. non-severe, critically severe vs. non-critically severe) or the final consequence (dead vs. alive) for COVID-19 patients. Compared with vital signs, symptoms, comorbidities, several lab data (CRP, Ddimer, lymphocyte, neutrophil, platelet, LDH) holds the value to differentiate disease severity and to predict the mortality of COVID-19 patients. To the best of our knowledge, this was the first meta-analysis study that potential bias of subject duplication of COVID-19 patients in studies has been eliminated before analyses. After SARS-CoV2 infection, multiple mechanisms of the human body triggered, e.g., immune (ex. WBC, lymphocyte, neutrophil) responses, inflammatory cataracts (ex. CRP, PCT), and the activation of coagulation cascades (ex. platelet count, D-dimer). 108-111 After virus invasion to the tissues which starts early, the inflammation situation intensifies 110,112 , the inflammatory indicators will increase dramatically. 15, 28, 29, 113, 114 The wide distribution of the COVID-19 receptors, e.g., angiotensin-converting enzyme-2 (ACE2) receptors, abundantly expressed in a variety of cells residing in many human organs, could exaggerate systemic is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint failure due to direct organ injury. 115,116 Organ (ex. lung, liver, kidney, heart, brain, etc.) damage indicators such as ALT, AST, total bilirubin, LDH, hypersensitive troponin I, etc. will be augmented to reflecting the impairment situation. 30, 117, 118 Our study confirmed again that levels of several laboratory data although not all are profound predictors to disease severity or mortality for COVID-19 patients as compared with previous studies. 15, 28, 29, 30, 114 It is no surprise that lymphocyte count played such an important role to COVID-19 patients in defending SARS-CoV-2. 28,109,112 Adaptive immune cells such as lymphocytes are essential for virus clearance as well as for recovery from the disease. 109, 112, 119 Interaction between SARS-CoV-2 and the immune system of an individual results in a diverse clinical manifestation. 16, 75, 88, 94, 96, 112 Our study reveals that lymphocyte count offerings a defensive feature to COVID-19 patients within a certain range (Table 1-B, Table 3 -B). Lower lever (e.g., lymphocyte count ≤ 1.03) or extreme lower (e.g., lymphocyte count ≤ 0.87) implies immune weakness and worsens outcomes to disease severity or mortality. However, a too high level of immune response becomes another issue, which may induce unintended results such as cytokine storm. 75,88 In our study, a higher level of lymphocyte count (e.g., lymphocyte count > 2.06) revealed a more severe status to severity (Table 1- Plentiful tools, e.g., demographics, symptoms, vital signs, comorbidities, imaging, etc., have been explored to their prediction capability for COVID-19 patients. 28,78,114 However, such data has its limitations. Routine lab data retains several advantages, which can indicate the whole body situation of a COVID-19 patient whose functions can be changed dramatically in few days. 122 Additionally lab testing is easy to access, repeatable, self-explain, relatively cheap, and therefore can be a cost-effective tool under pandemic circumstances. Current criteria to judge the severity, to triage or referral COVID-19 patients, are based on imaging, demographics, comorbidities, vital signs, or symptoms. 6,123,124 Based on our study results single lab data, lymphocyte count at administration plus age can be useful for the purposes. Early and continue monitoring lab data for a COVID-19 patient can help to understand the health state, triage the patient, predict the severity of disease, predict the health consequences, and workout treatment judgment appropriately. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint Our study has several limitations. Due to the lack of non-English articles, pediatric study and specific disease groups, interpretation of the results must be cautious. Ideally, all desired lab data should be collected and analyzed in all studies. However, it is not realistic in the real world because of wide-ranging medical resource deficiency that existed across countries. We suggest collecting essential data through a standardized list while clinical presentation, medical history, imaging information, comprehensive lab data, and other valuable factors, can be assembled and analyzed which will accelerate knowledge accumulation in particular under global pandemics. Retrospective observational study conducted at the level of hospital or community, characteristics of individual patients could not be retrieved. In addition, the dynamic relationship among various lab data, the status of disease progression, functions, and feelings of the patient, have not been explored due to inadequate data. More extensive and large-scale studies are required to double confirm the findings of this study. Our study involved 26,627 confirmed COVID-19 patients across sixteen countries provides evidence for defending disease under pandemics. Results prove that lymphocyte count is a universal biomarker to disease severity and mortality across regions. Several routine lab data at the initial hospital presented good prediction capability. Routine lab testing could be a useful tool in particular under a pandemic condition whereas medical resource is constrained. How to maintain or improve good immunity levels for the general population in daily life can be a crucial strategy to stakeholders in facing life-threatening infectious diseases such as COVID-19. KLL and FCH designed the study and take responsibility for the integrity of the data and the accuracy of the data analysis. JJC and KLL were in charge of the systematic review and data collection. FCH and FYW conducted the statistical analysis. KLL and FGH contributed to the writing of the manuscript. All authors contributed to data interpretation, reviewed, and approved the final version. All authors declared no competing interests. The data of this study were available from the corresponding author on request. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted August 4, 2021. ; https://doi.org/10.1101/2021.08.02.21261505 doi: medRxiv preprint Characteristic (ROC). 2 The 5 groups with "Age mean or median ≤ 56.06 years" were all non-severe so that the separation or high discrimination problem occurred in fitting the simple logistic regression model with the generalized estimating equations (GEE) method (assuming an 'exchangeable' working correlation structure). Then, the logistf() function of the logistf package was used to fit the Firth's bias-reduced logistic regression model with the profile penalized log-likelihood method in R. Since the "Residual Deviance" of the Firth's biasreduced logistic regression model was not compatible with the others, it was not computed and listed. COVID-19 treatment options: a difficult journey between failed attempts and experimental drugs CDC. Different COVID-19 Vaccines. Accessed Weekly epidemiological update on COVID-19-20 Characteristics of SARS-CoV-2 and COVID-19 Fangcang shelter hospitals: a novel concept for responding to public health emergencies COVID-19 and Italy: what next? Characteristics, outcomes and indicators of severity for covid-19 among sample of ESNA quarantine hospital's patients, Egypt: A retrospective study Makeshift hospitals for COVID-19 patients: where health-care workers and patients need sufficient ventilation for more protection Managing a specialty service during the COVID-19 crisis: lessons from a New York City health system. ncbi.nlm.nih.gov COVID-19 threatens health systems in sub-Saharan Africa: The eye of the crocodile The characteristics and predictive role of lymphocyte subsets in COVID-19 patients Associations of procalcitonin, C-reaction protein and neutrophil-to-lymphocyte ratio with mortality in hospitalized COVID-19 patients in China Neutrophil-to-Lymphocyte Ratios Are Closely Associated With the Severity and Course of Non-mild COVID-19 Predictors of coronavirus disease 2019 severity: A retrospective study of 64 cases Risk factors for death in 1859 subjects with COVID-19 Risk Factors for Mortality in 244 Older Adults With COVID-19 in Wuhan, China: A Retrospective Study The mortality of COVID-19 patients in Wuhan Early antiviral treatment contributes to alleviate the severity and improve the prognosis of patients with novel coronavirus disease (COVID-19) Clinical Characteristics and Prognosis of 218 Patients With COVID-19: A Retrospective Study Based on Clinical Classification Epidemiological and clinical features of 200 hospitalized patients with corona virus disease 2019 outside Wuhan, China: A descriptive study A retrospective study of the C-Reactive protein to lymphocyte ratio and disease severity in 108 patients with early COVID-19 Pneumonia from A retrospective study of risk factors for severe acute respiratory syndrome coronavirus 2 infections in hospitalized adult patients Dynamic changes of D-dimer and neutrophil-lymphocyte count ratio as prognostic biomarkers in COVID-19 Epidemiological and clinical characteristics of 1663 hospitalized patients infected with COVID-19 in Wuhan, China: a singlecenter experience Older Patients With SARS-Cov-2 Infection Admitted to Acute Care Geriatric Wards A Novel Scoring System for Prediction of Disease None. The study was a systematic review and meta-regression analysis so that ethical approval was not needed. Supplementary data to this article can be found online at https://.