key: cord-0073105-oljmt89i authors: Ma, Rui Na; He, Yi Xuan; Bai, Fu Ping; Song, Zhi Peng; Chen, Ming Sheng; Li, Min title: Machine Learning Model for Predicting Acute Respiratory Failure in Individuals With Moderate-to-Severe Traumatic Brain Injury date: 2021-12-24 journal: Front Med (Lausanne) DOI: 10.3389/fmed.2021.793230 sha: 4be4ff66eeb3e5001c95604b33f53b198aa370cb doc_id: 73105 cord_uid: oljmt89i Background: There is a high incidence of acute respiratory failure (ARF) in moderate or severe traumatic brain injury (M-STBI), worsening outcomes. This study aimed to design a predictive model for ARF. Methods: Adult patients with M-STBI [3 ≤ Glasgow Coma Scale (GCS) ≤ 12] with a definite history of brain trauma and abnormal head on CT images, obtained from September 2015 to May 2017, were included. Patients with age >80 years or <18 years, multiple injuries with TBI upon admission, or pregnancy (in women) were excluded. Two models based on machine learning extreme gradient boosting (XGBoost) or logistic regression, respectively, were developed for predicting ARF within 48 h upon admission. These models were evaluated by out-of-sample validation. The samples were assigned to the training and test sets at a ratio of 3:1. Results: In total, 312 patients were analyzed including 132 (42.3%) patients who had ARF. The GCS and the Marshall CT score, procalcitonin (PCT), and C-reactive protein (CRP) on admission significantly predicted ARF. The novel machine learning XGBoost model was superior to logistic regression model in predicting ARF [area under the receiver operating characteristic (AUROC) = 0.903, 95% CI, 0.834–0.966 vs. AUROC = 0.798, 95% CI, 0.697–0.899; p < 0.05]. Conclusion: The XGBoost model could better predict ARF in comparison with logistic regression-based model. Therefore, machine learning methods could help to develop and validate novel predictive models. Acute respiratory failure (ARF) is a common pathophysiological result of pulmonary complications [pneumonia, neurogenic pulmonary edema, and acute respiratory distress syndrome (ARDS)] in moderate or severe traumatic brain injury (M-STBI), not only worsening outcomes, but also extending intensive care unit (ICU) and hospital stays and increasing the cost of hospital care (1) (2) (3) (4) (5) (6) (7) . Consequently, accurately predicting ARF risk may help to identify cases requiring intensive airway management. This would help to allocate resources efficiently and improve morbidity reduction by appropriately monitoring patients at risk. With the rapid development of software, there is increasing use of machine learning algorithms. Especially, machine learning methods have been applied in medicine with excellent results, deriving predictive algorithms for multiple conditions (8) (9) (10) (11) (12) (13) (14) (15) . While traditional predictive models employ selected parameters, machine learning methods easily include multiple clinical parameters (16) . Although some predictive score systems or risk calculators have been developed by previous studies for the prediction of pulmonary complications (3, 5, 9, 13, 17) , to date, studies assessing RF feature selection and machine learning algorithms are rare in the M-STBI population. We hypothesized that supervised machine learning could help to develop models for better predicting single ARF occurrence upon M-STBI compared with routine statistical models. Therefore, this study aimed to utilize a machine learning model for developing and validating an ARF predictive model, termed extreme gradient boosting (XGBoost), which was compared to a conventional logistic regression model for effectiveness. Model development and internal validation were based on a large TBI database, which consists of data of patients admitted to the department of neurosurgery in the Second Affiliated Hospital of Fourth Military Medical University, China, from September 2015 to May 2017. This trial had approval from the Institutional Ethics Board of the Second Affiliated Hospital of Fourth Military Medical University (TDLL-KY-202110-09) and data reporting followed the guidelines included in the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement (18) . Adult patients with M-STBI [3 ≤ Glasgow Coma Scale (GCS) ≤ 12] with a definite history of brain trauma and abnormal head on CT images, acquired from September 2015 to May 2017, were included in this study. Patients with age >80 years or <18 years, multiple injuries with TBI upon admission, or pregnancy (in women) were excluded from this study. The medical records of the patients were carefully collected by three authors on separate occasions. Demographic parameters, clinical and laboratory variables, comorbidities, imaging features, and outcome variables were recorded. All the patients with M-STBI underwent the procedure of arterial blood gas (ABG) analysis within the day of admission; ABG was repeated, if oxygen saturation (SpO 2 ) <93% using a nasal catheter or mask oxygen inhalation for at least 5 min after suctioning oropharyngeal secretions. The primary endpoint of this study was ARF within 72 h of admission, which was defined as respiratory failure with partial pressure of oxygen (pO 2 ) <60 mm Hg and respiratory rate >30 breaths/min or respiratory distress for at least 5 min (19) . Clinical and laboratory parameters recorded in the initial 48 h after ICU admission were examined for their capabilities of predicting ARF. For parameters measured many times, both the maxima and minima were examined. Age, gender, GCS, comorbidity, and imaging features including the Marshall CT score and severity scores of lung exudations (see Tables S1, S2 for the details of the scores) were analyzed. In addition, laboratory data such as white blood cell (WBC) and neutrophil counts, neutrophil-lymphocyte ratio, C reactive protein (CRP), and procalcitonin (PCT) were included. In term of therapy, long-term sedation (sedation duration > 48 h) was examined. For predictor selection, the Akaike Information Criterion (AIC) was used for minimizing the possible collinearity of parameters from a given patient as well as overfitting (20) . This was a hypothesis-generating retrospective trial, with no sample size estimation, but including the totality of eligible patients in the database for statistical power maximization. We aimed to reflect daily clinical routine where often not all the data are obtainable. To make our algorithms and study realistic, we decided not to correct for missing data, e.g., by imputation techniques and to perform the analysis using the available data only. While using imputation techniques to estimate missing variables have many merits in conventional statistics, it is less preferred in machine learning because it does not reflect the observed reality-at best a close approximation-and adds artificially introduced noise to the data. Moreover, there could be significant reasons why some data are missing, which could be linked to the outcome variable of interest. In such cases (and in a number of other scenarios), imputation obscures important relationships in the observed data or introduces artificial relationships altogether, which decreases the value of complex pattern recognition used in machine learning. For variables with missing values, we coded the missing value as zero and created the corresponding missing dummy (12) . Continuous and categorical data were presented as median [interquartile range (IQR)] and number (percentage). Demographic characteristics between participants with and without ARF were compared by the Mann-Whitney U test or the chi-squared test. The primary model of this study was the XGBoost gradient boosted tree model. XGBoost represents a tree ensemble technique building in a progressive fashion on the loss from weak decision tree base learners. It can learn rapidly and effectively from substantial data amounts, with a flexibility allowing learning even from missing data (21) . After tuning the XGBoost model, parameters of the XGBoost model were finally max_depth = 7, subsample = 0.94, colsample_bytree = 0.83, nrounds = 100, learning rate (eta value) = 0.3, and gamma = 5. For comparison, another model for predicting ARF occurrence was developed based on the multivariate logistic regression analysis. As a comparison, a second model to predict the occurrence of ARF was created using the multivariate logistic regression model. For comparison, model discrimination was assessed using the area under the receiver operating characteristic (AUROC) curve and the optimal cutoff value was calculated by Youden index. The confusion matrixes of the two models were created based on the optimal cutoff values to evaluates the accuracy, sensitivity, and specificity. EmpowerStats (X&Y Solutions, Inc., Boston, MA, USA) and R version 3.4.2 (http://www.R-project.org) were utilized for data analysis. p < 0.05 was considered as statistically significant. (Figure 1) . Characteristics of patients are given in detail in Table 1 . The ARF group included more individuals with smoking history (37.12 vs. 26.67%; p = 0.049) and chronic obstructive pulmonary disease (COPD) history (5.30 vs. 1.11%; p = 0.029) prior to ICU admission than the non-ARF group. Upon admission, the minimum GCS values (6.57 ± 2.68 vs. 8.63 ± 3.27 mmol/l; p < 0.001) were lower, while the Marshall CT scores (5.50 ± 0.95 vs. 4.70 ± 1.39; p < 0.001) and severity scores of bilateral lung exudations (83.33 vs. 66.67%; p = 0.004) were higher in ARF cases. ARF cases also showed elevated white blood cell count (14.87 ± 7.14 vs. 10.96 ± 5.16; p < 0.001), elevated neutrophil cell count (85.06 ± 9.47 vs. 78.27 ± 12.37; p < 0.001), lower neutrophil-lymphocyte ratio (5.66 ± 5.83 vs. 9.78 ± 10.11; p < 0.001), and higher CRP (57.10 ± 59.85 vs. 23 .51 ± 31.19 mmol/l; p < 0.001) and PCT (2.54 ± 6.09 vs. 0.42 ± 1.13; p = 0.002) compared with the non-ARF group ( Table 1) . Extreme gradient boosting had an AUROC of 0.84 in the training set, with sensitivity and specificity of 0.71 and 0.84, respectively. Its precision was 0.78 (95% CI: 0.72-0.83). An error rate of 0.12 was obtained, indicating a correct prediction in roughly 78% of patients ( Table 2 ). In the test population, an AUROC of 0.90 was obtained for XGBoost, which had specificity and sensitivity of 0.85 and 0.78, respectively, indicating correct prediction of 29 of the 37 ARF cases in the test set. Meanwhile, 8 cases were incorrectly predicted [reflecting a precision rate of 0.82 (0.72, 0.90)]. The model had an error rate of 0.18, indicating correct outcome prediction in >81% of cases ( Table 2) . Variables showing high predictive values were the GCS and the Marshall CT score, PCT, and CRP on admission. The GCS was the center factor of the XGBoost model because the gain of the GCS was the highest among all the variables (Figure 2) . Other variables, e.g., long-term sedation and smoking history had low prediction power (Figure 2 ). Baseline parameters for the ARF and non-ARF groups are shown in Table 1 . Smoking and COPD history, the GCS and the Marshall CT score on admission, severity scores of lung exudations, long-term sedation, neutrophil cell count, WBC, neutrophil-lymphocyte ratio, PCT, and CRP showed associations with ARF occurrence in the univariate analysis (p < 0.05, Table 1 ). In the stepwise multivariate logistic regression analysis, bilateral lung exudations [odds ratio (OR), 3.435; 95% CI, 1.248-9.456], the Marshall CT score (OR for each 1 score increase, 1.078; 95% CI, 1.012-1.148), long-term sedation, increased WBC (OR for each 1 × 10 9 /L increase, 1.076; 95% CI, 1.181-2.463), and CRP (OR for each 1 mg/l increase, 1.014; 95% CI, 1.004-1.025) were associated with increased probability of ARF. On the contrary, the GCS (OR for each 1 score increase, 0.788; 95% CI, 0.681-0.913) was associated with decreased probability of ARF ( Table 2) . The multivariate regression model was created based on the AIC-selected variables. It showed an AUROC of 0.943 in the training cohort, with a specificity of 0.946 and a sensitivity of 0.837 ( Table 3) . Its error rate was 11.6%. In the test population, AUROC was 0.792 and specificity and sensitivity were 0.913 and 0.667, respectively; its error rate approximated 15.6% (Table 3) . Area under the receiver operating characteristics were determined for assessing the discriminative abilities of both the models. XGBoost showed an elevated AUROC in comparison with the logistic regression model (AUROC, 0.902; 95% CI, 0.834-0.966 vs. 0.789; 95% CI, 0.688-0.891, p < 0.05; Figure 3) . Tables 3, 4 describe the classification and confusion matrixes for both the models in predicting ARF. Prediction and timely detection of ARF in patients with M-STBI are critical, crucially impacting M-STBI outcome (22, 23) . This study developed a machine learning-based model to predict ARF occurrence in M-STBI, with multiple remarkable features. First, the model included readily available and reproducible parameters in the initial 48 h after admission. Second, after analyzing multiple interaction patterns among variables, the predominance of admission-related parameters (the GCS and the Marshall CT score, CRP, PCT, and long-term sedation; Figure 2 ) was most significant in determining the occurrence of ARF. Third, the novel model enhanced performance compared with the conventional logistic regression model. This study first investigated ARF prediction in patients with M-STBI using machine learning methods. This new model had accuracy and AUROC of 0.83 and 0.90, respectively. Of greatest importance, sensitivity and specificity of 0.73 and 0.91, respectively, were obtained in the test cohort. First, accurate detection of ARF in critically ill individuals with M-STBI is essential in performing intensive airway management and making decision with respect to invasive treatments such as tracheal intubation, invasive mechanical ventilation, and even tracheostomy. To date, reliable tools for timely predicting ARF in M-STBI are lacking. In this study, we demonstrated enlightened machine learning methods, including XGBoost, could provide a great deal of information obtainable from databases and promote the development and validation of better predictive models in comparison with conventional logistic regression techniques. The new model could help to stratify M-STBI cases right after ICU admission. Therefore, intensive airway management or invasive treatment could be more accurately provided to individuals with high odds of developing ARF to avoid long-term hypoxia, which is associated with increased morbidity and mortality in patients with M-STBI (24, 25) . On the other hand, intensive airway management needs important human and material resources, while invasive treatment is related to complications and high medical costs. Thus, identifying individuals who could benefit from intensive airway management or invasive treatment are critical. However, this analysis provided no high level of evidence with respect to the effectiveness of XGBoost. Further randomized controlled trials that compare therapies dependent on and independent of the predictive model should comprehensively examine its effectiveness. Second, we aimed to design a model with easy implementation by neurosurgery residents and staff alike. Therefore, parameters easily available and reproducible upon admission were required and quantitative (blood test results, the GCS score, the Marshall CT score, etc.) and dichotomous (long-term sedation or not, smoking status, etc.) variables were selected. Third, the XGBoost model showed that the GCS score, PCT, the Marshall CT score, CRP, and long-term sedation potentially predicted ARF in patients with M-STBI. Consistent with previous reports, the GCS score, PCT, and CRP were related to ARF in patients with M-STBI, suggesting the extent of TBI and severity of systematic inflammation (26) (27) (28) (29) . The GCS was center factor in the XGBoosting model shown in Figure 2 , suggesting that the severity of brain injury was associated with ARF in patients with M-STBI significantly. The results agreed with clinical experience very well. However, to the best of our knowledge, the association between the Marshall CT score and ARF has not been confirmed. This study suggested that the Marshall CT score potentially predicted ARF. The explanation could be that the Marshall CT score can reflect the extent of brain injury based on neuroimaging, so the high Marshall CT score is associated with injury of brainstem centers of respiration or intracranial hypertension, which causes ARF easily. Moreover, both the logistic and XGBoost models showed that sedation (more than 48 h) was related to ARF. The results could be explained by the fact that sedation is an important tool for reducing intracranial pressure, which cannot be stopped until intracranial pressure returns to normal. Intracranial hypertension and respiratory depression caused by sedative drugs contribute to ARF (30, 31) . This study had many strengths. XGBoost modeling represents a new method not yet applied in respiratory failure studies of neurological critical patients. XGBoost modeling can learn swiftly with high efficiency from important data amounts and its high flexibility enables learning even from missing data (21) . The XGBoost model had starkly higher predictive accuracy compared with the generalized linear model, being capable of capturing complex associations in data without requiring explicit highorder interactions and non-linear functions (12) . Using such features, predictive models based on clinical and laboratory variables, which are easily available and reproducible upon admission, could be built. However, there were also limitations. First, as a hypothesis-generating study, external validation of the XGBoost model is important for confirming its usefulness. The XGBoost model developed in this study will be applied to the Medical Information Mart for Intensive Care (MIMIC)-IV for external validation in the next study. Second, because this was a retrospective study, missing data are inevitable in practice. For missing data, variables with >70% missing values were excluded from model construction. Thus, the sample sizes of the training (n = 86) and test (n = 32) sets were low especially in the logistic regression model. To some extent, missing data decreased the performance of the model. Third, this study only explored ARF within 48 h upon admission and a different time interval (e.g., >48 h following admission) was not studied. In total, six major parameters related to ARF were screened to develop the XGBoost model with enhanced predictive value for ARF compared with the logistic regression model in patients with M-STBI. The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation. The studies involving human participants were reviewed and approved by Institutional Review Board of the Second Affiliated Hospital, Fourth Military Medical University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. ML and RM contributed to the study conception, design, and manuscript drafting. YH, MC, FB, and ZS contributed to the acquisition of data and analysis and interpretation of data. All authors approved the final version of the manuscript. Association of postoperative complications with hospital costs and length of stay in a tertiary care center Development and validation of a risk calculator predicting postoperative respiratory failure Development and validation of a score to predict postoperative respiratory failure in a multicentre European cohort: a prospective, observational study Predicting primary postoperative pulmonary complications in patients undergoing minimally invasive surgery for colorectal cancer Risk factors predicting prognosis and outcome of elderly patients with isolated traumatic brain injury Impact of grouping complications on mortality in traumatic brain injury: a nationwide population-based study Demographic and clinical risk factors associated with hospital mortality after isolated severe traumatic brain injury: a cohort study Acute graft-versus-host disease following orthotopic liver transplantation: predicting this rare complication using machine learning Development and validation of a machine learning model to predict near-term risk of iatrogenic hypoglycemia in hospitalized patients Early detection of heart failure with reduced ejection fraction using perioperative data among noncardiac surgical patients: a machine-learning approach Machine learning applied to registry data: development of a patient-specific prediction model for blood transfusion requirements during craniofacial surgery using the pediatric craniofacial perioperative registry dataset Machine learning for the prediction of volume responsiveness in patients with oliguric acute kidney injury in critical care Machine learning model to predict ventilator associated pneumonia in patients with traumatic brain injury: the C.5 decision tree approach AME Big-Data Clinical Trial Collaborative Group. Predictive analytics with gradient boosting in clinical medicine Machine learning algorithm identifies patients at high risk for early complications after intracranial tumor surgery: registry-based cohort study Deep learning A ventilatorassociated pneumonia prediction model in patients with acute respiratory distress syndrome Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement Development and validation of a prediction model for severe respiratory failure in hospitalized patients with SARS-CoV-2 infection: a multicentre cohort study (PREDI-CO study) Variable selection with stepwise and best subset approaches Supervised machine learning for the early prediction of acute respiratory distress syndrome Outcomes and mortality prediction model of critically ill adults with acute respiratory failure and interstitial lung disease Relevance of lung ultrasound in the diagnosis of acute respiratory failure: the BLUE protocol Post-traumatic hypoxia exacerbates neuronal cell death in the hippocampus Brain Oxygen optimization in severe traumatic brain injury phase-II: a phase II randomized trial Prediction of long-term ventilatory support in trauma patients The need for ICU admission in intoxicated patients: a prediction model Non-invasive ventilation for acute hypercapnic respiratory failure in older patients Chitinase-3-like protein 1, serum amyloid A1, Creactive protein, and procalcitonin are promising biomarkers for intracranial severity assessment of traumatic brain injury: relationship with glasgow coma scale and computed tomography volumetry A one-day prospective national observational study on sedation-analgesia of patients with brain injury in French Intensive Care Units: the SEDA-BIP-ICU (Sedation-Analgesia in Brain Injury Patient in ICU) Study. Neurocrit Care Clinical features and outcome of patients with primary central nervous system lymphoma admitted to the intensive care unit: a French national expert center experience