key: cord-1019631-tk3861u0 authors: Zhao, Chun-Hong; Wu, Hui-Tao; Che, He-Bin; Song, Ya-Nan; Zhao, Yu-Zhuo; Li, Kai-Yuan; Xiao, Hong-Ju; Zhai, Yong-Zhi; Liu, Xin; Lu, Hong-Xi; Li, Tan-Shi title: Prediction of fatal adverse prognosis in patients with fever-related diseases based on machine learning: A retrospective study date: 2020-03-05 journal: Chin Med J (Engl) DOI: 10.1097/cm9.0000000000000675 sha: 40f8166324b40954ba214d2108ef51b0fb7d756f doc_id: 1019631 cord_uid: tk3861u0 BACKGROUND: Fever is the most common chief complaint of emergency patients. Early identification of patients at an increasing risk of death may avert adverse outcomes. The aim of this study was to establish an early prediction model of fatal adverse prognosis of fever patients by extracting key indicators using big data technology. METHODS: A retrospective study of patients’ data was conducted using the Emergency Rescue Database of Chinese People's Liberation Army General Hospital. Patients were divided into the fatal adverse prognosis group and the good prognosis group. The commonly used clinical indicators were compared. Recursive feature elimination (RFE) method was used to determine the optimal number of the included variables. In the training model, logistic regression, random forest, adaboost and bagging were selected. We also collected the emergency room data from December 2018 to December 2019 with the same inclusion and exclusion criterion. The performance of the model was evaluated by accuracy, F1-score, precision, sensitivity and the areas under receiver operator characteristic curves (ROC-AUC). RESULTS: The accuracy of logistic regression, decision tree, adaboost and bagging was 0.951, 0.928, 0.924, and 0.924, F1-scores were 0.938, 0.933, 0.930, and 0.930, the precision was 0.943, 0.938, 0.937, and 0.937, ROC-AUC were 0.808, 0.738, 0.736, and 0.885, respectively. ROC-AUC of ten-fold cross-validation in logistic and bagging models were 0.80 and 0.87, respectively. The top six coefficients and odds ratio (OR) values of the variables in the Logistic regression were cardiac troponin T (CTnT) (coefficient=0.346, OR = 1.413), temperature (T) (coefficient=0.235, OR = 1.265), respiratory rate (RR) (coefficient= –0.206,OR = 0.814), serum kalium (K) (coefficient=0.137, OR = 1.146), pulse oxygen saturation (SPO(2)) (coefficient= –0.101, OR = 0.904), and albumin (ALB) (coefficient= –0.043, OR = 0.958). The weights of the top six variables in the bagging model were: CTnT, RR, lactate dehydrogenase, serum amylase, heartrate, and systolic blood pressure. CONCLUSIONS: The main clinical indicators of concern included CTnT, RR, SPO(2), T, ALB and K. The bagging model and logistic regression model had better diagnostic performance comprehesively. Those may be conducive to the early identification of critical patients with fever by physicians. Fever is the most common chief complaint of emergency patients and is an important pathophysiological process and common symptom of many febrile diseases. [1, 2] In recent years, major public health events, such as severe acute respiratory syndrome, which are mainly manifested by fever, have attracted worldwide attention. [3] [4] [5] A fever may occur in sepsis and other infectious diseases, and may also be seen in many non-infectious diseases, such as malignancy, tissue ischemia, cerebrovascular accident, and autoimmune disease. [2, 6] Sometimes it is difficult to diagnose the cause of fever, such as the fever of unknown origin and the patient's condition may deteriorate sharply. Early identification of patients at an increasing risk of death may avert adverse outcomes. Because of the complexity of fever-related illnesses, no biomarker can definitely diagnose sepsis or predict its clinical outcome. [7] General-purpose illness severity scoring systems such as the Acute Physiology and Chronic Health Evaluation II often contain too many complex items or are not specific to people with fever. [8, 9] With the continuous development of machine learning technology, [10, 11] a machine learning approach has outperformed existing clinical decision rules as well as traditional analytic techniques for predicting in-hospital mortality of emergency department (ED) patients with sepsis. [12] This study, using big data analysis technology, aimed to explore the key factors associated with adverse prognosis of patients with febrile illness, establish an effective model to predict fatal adverse prognosis in patients with febrile disease, and provide technical support for auxiliary clinical diagnosis and treatment decision-making. This was a retrospective study of ED visits. As a retrospective study and data analysis was performed anonymously, this study was exempt from the ethical approval and informed consent from patients. This study retrospectively analyzed the clinical data of 28,400 patients admitted to the emergency room from November 2014 to March 2018. Diagnostic criteria: fever was defined as axillary body temperature equal to or greater than 37.3°C. Inclusion criteria was fever (body temperature ≥37.3°C) and age ≥l2 years. Exclusion criteria: Patients who died shortly after admission (less than 4 h after admission) and failed to complete laboratory examination. The included patients were divided into the fatal adverse prognosis group and the good prognosis group. Definition: Adverse prognosis group included the patients who experienced cardiopulmonary resuscitation or died during emergency treatment, while good prognosis group included those who did not die during treatment in emergency room and did not undergo cardiopulmonary resuscitation, electric defibrillation, endotracheal intubation, tracheotomy and ventilator assisted respiration. All data elements for each ED visit were obtained from the Emergency Rescue Database of Chinese People's Liberation Army General Hospital. After the first measurement of body temperature greater than 37.2°C during the ED visit, sign values and test values were collected. But only the first set of data obtained or generated within 24 h of the ED visit were used as prediction variables. Structured query language queries were written to identify and abstract all demographic information (eg, age and sex) and ED health status (eg, vital signs and laboratory result values). Extraction and screening of variables indicators included vital signs, blood routine, blood biochemistry, coagulation function, and arterial blood gas score analysis, screening of key indicators, and completion of prediction model. Data cleaning was made on the data obtained according to the inclusion and exclusion criteria. The variables that were excessively missing were removed. The individuals whose data showed obvious errors or missing situation were also removed. A baseline description analysis of the remaining variables was performed, where the normal distribution variables are expressed as mean ± standard deviation; the non-normal distribution variables are expressed using the median (interquartile range). A differential hypothesis test was conducted on the variables. The t-test was used for the numerical variables that did not violate normality test and homogeneity test of variance, while the remaining numerical variables were tested by the Mann-Whitney U test. With a = 0.05 as the test level, the differences of the interested variables were discussed. In terms of feature selection, recursive feature elimination (RFE) method was used to determine the optimal number of the included variables. The basic idea was that given a specified external learning algorithm, the prediction accuracy of all subsets of variable combinations were calculated through RFE, and number of the subset with the highest prediction accuracy was chosen as the optimal number. Then, the optimal number was used in RFE as the parameter to determine the entry predictor of the final model. In this research, the decision tree was used to select the predictors into the final model. Also, the Pearson correlation coefficient was used to analyze the variable correlation of the factors in model and the results were shown using a heat map. To explore the most important factors, we reduced the optimal number to a smaller one and repeated the above process. The co-existing factors were chosen for further discussion. We compared the performance of logistic regression, random forest, adaboost and bagging with the same predictors selected above. Accuracy, F1-score, precision, sensitivity and the area under receiver operator characteristic curves (ROC-AUC) were used as criteria for judging model performance. The cohort was split as training and testing set at a ratio of 7:3. The training coefficient of logistic regression was C = 0.01 and the penalty was set to L2. The base estimator of bagging was decision tree classifier and the training loss criterion was set to entropy. A total of 1000 trainings were performed. The models with good performance were cross verified by ten folds. For validation, we further collected the emergency room data from December 2018 to December 2019 with the same inclusion and exclusion criterion. The performance was also evaluated by accuracy, F1-score, precision, sensitivity, and ROC-AUC. The descriptive baseline analysis and hypothesis testing were done in IBM SPSS Statistics for Windows (Version 19.0. Armonk, NY: IBM Corp., USA), and the rest of the process was done in Python 3.6.1 (https://www.python.org). The data of 5549 patients with fever were obtained, of which 307 patients had fatal poor outcome event and 5242 had no adverse outcome event. The data contained a total of 70 variables. If certain variable was absent in more than 30% of the patients' data, it was removed, so the remaining part contained 39 variables. After deleting individuals with missing data, a total of 3474 patients who did not have adverse outcome events as control group, and 208 patients who had fatal adverse outcome events were analyzed. In the baseline analysis, the statistical results of the resulting 39 variables are shown in Table 1 . In the hypothesis test analysis, the variables with statistical differences are shown in Table 1 . After selecting the optimal number by the RFE method with decision tree model, we got the highest accuracy when there were 15 factors in model, and then 15 was taken as the coefficient of RFE to select the specific factors. The obtained 15 factors through these processes were: heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), diastolic blood pressure (DBP), pulse oxygen saturation (SPO 2 ), temperature (T), creatine kinase myocardial isoenzyme (CK-MB), total bilirubin (TBIL), lactate dehydrogenase (LDH), serum amylase (AMY), serum lipase (LIP), cardiac troponin T (CTnT), aerum kalium (K), total protein (TP), and albumin (ALB). After that, the Pearson correlation test was done, with the details shown in Figure 1 The results were expressed as the median (interquartile range) or mean ± standard deviation. Table 2 and Figure 2 ]. Tenfold cross-validation was performed on the logistic and bagging models with better comprehensive performance, and their ROC-AUC were 0.80 and 0.87, respectively. In validation part, the decision tree model got the highest accuracy and F1-score, while the bagging model got the highest sensitivity and ROC-AUC. The coefficients and OR values of the variables in the logistic regression are shown in Table 3 To explore the most important factors, we reduced the optimal number of RFE to 11 and repeated the above process. Similarly, the bagging model showed the best performance on sensitivity and ROC-AUC, while the logistic regression showed the best accuracy. The details are shown in Table 2 Table 2 . Modern medicine often involves collecting large amounts of physiological data, laboratory results and imaging data into electronic records. The data, however, are complex and multidimensional. It is difficult to find subtle relationships between these data and clinical outcomes using traditional statistical techniques. In this study, the advantage of machine learning is to provide possible innovative solutions for clinical doubts, to find important indicators that may be ignored in treatment for clinicians, and to provide guidance for clinical decision-making to prevent adverse prognosis. We selected 70 clinical variables associated with "fever" as candidate indicators. It can be seen that in the model with 15 impact factors (HR, RR, SBP, DBP, SPO 2 , T, CK-MB, TBIL, LDH, AMY, LIP, CTnT, K, TP, and ALB) and in the model with 11 impact factors (RR, SPO 2 , T, Na, CL, CTnT, K, ALB, Ca, P, and RBC), the co-existing influence factors were: RR, SPO 2 , T, CTnT, K, and ALB. These factors could be the main clinical indicators of concern in this experiment, which should be given clinical attention. Our results suggest that CTnT has the largest weight in the prediction model of adverse prognosis patients with fever, the serum levels of CK, CK-MB, CTnT, and brain natriuretic peptide precursor in patients with fever adverse prognosis group were higher than those in patients with good prognosis group. Further logistic regression analysis showed that serum CK-MB and CTnT levels were independent risk factors for poor prognosis in patients with fever, which should be given clinical attention. Troponin T is one of the clinically recognized markers to reflect myocardial injury. When its level increases, it can reflect the increased severity of myocardial injury, with high specificity. [13] [14] [15] Previous studies have suggested that myocardial injury, as reflected by elevated cardiac troponin levels in plasma, is common in patients with community-acquired pneumonia. [16] The main reason for poor prognosis in patients with severe sepsis is that patients often have cardiac dysfunction. Once the cardiovascular system is damaged, the death rate of patients increases greatly. [17] In this study, ALB was found to be a key factor in predicting the death or poor prognosis in patients with fever. Hoeboer et al [18] concluded that ALB rather than Creactive protein may be valuable in predicting and monitoring the severity and course of acute respiratory distress syndrome in critically ill patients with or at risk for the syndrome after new onset fever. Therefore, active correction of hypoproteinemia in patients with fever may improve prognosis. This study suggested that the vital signs of body temperature, HR, RR, SBP, and DBP had important clinical value in predicting the fatal adverse prognosis in patients with fever, and the sum of their weights was 0.3186. It also showed that SPO 2 was one of the independent protective factors for poor prognosis in patients with fever. When HR, RR, and SBP increased in patients with fever, SPO 2 should be detected in time, and should be improved by early non-invasive or invasive mechanical ventilation treatment, so as to reasonably shorten the mechanical ventilation time to reduce the mortality of patients. Single-factor analysis showed that serum sodium of adverse prognosis group was higher than that of good prognosis group, suggesting that for clinical patients with severe fever, timely fluid supplement should be given to correct hyper serum sodium caused by dehydration and other reasons. In addition, this study suggested that serum K, LDH, AMY, TP, and TBIL were independent factors affecting poor prognosis in patients with fever. In patients with fever, attention should be paid to the changes of the above indicators to save lives. In the methodological part, when 15 variables were selected, the best results were obtained from the logistic regression model and the bagging model. Logistic regression had the highest accuracy (0.951), while the bagging model had the highest AUC score (0.885). Given the clinical application of the model, the clearer the focus on the indicators, the better it could help doctors make decisions. In addition, if more indicators are involved, it is not always possible to ensure that every factor can be obtained in a short time. Therefore, in the case that the performance of the model does not change much, we believe that the model with fewer factors is more conducive in the clinic. The model may be used to predict, verify and improve future clinical medical practices. We need to admit that our study has two limitations. First, when we selected cohort, we did not separate people with diversified causes of fever into different groups. Now we cannot calculate the accuracy of different fever causes. This can serve as the direction for our future research. Second, as this was a retrospective study, there were many deficiencies in the data, and the data volume is not very ideal, we hope to have more data in the future. In conclusion, big data analysis method was adopted to establish a scientific and objective prediction and evaluation model for adverse prognosis of patients with fever in this study. We found the main clinical indicators of concern, and the prediction model established had high diagnostic accuracy and reliability, which may be conducive to the early identification of critical patients with fever by physicians, thus improving the prognosis of patients with fever. The application of big data analysis combined with medical research is helpful to improve the diagnosis and treatment level of febrile critical diseases and the prevention and control of infectious diseases. Research on adverse event prediction model for critical patients with fever quantifies the recognition of critical diseases related to fever and provides a reference model for other similar clinical decision support studies. This study was supported by grants from the Big Data R&D Project of the People's Liberation Army General Hospital (No. 2017MBD-30), and the Science and Technology Innovation Nursery Fund Project of the People's Liberation Army General Hospital (No. 17KMM50). None. The pathophysiological basis and consequences of fever Another nightmare after SARS: knowledge perceptions of and overcoming strategies for H1N1 influenza among chronic renal disease patients in Hong Kong From SARS to avian influenza: the role of international factors in China's approach to infectious disease control Fifteen years post-SARS: key milestones in Canada's public health emergency response Brief report.incidence, etiology, risk factors, and outcome of hospital-acquired fever: a systematic, evidence-based review Sepsis and septic shock The ability of the National Early Warning Score (NEWS) to discriminate patients at risk of early cardiac arrest, unanticipated intensive care unit admission, and death A targeted real-time early warning score(TREWScore) for septic shock Accessing the public MIMIC-II intensive care relational database for clinical research Prolonged elevated heart rate and 90-day survival in acutely ill patients: data from the MIMIC-III database Prediction of in-hospital mortality in emergency department patients with sepsis: a local big data-driven, machine learning approach Mortality in sepsis: comparison of outcomes between patients with demand ischemia, acute myocardial infarction, and neither demand ischemia nor acute myocardial infarction Sequential N-terminal pro-B-type natriuretic peptide and highsensitivity cardiac troponin measurements during albumin replacement in patients with severe sepsis or septic shock Earliest bedside assessment of hemodynamic parameters and cardiac biomarkers: their role as predictors of adverse outcome in patients with septic shock Myocardial injury in critically ill patients with community-acquired pneumonia: a cohort study Acute and chronic remote ischemic conditioning attenuate septic cardiomyopathy, improve cardiac output, protect systemic organs, and improve mortality in a lipopolysaccharide-induced sepsis model Albumin rather than Creactive protein may be valuable in predicting and monitoring the severity and course of acute respiratory distress syndrome in critically ill patients with or at risk for the syndrome after new onset fever Prediction of fatal adverse prognosis in patients with fever-related diseases based on machine learning: a retrospective study