key: cord-0024393-te39f578 authors: Xu, Jianqiao; Chen, Yongqiang; Wang, Jiang; Yang, Guixiu; Yan, Peng; Duan, Zhimei; Xie, Lixin; Mo, Guoxin title: There Is a Third Condition Aside from Survival and Death That Affects Outcome Statistics: Terminal Discharge date: 2021-08-03 journal: Iran J Public Health DOI: 10.18502/ijph.v50i8.6809 sha: 3a744d66de9628b5f2e817fd4c0f313daa792d71 doc_id: 24393 cord_uid: te39f578 BACKGROUND: Some patients discharged automatically are classified as terminal discharge, while their clinical outcome is survival, disrupting the results of clinical research. METHODS: The data of this study were taken from inpatients admitted to the ICU of the First Medical Center of the People’s Liberation Army General Hospital, Beijing, China from 2008–2017. We collected the data regarding medications used over the three days before discharge from the group of patients who survived and the group of patients who died, and the outcomes of all patients were recalculated by three classification algorithms (AdaBoosting, Pearson correlation coefficient, observed to expected ratio-weighted cosine similarity). Our basic assumption is that if the classification result is death but the actual in-hospital outcome is survival, the associated patient was likely terminally discharged. RESULTS: The coincidence rate of the outcomes calculated by the AdaBoosting algorithm was 98.1%, the coincidence rate calculated by the Pearson correlation coefficient was 61.1%, and the coincidence rate calculated by the observed to expected ratio-weighted cosine similarity was 93.4%. When the three classification methods were combined, the accuracy reached 98.56%. CONCLUSION: The combination of clinical rules and classification methods has a synergistic effect on judgments of patients’ discharge outcomes, greatly saving time on manual retrieval and reducing the negative influence of statistics or rules. Automatic discharge refers to a patient's ability to self-discharge from hospitalization, regardless of whether the physician knows or agrees. This situation was defined in studies as a type of discharge against medical advice (DAMA). Both automatic discharge and DAMA have been investi-gated by many studies on the types, causes, and trends of discharge and their impacts on patient outcomes and hospital strategies (1) (2) (3) . In China, many patients want to be home for their last moments of life. Therefore, some patients who are automatically discharged are ter-minally discharged, which means that the patients are in critical condition and were likely to pass away within a few hours or days before discharge (1) . The rate of terminal discharge in the ICU (intensive care unit) is approximately 2.87-8.9%. Data analysis from a retrospective study indicated that the clinical outcomes of patients' terminal discharges from the hospital were survival rather than death, despite their being in a state of critical illness before discharge (4) . If the patients' diagnosis and treatment data were included in the survival group for analysis, the results would be altered. In particular, most ICU patients are critically ill, and the proportion of terminally discharged patients is higher in the ICU than in other general departments (1), this results in a greater interference in the survival group data that could affect the results of the data analysis. Therefore, such patients should be identified in advance, and their medical data should be treated accordingly. We encountered a problem during the implementation of pattern recognition for terminal discharge in a 10-year ICU data review study. It is not difficult to identify patients who are discharged in critical condition from the hospital because the patient's condition is generally recorded in the informed consent section of the discharge note. The difficulty in implementation arises from the storage of medical histories recorded by the early hospital information management system (Hospital Information System, HIS) as pictures, which cannot support text searching. It would take too much time to check the pictures of the informed consent form and identify target patients. To perform data analysis efficiently, we considered combining machine learning (5) and clinical rules to identify patients who were discharged while critically ill. We collected and calculated the characteristics of the data regarding medications used over the three days before discharge from the group of patients who survived and the group of patients who died, and the outcomes of all patients were recalculated by the classification algorithm. Our basic assumption is that if the classification result is death but the actual in-hospital outcome is survival, the patient was likely terminally discharged from the hospital. Inspecting the medical history data of this group of people can reduce the time needed for manual verification. We attempted to combine a machine learningbased classification algorithm and clinical rules to efficiently reclassify the clinical outcomes of patients in the historical medical records, which can contribute to reducing physician workloads and benefit retrospective research of clinical data. The data of this study were taken from 8884 inpatients admitted to the ICU of the First Medical Center of the People's Liberation Army General Hospital from January 2008 to June 2017. Of these patients, 1,128 patients died in the hospital. A total of 23,902 pieces of hospitalization medical information were collected from all hospitalization data. The drug codes were adapted to the hospital codes for 2,570 items. The protocol was approved by the Hospital Committee on Ethics of the Chinese PLA hospital (S2020-141-01). We found the patients who died in the hospital by screening patient outcome data, and the remaining patients were determined to be survivors. Then, we sorted all patients' discharge times and the start and stop times of medical orders. We set the three days before the patients' discharge time as the study time interval, and all the doctors' advice pertaining to the drugs administered during that time were obtained, including the names, doses, dose units, and frequency of administration of the drugs. The total doses of each drug used during the three days before the discharge for each patient were summed. The drugs administered to each patient are expressed in the following manner. We used the total doses of the administered drugs, which can be expressed as continuous variables, as the attribute values. (6). Pearson correlation coefficient (7) The following formula was used to calculate the correlation coefficient: The drug attributes of each group were combined, and we summed the attribute values in the same group to obtain the total vectors of the survival group and the death group. The Pearson correlation coefficients were calculated for each individual case vector and the survival group and death group vectors separately. If the Pearson correlation coefficient between the case vector and the survival group vector was greater than that of the death group, we judged the patient as a survivor, and if not, the patient was estimated to have died. Observed to expected ratio-weighted cosine similarity The observed to expected ratio was calculated by a four-cell table (8) showed on supplementary file. Observed to expected ratio = × ( + + + ) ( + ) × ( + ) In the survival and death groups, the drug attributes were combined, and the attribute values were added. The attribute values were multiplied by the observed to expected ratio as the vector value of two groups. The cosine similarities were calculated for each individual case vector and the survival group and the death group vectors separately. The formula is as follows: The outcomes calculated by three classification algorithms are shown in Table 2 and 3. Observed to expected ratio-weighted cosine similarity classification results are shown in Table 4 . Through analyzing the consistency of the estimated outcomes and the original data in the above classification results, the coincidence rate of the outcomes calculated by the AdaBoost algorithm was 98.1%, the coincidence rate calculated by the Pearson correlation coefficient was 61.1%, and the coincidence rate calculated by the observed to expected ratio-weighted cosine similarity was 93.4%. To clarify the reasons for these inconsistencies, we performed a manual verification of the inconsistent results. We assumed that patients who were estimated to be dead but had an actual outcome of survival were most likely to have been terminally discharged from the hospital. In general, patients who have died in the hospital or have been discharged in critical condition will have used salvage medications (such as noradrenaline, metaraminol, isoprenaline, epinephrine, dobutamine, dopamine, atropine), so patients who have used salvage medications but were discharged with an outcome of survival may have actually been discharged in critical condition. We categorized the patients into different groups with estimated results and rescue medication use. The informed consent forms and discharged notes of the patients were manually checked, and the statistical results are showed in Table 5 . Notes: "Salvage medications|Ada|Correl|cos" indicates a combination of the attributes. "1" = used salvage medications, "0" = did not use salvage medications, "D" = estimated as death, "L" = estimated as survival "1|D|D|D" means that the patients had used salvage medications and were estimated to be dead by the three classification algorithms. "0|L|L|L" means that the patients had not used salvage medications and were estimated to have survived by the three classification algorithms The actual death/actual survival ratios of the 0|L|D|L, 0|L|L|L, and 0|L|L|D groups were so low that they were not checked. Our main purpose was to improve the efficiency of the screening for patients discharged in critical condition and reduce manual retrieval time. Not verifying the results for some groups resulted in missing critically ill discharge cases, which will be studied later. In the past, we used the rule of having been administered salvage medications within three days before discharge to classify patients with an outcome of survival as terminally discharged. The accuracy of this simple rule was 66.67% (Fig. 1) . When the three classification methods (Ada, Correl and cos) were combined, the accuracy can reach 98.56%. Among the cases in which all three classification methods classified the patient as having survived, the accuracy of the medication rule was only 34.15%. The combination of clinical rules and classification methods has a synergistic effect on judging the patients' discharge outcomes (9) . Among the manually verified data, the AdaBoosting classification algorithm worked best, even better than the rule of using salvage medications when only considering the accuracy rate. Most of the patients who used rescue drugs but were classified as survival were found to have undergone severe craniocerebral trauma after manual verification. The vital signs of these patients before discharge were stable, and the medications administered were substantially different from those administered to the patients in the death group, so the calculated results were incorrect. Furthermore, there were few terminally discharged patients among those who did not use rescue drugs, who AdaBoosting classified into the survival group, and who were considered to have died according to both the Pearson correlation coefficient and the cosine similarity. In this study, we focused on patients who were discharged from the hospital with outcomes of survival but were classified as having died and determined the probability of these cases being terminal discharges. At the same time, we also conducted a preliminary analysis of cases that were actually discharged as a death but were classified as survivals (10) . By reviewing the hospitalization information of the cases, we found that the main causes of this misprediction were as follows: 1) the results from patients who underwent sudden death were similar to the survival cases due to less medication use in the 3 days before discharge; 2) patients who gave up treatment before dying used only a small amount of rescue and support drugs for safety; and 3) certain small amounts of special drugs or specific routes of administration of certain drugs had effects. Terminally ill and automatically discharged patients who have not died at the time of discharge are already in the terminal stage of their diseases, and their survival time may be extremely short. It is not appropriate for such patients to be classified as survivors solely from their discharge status when analyzing the data. Therefore, when performing the data analysis including clinical outcomes and patient prognosis, if status of the patients who are discharged dying from the hospital is not indicated in advance, relying only on the parameter adjustments of the algorithm to obtain a high agreement rate with the survival and death results of the training samples will result in widely incorrect classifications. Classifications of survival for patients who are discharged in critical condition from the hospital, especially those in articulo mortis, are considered errors by the classification algorithm; if, on the other hand, the patients are classified as having died, this classification is calculated into the error when the result is evaluated. Therefore, it is reasonable to set up a group of critically ill discharged patients. This study used a machine learning-based classification algorithm, combined with clinical rules, to obtain more accurate clinical outcomes with a comprehensive analysis, which can be used to categorize the clinical outcomes and reduce both the time taken to manually retrieve discharge records and the negative influence of statistics or rules. In the next study, we will use medical advice data to improve further the combination of rules and statistics to make the classification and evaluation of clinical outcomes in the hospital more simple and effective. In addition, we will also consider further running-in the classification algorithm and try to apply it to the prediction of clinical outcomes. Ethical issues (Including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, redundancy, etc.) have been completely observed by the authors. The practice of terminal discharge: Is it euthanasia by stealth? Reasons for discharges against medical advice: a qualitative study Why patients sign out against medical advice (AMA): factors motivating patients to sign out AMA Development and validation of a machine learning-based prediction model for near-term in-hospital mortality among patients with COVID-19 Big data and machine learning algorithms for health-care delivery Breast Cancer Diagnosis Using an Efficient CAD System Based on Multiple Classifiers Corruption of the Pearson correlation coefficient by measurement error and its estimation, bias, and correction under different error models A data mining approach for signal detection and analysis Machine Learning for Health Services Researchers Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data The authors declare that there is no conflict of interest.