key: cord-0559073-gnkkp3u0 authors: Wang, Taiyao; Hansen, Kyle R.; Loving, Joshua; Paschalidis, Ioannis Ch.; Aggelen, Helen van; Simhon, Eran title: Predicting Antimicrobial Resistance in the Intensive Care Unit date: 2021-11-05 journal: nan DOI: nan sha: a271fef3d755522f21cc93f16a02656b92dedaa1 doc_id: 559073 cord_uid: gnkkp3u0 Antimicrobial resistance (AMR) is a risk for patients and a burden for the healthcare system. However, AMR assays typically take several days. This study develops predictive models for AMR based on easily available clinical and microbiological predictors, including patient demographics, hospital stay data, diagnoses, clinical features, and microbiological/antimicrobial characteristics and compares those models to a naive antibiogram based model using only microbiological/antimicrobial characteristics. The ability to predict the resistance accurately prior to culturing could inform clinical decision-making and shorten time to action. The machine learning algorithms employed here show improved classification performance (area under the receiver operating characteristic curve 0.88-0.89) versus the naive model (area under the receiver operating characteristic curve 0.86) for 6 organisms and 10 antibiotics using the Philips eICU Research Institute (eRI) database. This method can help guide antimicrobial treatment, with the objective of improving patient outcomes and reducing the usage of unnecessary or ineffective antibiotics. Healthcare-associated infections (HAI) affect patients in a hospital or other healthcare facility, and are not present or incubating at the time of admission. They also include infections acquired by patients in the hospital or facility that appear after discharge, and occupational infections among staff. The estimated incidence rate in the U.S. was 4.5% in 2002, corresponding to 9.3 infections per 1000 patient-days and 1.7 million affected patients [1] . It was estimated that there were 648,000 patients with 721,800 HAIs in U.S. acute care hospitals in 2011 [2] . According to the Centers for Disease Control and Prevention (CDC), there were an estimated 687,000 HAIs in U.S. acute care hospitals in 2015. About 72,000 hospital patients with HAIs died during their hospitalizations [3] . Antimicrobial resistance (AMR) is a growing threat to global health. In 2013, the CDC reported that, each year, at least 2 million people become infected with antibiotic resistant bacteria in the U.S., and at least 23,000 people die each year as a direct result of these infections [4] . Although not all HAIs are caused by resistant bacteria, antibiotic resistance is a great concern for hospitals, since resistant pathogens can spread between patients and healthcare staff when hygiene measures are insufficient and suboptimal antibiotic treatment can promote resistance, thereby worsening patient outcomes. One of the main challenges facing clinicians is the absence of fast and accurate antimicrobial resistance typing. Antimicrobial resistance assays typically take several days to complete, but patients typically require antimicrobial treatment the day they are admitted. Being able to predict antimicrobial resistance accurately prior to culture-based resistance typing could thus help clinicians make an informed treatment decision in a timely manner. Machine learning has been proposed as a feasible solution for bacterial AMR prediction. The current body of work on machine learning models for AMR prediction is largely focused on genomic data models. [5] provides a brief overview of current studies using machine learning for prediction of antimicrobial susceptibility phenotypes from genotypic data. To the best of our knowledge, all published predictions of AMR using machine learning algorithms use either a k-mer representation of the bacterial genome [6, 7, 8, 9] or other gene-related information [10, 11, 12] . The model by Rishishwar et al. [11] that discriminates between vancomycin-intermediate and vancomycin-susceptible Staphylococcus aureus using 25 whole-genome sequences reached an accuracy of 84%. In a study by Pesesky et al. [10] , a rules-based and logistic regression prediction achieved agreement with standard-of-care phenotypic diagnostics of 89.0% and 90.3%, respectively, for whole-genome sequence data from 78 clinical Enterobacteriaceae isolates. In a study by Drouin et al. [6] , average test set error rates of set-covering machine models that predict the antibiotic resistance of Clostridium difficile, Mycobacterium tuberculosis, Pseudomonas aeruginosa, and Streptococcus pneumoniae with hundreds of genomes and k-mer representation ranged from 1.1% to 31.8%. Davis et al. [8] built adaptive boosting classifiers with at least 100 genomes and k-mer representations to identify carbapenem resistance in Acinetobacter baumannii, methicillin resistance in Staphylococcus aureus, and beta-lactam and co-trimoxazole resistance in Streptococcus pneumoniae with accuracies ranging from 88% to 99%. Her et al. [12] used Support Vector Machine (SVM) algorithms with radial basis function kernels to predict Escherichia coli AMR activities and reported Area Under the ROC Curve (AUC) from 93% to 100% for 12 of the most-annotated antibiotics based on a pan-genome and gene clusters selected by a genetic algorithm. The primary objective in this work is to develop AMR predictive models and to identify important variables to predict AMR in the absence of genomic information. Instead, we will leverage patient information and microbiology test characteristics. Such models can be developed based on data that are more readily available and easier to acquire and, as a result, have the potential to offer an attractive alternative to models based on gene-related information. Current practice relies on clinician interpretation of hospital antibiograms to guide prescribing of drugs based on population resistance. Hospital antibiograms summarize the percent of individual pathogens resistant to different antimicrobial agents and are derived from resistance typing results alone. A previous study estimated that standard care procedures result in appropriate prescription of antibiotics for only about 70% of cases [13] . Here we compare the performance of naive models based only on microbiology data (analogous to an antibiogram) with predictive models that combine resistance typing with information about patient demographics, hospital stay, previous resistance test results, and diagnoses. An important enabler of our work is the increasing availability of patients' Electronic Health Records. The digitization of patients' medical records over the last two decades has enabled the development of more personalized and accurate models for diagnostics and treatment. Over the past few years, there has been increased interest in algorithmic and data-driven approaches to improve healthcare quality. Machine learning is increasingly being used with Electronic Health Records to predict chronic disease hospitalizations [14, 15] , level-of-care requirements [16] , mortality [17, 18] and readmission [19, 20, 21, 22] . The remainder of this paper is organized as follows. In Section 2, we review the database specifics, data selection, feature generation, pre-processing and classification methods used in the study. In Section 3, we present the prediction results and highlight the important variables. In Section 4, we discuss the results, including both limitations and strengths. Conclusions are in Section 5. In this study we utilized microbiology tests and patient information from the eICU Research Institute (eRI) database for patients who had a complete hospitalization between January 1, 2007 and March 31, 2013. Detailed descriptions of the eRI database are provided in [23, 24] . We filtered tests such that patients were at least 16 years old and cultures were taken from patients while being admitted to an ICU unit. We selected microbiology tests limited to 6 organisms and 10 antibiotics. Organisms included are Staphylococcus aureus, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, Staphylococcus epidermidis, Enterobacter cloacae and antibiotics were one of vancomycin, imipenem/cilastatin, cefipime, oxacillin, ciprofloxacin, nitrofurantoin, trimethoprim/sulfamethoxazole, cefazolin, ampicillin/sulbactam, and ampicillin. After selection, there were 12,575 unique patients, 13,087 unique patient unit stays, and 80,125 tests from 191 different units. The test results which we aimed to predict were distributed as follows: 55,095 were Sensitive, 22,852 were Resistant, and 2,178 were Intermediate. Thus, after 'Sensitive' and 'Intermediate' records were combined as 'non-Resistant', the average AMR rate was 28.5%. We included in the study two types of variables as described below. • Patient data: patient unit stay id, gender, age, ethnicity, height, admission weight, unit location id, unit type, unit stay type, unit admit source, unit admit time, hospital admit source, number of minutes from unit admit time to culture time, the ICU visit number during the patient's hospital stay and admission diagnosis for patient unit stay. • Microbiology data: patient culture taken year, culture taken time, number of minutes from unit admit time until the culture was taken, culture site, organism, antibiotic, and sensitivity level of antibiotic. The feature generation and pre-processing consisted of the following steps: • Generate feature interactions 'anti-organism' between 'organism' and 'antibiotic'. • Generate existing resistant test information for the same patient and same type 'anti-organism' (more than 48 hours ago). • Transform hospital admit time and culture taken time from minutes to days and their log transformation. • Convert categorical variable into dummy/indicator variables with one-hot encoding (e.g., 'anti-organism', 'locationid', 'apacheadmissiondx'). • Cutoff lower-tail (0.5th percentile) and upper-tail (99.5th percentile) values for height, weight, hospital admit time and culture taken time. • Replace missing values of admit weight by discharge weight and replace other variables' missing values by the median across all patients. • Perform feature selection by a two-sided t-test which compares the means of a variable in the 'resistant' and 'non-resistant' cohorts. We take as 'null' hypothesis that means are equal. If the corresponding p-value is below a threshold of 0.1 , the variable is retained. We also perform an additional step of feature elimination, excluding one variable among each set of two highly correlated (> 0.75 or < −0.75) variables. • Transform features by scaling each feature to be between zero and one (i.e., by dividing by the total range of the feature). Microbiology tests were randomly split by patient unit stay id in a 60%:20%:20% fashion to training, test and validation cohorts. We tested a variety of supervised classification methods [25] , including ℓ 1regularized logistic regression (L1LR), random forests (RF), neural networks (NN) and Gradient Boosting Machine (GBM). All methods were implemented in Python, using scikit-learn [26] and LightGBM [27] . We compared these methods with a naive model (AB) based on an antibiogram by calculating the percentage resistance in the training and validation datasets and naively setting that equal to the probability of resistance in the test datasets. ℓ 1 -regularization can mitigate overfitting and improve the interpretability for models in clinical settings [14, 15, 28, 29, 30] . Logistic regression, widely used in statistics and machine learning studies, was implemented with an ℓ 1 -regularization term to induce sparsity and interpretability. We tuned the strength of the regularizer using cross-validation. The random forest (RF) method builds a large collection of de-correlated trees, and then averages them, which generally leads to a substantial performance improvement over single tree classifiers. We tuned a number of hyperparameters, such as the number of trees, the depth of each tree, and the number of features used (a random subset of the total) at each tree node split. Gradient boosting machine (GBM), also referred to as gradient-boosted decision trees, is a popular machine-learning algorithm used for regression and classification tasks. We used LightGBM which is a fast and high-performance GBM framework that grows trees leaf-wise rather than level-wise and implements a host of techniques, such as gradient-based one-side sampling and exclusive feature bundling to deal with a large number of data instances and features, respectively. We tuned hyperparameters such as the number of leaves in each tree, and the minimum number of sample points corresponding to each leaf. Finally, we used a class of feed-forward artificial neural networks (NN), called a multilayer perception, which consists of at least three layers of nodes. Except for the input nodes, each node uses a nonlinear activation function. It can distinguish data that is not linearly separable due to its multiple layers and non-linear activation. We tuned a number of hyperparameters such as the number of hidden neurons, layers, and iterations using a rectified linear unit ('relu') activation function for each hidden layer, and the 'Adam' adaptive stepsize rule for stochastic gradient descent. A Receiver Operating Characteristic (ROC) curve is created by plotting the true positive rate (or recall, or sensitivity) against the false positive rate (equal to one minus the specificity) at various thresholds. The c-statistic or the area under the ROC curve (AUC), is used to evaluate the prediction performance. A perfect predictor gives an AUC score of 1 and a predictor which makes random guesses has an AUC score of 0.5. In machine learning, ensemble methods [25] that use multiple learning algorithms can obtain better predictive performance than any learning algorithms alone. We proposed a new ensemble model by averaging other prediction results based on neural networks, GBM and RF, which achieved the highest AUC in the eRI dataset. We trained each model by using 80% of the samples for training model parameters (60% training, 20% validation) and evaluated the model's performance on the unseen test 20%. Table 1 summarizes the performance of the various predictive models on the test dataset under two learning modes. The first column contains the names of machine learning methods. The second column is the AUC under the assumption that no temporal effects exist, in which case training and test datasets are formed by randomly selecting patients. The third column is the AUC under the assumption that temporal effects exist; in this mode, samples are split by unit admit year and unit stay id so that the training samples belong to patients hospitalized before a certain data and test samples correspond to patients after that date. The latter mode aims at capturing how a predictive model would be trained in practice. From the comparison of the AUC column and the AUC time column, the prediction accuracy in practice may decrease due to temporal trends. In particular, changes in data collection and prevalence of specific microbes, render earlier data less predictive about the future. Figure 1 are the ROC curves for the AMR prediction models and the naive model (AB) in the test cohort. Figure 2 shows the ROC curves for the AMR prediction models and the naive model (AB) in the test cohort under the assumption that temporal effects exist. In order to study the difference of prediction accuracy on different organisms, we trained the L1LR model for every organism separately by using an 80%-20% split between training and testing data subsets. Table 2 summarizes the performance of the L1LR predictive models versus the naive model (AB) on the test dataset for every organism. Table 3 summarizes the 20 most predictive variables and coefficients in the L1LR model. Most of them are binary variables with prefix 'ao' from one-hot encoding of feature interactions 'anti-organism' between 'organism' and 'antibiotic'. Others include y pre, i.e., the mean of existing resistant tests for the same patient and same type 'anti-organism' more than 48 hours ago, admission diagnosis with prefix 'apacheadmissiondx', unit location id with prefix 'locationid', and culture taken time from ICU admit time. We note that since the variables have been standardized, the coefficients of different variables reveal relative strength of the variable in the model. Figure 3summarizes average AMR rates with 6 organisms and 10 antibiotics. Figure 4summarizes total frequency counts. Figure 5summarizes AMR frequency counts. In this study, we apply a set of machine learning methods and propose a new ensemble method to predict AMR utilizing electronic medical records (EMR) data in conjunction with microbial information. We show that it is an improvement over a naive model (AB) designed to represent an antibiogram, which is the common practice for antibiotic resistance prediction in hospitals. The results show strong classification performance using the eRI database. In addition, important variables have been identified offering interpretability of the results. This study provides a new framework and new predictions using EMR for predicting AMR. Instead of focusing on one organism as in previous studies, we explore six organisms and ten antibiotics. To our knowledge, this is the first study with high accuracy in predicting AMR and identifying important variables without gene-related information. The results have important consequences for ICU patients. In an ICU setting, it is critically important to obtain quick and accurate prediction without DNA sequencing, since about half of the ICU patients with AMR tests in the study are from the emergency room. The time-frame for the detection is also drastically shorter than testing multiple antibiotics -a common practice. Early detection and treatment of HAIs are vital for reducing the length of stay and mortality. For ICU patients who need acute treatment, other researchers and hospitals can use our framework with other datasets. Researchers can further combine the variables used here with the genomic information that may improve the prediction performance. The most important variables are interaction features 'anti-organism' between 'organism' and 'antibiotic'. If only 'organism' and 'antibiotic' are used without the second-order interaction terms (e.g., ao vancomycin Staphylococcus epidermidis), tree-based models can learn the interaction relationship but the interpretable logistic regression cannot so it performs more poorly without computing those interaction terms. Other important variables are existing AMR tests for the same patient and same type 'anti-organism', which are highly correlated if the culture taken times are close. However, in clinical settings, it takes 24-48 hours to obtain resistant test results. That is why only existing resistance tests more than 48 hours ago should be considered as variables in the models. If existing resistance tests more than 48 hours ago are removed, values in the AUC column of Table 1 decrease by about 1%. Average AMR rates and AMR frequency counts with 6 organisms and 10 antibiotics in Figure 3and Figure 5 can be used to help select antibiotics and reduce drug misuse for patients. For instance, if the average AMR rate is higher than 95% for one 'anti-organism', e.g., Klebsiella pneumoniae and ampicillin, the antibiotic drug should not be recommended for the treatment of bacterial infections. However, if the number of total frequency counts in Figure 4 is small, the AMR results should not be trusted. Other features like weight, age, height and ethnicity are important in tree-based models, e.g., GBM. In order to study AMR temporal effects, we have carried out an experiment in which samples are split by unit admit year and unit stay id so that the training samples belong to patients hospitalized earlier and the test samples belong to the patients hospitalized after a certain cutoff date. Such a prediction seeks to reproduce how a predictive model would be used in practice. From the comparison of the AUC column and the AUC time column in Table 1 , the prediction accuracy in practice may decrease due to temporal trends, potentially because the type of AMR instances evolves over time. This can also be observed in the comparable shift in AUC values between Figure 1 and Figure 2 .As Figure 6 suggests, average AMR rates change over time for some organisms and antibiotics. There are some limitations of our study. First, spatio-temporal effects [31] are not explicitly modeled. One reason is that our data span more than 6 years and 191 different units; many more data samples would be needed to reliably model spatio-temporal effects. The machine learning algorithms employed in this work to predict AMR based on the eRI database exhibit improved classification performance compared to the AB model. Our framework is faster and less expensive than previous research on predicting AMR with genomic data. The techniques we utilized can be extended to other settings besides the ICU. Our findings and the development of accurate predictive models of AMR can guide the choice of appropriate antibiotics, thereby reducing the occurrence of antibiotic resistance, decreasing healthcare costs and improving patient outcomes. The models developed in this work are based on patients who are admitted to the ICU at least once. In the future, the models can be further generalized to other hospital units, evaluating their performance in a more general setting. Another future direction is to combine our methods with previous predictions that leveraged gene information. As we argued, our framework is faster and more inexpensive but perhaps less accurate than genomics-based models. For some organisms or 'antiorganism' combinations, our predictions achieve high prediction accuracy. But for some other organisms or 'anti-organism' combinations with low prediction accuracy, e.g., 'Escherichia coli', one could potentially refine the prediction with DNA sequencing data [12] . The models introduced here can help hospitals speed up appropriate treatment of infection to improve patient outcomes, and to reduce the costs of AMR prediction. Health care-associated infections fact sheet Multistate point-prevalence survey of health care-associated infections Prevention: Data Portal -HAI -CDC Prevention: Antibiotic resistance threats in the United States Machine learning: novel bioinformatics approaches for combating antimicrobial resistance. Current opinion in infectious diseases Predictive computational phenotyping and biomarker discovery using reference-free genome comparisons Machine learning for antimicrobial resistance Antimicrobial resistance prediction in patric and rast Interpretable model for antibiotic resistance prediction in bacteria using deep learning Evaluation of machine learning and rules-based approaches for predicting antimicrobial resistance profiles in gram-negative bacilli from whole genome sequence data A genome sequence based discriminator for vancomycin-intermediate staphylococcus aureus A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the escherichia coli strains Prevalence of inappropriate antibiotic prescriptions among us ambulatory care visits Predicting chronic disease hospitalizations from electronic health records: an interpretable classification approach Predicting diabetes-related hospitalizations based on electronic health records Early prediction of level-of-care requirements in patients with covid-19 Predictive models of mortality for hospitalized patients with covid-19: Retrospective cohort study Risk prediction for 30-day mortality among patients with clostridium difficile infections: a retrospective cohort study Readmissions and death after icu discharge: development and validation of two predictive models Prescriptive cluster-dependent support vector machines with an application to reducing hospital readmissions Prescriptive analytics for reducing 30-day hospital readmissions after general surgery Data analytics and optimization methods in biomedical systems: from microbes to humans Benchmark data from more than 240,000 adults that reflect the current practice of critical care in the united states The eicu research institute-a collaboration between industry, health-care providers, and academia The Elements of Statistical Learning Scikit-learn: Machine Learning in Python Lightgbm: A highly efficient gradient boosting decision tree A robust learning approach for regression models based on distributionally robust optimization A joint sparse clustering and classification approach with applications to hospitalization prediction Convergence of parameter estimates for regularized mixed linear regression models Geographic diversity and temporal trends of antimicrobial resistance in streptococcus pneumoniae in the united states The authors would like to thank Chieh Lo, Ting Feng, Bryan Conroy, David Noren, and Andrew Hoss for helpful discussions. Author's contributions All of us contributed to the design of the study and to the definition of the problem. TW and KRH queried and analyzed the data. TW took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis and manuscript.