key: cord-0866792-f4uc8p44 authors: Rodriguez, Victor Alfonso; Bhave, Shreyas; Chen, Ruijun; Pang, Chao; Hripcsak, George; Sengupta, Soumitra; Elhadad, Noemie; Green, Robert; Adelman, Jason; Metitiri, Katherine Schlosser; Elias, Pierre; Groves, Holden; Mohan, Sumit; Natarajan, Karthik; Perotte, Adler title: Development and validation of prediction models for mechanical ventilation, renal replacement therapy, and readmission in COVID-19 patients date: 2021-03-11 journal: J Am Med Inform Assoc DOI: 10.1093/jamia/ocab029 sha: 5e930290efc272b88e531b11f1382b2e5d8cbad9 doc_id: 866792 cord_uid: f4uc8p44 OBJECTIVE: Coronavirus disease 2019 (COVID-19) patients are at risk for resource-intensive outcomes including mechanical ventilation (MV), renal replacement therapy (RRT), and readmission. Accurate outcome prognostication could facilitate hospital resource allocation. We develop and validate predictive models for each outcome using retrospective electronic health record data for COVID-19 patients treated between March 2 and May 6, 2020. MATERIALS AND METHODS: For each outcome, we trained 3 classes of prediction models using clinical data for a cohort of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2)–positive patients (n = 2256). Cross-validation was used to select the best-performing models per the areas under the receiver-operating characteristic and precision-recall curves. Models were validated using a held-out cohort (n = 855). We measured each model’s calibration and evaluated feature importances to interpret model output. RESULTS: The predictive performance for our selected models on the held-out cohort was as follows: area under the receiver-operating characteristic curve—MV 0.743 (95% CI, 0.682-0.812), RRT 0.847 (95% CI, 0.772-0.936), readmission 0.871 (95% CI, 0.830-0.917); area under the precision-recall curve—MV 0.137 (95% CI, 0.047-0.175), RRT 0.325 (95% CI, 0.117-0.497), readmission 0.504 (95% CI, 0.388-0.604). Predictions were well calibrated, and the most important features within each model were consistent with clinical intuition. DISCUSSION: Our models produce performant, well-calibrated, and interpretable predictions for COVID-19 patients at risk for the target outcomes. They demonstrate the potential to accurately estimate outcome prognosis in resource-constrained care sites managing COVID-19 patients. CONCLUSIONS: We develop and validate prognostic models targeting MV, RRT, and readmission for hospitalized COVID-19 patients which produce accurate, interpretable predictions. Additional external validation studies are needed to further verify the generalizability of our results. The United States continues to be a major epicenter for coronavirus disease 2019 (COVID- 19) , the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 1, 2 In the early phase of the pandemic, hospitals in hard-hit regions, such as the New York Metropolitan Area, suffered large caseloads which heavily strained medical resources. [3] [4] [5] The surge in cases throughout the country continues to drive medical resource expenditure, exhausting limited supplies. In this setting, delivering optimal care to COVID-19 patients will require matching scarce resources to patients in need across hospital systems, cities, and even across the country. Efficient distribution of resources will depend critically on prognostic assessments for newly presenting patients. With accurate prognostication, patient needs may be anticipated and met with the necessary equipment and provider expertise to limit disease progression or guard against avoidable adverse outcomes. In this study, we develop, validate, and analyze predictive models for the prognostication of 3 prevalent and actionable adverse outcomes in the setting of COVID-19. Acute respiratory failure (ARF) requiring mechanical ventilation, severe acute kidney injury (AKI) requiring renal replacement therapy (RRT), and readmission are 3 common and critical adverse outcomes for patients with COVID-19. Roughly 12% to 33% of patients suffer ARF and require mechanical ventilation. [6] [7] [8] [9] 34% of all patients with COVID-19 and 78% of COVID-19 intensive care unit (ICU) patients develop AKI, with up to 14% of all patients and 35% of ICU patients requiring RRT. 10 In addition, while hospitals struggle to manage heavy COVID-19 caseloads, patients who would normally be admitted may be discharged home, leading to higher than expected readmission rates. 11 Each of these outcomes carries significant implications for patient outcomes, long-term sequelae, and utilization of scarce resources including hospital beds and the equipment and materials needed for mechanical ventilation and RRT. Clinical prediction models could be used effectively to assess patient prognosis, informing resource planning and triage decisions. 12, 13 Nevertheless, most published COVID-19 prediction models have focused on disease diagnosis, while the few prognostic models have targeted COVID-19 disease severity or mortality. 14 In this work, we aim to build interpretable prognostic models for COVID-19 patients that estimate the risk of ARF requiring mechanical ventilation, AKI requiring RRT, and hospital readmission. We develop our models using electronic health record (EHR) data from a major tertiary care center in New York City during the peak of the COVID-19 crisis, and externally validate them using data from a community hospital. We focus on patients whose hospital courses included emergency room visits, inpatient admissions, or both at Columbia University Irving Medical Center/NewYork-Presbyterian (CUIMC/NYP) between March 2 and May 6, 2020. As we are interested in studying patients with active SARS-CoV-2 infection, we further limit this co-hort to patients with a positive, polymerase chain reaction-based SARS-CoV-2 test at any point during their hospital course. All clinical observations were extracted from CUIMC/NYP's Clinical Data Warehouse formatted according to the Observational Medical Outcomes Partnership (OMOP) common data model. 15 Data from CUIMC including Milstein Hospital and the Morgan Stanley Children's Hospital were used for model development. Observations for patients treated at NYP Allen Hospital, a community hospital member of NYP, were held out as a validation set. We use chi-square permutation tests to compare the distribution of outcomes and demographics between our 2 cohorts (see Table 1 ). We note that the development and validation cohort data are derived from care sites with distinct inpatient and critical care capacities. To characterize these differences, we provide each site's regular inpatient and ICU bed counts as well as their average annual admissions (see Supplementary Table 1) . Our datasets comprise demographics, smoking status, laboratory test results, vital signs, and conditions. Clinical laboratory tests and vital signs are standardized while demographics and conditions are transformed into a binary encoding indicating presence (see Supplementary Methods). We include in our feature set only those conditions that appeared in the clinical records of at least 5 patients. A full list of the variables included in our feature set is provided in Supplementary Table 2 . For RRT and mechanical ventilation models, we use data gathered during the first 12 hours of the current hospital course. The 12hour constraint is meant to exclude early events, which are likely to be anticipated on presentation and are therefore less likely to be intervened upon based on the output from a predictive model. This constraint also removes episodes occurring prior to a patient's arrival at the hospital; such events must be excluded to permit construction of prognostic models. We also include data from patients' prior visits. For numerical data types like laboratory tests and vital signs, we use only the most recent values. Our binary encoding accounts for the presence of conditions in a patient's current visit and all their prior visits. The dataset for our readmission models is constructed in the same way as is done for the mechanical ventilation and RRT models with one important difference: we extend the data-gathering period to cover the entirety of the index hospital admission, not just the first 12 hours. Handling missing values For conditions, the binary encoding does not require imputation as it encodes presence directly. For numerical variables, we impute missing values using Scikit-learn's 16 implementation of the MICE algorithm 17 with its default parameterization. Categorical variables (excluding conditions) were imputed using the most common class in the training set. Furthermore, for imputed variables, we expand our features to include binary missingness indicators specifying whether a value was observed or imputed. See Supplementary Table 3 for a detailed account of each variable's missingness proportion. We construct definitions for mechanical ventilation, RRT, and readmission, and constrain our analysis to the earliest such event within a patient's available timeline. Note that we do not exclude patients who died during their index hospital course. As such, our outcomepositive cohorts contain patients who experienced the target outcome and died afterward. Conversely, outcome-negative cohorts contain patients who died without ever experiencing the target outcome. This choice is in line with our aim of constructing clinically useful prognostic models. Doing so requires that we construct our models using data for all available patients, including those who deteriorate so quickly that they expire before our target outcomes can take place as well as those who deteriorate after all potential clinical interventions have been exhausted. See Table 1 for summary mortality statistics. We validate all our outcome definitions by iteratively sampling 50 to 100 patients, reviewing their clinical records to determine if our outcome definitions correctly classified their outcome status, and refining the outcome definitions to reduce misclassifications. Furthermore, we train preliminary models and review the clinical records for false positive and false negative patients to further revise our outcome definitions where appropriate. In our Clinical Data Warehouse, structured data in electronic nursing flowsheets contain the most accurate observations and timestamps regarding a patient's mechanical ventilation status. From these flowsheets, we extract the mechanical ventilation onset times for each patient in our cohort. If a patient undergoes multiple mechanical ventilation episodes within a single hospital course, we use the earliest onset time to identify the first such episode. We use nursing flowsheets to extract the onset time of RRT for each patient and restrict to the earliest such episode. In addition, we exclude patients with a likely history of RRT by eliminating patients whose records contained OMOP concepts related to end-stage renal disease or stage 5 chronic kidney disease (see Supplementary Table 4) . Readmissions were defined as any emergency visit or inpatient admission occurring 1 to 7 days after a previous emergency room or inpatient discharge. To calculate the interval between an individual patient's visits, we simply take the difference between the first and second visit's end time and start time, respectively. Readmissions occurring within 1 day postdischarge were excluded as these events are difficult to distinguish from transfers within an ongoing hospital stay. If multiple readmissions are observed, we focus on the one with the earliest start date. We employ 3 types of models: L1-penalized logistic regression (logistic L1), elastic-net logistic regression (logistic EN), and gradient boosted trees (GBT). The former 2 are based on logistic regression, an effective model for clinical prediction tasks. 18, 19 GBTs are nonparametric models that have also shown strong clinical prediction performance. [20] [21] [22] [23] These models are relatively simple, interpretable, and straightforward to apply for prognostic modeling. These characteristics align well with the aims of the present study. Furthermore, each of these models has a built-in regularization mechanism. [24] [25] [26] [27] This is crucial in our setting in which the number of features is on the order of the number patients. We use Scikit-learn to implement each model type. 16 Both logistic L1 and logistic EN have a hyperparameter, alpha, which controls the strength of regularization. In addition, logistic EN has a second, mixing hyperparameter which controls the relative weight of the L1 vs L2 penalties. We use the default hyperparameter settings for GBT. Our model selection approach relies upon 2 performance metrics: the area under the receiver-operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). For each model and outcome, we conducted 5-fold cross validation on the development cohort data searching across different hyperparameters (alpha: [0.3, 0.5, 0.7]; mixing: [1 Â 10 -4 , 1 Â 10 4 ] equally spaced range of 10 values). We select the model with the best average AUROC across all folds. For the selected model, we compute the mean AUROC and AUPRC across all folds. 28 We obtain 95% confidence intervals (CIs) for all statistics by pooling the predicted probabilities and true labels across all folds within a reverse percentile bootstrap. For the validation cohort, we use the selected model to obtain outcome predictions and subsequently compute the reverse percentile bootstrap. For the development cohort, we use the pooled predicted probabilities and the true labels to generate the calibration curves. For the validation cohort, we use the predicted probabilities and true labels for the full cohort. Feature importances for all models were evaluated using SHAP, 29 a method for estimating instance-wise Shapley values, which represent fair estimates of the effect each feature has upon an outcome predic-tion. SHAP allows for instance-wise visualization, which for a given feature can demonstrate the distribution of the effect size and direction across the cohort. This study was approved by CUIMC's institutional review board and issued institutional review board number AAAS9678. Our final development and validation cohorts contained 2256 and 855 patients, respectively. The distributions of outcome and demographic variables for each cohort are presented in Table 1 . The distributions of sex and race and the number of readmissions were not significantly different between cohorts (P values >.05). Significant differences were found in the distributions of age and ethnicity, the numbers of mechanical ventilation and RRT cases, and the numbers of patients with do not intubate or do not resuscitate status or who died during their hospitalization (P < .001). Performance metrics for all models and outcomes on the development cohort are presented in Figure 3 shows the calibration curves for each outcome's selected model. In the development cohort, predicted probabilities for mechanical ventilation closely approximate the observed fraction of positive cases. Meanwhile, for both RRT and readmission, the predicted probabilities overestimate the fraction of positive cases. However, these estimates improve as the value of the predicted probability increases. Similar trends are observed for calibration in the validation cohort. SHAP values for each outcome's selected model are visualized in Figure 4 . Respiratory illnesses including acute hypoxemic respiratory failure, acute respiratory distress syndrome (ARDS), and acute lower respiratory tract infection served as positive predictors (positive SHAP values) for mechanical ventilation. High respiratory rate, high neutrophil count, hypoxemia, shock, and documented disease due to coronaviridae (ie, the presence of the concept code "Disease due to Coronaviridae" in a patient's clinical record) were also strong positive predictors. Greater age was negatively predictive (negative SHAP values). Respiratory and renal illnesses including acute renal failure, acute hypoxemic respiratory failure, ARDS, and acute lower respiratory tract infection functioned as positive predictors for RRT. Several features drove the predicted likelihood either positively or negatively depending on their value. Serum creatinine, neutrophil count, C-reactive protein, and hyaline casts transition from nega-tively to positively predictive as values increase from low to high. Meanwhile, serum bicarbonate and calcium make the same transition as values decrease. Furthermore, the presence of procalcitonin, urea nitrogen-to-creatinine ratio, and glomerular filtration rate measurements were positively predictive for RRT. Readmission prediction was driven positively by high values for temperature, hemoglobin, and oxygen saturation (SpO 2 ). Conversely, it was driven negatively by low values for these variables. The opposite trend was observed for leukocyte count, respiratory rate, erythrocyte sedimentation rate (ESR), calcium, and erythrocyte distribution width. Fever and abdominal pain were positively predictive, whereas respiratory disorder and documented coronaviridae infection were negatively predictive. Missing values for laboratory tests including fibrin d-dimer, ferritin, procalcitonin, lactate dehydrogenase, ESR, and activated partial thromboplastin time were positively predictive. Our results demonstrate that interpretable, performant, prognostic models targeting resource-intensive outcomes important to the management of COVID-19 may be trained using routinely recorded clinical variables. For mechanical ventilation and RRT, our models use only the data available within the first 12 hours of a patient's hospital course. Thus, their predictions may be made available to clinicians actively managing COVID-19 patients. Meanwhile, for readmission, our model utilizes data gathered throughout the current stay, making predictions available by the end of a hospital course when they would have the largest impact. Our work extends and improves on the current state-of-the-art in outcome prediction for COVID-19 patients. Our mechanical ventilation prediction model is competitive with the deep learning model introduced by Shashikumar et al. 30 Though our objectives are distinct (their model targets hourly predictions), their validation AUROC (0.882) and AUPRC (0.209) lie near or within our 95% CIs. Our RRT prediction model demonstrates superior performance relative to previously described work which also utilized data from patients in New York City 31 ; they obtained a validation AUROC of 0.79, which lies within our 95% CI (0.759-0.931). Though the current literature contains retrospective analysis studying the subpopulation of readmitted COVID-19 patients, to our knowledge, we are the first to describe a predictive model for COVID-19 patient readmission. 32, 33 Each of our models demonstrates reasonably good calibration in the development and validation cohorts. Nevertheless, caution should be taken when interpreting our models' predicted probabili- ties as estimates of the true risk of the target outcome for a given patient. Otherwise, a method for posttraining calibration should be employed, such as isotonic regression. 34 The use of SHAP values illuminates which features are driving our models' predictions and in which directions. This information is vital for evaluating what our models have learned. Consistent with expectations, predicted likelihoods of mechanical ventilation and RRT correlated positively with markers of respiratory and renal distress, as well as markers of active infectious or inflammatory processes. Notably, patient age was negatively predictive of mechanical ventilation, which is potentially a reflection of advance directives and clinical decision making, rather than a lower incidence of severe respiratory failure. In addition, as described in recently published work, 31 we find that respiratory distress is strongly associated with RRT. Predicted probabilities for readmission were mostly driven by the absence of labs, which would be ordered if clinical suspicion for an infectious process were high (eg, lactate dehydrogenase, C-reactive protein, ESR). This finding suggests that readmitted patients may not have been considered ill enough to warrant admission and thus were given only a limited clinical workup for COVID-19. In addition, high respiratory rates were negatively predictive for readmission, suggesting that signs of respiratory distress may be associated with presentation later in the disease course, prolonged evaluation, and hence decreased probability of near-term return after discharge. Of note, the condition "Disease due to Coronaviridae" is a strong positive predictor for mechanical ventilation and a negative predictor for readmission. This suggests that the subset of patients whose documentation contains this concept code may be suffering from more severe disease on admission, as such patients are more likely to require invasive intervention (ie, mechanical ventilation) and are unlikely to be discharged early and be subsequently readmitted. With further development and prospective validation, our outcome prediction models could potentially be utilized in practice to inform triage and resource allocation. Patients with high estimated risk of mechanical ventilation could be monitored more closely with continuous pulse oximetry and given early, noninvasive interventions such as self-proning. 35 Care could be taken to place these patients in beds with easy access to advanced oxygen therapies like high-flow nasal cannula and noninvasive positive pressure ventilation, resources that are typically not evenly distributed throughout a hospital. Similarly, providers could ensure that patients at high risk for RRT are placed in locations with the personnel and equipment needed to deliver this service. Such patients may also benefit from renal-protective therapeutic strategies such as setting a higher threshold for use of nephrotoxic agents, managing ARDS with less aggressive volume restriction, and an early nephrology consultation for AKI-an intervention that has been associated with improved renal prognosis. 36, 37 Additionally, given the relative paucity of dialysis equipment and appropriately trained staff during a pandemic surge, awareness of the risk of AKI requiring RRT could allow for improved resource planning and appropriate timing of surge protocols such as shared continuous RRT and acute peritoneal dialysis. Finally, patients with high risk of readmission could be re-evaluated for discharge, provided more intense monitoring, or provided additional support such as a visiting nurse that could help avoid a readmission while also lowering the risk of these patients decompensating at home. Though our models demonstrate strong performance on the development cohort, we must also acknowledge that this performance deteriorates significantly when they are applied to the validation cohort. This observation speaks to the care practitioners must take when developing models on one patient population and applying them to another. In our case, the development cohort was drawn from a major medical center while the validation cohort came from . SHAP feature importances for ventilation, renal replacement therapy (RRT), and readmission. Each SHAP value plot displays a patient-level SHAP value as a point which lies on the horizontal axis and uses color to indicate whether the feature value for a patient was higher (red) or lower (blue) than average. SHAP values >1 indicate increased risk for a patient. SHAP values <1 indicate decreased risk. This SHAP plot allows for visualization of the distribution of effect sizes indicated by the spread of the points around 0 and shows the direction of the effect. As an example, a higher respiratory rate (red points are all >0) indicates higher risk for ventilation. The average of the absolute SHAP values (shown in parenthesis for each feature) across all points shows the overall importance of the feature. aPTT: activated partial thromboplastin time; MCHC: mean corpuscular hemoglobin concentration; SpO 2 : oxygen saturation. a small community hospital. It is likely that these 2 populations contain very different people who experience very different care practices. The result is development and validation datasets that differ in both the spectrum of observed variable values as well as the frequency and pattern of variable missingness. These differences limit our models' ability to generalize to the validation cohort what they learned on the development cohort and are likely a major driver of the performance degradation on the former. We also consider that differences in resource constraints, specifically regarding equipment and materials needed for mechanical ventilation and RRT, could have potentially contributed to our models' performance degradation on these outcomes in the validation cohort. However, due to changes in care practices (eg, having 2 patients share a single continuous RRT machine within a 24-hour on/off cycle) and acquisition of additional materials and equipment, neither site ever came close to exhausting it's supplies. This suggests that resource constraints played at most a marginal and indirect role in limiting our models' performance on the validation cohort. We acknowledge several important limitations to this work. Our data were derived from a single hospital network. This limits the generalizability of our results to other institutions, as we cannot capture the out-of-network variability in COVID-19 population characteristics and care practices. This limitation extends to our validation experiments, which used data from an in-network community hospital. This also complicates our modeling of readmission. Our positive readmission cases are limited to those patients whose discharge and readmission both occurred in our hospital network. Discharged patients who were subsequently admitted elsewhere would appear as negative cases in our models. We adopted a feature-agnostic approach when choosing which variables to include in our model. This allowed us to model many of the observations in the clinical record, but it also complicates the models' utility. To extract risk estimates from our model, a user will need to replicate our feature engineering and apply it to their local data stores. Thus, they will likely need a pipeline inputting clinical observations directly into the model from the EHR. Modeling many variables also introduced a significant amount of missing values (see Supplementary Table 3) . To handle these, we used imputation strategies like MICE, which assume that the data are missing at random, even though our data are likely missing not at random. As such, it is likely that our fitted model parameters are biased. 38 However, as we are primarily concerned with optimizing prediction, we are willing to trade off model parameter bias for predictive performance by modeling the imputed data along with the observed missingness pattern. 39 Our use of the OMOP common data model also introduced challenges and limitations. The first of these concerns our outcome definitions, which relied on structured fields in nursing flowsheet. As these data are not part of the OMOP common data model, replicating our definitions at other sites may be difficult. Second, during the extraction, transformation, and loading of data into the OMOP we may have lost some observations. This is a likely source of the unusually large amount of missingness in routinely collected clinical measurements such as vitals and plasma and serum electrolyte labs. In conclusion, we have trained and validated prognostic models targeting 3 significant, resource-intensive outcomes in the context of COVID-19: mechanical ventilation, RRT, and hospital readmission. Our models run on routinely collected clinical variables, and produce accurate, interpretable predicted likelihoods for each outcome. Additional external validation studies are needed to further verify the generalizability of our results. FUNDING VAR is supported by grant F31LM012894 from the National Institutes of Health (NIH), National Library of Medicine (NLM). SB is supported by grant 5T15LM007079 from the NIH, NLM. AP is supported by grant R01HL148248 from the NIH, National Heart, Lung, and Blood Institute. The funders played no direct role in the present work with regard to study concept and design; acquisition, analysis, and interpretation of data; statistical analysis; manuscript drafting and revision; and supervision. AP, KN, RC, SM, and HG contributed to concept and design. SB, CP, AP, VAR, KN, and SS contributed to acquisition, analysis, and interpretation of data. SB contributed to statistical analysis. VAR, SB, and RC contributed to manuscript drafting. VAR, RC, SM, NE, GH, KSM, JA, PE, RG, and HG contributed to manuscript revision. AP was involved in supervision. Supplementary material is available at Journal of the American Medical Informatics Association online. Impending shortages of kidney replacement therapy for covid-19 patients Critical supply shortages -the need for ventilators and personal protective equipment during the Covid-19 pandemic These places could run out of hospital beds as coronavirus spreads. The New York Times Characterization and clinical course of 1000 patients with coronavirus disease 2019 in New York: retrospective case series Clinical characteristics of COVID-19 in New York City Northwell COVID-19 Research Consortium. Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City Area Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: Prospective cohort study Outcomes for patients with COVID-19 and acute kidney injury: a systematic review and meta-analysis Hospital readmissions of discharged patients with COVID-19 Predicting the future-big data, machine learning, and clinical medicine Axes of a revolution: challenges and promises of big data in healthcare Epidemiology, clinical course, and outcomes of critically ill adults with COVID-19 in New York City: a prospective cohort study Validation of a common data model for active safety surveillance research Scikit-learn: machine learning in Python Multivariate imputation by chained equations in R Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models Review and evaluation of penalised regression methods for risk prediction in lowdimensional data with few events Assessing fracture risk using gradient boosting machine (GBM) models Prediction of outcome in acute lower gastrointestinal bleeding using gradient boosting Gradient boosting for high-dimensional prediction of rare events Machine learning classifier models can identify acute respiratory distress syndrome phenotypes using readily available clinical data Regression shrinkage and selection via the Lasso Regularization and variable selection via the elastic net Statistical Learning with Sparsity: The Lasso and Generalizations Greedy function approximation: a gradient boosting machine Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates A unified approach to interpreting model predictions Development and prospective validation of a transparent deep learning algorithm for predicting need for mechanical ventilation AKI in hospitalized patients with COVID-19 Characterization of patients who return to hospital following discharge from hospitalization for COVID-19 The clinical features and outcomes of discharged coronavirus disease 2019 patients a prospective cohort study Isotonic median regression: a linear programming approach Early self-proning in awake, nonintubated patients in the emergency department: a single ED's experience during the COVID-19 pandemic Early nephrology consultation can have an impact on outcome of acute kidney injury patients Delayed nephrology consultation and high mortality on acute kidney injury: a meta-analysis Inference and missing data Missing data should be handled differently for prediction than for description or causal explanation The following authors had full access to the electronic health record data used in this work: AP, KN, CP, SB, VAR, and RC. The authors have no conflicts of interest to report as pertains to their financial interests, activities, relationships, and affiliations.