key: cord-0721395-1wy5yzhh authors: Schlauch, D.; Fisher, A. M.; Correia, J.; Fu, X.; Martin, C.; Junglen, A.; Burris, H. A.; Sears, L. E.; Fromell, G.; Correll, M.; LeMaistre, C. F.; Arnold Egloff, S. A. title: Development of a Real-Time Risk Model (RTRM) for Predicting In-Hospital COVID-19 Mortality date: 2021-04-28 journal: nan DOI: 10.1101/2021.04.26.21256138 sha: 4971279085568cfc8e9632806c4325b75939d28a doc_id: 721395 cord_uid: 1wy5yzhh Background: With over 83 million cases and 1.8 million deaths reported worldwide by the end of 2020 for SARS-CoV-2 (COVID-19), there is an urgent need to enhance identification of high-risk populations to properly evaluate therapy effectiveness with real-world evidence and improve outcomes. Methods: Baseline and daily indicators were evaluated using electronic health records for 46,971 patients hospitalized with COVID-19 from 176 HCA Healthcare-affiliated hospitals, presenting from March to September 2020, to develop a real-time risk model (RTRM) of all-cause, hospitalized mortality. Patient facility, dates-of-care, clinico-demographics, comorbidities, vitals, laboratory markers, and respiratory support findings were aggregated in a logistic regression model. Findings: The RTRM predicted overall mortality as well as mortality 1, 3, and 7 days in advance with an area under the receiver operating characteristic curve (AUCROC) of 0.905, 0.911, 0.905, and 0.901 respectively, significantly outperforming a combined model of age and daily modified WHO progression scale (all p<0.0001; AUCROC, 0.846, 0.848, 0.850, and 0.852). The RTRM delineated risk at presentation from ongoing risk associated with medical care and showed that mortality rates decreased over time due to both decreased severity and changes in care. Interpretation: To our knowledge, this study is the largest of its kind to comprehensively evaluate predictors and incorporate daily risk measures of COVID-19 mortality. The RTRM validates current literature trends in mortality across time and allows direct translation for research and clinical applications. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint Due to the rapidly evolving nature of the COVID-19 pandemic, the body of evidence and published literature was considered prior to study initiation and throughout the course of the study. Although at study initiation there was a growing consensus that age and disease severity at presentation were the greatest contributors to predicting in-hospital mortality, there was less of a consensus on the key demographics, comorbidities, vitals and laboratory values. In addition, early on, most potential predictors of in-hospital mortality had been assessed by univariable analysis. In April of 2020, a systematic review of prediction studies for COVID-19 revealed that there were only 8 publications for prognosis of hospital mortality. All were deemed to have high potential for bias due to low sample size, model overfitting, vague reporting and/or insufficient follow-up. Over the duration of the study, in-hospital prediction models were published ranging from simplified scores to machine learning. There were at least 8 prediction studies that were published during the course of our own that had comparable sample size or extensive multivariable analysis with the greatest accuracy of prediction reported as 74%. Moreover, a report in December of 2020 independently validated 4 simple prediction models, with none achieving greater than an AUCROC of 0.72%. Lastly, an eight-variable score developed by a UK consortium on a comparable sample size demonstrated an AUCROC of 0.77. To our knowledge, however, none to-date have modeled daily risk beyond baseline. We frequently assessed World Health Organization (WHO) resources as well as queried both MedRXIV and PubMed with the search terms "COVID", "prediction", "hospital" and "mortality" to ensure we were assessing all potential predictors of hospitalized mortality. The last search was performed on January 5, 2021 with the addition of "multi", "daily", "real time" or "longitudinal" terms to confirm the novelty of our study. No date restrictions or language filters were applied. To our knowledge, this study is the largest and most geographically diverse of its kind to comprehensively evaluate predictors of in-hospital COVID-19 mortality that are available retrospectively in electronic health records and to incorporate longitudinal, daily risk measures to create risk trajectories over the entire hospital stay. Not only does our Real-Time Risk Model (RTRM) validate current literature, demonstrating reduced mortality over the course of the COVID-19 pandemic and identifying age and WHO severity as major drivers of mortality in regards to baseline characteristics, but it also outperforms a model of age and daily WHO score combined, achieving an AUCROC of 0.91 on the test set. Furthermore, the fact that the RTRM delineates risk at baseline from risk over the course of care allows more granular interpretation of the impact of various parameters on mortality risk, as demonstrated in the current study using both sex disparity and calendar epochs that were based on evolving treatment recommendations as proofs-of-principle. The goal of the RTRM was to create a flexible tool that could be used to assess intervention and treatment efficacy in real-world, evidence-based studies as well as provide real-time risk assessment to aid clinical decisions and resourcing with further development. Implications of this work are broad. The depth of the multi-facility, harmonized electronic health record (EHR) dataset coupled with the transparency we provide in the RTRM results provides a resource for others to interpret impact of markers of interest and utilize data that is relevant to their own studies. The RTRM will allow optimal matching in retrospective cohort studies and provide a more granular endpoint for evaluation of interventions beyond general effectiveness, such as optimal delivery, including dosing and timing, and identification of the population/s benefiting from an intervention or combination of interventions. In addition, beyond the scope of the current study, the RTRM and its resultant daily risk scores allow for flexibility in developing prediction models for other clinical outcomes, such as progression of pulmonary disease, need for invasive mechanical ventilation, and development of sepsis and/or multiorgan failure, all of which could provide a framework for real-time personalized care. A cluster of pneumonia cases in Wuhan, China in December of 2019 has become a global pandemic, with over 83 million cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and 1·8 million deaths reported from COVID-19 worldwide by the end of 2020. 1-3 Despite unprecedented vaccine efforts, the pandemic is expected to remain a persistent threat given the challenges of global deployment. Therefore, there is an urgent need to identify higher risk populations as well as more accurately measure ongoing disease severity and progression to identify effective therapies. Moreover, due to practical challenges associated with operating and recruiting patients to randomized clinical trials, risk-adjusted retrospective cohort studies of real-world evidence have been essential to understanding treatment effects for COVID-19. These non-randomized studies, however, are inherently biased due to possible incomplete inclusion of prognostic indicators. Given the global impact of the pandemic on healthcare system resources, there is critical need to identify patients at higher risk for developing lung and multiorgan dysfunction in real-time in order to intervene early and properly allocate resources. There is also a need to rapidly and systematically evaluate clinical interventions while accounting for robust risk adjustment, concurrent medications, and an evolving interventional landscape. Thus, considering the patient timeline and trajectory of risk throughout hospitalization is critical. In this first proof-of-principle study, we leverage hospital data from HCA Healthcare, the largest private healthcare system in the U.S., to train and predict all cause, hospitalized mortality in a cohort of 46,971 patients. Our resultant real-time risk model (RTRM) is able to assess patient prognosis at presentation and throughout hospitalization. Moreover, the RTRM validates predictors for inclusion in retrospective, matched cohort studies for evaluating treatment effectiveness and produces a granular daily risk score that serves as a surrogate of disease progression and/or improvement. Importantly, we demonstrate the superiority of the RTRM at predicting hospitalized mortality in comparison to a combined model of age and the daily modified World Health Organization (WHO) clinical progression scale as the current benchmark measure of disease severity. [4] [5] [6] [7] [8] Finally, we delineate baseline effects from responses across hospital care for calendar date epochs and for sex by comparing daily risk trajectories. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint This study was supported by HCA Healthcare and deemed exempt, non-human subjects research by the governing institutional review board. The design, analysis, and data interpretations were conducted independently by investigators. All authors testify to the accuracy and completeness of the data with the understanding that there may be issues in real-world records outside of our awareness. Electronic health records (EHRs) were compiled for patients hospitalized with laboratory-confirmed SARS-CoV-2 infection, representing COVID-19 disease, at 176 HCA Healthcare-affiliated facilities across the Unites States between March 2 and September 23 and were followed through October 7, 2020. A COVID-19 confirmed case was defined as a positive and/or presumptive positive SARS-CoV-2 result regardless of assay platform. A COVID-19associated hospitalization was defined by the date of hospital presentation to the date of discharge or death; whereby, all consecutive inpatient encounters such as emergency room or holding area stays, inter-or intra-HCA-affiliated hospital transfers, and/or readmissions were considered as a single hospitalization. For patients with multiple COVID-19-associated inpatient encounters separated by greater than 36 hours, only data from the first hospitalization was used. Only those patients that presented to the hospital prior to September 23, 2020 and experienced an outcome of either "discharged alive" or "death" by October 7, 2020 were included, which was 93.8% (46,971/50,059) of the overall source population ( Figure 1A ). Patients that were discharged or expired on the same date of hospitalization were excluded from the analysis. Data was collected from EHRs (Epic, Cerner, Meditech) and compiled in an enterprise data warehouse. Further data processing and standardization was done using Genospace, a cloud-based biomedical data ingestion and transformation platform. This study adhered to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) reporting guidelines. Missing value handling depended on the variable and is detailed in the supplemental methods ( Figure S1 ). Medications were purposely excluded from the RTRM to facilitate downstream assessments of treatment effectiveness. Data collection included patient demographics, admitting and primary treating facility, smoking status, and blood type, as well as certain comorbidities documented through past and current ICD-9-CM and ICD-10-CM diagnostic codes (Table S1 ). Comorbidities of interest included diabetes with chronic complications, diabetes without chronic complications, hypertension, chronic ischemic heart disease, congestive heart failure, renal disease (mild or moderate), renal disease (severe), asthma or reactive airway disease, chronic pulmonary disease (excluding asthma), HIV infection, and cancer (including solid tumors and blood cancers but excluding non-melanoma skin cancer). Time-course data were indexed by each day of hospitalization with baseline (t=0) defined as the date of hospital presentation (emergency room or direct admission). Longitudinal data on vitals, laboratory values, and level of respiratory support were collected throughout the course of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint hospitalization. Additional data processing included standard unit transformation and validation of expected values; erroneous data-points outside of plausible range were removed and considered missing. Missing data were imputed for all patient-days with no available data for a particular variable. When multiple values were present for a particular patient-day, the mean of the observed values was used. Patients were categorized based on a modified 6-point scale, adapted from the WHO R&D Blueprint group to assess disease severity and measure clinical improvement. 4 The modified 6point scale, subsequently, the WHO Progression Scale (WHO PS), is as follows: 1, discharged alive; 2, hospitalized, no supplemental oxygen; 3, hospitalized, low-flow supplemental oxygen; 4, hospitalized, non-invasive or high-flow oxygen including continuous positive airway pressure (CPAP) and bi-level positive airway pressure (BIPAP); 5, hospitalized, invasive mechanical ventilation or extracorporeal membrane oxygenation (ECMO); 6, expired. Modification excluded points that could not be reliably defined by hospital EHR data. Patients were assigned a score at presentation and each day following based on their most severe status that day, except day-ofdischarge, where they were assigned 1. Clinical outcome measures were also collected, including WHO PS at discharge or death (primary outcome measure); maximum prior WHO PS; time on ventilator and time to first ventilation; intensive care unit (ICU) status, defined as receiving care in the ICU at any point during hospitalization; length of stay, defined as the time between date of presentation and death or discharge; and length of ICU stay, defined as the time between ICU admission and death or discharge from the ICU. Complications included development of acute respiratory distress syndrome (ARDS), defined as PaO2/FiO2 ratio ≤300 mmHg and a daily WHO PS of 3 or higher, as well as pneumonia, sepsis, and bacteremia based on ICD-10-CM codes (Table S2) . We evaluated the COVID-19 cohort as six independent date-of-hospital-presentation epochs to account for changes in treatment recommendations by the HCA Healthcare Clinical Operations Group over time in 2020. These epochs are detailed in Table 1 . The relationship between labs, vitals, or other predictors with clinical outcomes is complex and nonlinear ( Figures S2 and S3 ). Three derived features of each time-course variable were engineered to consider binary, continuous, and outlier-driven relationships, which included (1) the raw data itself, (2) a categorical representation of the data bucketed into 1, 10, 25, 50, 75, 90, and 99 th percentile cut-points, and (3) a normalized version fitting to Gaussian distribution. Two additional features were generated for longitudinal data to account for momentum or the change in values compared to the prior day (1-day delta) and two days prior (2-day delta), where values were set to 0 for patients in their first day after hospital presentation or first two days, respectively. Higher values indicate current increasing trend. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint Statistical interactions across a number of features were evaluated. Details are provided in supplemental methods. Given that ~400 features were investigated in the RTRM, candidate interaction testing was limited to those features that were clinically motivated or most significant as main effects to keep the model computationally tractable ( Figure S4 ). We employed multiple imputation methods to impute missing data, which are detailed in supplemental methods and Figure S5 . Patients were randomly split into training and testing sets with a 50:50 ratio ( Figure 1B ). Imputation and feature engineering were performed separately on each set following assignment. The RTRM can predict mortality for any n-days in advance; three n-days were reported (next-day, next-3-days, next-7-days) along with overall mortality by assigning a binary indicator to the corresponding days preceding death for each patient. Overall mortality at presentation was also modeled, which only assessed baseline risk. Each predictor consisted of the vector of all engineered features mapped to each patient-day. Model coefficients were estimated using logistic regression with group-wise elastic net regularization ( Figure S6 ). Additional model fitting and bootstrap validation details are described in the supplemental methods. Model performance was measured against age and daily WHO PS, alone and in combination, using the area under the receiver operating characteristic curve (AUCROC). Age and daily WHO PS were used to predict mortality n-days in advance in an unpenalized logistic regression model. The predictive scores for each of these reference models were evaluated in the testing set, along with the RTRM, and ROC curves were graphed and AUCROC computed for each of the n-day mortalities and for each day following hospital presentation. All authors are employed by an affiliate of the sponsor, HCA Healthcare, who provided resources for this research and was involved in the decision to submit for publication. Data collection, analysis, and the writing of the report were performed independently by investigators. A total of 46,971 hospitalized patients were included in the risk model ( Figure 1A ). The cohort median age was 63 years with an interquartile range (IQR) of 49-76 and was 51·6% male, 20·3% Black or African American, and 51·6% White or Caucasian (Table 2) . Comorbidities most prevalent were hypertension (62·2%), diabetes (39·5%), and chronic ischemic heart disease (21·0%). Among the cohort, 15·6% expired, 33·2% received ICU care, and 15·1% received invasive mechanical ventilation and/or ECMO during hospitalization ( Table 3 ). The median hospital length-of-stay was 6 days for discharged patients and 11 days for expired patients (Table 3) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 28, 2021. The RTRM demonstrated strong performance in predicting mortality in real-time. For each day a test set patient remained hospitalized, the RTRM produced predictions for outcomes of next-day, next-3-days, next-7-days, and overall mortality, which resulted in an AUCROC of 0·911, 0·905, 0·901, and 0·905, respectively (Figure 2A-D) . We compared the RTRM result to the performance of two benchmarks of COVID-19 mortality, age and the daily WHO PS. [4] [5] [6] The AUCROCs were significantly lower for all models using either age alone (all p<0·0001; AUCROCs, 0·653, 0·656, 0·658, and 0·653) or the daily WHO PS alone (all p<0·0001; AUCROCs, 0·804, 0·803, 0·801, and 0·792; Table S3 and Figure 2A-D) . Importantly, the RTRM significantly outperformed the model combining both age and the daily WHO PS (all p<0·0001; AUCROCs, 0·848, 0·850, 0·852, and 0·901; Table S3 and Figure 2A -D). This consistent outperformance across each outcome highlights the strength of the RTRM to predict mortality in the days preceding death. Using only baseline risk for prediction of overall mortality at hospital presentation rather than daily risk assessments, the RTRM achieved an AUCROC of 0·874, compared to 0·721, 0·629, and 0·776 for age and daily WHO PS alone and in combination, respectively (all p<0·0001; Figure 2E ). Furthermore, the fact that the RTRM AUCROC of overall mortality using daily risk was 0·031 higher than the overall mortality at hospital presentation highlights the important contribution of longitudinal risk to model accuracy. While the baseline accuracy is high, the added value of the RTRM persists with a greater than 0·80 AUCROC over the entire course of hospitalization, out to 30 days ( Figure 2F ). A primary purpose of the RTRM is a more granular elucidation of deteriorating health and increased mortality risk for patients over their hospitalization. The performance of the RTRM is demonstrated quantitatively with AUCROCs for next-day, next-3-days, next-7-days, and overall mortality (Figure 2A -F) and by swimmer plots provided as visualization of individual patient risk across length-of-stay. Patients were grouped by their eventual outcome, death or discharge, to highlight the divergence in risk scores that was observable with the RTRM well in advance of their eventual outcome ( Figure 3 ). The evaluation of the most impactful predictors obtained from fitting the RTRM confirm previously known risk factors in COVID-19 and suggest interesting avenues for further investigation. 9 Figure 4A shows the top 100 predictors by odds ratio (OR) associated with mortality by univariable analysis. Figure 4B shows the influence of all features in the RTRM. Although medications were excluded from the RTRM, frequencies of the predominant COVID-19 medications is provided in Table S4 . Cut-points for each laboratory measure percentile included in the RTRM are provided (Table S5) . Severe sepsis (OR=5·32, p<0·0001) and WHO PS of 5 (OR=3·02, p<0·0001) contribute the most to identifying future mortality ( Figure 4 ). Age provides the greatest baseline demographic impact, with a 16·8 year increase in age associated with a 16·6% increase in mortality (p<0·0001); an effect likely minimized due to inclusion of competing risk factors such as comorbidities. The unadjusted association of age with mortality was found to be 1·87, 95% CI [1·84, 1·89] ( Figure 4A ). Figure S7 displays all variables used in the RTRM with the adjusted odds ratios. Overall, most of the blood markers showed appropriate dose responses with ferritin, interleukin-6, bilirubin, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint and white blood cell count having the largest distribution of response or ability to discriminate risk. Additionally, most comorbidities evaluated were significant independent predictors of mortality; albeit, the interpretation of the direction and effect size is limited due to the complexity of the RTRM ( Figure S7 ). The mortality rate among patients hospitalized with COVID-19 has dropped substantially since the start of the pandemic. 10, 11 Improvements in clinical support care, changes in therapeutic regimens, evolution of the hospitalized patient population in regards to clinico-demographics, and alterations in hospital admission patterns or patient behavior have been speculated as reasons for improved mortality. The RTRM was utilized to investigate patterns across six epochs of time delineating major shifts in medication recommendations (Table 1) . We evaluated the daily risk scores for all patients in our test set during hospitalization ( Figure 5A , C) as well as restricted the evaluation to those patients who had not reached an endpoint for each day following hospital presentation (active; Figure 5B , D). Figure 5A shows the RTRM risk over days from presentation and stratified by the six epochs. The patient cohort presenting to hospitals during Epochs 1 and 2 had substantially increased mortality compared to all others. Risk increased rapidly over the first 10 days of Epochs 1 (slope=0·0044, p<0·0001) and 2 (slope=0·0043, p<0·0001); whereas, Epochs 3 to 5 experienced no change in risk ( Figure 5A ). Strikingly, Epoch 6 actually experienced reductions in risk over the first 10 days (slope= -0·0019, p<0·0001). These reductions in slopes across epochs suggest improvements in care led to improved outcomes over time, marked by reductions in severe sepsis for example (Table 3) . These data suggest improvement in clinical care may have occurred between Epochs 2 and 3, which corresponds with the shift in treatment recommendation towards the use of low-dose steroids in the late pulmonary phase, and between Epochs 5 and 6, which corresponds with a shift from restricting convalescent plasma to severe patients to less severe patients based on regulatory approvals (Table 1) . Furthermore, there was a meaningful separation of risk at hospital presentation (Day 0), with Epochs 1 and 2 at greatest risk, Epochs 3 to 5 having moderate risk, and Epoch 6 having much lower risk ( Figure 5 ). This suggests that disease severity and/or risk factors evolved over time. Interestingly, neither age nor any of the comorbidities consistently changed over epochs. There were, however, relatively fewer Asian and African Americans and an increase in Caucasians by Epoch (Table 2 ). There was a negative trend in the frequency of baseline mechanical ventilation/ECMO with a concomitant increase in the use of non-invasive ventilation or high-flow oxygen at hospital presentation ( Table 2 ). This suggests decreases in mortality rates over time may also be attributed to patients either presenting to the hospital or being admitted with less severe disease. Lastly, showing the trajectory of RTRM mortality risk from days of hospital presentation for active patients only, indicate that risk of death increased for each day of hospitalization until days 10 to 14 ( Figure 5B, D) . Risk of mortality then remained stable or decreased, as visualized most strikingly in Epoch 1 but demonstrated by all other Epochs ( Figure 5B ) and both males and females ( Figure 5D ). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint We evaluated the trajectories of risk scores according to sex. The smoothed risk averages displayed in Figure 5C and D suggest that disparities in sex mortality can be attributed to differences in risk at hospital presentation. Following the separation at day 0, both men and women follow similar risk trajectories for the first 20 days for both overall ( Figure 5C ) and active patients only ( Figure 5D ). Quantifying the risk of mortality at a specified time for patients hospitalized with COVID-19 is critical to understanding the impact of clinical interventions. The widespread impact of the COVID-19 pandemic yielded a broad spectrum of patients with diversity across geography, timeframe, medical history, and demographics. Patient prognoses differ substantially based on these factors. Recent studies highlighted the importance of age and oxygen saturation as key prognostic factors by multivariable analyses, but none evaluated risk beyond admission. [5] [6] [7] [8] 12 Utilizing EHRs for 46,971 patients hospitalized with COVID-19, we evaluated hundreds of elements to develop a real-time risk model (RTRM) of all-cause, hospitalized mortality. Both the daily WHO PS and the age models perform well at predicting mortality as previously reported. [4] [5] [6] Therefore, it is noteworthy that the RTRM is superior to the combined age and the daily WHO PS model for next-day, next-3-days, next-7-days, and overall mortality. Additionally, due to longitudinal daily assessment, the RTRM was able to separately evaluate differences in mortality based on risk at presentation versus risk during the course of care. The RTRM revealed that reduction in mortality rates over time during the pandemic was associated with not only reduction in baseline risk due to disease severity and shifts in demographics, but also suggested improvement in care over time. Notably, we did not find differences in age across epochs. Although other studies have shown that males are at increased risk of death due to COVID-19, the RTRM suggests this increase is primarily due to severity at presentation rather than decreased response to clinical care. Finally, evaluation of effect size and significance of all predictors in the RTRM multivariable regression model not only corroborates most of the current key predictors of mortality in COVID-19 such as age, WHO PS, male sex, comorbidities, and markers of inflammation and organ dysfunction, but also shows clear dose responses for critical laboratory measures. 13 Our study has several limitations, some of which is inherent to real-world EHR data. In particular, because ICU care status was based on EHR location data, ICU misclassification may have occurred due to creation of COVID-19 isolation wards within ICUs, regardless of the necessity for ICUlevel care. Additionally, the complexity of the RTRM reduces its current applicability to clinical trials and other healthcare systems; however, we have provided a full list of variables and their contributions to prediction of hospitalized mortality, for further scrutiny and to allow incorporation of top contributors in other research and clinical applications. Future directions could focus on both research and clinical applications. First, the depth of the cross-facility, harmonized EHR data allows for more robust risk adjustment in retrospective, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint matched cohort studies and provides a granular endpoint to evaluate interventions beyond general effectiveness towards optimal delivery, including timing and dosing. Indeed, medications were purposely excluded from the RTRM in order to facilitate the use of the model in downstream interventional studies. In an environment where randomized control trials are challenging to operate and recruitment is limited, the RTRM allows real-world evidence to identify which populations benefit from an intervention or combination of interventions. Second, based on the top contributors to the RTRM that can be reliably measured, a simplified risk score can be developed to provide ease of calculation and broader applicability. In addition, RTRMs that predict other clinical outcomes, such as progression of pulmonary disease and development of sepsis and/or multiorgan failure should be investigated. Future work should consider readmissions related to COVID-19 complications and chronic COVID-19 etiologies, which is particularly relevant given studies show that 9-15% of patients hospitalized due to COVID-19 and discharged are readmitted within 60 days. 14, 15 In summary, to our knowledge, the RTRM is the first of its kind to evaluate daily risk of hospitalized mortality for patients with COVID-19. Not only does the RTRM outperform benchmark predictors of mortality, but it lays down a framework for optimizing future research and personalized care utilizing real-world evidence. DS contributed to study design, all analyses, modeling, results interpretation and manuscript preparation. AMF contributed to data curation, sourcing and mapping as well as manuscript preparation and review. JC contributed to data mapping, data verification, manuscript preparation and review. XF contributed to data mapping, data verification, and analysis review. CM contributed to EHR data curation and sourcing, data verification, and manuscript review and editing. AJ contributed to study design, data verification and manuscript review and editing. HAB provided general oversight and contributed to results interpretation and manuscript review. LES provided oversight and contributed to overall study design, results interpretation, and manuscript review. GF provided general oversight and contributed to results interpretation and manuscript review. MC contributed to overall study design, data acquisition, and results interpretation. CFL contributed to study design, results interpretation and manuscript review. SAAE oversaw all aspects of the study including conceptualizing the study design and directing its implementation, as well as contributed to the development of the statistical analysis plan, interpretation of results, and manuscript writing. The views expressed in this publication represent those of the authors and do not necessarily represent the official views of HCA Healthcare or any of its affiliated entities. None of the authors declared any conflict of interest related to the current study beyond employment with an affiliate of HCA Healthcare. HAB reported grant funding from MedImmune, Boehringer Ingelheim, Merck, Moderna, Verastem, Harpoon, Jounce, Janssen, BIND Therapeutics, Pfizer, Vertex, Gilead, Bayer, Incyte, AstraZeneca, Novartis, Seattle Genetics, GlaxoSmithKline, BioAtla, Agios, BioMed Valley, TG Therapeutics, eFFECTOR, CicloMed, Array, Roche/Genentech, Arvinas, Bristol-Myers Squibb, Macrogenics, CytomX, Arch, Revolution Medicine, Lilly, Tesaro, Takeda/Millennium, miRNA, Kyocera and Foundation Medicine (all paid to his institution); consulting fees from Incyte, AstraZeneca, Celgene, and Forma Therapeutics (all paid to his . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint institution); non-compensated consulting services from Novartis, Bayer, Pfizer, GRAIL and Daiichi Sankyo; expert testimony from Novartis (paid to his institution) and stock ownership in HCA Healthcare. DS, AMF, JC, MC, and SAAE reported stock ownership in HCA Healthcare. The data that support the findings of this study are available upon request from the corresponding author. The data are not publicly available due to privacy restrictions. Extensive detail on all variables utilized by the RTRM has been provided in the supplemental materials, including significance and effect size. The project was sponsored by HCA Healthcare where all authors are employed by an affiliate. HCA Healthcare prioritized and provided resources for this research and was involved in the decision to submit for publication. Data collection, analysis, and the writing of the report were performed independently by employees. manuscript support. We would also like to acknowledge Mandelin Cooper, Mitch Davis and Laura McLean in the Clinical Operations Group for their parallel COVID-19 work that provided insight into HCA Healthcare treatment recommendations for the date epochs as well as variable sourcing and mapping suggestions. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. Data are n (%) unless otherwise indicated. Epochs were all in 2020 and were defined as follows based on changes in treatment recommendations: Epoch 1, March 1 to April 2; Epoch 2, April 3 to April 29; Epoch 3, April 30 to May 19; Epoch 4, May 20 to July 5; Epoch 5, July 6 to August 23; Epoch 6, August 24 to September 22. † BMI and laboratory and vital values at presentation are presented as median (IQR). IQR=interquartile range. BMI=body mass index. ECMO = extracorporeal membrane oxygenation. * Level of respiratory support at presentation followed the WHO Progression Scale. Data are n (%) unless otherwise indicated. Epochs were all in 2020 and were defined as follows based on changes in treatment recommendations: Epoch 1, March 1 to April 2; Epoch 2, April 3 to April 29; Epoch 3, April 30 to May 19; Epoch 4, May 20 to July 5; Epoch 5, July 6 to August 23; Epoch 6, August 24 to September 22. IQR=interquartile range. ECMO = extracorporeal membrane oxygenation. ARDS=acute respiratory distress syndrome. § Time to first episode of mechanical ventilation is presented as median (IQR). † Length of stay and time to outcomes are presented as median (IQR). * Level of respiratory support at most severe followed the WHO Progression Scale. Performance analysis of the RTRM measured by ROC, with models of the daily WHO Progression Scale (WHO PS) and Age as references. Performance is broken down by next-day, next-3-days, and next-7-days mortality, overall mortality, and mortality predicted at the date of presentation (A, B, C, D, and E respectively). Results are graphed by day since presentation (F), highlighting strong predictive power of the RTRM throughout COVID-19 hospitalization. RTRM scores (colored) for individual patients over time, selected based on eventual outcome. Patients were filtered at random from the test set for visualization purposes. Red cells indicate a higher risk of overall mortality and all patients experienced death (A, B) or discharge (C, D) at the end of their segments. Patients in A and C include only those that died or were discharged, respectively, on exactly the 15th day after hospital presentation, while B and D depict randomly selected patients that died or were discharged, respectively, at any time. Patterns suggest that the RTRM is able to reliably identify patients several days prior to their eventual death. Estimated parameters (odds ratio, OR) and 95% confidence intervals from the 100 most significant predictors of all-cause, hospitalized mortality based on either (A) univariable logistic regression or (B) multivariable logistic regression with elastic net (i.e. RTRM). Max Prior WHO and WHO Index are all referring to the WHO PS. ORs for categorical data are reported as normal for each categorical level independently. ORs for continuous variables are reported as the OR for a one standard deviation increase in the variable value. Smoothed and combined RTRM curves showing risk trajectories across six epochs and sex. A generalized additive model with integrated smoothness estimation was applied to the risk predictions over hospitalization time. Each point, n, on the y-axis represents the smoothed mean risk score for patients that were still hospitalized on the n th day following COVID-19 hospital presentation. Patients that have experienced an outcome continue to contribute to the risk of mortality after their death or discharge in A and C; whereas, in B and D only active patients (i.e. still-hospitalized) contribute to the model for any given day after hospital presentation. RTRM curves are broken down by six calendar date epochs (A, B) and sex (C, D). Table S1: ICD-9-CM and ICD-10-CM codes for Identification of Comorbidities in Electronic Health Record Data Table S2 : ICD-10-CM codes for Identification of Pneumonia, Sepsis, and Bacteremia in Electronic Health Record Data Table S3 : Pairwise Testing of Next-n-Day and Overall Mortality AUCROCs Pairwise testing of significant differences between the AUCROCs of the RTRM and the benchmark models using Age alone, daily WHO PS alone, and Age + daily WHO PS for Next-Day, Next-3-Day, Next-7-Day, and overall mortality outcomes. 1 Data was obtained from Genospace, an internal platform at HCA Healthcare that allows live querying of data obtained from HCA Healthcare's Electronic Data Warehouse. All preprocessing was performed with the intention of generating a patient-day matrix where each row represented the available information for an individual patient on an individual day. The columns of this matrix contain all available data for that patient-day, including demographics, labs, vitals, oxygenation, and hospitalization status. Many variables are not necessarily available on the single-day level. For variables not associated with a specific date in the dataset, such as race, age, or weight, the values were repeated for each patient-day of hospitalization. Other values, such as vitals, were commonly available with much greater than single-day frequency. For any such variable containing multiple instances in a single day, the mean for each patient-day was taken. The minimum and maximum were also considered, but were not found to improve model performance on the whole due to added model complexity. This process was applied for all continuous variables. For WHO PS, which is based on the level of respiratory support received, the maximum value, representing the most severe level of respiratory support, for a given day was used rather than the mean, except if the patient was discharged alive, in which case a score of 1 was assigned. For categorical values with dates, such as comorbidities and complications, including pneumonia, sepsis, and bacteremia, the indication of the event was added to the given patient-day and all subsequent days. Missing value handling depended on the variable ( Figure S1) . For multi-level factors, e.g. race, patients with missing values were specifically given the value "Missing". For binary indicator variables, e.g. comorbidities, the absence of a data point for a particular patient was treated as a negative. For continuous values, e.g. lab results, data was completed with multiple imputation, as described in the main text and below in the Imputation Evaluation section. The Real-Time Risk Model requires selection of shrinkage coefficients to be estimated for L1 (absolute) and L2 (squared) regularization. This process was performed with 10-fold cross validation and a grid search where L1 and L2 were chosen independently. For each point in the grid, the risk model was fit for each fold and evaluated on the cross validation test set. The deviance of the model was measured at each stop. Hyperparameters were chosen based on the deviance minimization within the grid search. The Real-Time Risk Model considered interactions between pairwise combinations of variables, i.e. the possible altering of the linear effect of a variable on mortality conditional on another variable. We allowed this complexity into our model with the introduction of simple interacting coefficients for selected variables. Given the large number of coefficients to be considered, we recognized that including all n-choose-2 pairwise combinations would dramatically increase the dimensionality of our model. To mitigate this, we limited the number of coefficients to those with higher clinical or statistical support. Interactions were modeled based on all pairwise combinations of the following ten variables -(1) Age, (2) WHO PS, (3) Admission Date, (4) ICU Status, (5) Hospitalization day number, (6) C-Reactive Protein (mg/dL), (7) D-Dimer (ng/mL DDU), (8) Systolic Blood Pressure (mmHg), (9) Temperature (C), and (10) SPO2 (%). These terms formed 45 pairwise interaction terms ( Figure S4 ). The conditional distributions of each of the interacting variables were evaluated ( Figure S6 ). While missing data at any point after a baseline value can be directly inferred from the last available value, missing baseline data as well as values for variables not associated with a specific date in the dataset needed to be imputed . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint from available data. To prevent forward-looking data leakage, no timepoint data imputation was permitted to utilize data that was not available at the time of the missing datapoint. To perform this imputation, we used Multiple Imputation by Chained Equations (MICE). 1 With this process, missing data for each variable was randomly sampled with replacement from the observed data. Then, for each variable, missing values were removed and replaced with fitted values from regressing the observed values for that variable against the rest of the dataset. This step was performed for each variable containing any missing data and repeated until convergence. The stochastic component of this procedure was designed to capture some of the variability present in the original data. Consequently, we produced 10 separate imputed datasets in this manner and executed our analysis separately on each. The results of each analysis were merged by combining, via simple average, the resulting coefficients into a single prediction model. 2 While all iterations of the multiple imputation method differed slightly from one another, the predicted probabilities were highly correlated (r = 0·977) and the merged model outperformed each individual model on its own ( Figure S5 ). Variance for coefficients was measured with bootstrapping. Models were fit, as described above, repeatedly on prediction matrices obtained by resampling the patient population with replacement. This process was repeated 100 times, with the standard deviation of the bootstrapped coefficient estimates taken as the bootstrap estimate of the standard errors, which are visualized and reported in Forest plots ( Figure 4 , Figure S6 ). Model validation was performed by applying the prediction coefficients generated from the training set to the heldback testing set. This entire process was performed ten times; once for each imputation step. Prediction scores for each imputation, corresponding to probability of n-day mortality, were averaged into a single n-day risk of mortality. Each patient-day was evaluated for the probability of n-day and overall survival. The model performance was measured with the area under the curve of the receiver operating characteristic (AUCROC). This analysis was performed using the R packages ROCit, oem (Orthogonalizing EM: Penalized Regression for Big Tall Data) and glmnet. [3] [4] [5] . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. Tables Table S1: ICD-9-CM and ICD- * The ICD-9 and ICD-10 codes used to identify these comorbidities were largely based on a modified Charlson Comorbidity Index (CCI) coding scheme (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6684052/) ✝ The diagnosis codes for asthma were separated from other chronic pulmonary diseases to facilitate the separate assessment of asthma, and because the same included diagnosis code can be used for both asthma and reactive airway disease, the label for this comorbidity is "Asthma or reactive airway disease"; the World Health Organization (WHO) ICD-10 code J46 was not included, as this code is not included in ICD-10-CM ‡ The diagnosis codes for unspecified renal failure (i.e., the ICD-9 code 586 and the ICD-10 code N19) were not included, as these codes are not sufficient to identify previous acute renal failure, which may have been reversible, vs. chronic kidney disease (e.g., end-stage renal disease) vs. acute renal failure/acute kidney injury as a complication of COVID-19 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. * Dates associated with these ICD-10 codes were filtered to align with COVID-19 diagnosis, including only dates in the range of two weeks before the collection date of the first positive SARS-CoV-2 test to 90 days after the first encounter with SARS-CoV-2 infection . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 28, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. A m e r i c a n I n d i a n A s i a n A s i a n I n d i a n B l a c k O r A f r i c a n A m e r i c a n H a w a i i a n is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. Coronavirus Resource Center An interactive web-based dashboard to track COVID-19 in real time Working Group on the Clinical Characterisation and Management of Covid-infection. A minimal common outcome measure set for COVID-19 clinical research Systematic evaluation and external validation of 22 prognostic models among hospitalised adults with COVID-19: An observational cohort study Clinical features of COVID-19 mortality: development and validation of a clinical prediction model Machine learning prediction for mortality of patients diagnosed with COVID-19: a nationwide Korean cohort study Prediction model and risk scores of ICU admission and mortality in COVID-19 Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal Trends in COVID-19 Risk-Adjusted Mortality Rates A weekly Surveillance Summary of U.S. COVID-19 Actvity Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score Automated EHR score to predict COVID-19 outcomes at US Department of Veterans Affairs Characteristics of Hospitalized COVID-19 Patients Discharged and Experiencing Same-Hospital Readmission -United States Sixty-Day Outcomes Among Patients Hospitalized With COVID-19 Multivariate Imputation by Chained Equations in R Multiple Imputation for Nonresponse in Surveys Orthogonalizing EM: A design-based least squares algorithm Fast Penalized Regression and Cross Validation for Tall Data with the oem Package Regularization Paths for Generalized Linear Models via Coordinate Descent The authors wish to acknowledge Drs. Susan Garwood, Marjorie Wongskhaluang, Joe Restivo, and Faisal Cheema for their clinical insight as well as Troy Gifford, Ryan Patrick, Cecile Roman, Daniel Luckett, and Shaita Picard for their contributions to electronic data extraction, management and curation. Thank you to Molly Altman for providing project management and general . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 28, 2021. ; https://doi.org/10.1101/2021.04.26.21256138 doi: medRxiv preprint Schlauch et al. Predictions from Imputation X