key: cord-0844337-unm5nryj authors: Wiegand, M.; Cowan, S. L.; Waddington, C. S.; Halsall, D. J.; Keevil, V. L.; Tom, B. D. M.; Taylor, V.; Gkrania-Klotsas, E.; Preller, J.; Goudie, R. J. title: Development and validation of a dynamic 48-hour in-hospital prognostic risk stratification for COVID-19 in a UK teaching hospital: a retrospective cohort study date: 2021-02-18 journal: nan DOI: 10.1101/2021.02.15.21251150 sha: 224d692392daac19dfa626e545c9f59fa8e95143 doc_id: 844337 cord_uid: unm5nryj We propose a prognostic dynamic risk stratification for 48-hour in-hospital mortality in patients with COVID-19, using demographics and routinely-collected observations and laboratory tests: age, Clinical Frailty Scale score, heart rate, respiratory rate, SpO2/FiO2 ratio, white cell count, acidosis (pH < 7.35) and Interleukin-6. We train and validate the model using data from a UK teaching hospital, adopting a landmarking approach that accounts for competing risks and informative missingness. Internal validation of the model on the first wave of patients presenting between March 1 and September 12, 2020 achieves an AUROC of 0.90 (95% CI 0.87-0.93). Temporal validation on patients presenting between September 13, 2020 and January 1, 2021 gives an AUROC of 0.91 (95% CI 0.88-0.95). The resulting mortality stratification tool has the potential to provide physicians with an assessment of a patient's evolving prognosis throughout the course of active hospital treatment. SARS-CoV-2 virus infection, the cause of COVID-19, results in a spectrum of disease ranging from asymptomatic infection through to life threatening disease requiring critical care, and even death. For patients admitted to hospital, it is essential to identify who is at risk of deterioration and death to enable timely targeted interventions (such as immune modulation and mechanical ventilation), to facilitate appropriate resource allocation and patient flow, and to inform discussions with patients and families. Most existing disease severity prediction models for COVID-19 use only data that are available at the time of admission to hospital. Numerous such models have been proposed for both mortality and composite escalation/mortality outcomes, including new and re-purposed severity and early warning scores [1] [2] [3] [4] [5] [6] [7] and time-to-event models [8] [9] [10] [11] [12] [13] . Most, however, perform only moderately well and are at high risk of bias [14] . While some markers of severity, such as sex and age can be assumed constant for the duration of the hospital visit, others, such as clinical observations and blood test results, can change markedly over the course of admission. COVID-19 is a dynamic disease in which patients can deteriorate over a short time period or suffer acute complications e.g. thromboembolism [15] [16] . This may have a significant effect on a patient's prognosis that cannot be foreseen by a point-of-admission model. Dynamic models that assimilate clinical data as it accrues may provide more accurate and clinically useful prediction of a patient's clinical course and prognosis over the subsequent days than that of point-of-admission models. Predictive models that incorporate post-admission information are limited in number and scope. Some models using information from the first four or five days after admission to predict mortality or deterioration have been proposed, but do not continue beyond the first few days of admission [17] [18] . Other more recent models have made use of additional post-admission data, using a time-varying Cox model for mortality and escalation [19] or a machine learning model for mortality [20] . While indicating promising discrimination, these models use clinically unjustifiable or unclear methods for handling missing data and censoring, and do not account for informative missingness or consider the effect of treatments. Informative missingness describes the fact that in routinely-collected data the availability (or absence) of a result or observation may be related to the probability of the outcome. For example, a more extensive panel of investigations may be sent in patients thought more likely to benefit from Intensive Care Unit (ICU) admission. While often ignored, such effects can be strong in Electronic Health Record (EHR) data [21] [22] . We propose a prognostic risk stratification score for hospital patients with COVID-19, based on prediction of mortality in the subsequent 48 hours, using routinely-collected clinical data. The model is based upon a principled statistical approach called landmarking [23] that allows inclusion of any time-varying clinical parameters recorded prior to the time of prediction, whilst appropriately accounting for censoring and changes in the set of patients at risk. Our model further accounts for informative missingness and competing risks, which arise when there are two or more mutually exclusive outcomes: for example, once a patient is discharged, the risk of in-hospital mortality (during that admission) is removed, and therefore discharge is a "competing risk". This is a retrospective cohort study of all patients presenting to Cambridge University Hospitals, a regional, tertiary care, university hospital in the East of England, between March 1, 2020 and January 1, 2021. This hospital is the sole admission hospital for patients in its immediate catchment population with COVID-19, and is a regional referral centre for a wide range of specialist services (not including ECMO). The model was trained using data from patients presenting between March 1, 2020 and September 12, 2020 (hereafter "Wave 1"). A temporal validation of the model was then performed using patients presenting between September 13, 2020 and January 1, 2021 (hereafter "Wave 2"). We report our findings according to the Transparent Reporting of multivariate prediction models for Individual Prediction Or Diagnosis (TRIPOD) reporting guidelines [24] . All adults (>= 18 years of age) presenting to hospital during the study period and diagnosed with COVID-19 were included. Diagnosis was based on either a positive diagnostic SARS-CoV-2 test during or up to 14 days prior to the hospital visit, or a clinical diagnosis of COVID-19. Diagnostic testing used either a real-time reverse transcription polymerase chain reaction (RT-PCR) of the RdRp gene from a nasopharyngeal swab, or the SAMBA II point-of-care test used at the hospital [25] . Patients with clinically diagnosed COVID-19 (based on symptoms, and in the opinion of the treating clinician) were included because diagnostic testing was limited during the early stages of the pandemic [26] . Clinical diagnosis of COVID-19 was identified using International Classification of Diseases 10th Edition (ICD-10) codes in the EHR. We include only the first hospital visit for each patient after the positive test; any re-admissions were excluded. Nosocomial infection was defined as a first positive SARS-CoV-2 test or diagnosis more than 10 days after hospital admission. Since the first prediction by our model is made at 6 hours (to allow time for laboratory investigations), patients who died, were discharged or were classified as end of life within 6 hours of presentation to hospital were excluded. Throughout each patient's hospital visit, we aim to predict all-cause in-hospital mortality during the next 48 hours, a time period that we refer to as the "prediction horizon". We also consider two other outcomes, which are competing risks for the primary outcome: transfer to a tertiary Intensive Care Unit (ICU) for ECMO; and discharge from the hospital due to clinical improvement. Patients were followed up until January 3, 2021. The study was approved by a UK Health Research Authority ethics committee (20/WM/0125). Patient consent was waived because the de-identified data presented here were collected during routine clinical practice; there was no requirement for informed consent. The study used routinely collected data, extracted in anonymised form from the hospital EHR system, Epic (Epic Systems Corporation, Verona, Wisconsin). We selected a list of 59 candidate clinical parameters (Table S1 ) that have been included in existing point-of-admission prediction models or were clinically judged to be likely predictors. These are divided into five categories: demographics; comorbidities; observations; laboratory tests; and treatments, interventions and level of care. Basic patient demographics were extracted from the hospital EHR: age, sex, ethnicity, and Body Mass Index (BMI). Twelve comorbidities that have previously been associated with COVID-19 [27] were identified by the presence of the corresponding ICD-10 codes entered in the EHR prior to the time at which the prediction is made (either before or during the hospital visit). Table S2 provides the ICD-10 codes used to define each comorbidity. In addition to specific comorbidities, frailty amongst patients over 65 years old was assessed by the Clinical Frailty Scale (CFS) score [28] . For patients for whom a CFS score had not been recorded by the treating team, a consultant or specialist registrar in Geriatric Medicine reviewed the clinical records and assigned a CFS score using only information recorded at the time of admission [29] . This approach has been shown to have good agreement with CFS scores assigned after face to face assessment (inter-rater reliability kappa 0.84) [30] . We included the following observations that are regularly recorded in the EHR: heart rate (HR), mean arterial pressure, temperature and respiratory rate (RR). We summarised the measurements recorded over the previous 24-hour period as follows: mean, minimum and maximum value. We also calculated the trend as the difference between the median value for the last 24 hours, and the median value for the 24 hours prior to this. The Glasgow Coma Score (GCS) was extracted from the EHR; patients without a recorded GCS were assumed to have a GCS >= 12. PaO2/FiO2 (P/F) and SpO2/FiO2 (S/F) ratios were calculated to indicate the severity of hypoxia [31] [32] . SpO2 itself was not included as a potential predictor as our exploratory work suggested that, without accounting for FiO2, this largely reflected a patient's assigned oxygen saturation targets, and therefore acted as a proxy for underlying respiratory disease. Where only oxygen flow rate was available, FiO2 was estimated according to the EPIC II conversion tables [33] . For each of the 31 laboratory tests we considered, we included results up to 48 hours prior to the time at which the prediction was made. Where more than one result was available, we used the most recent result. In addition, for 7 of the most frequently measured blood tests (C-reactive protein (CRP), white cell count (WCC), platelets, haemoglobin, creatinine, sodium, potassium), we included the trend, defined as the difference between the median value for the last 24 hours, and the median value for the 24 hours prior to that. The neutrophil/lymphocyte and Interleukin-6/Interleukin-10 (serum IL-6/IL-10) ratios have previously been identified as prognostic, therefore we also considered these as potential predictors [9, 18, 34] . For blood markers where both abnormally low and abnormally high results could potentially be associated with poor prognosis (sodium and blood gas pH), we included the maximum deviation below and above the standard range in the previous 24 hours. We adjusted venous pH results by adding 0.03 to approximate arterial pH results [35] . We included 5 indicators of treatments, interventions and levels of care. The level of care of the patient was summarised by whether the patient had been in an ICU bed in the previous 24 hours. Mechanical ventilation was defined as patients receiving invasive ventilation during the previous 24 hours, either via endotracheal tube or tracheostomy. The use of renal replacement therapy during the last 24 hours was identified from the EHR. Cardiovascular support was defined as the administration of any vasopressors or inotropes in the last 24 hours. We also included in-hospital steroid administration prior to the prediction time, given that steroids can decrease the risk of death in patients with COVID-19 and that clinical trials operated in our hospital for part of the study time [36] [37] . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 18, 2021. ; We use the landmarking approach to dynamic prediction [23, [38] [39] . Landmarking fits a series of models at a sequence of fixed time origins called landmark times. These are the time points at which we extract the data used to develop the proposed model. At each landmark time, a time-to-event model is used to describe the likelihood of the event occurring within the next 48 hours (the "prediction horizon"). Only data recorded before (or at) the landmark time are used in each model; no data from the future can be used. If the event of interest happens after the prediction horizon then the event is treated as censored at this landmark. Patients who have had any event prior to the landmark time are excluded, since these patients are no longer at risk. We use the supermodel approach in which the time-to-event model is assumed constant across landmark times [40] . Our landmark times are every 24 hours during the hospital visit, starting 6 hours after presentation to the hospital, to allow sufficient clinical information to accrue for prediction at the initial landmark to be feasible. We only use data at each landmark time from patients being We use a Fine-Gray competing risk model [41] [42] to account for the competing risks of in-hospital death, which are transferral for ECMO and hospital discharge. We handle missing data using the missingness indicator approach because the recording in the EHR of a clinical parameter, regardless of the value, is often indicative of the treating health professional's contemporaneous view of the patient's prognosis [43] [44] . To do this we augment the set of potential predictors with binary variables that indicate whether, during the window of time we consider, any measurement of the corresponding parameter is available for that patient. The missing indicator approach avoids the need to make the missing at random (MAR) assumption that is unlikely to hold in these data [45] . To select the most predictive parameters into the model we used standard penalised variable selection, specifically Smoothly Clipped Absolute Deviations (SCAD), with the tuning parameter . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; https://doi.org/10.1101/2021.02.15.21251150 doi: medRxiv preprint chosen to minimise the Bayesian Information Criterion [46] . We paired parameters together with their corresponding missingness indicator to prevent inclusion of an incompletely-observed parameter without its missingness indicator, using the group SCAD [47] . We also allowed missingness indicators to be included by themselves, in case the presence of a parameter is informative irrespective of its value. All analyses were conducted in R version 3.6 [48] . The discriminative ability of our model was assessed visually using the Receiver Operator Characteristic (ROC) curve, showing the false positive rate (FPR, also equal to 1-specificity) against true positive rate (TPR, also equal to sensitivity); and the Precision-Recall (PR) curve, showing precision (positive predictive value, PPV) against recall (sensitivity). PPV is the key metric for a dynamic predictive model such as we propose, because the low incidence of the primary outcome leads to a strong imbalance of events and non-events that is not accounted for by the ROC curve [49] . We also assess the Number Needed to Evaluate (NNE), defined as the number of patients needed to evaluate in order to identify one in-hospital death within the prediction horizon, calculated as 1/PPV. Quantitative assessment of discrimination was performed using the Area Under the ROC (AUROC) curve, in which 0.5 indicates no discrimination and 1.0 indicates perfect discrimination. For validation of the performance of the model on the training data, in addition to the unadjusted AUROC, we also performed repeated 5-fold cross-validation to account for uncertainty and over-optimism due to the complete model building process (including variable selection) [50] . We also calculate the Area Under the PR Curve (AUPRC) since it provides a clearer performance summary than AUROC when the primary outcome has low incidence, as here [51] . We assessed calibration visually using a calibration plot of predicted risk against observed mortality rate. We also quantitatively assessed the calibration slope, testing whether beta=1 in the logistic regression model logit(probability of death) = alpha + beta * logit(p), where p is the risk score; and calibration-in-the-large, testing whether intercept alpha=0 when fixing beta=1 [52] . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. 519 patients presented to hospital with COVID-19 during Wave 1 (March 1, 2020 and September 12, 2020), of whom 46 were excluded due to discharge (34) , death (2) or transition to end of life care (10) prior to the first landmark time (i.e. within 6 hours of presentation). 473 patients were therefore included in the development of the model. In total we include 6846 landmark times for training the model, with a median of 9 (IQR 3-17) landmark times per patient. In the 48-hour prediction horizon following these landmark times, there were 119 in-hospital death events (1.7% of landmarks), 658 hospital discharge events (9.6%) and 10 transfers for ECMO (0.1%). Note that, since landmarks occur every 24 hour and the prediction horizon is 48 hours, patient events will usually occur within the prediction horizon of two adjacent landmark times. Table S1 reports summary statistics, missingness and the number of measurements available per landmark time for each predictor. No patients were excluded due to missing data. Our proposed model for 48-hour in-hospital mortality includes age, Clinical Frailty Scale (CFS) score, heart rate (HR), respiratory rate (RR), oxygen saturation/fraction of inspired oxygen (SpO2/FiO2) ratio, white cell count (WCC), acidosis (pH < 7.35) and Interleukin-6 (IL-6). The . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; estimated coefficients are shown in Table 2 , and the estimated hourly baseline cumulative subdistribution hazards are in Table S3 . Internal performance assessment Figure 1 shows the internal performance metrics. The unadjusted internal area under receiver-operating characteristic curve (AUROC) was 0.90 (95% CI 0.87-0.93) and the median cross-validation AUROC was 0.87, both indicating good discrimination ( Figure 1A) . The precision-recall (PR) curve ( Figure 1B ) also shows good discrimination, with an area under precision-recall curve (AUPRC) 0.31, in a population with 48 hour in-hospital mortality of 0.017 (1.7%), and the number needed to evaluate (NNE) < 10 for sensitivity less than 0.75 ( Figure 1C ). Figure 1D shows the calibration plot. The calibration intercept alpha = -0.02 (95% CI -0.22-0.17) indicates that the mean predicted probabilities match the mean observed mortality, while the calibration slope beta = 1.16 (95% CI 1.02-1.31) suggests that the observed mortality in high predicted risk patients slightly exceeds the predicted mortality risk. We assessed the performance of the model by applying it to held-out data corresponding to visits during Wave 2 (September 13, 2020 and January 1, 2021). 405 patients presented to the study hospital during this period. In total we include 3086 landmark times for training the model, with a median of 4 (IQR 1-10) landmark times per patient. In the 48-hour prediction horizon following these landmark times, there were 55 in-hospital death events (1.8% of landmarks), 441 hospital discharge events (14.3%) and 0 transfers for ECMO (0%). Characteristics are summarised in Table 1 . Of note, compared to Wave 1, patients presenting in Wave 2 were younger (median age 61 vs 69) and more likely to be female (50.1% vs 41.4%). Note that because follow-up continued until January 3, 2021, the outcome during the 48-hour prediction horizon was known for all landmark times up to January 1, 2021, and so patients remaining in hospital at the study end-date could be included in the validation. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; https://doi.org/10.1101/2021.02.15.21251150 doi: medRxiv preprint sensitivities, with AUPRC 0.21, and NNE < 10 for sensitivities between 0.01 and 0.87 ( Figure 2C ). Figure 2D shows the calibration plot, which shows a tendency of the model to underpredict risk in the higher risk patients: calibration-in-the-large was 0.33 (95% CI 0.04-0.60), suggesting the mean of the predicted probabilities is lower than the mean observed mortality, and calibration slope 1.15 (95% CI 0.95-1.37), indicating that the spread of predicted risk corresponds reasonably well with the spread of observed mortality. SARS-CoV-2 causes a wide spectrum of disease, from mild and short lived disease, through to more severe disease that can evolve over weeks, and may necessitate critical care management and even result in death. There is a pressing clinical need to be able to anticipate both disease severity and the trajectory of illness in order to facilitate patient management, resource allocation, and inform discussions with both patients and families. Although risk factors for severe disease, such as age and male sex have been widely recognised, these static risk factors provide little nuance or discrimination at the individual patient level. Disease severity models have been proposed that provide a more accurate assessment but these have focused on a single time point such as admission to hospital, and do not respond to changes in the clinical picture as the disease evolves. The model described herein uniquely incorporates both static risk factors (age and CFS) and evolving clinical and laboratory data, providing a dynamic 48-hour risk prediction model that can adapt to both sudden and gradual changes in an individual patient's clinical condition. Our model is further strengthened by the competing-risk landmarking approach we adopt, allowing the model to account for events other than death that remove the patient from the population at risk. The data used in the model were routinely collected demographic and clinical data from during the patient's hospitalisation, and were automatically extracted from patient EHRs. As such, this model could be readily incorporated into routine clinical care, providing invaluable information to clinicians on the ground who are managing patients with COVID-19. More broadly, our model highlights the potential utility of EHR data to inform our understanding of disease by making it feasible to collect and analyse the detailed data accrued during routine care, in contrast to the limited information that can be gathered via manual data collection. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; Several methodological aspects of our approach further strengthened our model. Firstly, we account for competing risks in the model. Competing risks refers to the situation whereby the outcome (risk) of interest (in this case in-hospital mortality) can only happen whilst the patient is in hospital, and therefore the outcome of interest is 'competing' against the risk of transfer to another hospital and/or discharge from hospital. Allowance for this has been shown to be important in predictive modeling [53] . Secondly, allowance was made for the potential of the availability of observations and investigations to in itself be a reflection of disease severity. For example, the fact that an arterial blood gas has been taken may reflect the clinical impression that a patient is deteriorating. While multiple imputation is often used in clinical prediction models because it gives unbiased estimates under the missing at random (MAR) assumption, it is unlikely that the MAR assumption holds in the routinely-collected EHR data that we use [45] . The missing indicator method that we adopt does not rely on the MAR assumption and has been found to lead to improved predictive performance in EHR data [43] [44] [45] . Furthermore, we do not seek to make prognostic predictions for patients after clinicians have identified them as Several predictors of disease severity selected by our model have also been identified by models that aimed to assess severity of disease at the point of admission to hospital, and in epidemiological studies of risk factors for severe disease. Increasing age is widely recognised as being the strongest predictor of poor outcome from COVID-19 [3, 11, 27 ] . Frailty has similarly been shown to be a strong independent predictor of mortality in hospitalised older adults [54] , including those with COVID-19 [29, 55] , and it is therefore unsurprising that the frailty score was selected in the model. The deleterious effect of SARS-CoV-2 infection on respiratory function is one of the commonest and often most severe effects of illness, and frequently precipitates hospital admission and the need for critical care. Markers of respiratory function, including respiratory rate [3, 4, 12] and SpO2/FiO2 (S/F) ratio [56] have been included in previous point-of-admission models; similarly . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; https://doi.org/10.1101/2021.02.15.21251150 doi: medRxiv preprint the recent ISARIC 4C deterioration model includes SpO2 and the need for supplemental oxygen [3] . The S/F ratio, as selected by our model may be a more informative measure of respiratory compromise as it allows a fully quantitative rather than dichotomous measure of the need for additional oxygen, as well as allowing for the confounding effect of variation in the target oxygen saturations in different patient groups. Clinicians often set a lower target SpO2 in patients with pre-existing respiratory disease, and thus oxygen therapy is not initiated until a much lower SpO2 is reached compared to otherwise healthy patients. Our model selected two markers of infection and inflammation: WCC and IL-6. This is consistent with other findings [11, [57] [58] . IL-6 was included in the routine COVID-19 panel of blood tests at the study hospital but we recognise that this may be less commonly requested in other hospitals. To assess whether C-reactive protein (CRP) could serve as a proxy for IL-6 in our model when it is not available, we refitted the model with CRP in place of IL-6 (Tables S4-S5). The AUROC is slightly lower on both training (0.89, 95% CI 0.85-0.93) and test (0.83, 95% CI 0.78-0.89) data, yielding a slightly weaker but potentially more broadly applicable model. The preference of the model for IL-6 over CRP may reflect the fact that IL-6 is responsible for the production of CRP and, as such, is an earlier and more dynamic marker of the inflammatory response [59] . The inclusion of acidosis in our model is more novel, although it has previously been noted as a marker of disease severity [60] . Acidosis frequently complicates respiratory, renal and advanced circulatory failure in patients, and is therefore frequently observed in patients with severe disease. The separate inclusion of the severity of acidosis and alkalosis in our set of candidate predictors removed the need to assume a linear effect of pH, allowing pH changes in either direction to be accounted for. This would avoid, for example, a minor negative effect of alkalosis from masking a more major effect of acidosis. There are several limitations to our study. We have not incorporated imaging data that have been used as a proxy for disease severity in some clinical trials, although we have included an extensive set of relevant clinical information. We also chose to include only laboratory results up to 48 hours and vital signs up to 24 hours before the landmark time. Exploiting older data in addition might improve the predictive ability of our model, but would also be likely to increase the model's complexity considerably, and therefore decrease its real-world utility. Our data were gathered from a single centre, and therefore the generalisability of our findings to other centres . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; https://doi.org/10.1101/2021.02.15.21251150 doi: medRxiv preprint and populations are uncertain. Further, our model was generated from a relatively modest sample size due to the relatively low prevalence of COVID-19 patients in the catchment population of the hospital, particularly during the early months of the pandemic. One advantage of using this single dataset from a large, tertiary hospital was that the hospital never became overwhelmed with patients, and therefore it is considered that patients received care according to what was felt to be clinically appropriate rather than according to what resources were available. It is also important to note that as the pandemic has evolved in the UK, there have been changes in both the clinical care of patients (notably with the routine inclusion of steroid therapy for patients requiring oxygen) and in the strains of the virus circulating. It is encouraging that the model continued to perform well in the Wave 2 validation data, but over time, changes such as these may influence the clinical picture of the disease, its severity and the risk factors for disease. It is likely therefore that the model will need to be updated as the pandemic evolves and the utilisation of routinely available data in this model makes this relatively simple to do. The ongoing COVID-19 pandemic continues to affect the lives of many and to put extreme pressure on clinical services. We have developed and validated a dynamic prediction model with high discrimination of 48-hour in-hospital mortality in COVID-19 using routine clinical data that updates over the course of the illness and patient admission. This represents a significant advance on existing point-of-admission COVID-19 prediction models: it has the potential to inform patient management and resource planning and allocation in real-world settings. In doing so, this model could considerably improve both individual patient care, and the ability to manage pandemic response on a hospital wide scale. The de-identified data that support the findings of this study are available from Cambridge University Hospitals but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request with permission of Cambridge University Hospitals. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 18, 2021. ; . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 18, 2021. ; https://doi.org/10.1101/2021.02.15.21251150 doi: medRxiv preprint Evaluation and Improvement of the National Early Warning Score (NEWS2) for COVID-19: a multi-hospital study Comparison of Severity Scores for COVID-19 Patients with Pneumonia: A Retrospective Study Development and Validation of the ISARIC 4C Deterioration Model for Adults Hospitalised with COVID-19: A Prospective Cohort Study Development and Validation of the Quick COVID-19 Severity Index: A Prognostic Tool for Early Clinical Decompensation Development and External Validation of a COVID-19 Mortality Risk Prediction Algorithm: A Multicentre Retrospective Cohort Study Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography Scoring Systems for Predicting Mortality for Severe Patients with COVID-19 Association of Red Blood Cell Distribution Width with Mortality Risk in Hospitalized Adults with SARS-CoV-2 Infection Dynamic Changes of D-Dimer and Neutrophil-Lymphocyte Count Ratio as Prognostic Biomarkers in COVID-19 Risk Factors for Mortality in 244 Older Adults with COVID-19 in Wuhan, China: A Retrospective Study Clinical Course, and Outcomes of Critically Ill Adults with COVID-19 in New York City: a prospective cohort study Early Predictors of Clinical Deterioration in a Cohort of 239 Patients Hospitalized for Covid-19 Infection in Lombardy, Italy Early Triage of Critically Ill COVID-19 Patients Using Deep Learning Prediction Models for Diagnosis and Prognosis of Covid-19: Systematic Review and Critical Appraisal Incidence of Thrombotic Complications in Critically Ill ICU Patients with COVID-19 Pulmonary Embolism in Patients with COVID-19 Value of Dynamic Clinical and Biomarker Data for Mortality Risk Prediction in COVID-19: A Multicentre Retrospective Cohort Study A Linear Prognostic Score Based on the Ratio of Interleukin-6 to Interleukin-10 Predicts Outcomes in COVID-19 Predicting the Need for Escalation of Care or Death from Repeated Daily Clinical Observations and Laboratory Results in Patients with SARS-CoV-2 During 2020: A Retrospective Population-Based Cohort Study from the United Kingdom Relational Learning Improves Prediction of Mortality in COVID-19 in the Intensive Care Unit Biases in Electronic Health Record Data Due to Processes Within the Healthcare System: Retrospective Observational Study Dynamic Prediction in Clinical Survival Analysis Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement Performance Evaluation of the SAMBA II SARS-CoV-2 Test for Point-of-Care Detection of SARS-CoV-2 Covid-19: What is the UK's Testing Strategy? Factors Associated with COVID-19-Related Death Using OpenSAFELY A Global Clinical Measure of Fitness and Frailty in Elderly People Clinical Features, Inpatient Trajectories and Frailty in Older Inpatients with COVID-19: A Retrospective Observational Study Can Patient Frailty Be Estimated from Inpatient Records? A Prospective Cohort Study Comparison of the Spo2/Fio2 Ratio and the PaO2/FiO2 Ratio in Patients with Acute Lung Injury or ARDS Derivation and Validation of SpO2/FiO2 Ratio to Impute for PaO2/FiO2 Ratio in the Respiratory Component of the Sequential Organ Failure Assessment Score* International Study of the Prevalence and Outcomes of Infection in Intensive Care Units Clinical Impact of Monocyte Distribution Width and Neutrophil-to-Lymphocyte Ratio for Distinguishing COVID-19 and Influenza from Other Upper Respiratory Tract Infections: A Pilot Study Peripheral Venous and Arterial Blood Gas Analysis in Adults: Are They Comparable? A Systematic Review and Meta-Analysis Dexamethasone in Hospitalized Patients with Covid-19 -Preliminary Report Association Between Administration of Systemic Corticosteroids and Mortality Among Critically Ill Patients With COVID-19: A Meta-Analysis Using the Landmark Method for Creating Prediction Models in Large Datasets Derived from Electronic Health Records Landmark Models for Optimizing the Use of Repeated Measurements of Risk Factors in Electronic Health Records to Predict Future Disease Risk Dynamic Prediction by Landmarking in Event History Analysis A Proportional Hazards Model for the Subdistribution of a Competing Risk Dynamic Prediction of Competing Risk Events Using Landmark Sub-Distribution Hazard Models with Multiple Longitudinal Biomarkers Informative Presence and Observation in Routine Health Data: A Review of Methodology for Clinical Risk Prediction New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study Missing Data Should be Handled Differently for Prediction Than for Description or Causal Explanation Penalized Variable Selection in Competing Risks Regression Group SCAD Regression Analysis for Microarray Time Course Gene Expression Data R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing Why the C-Statistic is Not Informative to Evaluate Early Warning Scores and What Metrics to Use An Empirical Assessment of Validation Practices for Molecular Classifiers The Precision-Recall Curve Overcame the Optimism of the Receiver Operating Characteristic Curve in Rare Diseases Clinical Prediction Models: A Practical Approach To Development, Validation and Updating Prognostic Models with Competing Risks: Methods and Application to Coronary Risk Prediction Clinical Frailty Adds to Acute Illness Severity in Predicting Mortality in Hospitalized Older Adults: An Observational Study The Effect of Frailty on Survival in Patients with COVID-19 (COPE): A Multicentre, European, Observational Cohort Study Patient Trajectories Among Persons Hospitalized for COVID-19 : A Cohort Study IL-6-Based Mortality Risk Model for Hospitalized Patients with COVID-19 Risk Factors Associated With Acute Respiratory Distress Syndrome and Death in Patients With Coronavirus Disease Role of Interleukin-6 to Differentiate Sepsis from Non-Infectious Systemic Inflammatory Response Syndrome Clinical Characteristics of 113 Deceased Patients with Coronavirus Disease 2019: Retrospective Study The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and All authors approved of the final version of this script to be published, and are accountable for the work presented..