key: cord-0777213-ibjgpzrs authors: Ray, S.; Swift, A.; Fanstone, J. W.; Banerjee, A.; Mamalakis, M.; Vorselaars, B.; Mackenzie, L. S.; Weeks, S. title: LUCAS: A highly accurate yet simple risk calculator that predicts survival of COVID-19 patients using rapid routine tests date: 2021-04-30 journal: nan DOI: 10.1101/2021.04.27.21256196 sha: 35e0173562cc64a66522acb708e5b9c59628fd24 doc_id: 777213 cord_uid: ibjgpzrs There is an urgent need to develop a simplified risk tool that enables rapid triaging of SARS CoV-2 positive patients during hospital admission, which complements current practice. Many predictive tools developed to date are complex, rely on multiple blood results and past medical history, do not include chest X ray results and rely on Artificial Intelligence rather than simplified algorithms. Our aim was to develop a simplified risk-tool based on five parameters and CXR image data that predicts the 60-day survival of adult SARS CoV-2 positive patients at hospital admission. Methods We analysed the NCCID database of patient blood variables and CXR images from 19 hospitals across the UK contributed clinical data on SARS CoV-2 positive patients using multivariable logistic regression. The initial dataset was non-randomly split between development and internal validation dataset with 1434 and 310 SARS CoV-2 positive patients, respectively. External validation of final model conducted on 741 Accident and Emergency admissions with suspected SARS CoV-2 infection from a separate NHS Trust which was not part of the initial NCCID data set. Findings The LUCAS mortality score included five strongest predictors (lymphocyte count, urea, CRP, age, sex), which are available at any point of care with rapid turnaround of results. Our simple multivariable logistic model showed high discrimination for fatal outcome with the AUC-ROC in development cohort 0.765 (95% confidence interval (CI): 0.738 - 0.790), in internal validation cohort 0.744 (CI: 0.673 - 0.808), and in external validation cohort 0.752 (CI: 0.713 - 0.787). The discriminatory power of LUCAS mortality score was increased slightly when including the CXR image data (for normal versus abnormal): internal validation AUC-ROC 0.770 (CI: 0.695 - 0.836) and external validation AUC-ROC 0.791 (CI: 0.746 - 0.833). The discriminatory power of LUCAS and LUCAS + CXR performed in the upper quartile of pre-existing risk stratification scores with the added advantage of using only 5 predictors. Interpretation This simplified prognostic tool derived from objective parameters can be used to obtain valid predictions of mortality in patients within 60 days SARS CoV-2 RT-PCR results. This free-to-use simplified tool can be used to assist the triage of patients into low, moderate, high or very high risk of fatality and is available at https://mdscore.net/. The UK National Health Service (NHS) has experienced unprecedented pressure due to the recurrent surges of coronavirus disease 2019 cases caused by the SARS CoV-2 virus. To date more than two million confirmed cases in the UK have forced healthcare professionals to face complex decisions on how to effectively triage patients on admission who may need acute care. Under these strained conditions there is an urgent need to develop a simplified risk tool that enables rapid triaging of SARS CoV-2 positive patients, which complements predictive models commonly used in practice. Any new model must support the healthcare professional's experiential knowledge and clinical reasoning [1] in both patient care management and allocation of limited healthcare resources. Predictive models such as the National Early Warning Score 2 (NEWS2) is widely used in practice to observe and identify deteriorating patients [2] . Although, favourable amongst healthcare professionals [3] , this predictive tool has caused an increased false trigger rates for SARS CoV-2 positive patients [4] . Another familiar tool in practice in the UK, is the Acute Physiology And Chronic Health Evaluation II (APACHE II) which is a validated intensive care unit (ICU) scoring tool used to estimate ICU mortality, which has underestimated the mortality risk in SARS CoV-2 positive patients [5] . Several published models aimed to meet the clinician's need to stratify high-risk SARS CoV-2 positive patients but are yet to be widely clinically implemented and had poor clinician feedback [6] . Here we aim to avoid 'clinical resistance' by implementing a new prognostic model by developing and validating a new tool that extends on the widely used NEWS2 model [7] . This model combines clinical data, routine laboratory tests and chest X-ray (CXR) to improve risk stratification and clinical intervention for SARS CoV-2 infection. Studies have shown CXR results add value in the triage of patients who are SARS CoV-2 positive [8] ; however, the incremental prognostic value of CXR [9] in addition to clinical data and routine laboratory tests remains to be consolidated in practice. Using statistical methods used previously [10] , we used well-defined predictors and outcomes to limit model overfitting. The model's development and reporting adhered to the TRIPOD (transparent reporting of a multivariable prediction model for individual prediction or diagnosis) guidelines [11] to publish a simple, user-friendly scoring system for adult patients admitted to hospital with SARS CoV-2 infection that will lead to an improved integration of the best evidence into clinical care pathways that will assist clinicians in risk, patient and resource management. The aim of this study was to develop a simple objective tool for risk stratification in SARS CoV-2 infection that would easily support established triage practice in the hospital setting that integrates results from rapid and routine clinical, laboratory and CXR image data. A prospective cohort study was conducted with The National COVID-19 Chest Imaging Database (NCCID) [12] that collated clinical data from secondary and tertiary NHS Trusts across the UK with the aim to support SARS CoV-2 care pathways (Supplementary Table 1 ). NHS staff submitted data for patients suspected of SARS CoV-2 with a RT-PCR test that was positive or negative. The centralized data warehouse stored de-identified hospital admission's clinical data including between 23 rd January 2020 and 7 th December 2020, which was used for both the development and validation of the mortality risk tool. To evaluate the tool's generalized performance, a second dataset was used for external validation. The data extracted from the Laboratory Information Management System (LIMS) came from a large NHS Foundation Trust hospital in north-east England UK, which had not participated in the NCCID initiative. Patients who were admitted with suspected SARS CoV-2 infection at the Accident and Emergency (A&E) Department between 1 st March 2020 and 21 st August 2020 were included. This population was representative of the adult general population and therefore patients aged 16 years and younger were excluded, as the laboratory data thresholds vary compared to adults. Adult patients defined by positive RT-PCR for SARS CoV-2 virus were included from all data sets (development, internal and external validation) for analysis. In order to mitigate the incomplete penetrance of testing during initial stages of the pandemic and limited sensitivity of the RT-PCR swab test, NCCID provided an overall SARS CoV-2 status that was derived from cumulative RT-PCR tests. This approach was also applied to identify SARS CoV-2 positive patients in the external validation data set. To evaluate the tool's predictive performance, the NCCID dataset was non-randomly split with the admissions before 30 th April 2020 that made up the development cohort (n = 1434) and admission on or after 1 st May 2020 that made up the internal validation cohort (n = 310) (Figure 1) . The external validation cohort (n = 741) was filtered by interaction with A&E dates and mortality dates on all SARS CoV-2 positive patients. This approach was rationalised since patients were more comparable on latent variables such as co-morbid illnesses that were not explicitly captured during the LIMS data extraction of laboratory results. In addition, the filtering on admitted patients increased the density of laboratory results across all the candidate predictors, which made the model less reliant on imputation. The primary outcome of interest was in-hospital mortality within 60-days of a positive SARS CoV-2 RT-PCR test. To identify this outcome, the date of death was followed up and entered by hospital staff as part of the NCCID initiative. The NCCID dataset collected 38 clinical data points for each patient that served as candidate predictors. These were categorised into demographics, risk factors, past medical history (PMH), medications, clinical observations, chest imaging data, and laboratory parameters measured on admission. Chest X-ray (CXR) reports from the NCCID model development dataset and the external dataset were dichotomised as normal versus abnormal. All laboratory tests were measured before SARS CoV-2 RT-PCR test and mortality date, which ensured these potential predictors were blinded to the outcome. The SARS CoV-2 swab and RT-PCR results established the final COVID-19 status. In the event of deterioration, data points such as the NEWS2 score, Acute Physiology and Chronic Health Evaluation (APACHE) score, Intensive Therapy Unit (ITU) admission, intubation and mortality date were also recorded. Since the focus of the study was to identify predictors that were available at the point of admission, the ITU admission and intubation data were excluded in the final analysis. From this extensive list, only the demographics, clinical observations, and laboratory parameters recorded at the time of admission for SARS CoV-2 positive patients were included in the analysis as key predictors. Previous medical history, smoking status and current pharmaceutical interventions were not included in the analysis. This rationale was based on consultations with biomedical scientists and clinicians working in A&E departments to determine which rapid and routine laboratory tests would be clinically relevant candidate predictors, as well as previously published literature on prognostic factors associated with SARS CoV-2 infection for the model development [13] . To account for the risk of spurious selection or exclusion of important predictors in the model development, a minimum sample size of 380 was determined based on 10 outcome events per variable. Based on previous studies, a minimum of 100 mortality events and 100 non-mortality events was required for external validation sample size [14] . Reliability of data was ensured by excluding patients from the development and internal validation datasets if information was missing on key characteristics, such as and including RT-PCR SARS CoV-2 positive results, admission and mortality dates outside the 60-day prediction interval (Figure 2) . Predictors with more than 40% missing values were also excluded from the modelling process (Supplementary Figure 1) . Missing data from the remaining predictors were handled using Multiple Imputation by Chained Equations (MICE) [15] and ten different imputed data sets were combined using the Rubin rule [16] . Continuous predictors such as the laboratory tests and age were not converted into categorical predictors. An initial multivariable logistic regression (MLR) model started with 13 predictors with less than 40% missing values (Lymphocyte, Urea, CRP, Age, Sex, WCC, Creatinine, Platelets, Diastolic BP, Systolic BP, Temperature, Heart rate, PaO 2 ) were used to fit a standard logistic model. Elastic Net (GLMNET) [17] , weighted combination of Least Absolute Shrinkage and Selection Operator (LASSO) and ridge regression were used to select predictors of importance. The optimal value of the shrinkage estimator for LASSO and the weight of GLMNET were chosen using a 10-fold cross validation using the R-package caret [18] . Additionally, the performance of the MLR model was compared with the performance of random forest machine learning approach [19] . Another MLR model took a more pragmatic approach in predictor selection that was based on the predictor's clinical relevance in monitoring COVID-19 infection. This model started with 8 predictors (Lymphocyte, Urea, CRP, Age, Sex, WCC, Creatinine, Platelets) that were retained after GLMNET, LASSO and ridge regression became the final model. In addition, the model's performance was evaluated if the inclusion of either NEWS2 or CXR or both improved this model's performance. Each model's performance was assessed using Area Under the Receiving Operator Characteristics Curve (AUC-ROC) with a 95% confidence interval. The internal and external validation of each model were assessed with bootstrap resampling. The predictor selection and entire modelling process were repeated in 1000 bootstrap samples to estimate the realistic predictive measures for future . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint patient cohorts. An AUC-ROC of 0.5 indicated a prediction based on random chance, 0.7 to 0.8 was considered acceptable, above 0.8 demonstrated excellent and a value of 1.0 showed a perfect prediction [20] . We also evaluated the goodness of fit using the Brier score [21] which is a measure to quantify the closeness of the probabilistic predictions to the binary ground-truth class labels. The score varies between 0 and 1, with the lower score indicating superior performance. Finally, the accuracy of the models was evaluated using the standard cut-off value of 0.5. The internal validation included the NEWS2 score; however, NEWS2 score was not available for the external validation dataset. The statistical package R (3.5.3) was used to perform all statistical analysis and the methods employed in the model development, validation workflow and primary model fitting followed by both internal and external validation. De-identified and pseudo-anonymised patient data were obtained from data sets which were approved by the ethics committee as part of the existing Cardiac MRI Database NHS REC IRAS Ref: 222349 and University of Brighton REC (8011). The data that support the findings of this study are available from the NCCID [12] and another NHS site; but restrictions apply to the availability of these data sets, which were used under license for the current study, and are not publicly available. The Simple Logistic Regression based calculator is freely available at https://mdscore.net/ . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint Overall, the NCCID database comprised of 3818 patients (23/01/2020 -31/12/2020). The number of patients that met the inclusion criteria and missingness pattern were 1434 and 310 patients for the development and internal validation data sets, respectively (Figure 1 ). All available data on positive patients were used in the external validation data set and divided between 242 fatal (mortality) and 499 non-fatal (no mortality) cases. The development dataset included 38 data points (continuous and categorical) that followed the removal of date data and ≥ 40% missingness. Summary statistics of median (IQR) for numeric variables and proportions for categorical variable are shown and compared to the internal validation and external validation data sets ( Table 1) . From the NCCID 38 possible predictors we focused on 13 predictor variables based on clinical reasoning that related to demographics, clinical observations, laboratory parameters recorded at the time of admission and did not have more than 40% missing data (Figure 2) . The variable importance plot indicates the relative importance of these predictors (Figure 3 ). The GLMNET model selection criteria optimally chose predictors based on 10-fold cross validation that created the MLR model and had an AUC-ROC of 0.759 ( Table 2 ). In contrast, the random forest (RF) model used all 13 predictors which increased the complexity of the prediction and achieved a slightly higher AUC-ROC value of 0.806. Thus, the simplicity and ease of fitting our MLR model makes it a robust score which can be easily adaptable to new training data. The rapid blood parameter measurements, sex and age recorded at the time of admission and they parameterized our model. The relative difference of the laboratory parameters along with age, between patients with fatal and non-fatal outcome are presented in Figure 4 . Starting the MLR model with these eight predictors, namely, lymphocyte, urea, CRP, age, sex, WCC, creatinine, platelets, and using GLMNET for variable selection, a final model of 5 predictors were retained with AUC-ROC 0.765 ( Figure 5 ). These predictors: Lymphocyte count, Urea, CRP, Age and Sex (LUCAS) became the final model with optimal accuracy results compared to the MLR and random forest models with different number and set of predictors included ( Table 2) . From these results, the model can accurately predict 91% of the true fatal outcomes and 36% of the true nonfatal outcomes. The first step taken to stratify the outcomes from each predictor involved effect plots along with the estimated linear predictor (linear combination of the predictors with the estimated coefficients of MLR) (Figure 5) . Each sub-plot in Figure 5 indicates how the individual predictor is related to the probability of fatal outcome. The binary predictor of sex shows an increase in probability of fatal outcome for males. For the continuous predictor of age, urea and CRP, we can see a clear increase in the probability of fatal outcome corresponding to the increase in value of these predictors. We can also observe that in the presence of other predictors, lymphocyte count is a weak predictor as there is a slight decrease in the probability of fatal outcome with the increase in lymphocyte count. We can also notice that the error band around the mean effects is quite wide on the higher values of CRP and lymphocyte count since there are only a few observations in the higher range of these two predictors. Finally, we can clearly see that the linear combination given by LUCAS has a steep slope with a very narrow error bar giving us a very sensitive and accurate mortality score. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Along with the exact values of the probability of fatal outcome based on the five predictors, we propose a risk stratification ( Table 3 ) based on these probabilities which allows a simplified interpretation of mortality to low, medium, high and very high risk of a fatal outcome. Though the percentage of missing predictors in our reduced models were negligible, sensitivity analysis was performed using complete case data and as expected the model performance did not change. We also note that, the simple logistic regression was preferred over machine learning approaches such as Random Forest starting from same set of predictors, based on simplicity and their performance in the validation set. We have performed validation on an internal validation (NCCID Data between 1 st May 2020 and 7 th December 2020) cohort and external validation cohort consisting of data from a separate A+E site (n = 1012; between 1 st March 2020 and 21 st August 2020) with a small overlap with the available predictors from the derivation and internal validation cohort. However, all five predictors from the model were present (LUCAS: Lymphocyte count, Urea, CRP, Age and Sex). The median age of patients from the internal validation dataset was 74 years (interquartile range; IQR 62-84), which was not significantly different from the external validation dataset with a median of 76 years (IQR 61-84). The proportion of male patients was similar in both groups; 58% male in the internal validation set, and 55% male in external validation set ( Table 1) . The performance of the LUCAS calculator on the internal validation cohort was very good; AUC-ROC 0.744 (CI: 0.673 -0.808) with an accuracy 0.796 at the standard cut-off of 0.50 for the probability of a SARS CoV-2 positive patient dying within 60 days of a positive test ( Table 4) . Use of the available NEWS2 data in combination with LUCAS had little effect on accuracy (AUC-ROC 0.747, CI: 0.668 -0.821). In contrast, inclusion of CXR data in combination with LUCAS led to a slight increase in AUC-ROC to 0.770 (CI: 0.695 -0.836). The choice to use the CXR outcome (normal vs abnormal) was therefore included in the simple LUCAS calculator as an optional extra predictor. LUCAS predictor outperformed in external validation by 0.752 (CI: 0.713-0.790) AUC-ROC value contrary to the internal dataset (0.744 CI: 0.673 -0.808). This can justify the robustness and generalization of the calculator since the variation between the two datasets is not high (0.046, robust) and the model outperforms in the external validation (generalization). Including CXR information improved the prediction performance of AUC-ROC to 0.791 (CI: 0.746 -0.833). Note that, the NEWS2 score was not available for the external validation cohort, so this result could not be externally validated. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint We have developed and validated a simplified, fast track mortality calculator based on three rapid and routine simple predictors plus age and sex, which can be optionally used in combination with CXR results. The LUCAS calculator is freely available, relies on objective measurements only, and has been both internally and externally validated. The use of LUCAS calculator is for use to aid triage on patient admission to A&E following a positive SARS CoV-2 RT PCR test. The robustness and generalized results in the validation process classify the tool as an excellent candidate for risk management of the mortality level in a 60-day survival interval of adult SAR CoV-2 positive patients. The generalisation of LUCAS was justified since the calculator delivered higher accuracy of the external validation compared with the internal validation set. In addition, the incorporation of CXR results as normal versus abnormal improved the prediction performance. The NCCID dataset collected 38 clinical data points for each patient that served as candidate predictors. These were categorised into demographics, risk factors, past medical history, medications, clinical observations, chest-imaging data, and laboratory parameters measured on admission. All laboratory tests were measured before SARS CoV-2 RT-PCR test and mortality date, which ensured these potential predictors were blinded to the outcome. The SARS CoV-2 swab and RT-PCR results established the final COVID-19 status. Moreover, thorough study of other included measures in the event of deterioration, such as the NEWS2 score, was also evaluated. As a result, a comprehensive study and evaluation of all the possible blood markers and clinical patient information was taken into consideration. This study shows that a simple and objective tool can risk stratify SARS CoV-2 positive patients within one hour after hospital admission. The primary objective showed that rapid and routine laboratory blood tests and chest imaging data added predictive value beyond the RT-PCR test and clinical observations with high AUC-ROC. There have been a large number of prognostic tools published, most notably the 4C Mortality Score [22] and QCOVID [23] which were based on very large cohort datasets and includes a large number of predictors in their algorithms. Our study is the first to combine a minimum number of blood results along with CXR data, so as to generate a simplified calculator based on as few objective predictors as possible. The 4C Mortality Score includes a large number of parameters including PMH, demographics and blood measurables, which results a high AUC ROC of 0.79. However, gaining an accurate past medical history during triage is not always practical, and the 4C calculator was not externally validated, and its use has been limited due to its complexity. Our aim was to use the minimum number of predictors without losing accuracy, which was achieved, resulting in LUCAS which exhibits a similar level of prediction as the more complex and detailed 4C algorithm. The primary QCOVID score was developed as a risk prediction algorithm to estimate hospital admission and mortality outcomes, which also included a large number of predictors, including PMH. More importantly the authors clearly state that their score is "not intended for use supporting or informing clinical decision-making." Numerous prediction models have been developed to aid triage and research into COVID-19 disease severity. While there has been a great deal of useful insight into the disease gained from these studies and prognostic tools, there is a range of outcomes mostly due to some having a high risk of bias, lack of transparency and lack of internal [24] and or external validation [22, 25] . Our study improves on these issues by conforming to PROBAST and being both internally validated from the same large NCCID dataset, and externally validated in a smaller, separate hospital database. In addition, many studies require past medical history, or base the prediction solely mainly on the underlying health . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint conditions of the patient [25, 26] . These data may be difficult to assess accurately on admission to hospital, and may mislead should the patient have undiagnosed conditions. For this reason, we focused our model on measurables taken routinely on admission. Inclusion of CXR data is optional on the online LUCAS calculator, and based on simple outcome of normal/abnormal image results. The ability to include CXR results is not widely used in other prediction-modelling calculators and has been included in one other study [25] along with ten other parameters (symptoms, past medical history and measurables). Our focus however was to form a simplified model on blood results and with the option of CXR images, which we have achieved. All of the predictors used in LUCAS have individually been indicated as important in COVID-19 severity and mortality, although our study is the first to use only these predictors in a prognostic score. Leukocytes, Urea and CRP are recognised as key measurable predictors of severity of SARS CoV-2 infection, and age and sex are well-known predictors of mortality [22, 23, 27] . Throughout the study, we have carefully considered the risk of bias that is inherent in retrospective studies. By conducting both internal and external validation the study here indicates a robust model, with reduced bias since only patients testing positive for SARS CoV-2 were included in the development of the LUCAS algorithm. The size of the external validation set was smaller than the development set allowing us to check for discrimination of population size, and the results indicate that the LUCAS calculator can predict form small cohorts as well as it can from larger size populations. The patient data was collected at an early stage of the pandemic, when treatments differed to later in the year, which would affect the death rate in hospital, and our results do not account for non-hospital death that may occur outside of the hospital or in the 60-day window following diagnosis. Over time, any algorithm of mortality will change due to improvements in therapies as well as the use of vaccination which will change the profile of those at risk of COVID-19 related death. In conclusion, there are major strengths of this study including the analytical approach taken, the comparison to NEWS2 score, the inclusion of CXR data and the resulting algorithm is both internally and externally validated. The CXR data contributes an improvement in the prediction of the mortality of the patient, however the results used in the LUCAS calculator is basic. Future development of this work will be to extract pixel to pixel information based on CXR medical image analysis using deep learning or machine learning methods. Thus, as feature work we will combine LUCAS with ML and DL models to deliver the option of CXR imaging analysis when this is available. Contributors: SW, LM are joint last authors. LM is corresponding author for this article. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 30, 2021. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint Figure 5 . Predictor plots; Multivariate association between predictors (Age, Sex, Urea, CRP and Lymphocyte count) and probability of fatal outcome. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 30, 2021. ; https://doi.org/10.1101/2021.04.27.21256196 doi: medRxiv preprint Building on experience--the development of clinical reasoning Early warning scores: a sign of deterioration in patients and systems National Early Warning Score 2 (NEWS2) to identify inpatient COVID-19 deterioration: a retrospective analysis Analysis of Critical Care Severity of Illness Scoring Systems in Patients With Coronavirus Disease 2019: A Retrospective Analysis of Three U.K. ICUs Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal What every intensivist should know about prognostic scoring systems and risk-adjusted mortality. Rev Bras Ter Intensiva Chest X-ray for predicting mortality and the need for ventilatory support in COVID-19 patients presenting to the emergency department DenResCov-19: A transfer deep-learning network for robust automatic classification of COVID-19, pneumonia and tuberculosis from X-rays Use of Machine Learning and Artificial Intelligence to predict SARS-CoV-2 infection from Full Blood Counts in a population PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies Using imaging to combat a pandemic: rationale for developing the UK National COVID-19 Chest Imaging Database Prognostic factors in patients admitted to an urban teaching hospital with COVID-19 infection Substantial effective sample sizes were required for external validation studies of predictive logistic regression models Flexible Imputation of Missing Data Multiple Imputation for Nonresponse in Surveys Regularization and variable selection via the elastic net Building Predictive Models in R using the caret Package Random Forests Receiver operating characteristic curve in diagnostic test assessment Assessing the performance of prediction models: a framework for traditional and novel measures Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study Calculated Decisions: Brescia-COVID Respiratory Severity Scale (BCRSS)/Algorithm Development and Validation of a Clinical Risk Score to Predict the Occurrence of Critical Illness in Hospitalized Patients With COVID-19 Predicting in-hospital mortality from Coronavirus Disease 2019: A simple validated app for clinical use Clinical and laboratory features of COVID-19: Predictors of severe prognosis The researchers would like to thank the biomedical scientists at NHS Greater Glasgow and Clyde hospitals and Portsmouth Hospital NHS Trust for their advice. All researchers declare no competing interests.