key: cord-0889263-ynw2i48b authors: Cheng, Jianhong; Sollee, John; Hsieh, Celina; Yue, Hailin; Vandal, Nicholas; Shanahan, Justin; Choi, Ji Whae; Tran, Thi My Linh; Halsey, Kasey; Iheanacho, Franklin; Warren, James; Ahmed, Abdullah; Eickhoff, Carsten; Feldman, Michael; Mortani Barbosa, Eduardo; Kamel, Ihab; Lin, Cheng Ting; Yi, Thomas; Healey, Terrance; Zhang, Paul; Wu, Jing; Atalay, Michael; Bai, Harrison X.; Jiao, Zhicheng; Wang, Jianxin title: COVID-19 mortality prediction in the intensive care unit with deep learning based on longitudinal chest X-rays and clinical data date: 2022-02-19 journal: Eur Radiol DOI: 10.1007/s00330-022-08588-8 sha: dae6ef152c5cd6f564183891ee0919138163680b doc_id: 889263 cord_uid: ynw2i48b OBJECTIVES: We aimed to develop deep learning models using longitudinal chest X-rays (CXRs) and clinical data to predict in-hospital mortality of COVID-19 patients in the intensive care unit (ICU). METHODS: Six hundred fifty-four patients (212 deceased, 442 alive, 5645 total CXRs) were identified across two institutions. Imaging and clinical data from one institution were used to train five longitudinal transformer-based networks applying five-fold cross-validation. The models were tested on data from the other institution, and pairwise comparisons were used to determine the best-performing models. RESULTS: A higher proportion of deceased patients had elevated white blood cell count, decreased absolute lymphocyte count, elevated creatine concentration, and incidence of cardiovascular and chronic kidney disease. A model based on pre-ICU CXRs achieved an AUC of 0.632 and an accuracy of 0.593, and a model based on ICU CXRs achieved an AUC of 0.697 and an accuracy of 0.657. A model based on all longitudinal CXRs (both pre-ICU and ICU) achieved an AUC of 0.702 and an accuracy of 0.694. A model based on clinical data alone achieved an AUC of 0.653 and an accuracy of 0.657. The addition of longitudinal imaging to clinical data in a combined model significantly improved performance, reaching an AUC of 0.727 (p = 0.039) and an accuracy of 0.732. CONCLUSIONS: The addition of longitudinal CXRs to clinical data significantly improves mortality prediction with deep learning for COVID-19 patients in the ICU. KEY POINTS: • Deep learning was used to predict mortality in COVID-19 ICU patients. • Serial radiographs and clinical data were used. • The models could inform clinical decision-making and resource allocation. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00330-022-08588-8. The novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) disease (COVID- 19) was first detected in Wuhan, China, in late December 2019 and quickly became a global health crisis [1] . As of August 2021, almost 200 million confirmed cases have been reported globally with over four million deaths. In the USA alone, over 600,000 deaths are attributable to the virus [2] . Typical symptoms include fever, dyspnea, cough, and muscle aches; however, the disease can cause severe cardiorespiratory complications, particularly in vulnerable populations (e.g., the elderly and those with comorbidities) [3] . Despite rapid vaccine development and extensive public health mitigation efforts, COVID-19 remains a global health emergency. Additionally, novel variants threaten to exacerbate the severity and duration of the pandemic [4, 5] . Thoracic imaging, such as computerized tomography (CT) and chest radiograph (CXR), plays a key role not only in initial COVID-19 detection and diagnosis, but also in the continuous monitoring of disease progression and treatment efficacy during extended hospital stays [6] [7] [8] . While CXR is less sensitive for the detection of pneumonia associated with COVID-19 [9, 10] -particularly in less advanced stages-it is a helpful and versatile tool for monitoring the rapid pulmonary progression that is often seen in patients in the intensive care unit (ICU) [8, 11] . Moreover, longitudinal CXRs may provide vital information for risk stratification, clinical decision-making, and resource allocation [8] . Chest X-ray can be performed at the bedside in many cases, making it readily accessible and further increasing clinical utility, particularly in resource-limited settings [12] . Despite the potential of regular monitoring with CXRs to improve clinical care, longitudinal imaging is burdensome for radiologists. Given the current prevalence of COVID-19, manual, timely, and accurate interpretation of images is often logistically impossible, particularly for rapidly deteriorating ICU patients. Additionally, human readers are prone to variability, fatigue, and unconscious bias. To address these challenges, researchers have proposed artificial intelligence (AI) based tools to automate chest imaging interpretation and improve accuracy [11, [13] [14] [15] [16] . For instance, AI with deep learning can predict the severity and progression of COVID-19 patients based on initial CXRs and clinical variables at presentation to the emergency department (ED) [16] . A model based on longitudinal CXRs may improve outcome prediction and inform clinical decision-making and resource allocation for critically ill patients. The purpose of this study was to develop deep learning models using longitudinal CXRs and clinical variables to predict in-hospital mortality of COVID-19 patients in the ICU. Clinical data acquisition and preprocessing A retrospective chart review was performed between March 2020 and December 2020 to identify consecutive patients who presented to the EDs of two independent hospital systems, the University of Pennsylvania Health System in Philadelphia, PA, USA, and Brown University-affiliated hospitals in Providence, RI, USA. The institutional review boards of both institutions approved the study, and the requirement for written informed consent was waived. Patients were only included in the study if there was a positive reverse transcriptase-polymerase chain reaction (RT-PCR) test for COVID-19 (COVID-19 RT-PCR test; LabCorp). Furthermore, to focus outcome prediction on critically ill patients, only those who were admitted to the ICU were included. To allow for longitudinal assessment, only patients with at least two CXRs in anteroposterior view obtained in the ICU were included. A subset of the data has previously been published [16] [17] [18] [19] . In the study by Jiao et al [16] , all patients from the University of Pennsylvania (N = 1834) and Brown University-affiliated hospitals (N = 475) who presented to the ED with a PCR-confirmed COVID-19 diagnosis were included. Deep learning was then used to predict disease severity and progression based on singletimepoint baseline chest X-rays and clinical variables. In the study by Wang et al [17] , a further subset of the patients from the University of Pennsylvania (N = 144) and Brown University-affiliated hospitals (N = 31) who presented to the ED with a PCR-confirmed COVID-19 diagnosis and available baseline CT scans were included. Deep learning was then used to predict deterioration to critical illness based on imaging and clinical data. Two earlier studies used subsets of the patients in Wang et al [17] to assess the performance of radiologists in diagnosing COVID-19 [18] and the utility of AI to augment diagnosis by radiologists with baseline CT scans [19] . The current study expands upon the previous by predicting mortality in a subset of critically ill ICU patients from Jiao et al [16] based on longitudinal CXRs and clinical data. For each patient, demographic, clinical, and laboratory variables taken on admission to the ICU including age, sex, temperature, oxygen saturation on room air (SpO2), absolute white blood cell count (WBC), absolute lymphocyte count, serum creatinine concentration, serum c-reactive protein (CRP) concentration, and comorbidities such as cardiovascular disease (CVD), hypertension (HTN), chronic obstructive pulmonary disease (COPD), chronic liver disease, chronic kidney disease, cancer, and human immunodeficiency virus (HIV) were collected. All continuous lab variables were binarized prior to analysis: fever was defined as a temperature of > 37°C, low SpO2 as < 94%, high absolute WBC as > 11 × 10 9 cells/L, low absolute lymphocyte count as < 1 × 10 9 cells/ L, high serum creatinine concentration as > 1.27 mg/dL, and high serum CRP concentration as > 1 mg/dL. The binary outcome of in-hospital mortality was also recorded. For patients meeting inclusion criteria, all CXRs obtained during ICU stay were identified (ICU CXRs). Furthermore, all CXRs obtained prior to ICU admission but during the hospital stay for the same disease course were identified (pre-ICU CXRs). CXRs with overall poor quality were excluded. Images were downloaded from the hospital picture archiving and communications system. Images were inverted as necessary so that air cavities appeared dark and padded and resized to 512 × 512 resolution. Then, pixel values were normalized and scaled to 0, 1. Finally, CXRs were segmented to generate lung masks for input to the deep learning model [16] . Imaging and clinical variables from the University of Pennsylvania were used to train longitudinal transformerbased network (LTBN) models to predict the binary outcome of in-hospital mortality of COVID-19 patients in the ICU ( Figure 1 ). The data from the University of Pennsylvania were randomly divided into two parts, 80% of which were used for training and 20% for internal validation. Finally, the models were tested on an external dataset derived from Brown University-affiliated hospitals. Five models were evaluated: (1) longitudinal CXRs before admission to the ICU ("pre-ICU model"), (2) longitudinal CXRs during the ICU stay ("ICU model"), (3) all longitudinal CXRs (pre-ICU and ICU) ("longitudinal model"), (4) demographic, clinical, and laboratory variables at the time of ICU admission only ("clinical model"), and (5) all longitudinal CXRs (pre-ICU and ICU) and clinical variables ("combined model"). For mortality prediction based on clinical variables, a model with three fully connected layers with 128, 32, and two neurons was established. To prevent overfitting, a dropout layer, which randomly set the input neuron to zero with a probability of 0.2, was embedded between the first two fully connected layers. To determine the relative importance of different clinical variables in predicting mortality, random forest (RF) models were utilized [20] . For mortality prediction based on CXRs, a LTBN consisting of Resnet-50 [21] and Vision Transformer (ViT) [22] , termed "R50-ViT," was designed to extract both local and longitudinal global representation features. The proposed framework takes a series of longitudinal CXRs and the corresponding lung mask as input and generates features from the lung parenchyma region. The extracted features from all longitudinal CXRs are then combined using global average pooling and global max pooling operations. Finally, the combined features are fed into two fully connected layers with 1536 and two neurons and a softmax activation function to generate a probability score for mortality risk. The combined mortality prediction model was derived from the weighted sum of the longitudinal model and the clinical model, and the weights were obtained by training a fully connected layer. Additional details of the model architecture are provided in the supplementary materials ( Figure R1 ). The proposed models were implemented using Python (Version 3.6) and were run on two NVIDIA V100 GPUs for data parallel training. The network was trained with the Adam optimizer with an initial learning rate of 0.0005 and a poly learning rate strategy, in which the initial rate decays by each iteration with a power of 0.9. The batch size was set as one for each GPU, and the model was trained for 500 epochs. The codebase used in this study is available online (https://github. com/chengjianhong/Covid-19-CXR.git). The full dataset used to train and evaluate the models is not available for public access because of patient privacy concerns but is available from the corresponding authors if there is a reasonable request and approval from the institutional review boards of the affiliated institutions. Differences in demographic, clinical, and laboratory variables between the training and testing sets and between patients who had died and who had survived were assessed using student's ttest for continuous variables and chi-square test for categorical variables. Results are presented as median (interquartile range [IQR]) for continuous variables and as number (percentage) for categorical data. A two-sided p < 0.05 was considered statistically significant. Model performance was evaluated with area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and F1-score. The funding source had no role in the study design, data collection, data analysis, interpretation, or writing of the report. All the authors have full access to the data and take full responsibility for the contents of this report and the decision to submit it for publication. Retrospective chart review identified 546 patients at University of Pennsylvania-affiliated hospitals and 108 at Brown University-affiliated hospitals meeting inclusion Fig. 1 Data analysis workflow and machine learning architecture. The clinical model consisted of three fully connected layers with 128, 32, and two neurons. A dropout layer with a probability of 0.2 was embedded between the first two layers. The CXR model used R50-ViT with two dense layers of 1536 and two neurons. CXR: chest X-rays criteria. Patients from the University of Pennsylvania were designated as the training set, and patients from Brown University were designated as the testing set. A total of 5645 CXRs were available for analysis ( Figure 2 ). The median number of CXRs per patient was 4 (IQR 2-10) in the training set and 6 (IQR 2-14.5) in the testing set. The median number of days from the last ICU CXR to death was 2 (IQR 1-5) for the training set, 2 (IQR 1-6) for the testing set, and 2 (IQR 1-5) for all patients. There were no statistically significant demographic or clinical differences between the training and testing sets, except for a higher incidence of chronic kidney disease in patients in the training set (p = 0.016). Of the patients included in both the training and testing sets, 212 (32%) had died and 442 (68%) had survived. A higher proportion of patients died in the testing set as compared to the training set (p = 0.010). In terms of laboratory variables at the time of ICU admission, a larger proportion of deceased patients had elevated absolute WBC count (p = 0.0092), decreased absolute lymphocyte count (p = 0.0054), and elevated creatinine concentration (p < 0.001). In terms of comorbidities, deceased patients had a higher incidence of CVD (p = 0.015), COPD (p = 0.0019), and chronic kidney disease (p = 0.036). A detailed summary of demographic, laboratory, and clinical variables is provided in Tables 1 and 2 . The Table 3, Table 4 , and Figure 3 . The relative importance of clinical features was investigated in the training ( Figure 4 ) and testing ( Figure 5 ) datasets using RF models. In both datasets, age was found to be highly prognostic of mortality risk, followed by the presence or absence of comorbid CVD and elevated creatinine concentration. Of the other comorbidities considered, the least important were HIV and chronic liver disease. Comorbid COPD was found to be only moderately important and was more important in the training than the testing dataset. This study demonstrates that a deep learning model based on longitudinal CXRs and clinical information performs well in predicting in-hospital mortality of COVID-19 patients in the ICU. Five separate LTBN models were trained, and their performances were compared. The longitudinal imaging model, which included all CXRs from the time of ED presentation to the time of death or discharge, performed slightly better than the models based on pre-ICU CXRs, ICU CXRs, and clinical Fig. 2 Chest X-rays in the training and testing sets. Chest X-rays were collected from the time of initial presentation to the emergency department up until either death in the ICU or discharge from the ICU. The total number of chest X-rays for each dataset that was collected before admission to the ICU (pre-ICU) and during the ICU stay are shown along with the median number per patient. N: number; IQR: interquartile range; ICU: intensive care unit data only. A combined model based on longitudinal imaging and clinical data significantly outperformed one based on clinical data alone. The proposed deep learning model has the potential to improve the triage of critically ill COVID-19 patients and improve resource allocation in the ICU. By stratifying patients by high and low risk, the model could help identify which patients should be prioritized for CT or escalation of care, particularly in resource-limited settings. Chest X-rays have advantages over CT scans, particularly for ICU patients. First, CXRs are often portable and can be performed at the patient bedside, negating the need for transportation, which could prove particularly difficult for patients requiring mechanical ventilation. Second, there is less contamination risk with CXRs. The American College of Radiology recommends a thorough cleaning of CT machines by someone wearing full protective equipment following each scan [23] . Moreover, CT rooms may need to be unavailable for approximately 1 h following imaging of infected patients to allow for proper air circulation [23] . Given the prevalence of COVID-19 and the ongoing burden placed on hospital systems, such a delay could lead to substantial problems with patient care. Unlike CT machines, the surfaces of portable CXRs can be easily cleaned and even transported to ambulatory care facilities when deemed medically appropriate. This study is novel in that it considers longitudinal CXRs rather than single-timepoint imaging. The results indicate that the addition of more time-series information slightly improves model performance, as the full longitudinal model was more accurate and sensitive and had a higher AUC and F1-score than both the pre-ICU and ICU models. Several previous studies have used single-timepoint imaging acquired at the time of hospital admission to predict in-hospital mortality or disease progression with machine learning and statistical modeling approaches. For instance, a previous study by our group found that deep learning based on the initial CXR and clinical variables at presentation to the ED can predict disease severity and progression with an AUC of 0.846 and 0.792, respectively, on external datasets [16] . In another study by our group, deep learning models predicted progression to critical illness with a concordance index of 0.80 in ED patients with baseline CT and clinical data [17] . Likewise, Fang et al [24] used chest CT features to develop a severity score at baseline, which was used to train three machine learning models to predict the risk of in-hospital mortality and ICU admission. The model achieved the best AUC of 0.813 in predicting ICU admission and an AUC of 0.741 in predicting mortality. Maroldi and colleagues [25] used a semi-quantitative approach to manually score baseline CXRs at a hospital presentation. Multivariate logistic regression found that these scores correlated well with subsequent in-hospital mortality. Beyond mortality risk prediction, researchers have also used baseline imaging to predict the length of hospital stay. Wang et al [26] used a deep learning model to stratify patients by high-or lowrisk groups based on hospital stay duration using baseline CT features. However, it is difficult to compare results from these studies to the present, as the present study focused specifically on ICU patients, while the previous studies included all patients admitted to the hospital. Beyond the use of longitudinal imaging, another strength of this study is that it combines imaging and clinical variables into a single model. The results indicate that the addition of longitudinal imaging significantly improves the clinical-only model, increasing the AUC from 0.653 to 0.727 (p = 0.039) and the accuracy from 0.657 to 0.732. In the study by Jiao and colleagues [16] , the addition of single-timepoint CXR data to the clinical-only model improved both progression and severity predictions. Still, most previous studies that have used machine learning or statistical modeling to predict inhospital mortality have relied solely on clinical or laboratory variables rather than exploring the combination of imaging and clinical data. Our clinical-only model achieved a moderate performance, with an AUC of 0.653. Other models using clinical or laboratory variables to predict prognosis in COVID-19 patients have achieved better performance. For instance, Zhu et al [27] considered 78 clinical variables collected at the time of hospital presentation to predict mortality, with the top five most important variables allowing the model to achieve the best AUC of 0.968. Likewise, Hu et al [28] found that four clinical variables could predict in-hospital mortality with good accuracy, and Ko et al [29] found that an ensemble model based on deep neural networks and RFs could predict in-hospital mortality based on 28 blood biomarkers with 100% sensitivity. Another ensemble model with four machine learning methods based on 14 clinical variables was able to stratify patients by mortality risk with the best AUC of 0.976 [30] . In a large cohort, Vaid and colleagues [31] used clinical variables at admission to predict in-hospital mortality and clinical events at three, five, seven, and 10 days from admission, achieving the best AUC of 0.88 at 3 days. Multiple other studies have used similar methods [32] [33] [34] [35] [36] [37] [38] . While our clinical-only model performed worse than many in the literature, several factors should be considered. First, we only used variables that are routinely collected in the ICU to maximize the potential for integration of the model into the existing clinical workflow. Contrastively, Zhu et al [27] considered 78 total variables, and Ko et al [29] used 28 blood biomarkers. Both studies additionally used D-dimer concentrations, which we did not. A further consideration is that we chose to focus only on critically ill ICU patients. It may be inherently more difficult to predict outcomes in these patients, given that treatment recommendations evolved over the course of the outbreak. As such, patients diagnosed early may have been treated very differently from those diagnosed later, and consequently, their outcomes may be different despite similarities in baseline clinical and laboratory findings. To identify the clinical and demographic variables that were most predictive of mortality, we performed a RF analysis. In both the training and testing datasets, age was found to be highly prognostic of mortality risk, followed by the presence or absence of comorbid CVD and elevated creatinine concentration. In other studies that performed similar analyses, age was also the most valuable factor in mortality prediction [22, 31, [33] [34] [35] [36] [37] [38] [39] . Similarly, the high concentration of CRP and the presence of one or more comorbidities were also found to be highly predictive [21, 22, 30, 31, 33, 34, [36] [37] [38] [39] . Finally, high respiratory rate and SpO2 at admission were important for predictive accuracy [30, 33, 35, 38] . Our results indicate that SpO2 is only moderately important for mortality prediction compared to the other variables. Because our cohort was limited to critically ill ICU patients, a large percentage of both the survivor (39%) and nonsurvivor cohorts (48%) had low SpO2 (SpO2 < 94, p = 0.055). In comparison, for general COVID-19 positive patients, the average SpO2 of non-survivors is typically statistically lower (e.g., SpO2 87%) than that of survivors (e.g., SpO2 97%) [27] . Since the SpO2 was universally decreased and less variable in our cohort of ICU patients than in other studies, SpO2 was not as prognostically important in our model. This study has several limitations. Like most machine learning models, there is a concern for generalizability, especially given that the model was trained using data obtained from a single institution, and the clinical landscape of the pandemic is quickly evolving. According to recent studies, while ICU admissions may be increasing due to increased virulence of the delta variant, death rates of ICU patients are relatively low, and there is an increasing incidence in children [4, 5, 39] . It also remains unclear whether the novel omicron variant causes more severe disease compared to infections with other variants. Moreover, the cohort in this study was entirely unvaccinated, as data collection was terminated in December 2020, and vaccines were not widely available until January 2021 [40] . It is thus unknown whether the current model will work in a highly vaccinated population with different variants of the disease. In fact, a recent study demonstrated that there was a deficit in all-cause mortality in a highly vaccinated population during the initial delta variant period from June 2021 to August 28, 2021 [41] . Given the lack of data on chest imaging characteristics of vaccinated individuals with novel variants, we are unable to predict the utility of this model on the current ICU population. Another limitation is that we did not include treatment as a clinical variable, which may be an important consideration, given that critically ill patients are often treated aggressively and with a wide variation of approaches. Also, given the retrospective nature of data collection, there were variable numbers of CXRs collected at different times for different patients, prohibiting the development of a truly longitudinal model. Another limitation is that the median time from the last ICU CXR to death was 2 days, which provides a short window for aggressive intervention for patients identified as high risk. As shown in supplementary Figure R2 , the visual appearance of CXRs from patients with different clinical outcomes was highly variable. In some cases, a radiologist would likely be able to make an accurate prediction of mortality risk with purely visual observation. Nevertheless, timely interpretation of images by a radiologist is often logistically impossible for rapidly deteriorating ICU patients during COVID-19 surges. Reading delays could potentially lead to the postponement of life-saving interventions for critically ill patients with imminent mortality risk. The automated AI model could triage incoming CXRs rapidly, allowing radiologists to prioritize workflow. Still, it would be of interest to test model performance when only early CXRs are considered (e.g., > 7 days from mortality). While the best-performing model achieved good performance, whether the model is useful for triage in a real clinical setting is unclear. This will require prospective testing, which is a future aim. In the current model, the 95% confidence intervals for sensitivity and specificity ranged from 0.609 to 0.822 and 0.648 to 0.833, respectively. As such, at best, 17.8% of patients that progress to mortality may be missed (false negatives), and 16.7% of patients that survive may be identified as high risk (false positives). At worst, 39.1% of patients that progress to mortality may be missed (false negatives), and 35.2% of patients that survive may be identified as high risk (false positives). Despite the potential for false negatives and positives, we hypothesize that the model would serve clinically useful for rapidly triaging patients, particularly in overburdened ICUs. Finally, the models were designed to predict mortality as a binary outcome rather than predicting overall survival time. A model which predicts not only the occurrence of mortality but also time to mortality would be of greater clinical utility. In summary, we demonstrate that a deep learning model based on longitudinal CXRs and routinely collected clinical variables performs well in predicting in-hospital mortality of COVID-19 patients in the ICU. The addition of longitudinal CXRs improves the performance of models based on clinical data alone. Although prospective validation is required, the model has the potential to improve clinical decision-making and resource allocation for critically ill COVID-19 patients. A novel coronavirus from patients with pneumonia in China Clinical features of patients infected with 2019 novel coronavirus in Wuhan Effectiveness of COVID-19 vaccines against the B.1.617.2 (Delta) variant COVID-19: Delta variant is now UK's most dominant strain and spreading through schools Coronavirus disease 2019 (COVID-19): role of chest CT in diagnosis and management Chest CT findings in coronavirus disease 2019 (COVID-19): relationship to duration of infection COVID-19 outbreak in Italy: experimental chest X-ray scoring system for quantifying and monitoring disease progression COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review Frequency and distribution of chest radiographic findings in patients positive for COVID-19 Predicting COVID-19 pneumonia severity on chest X-ray with deep learning The role of imaging in 2019 novel coronavirus pneumonia (COVID-19) Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT AI-based analysis of CT images for rapid triage of COVID-19 patients Using artificial intelligence for COVID-19 chest X-ray diagnosis Prognostication of patients with COVID-19 using artificial intelligence based on chest x-rays and clinical data: a retrospective study Artificial intelligence for prediction of COVID-19 progression using CT imaging and clinical data Performance of radiologists in differentiating COVID-19 from non-COVID-19 viral pneumonia at chest CT Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT Random forest Deep residual learning for image recognition An image is worth 16x16 words: transformers for image recognition at scale ACR recommendations for the use of chest radiography and computed tomography (CT) for suspected COVID-19 infection Association of AI quantified COVID-19 chest CT and patient outcome Which role for chest x-ray score in predicting the outcome in COVID-19 pneumonia? A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis Deep-learning artificial intelligence analysis of clinical variables predicts mortality in COVID-19 patients Early prediction of mortality risk among patients with severe COVID-19, using machine learning An artificial intelligence model to predict the mortality of COVID-19 patients at hospital admission time using routine blood samples: development and validation of an ensemble model Machine learning based early warning system enables accurate mortality risk prediction for COVID-19 Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation Machine learning for mortality analysis in patients with COVID-19 Prognostic modeling of COVID-19 using artificial intelligence in the United Kingdom: model development and validation Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study Using automated machine learning to predict the mortality of patients with COVID-19: prediction model development study Development and validation of prognosis model of mortality risk in patients with COVID-19 Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making Development of a prognostic model for mortality in COVID-19 infection using machine learning Initial chest radiographs and artificial intelligence (AI) predict clinical outcomes in COVID-19 patients: analysis of 697 Italian patients A timeline of COVID-19 vaccine developments in 2021 Absence of excess mortality in a highly vaccinated population during the initial COVID-19 Delta period Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations The online version contains supplementary material available at https://doi.org/10.1007/s00330-022-08588-8.Funding Research reported in this publication was partially supported by a training grant from the National Institute of Health (NIH), National Heart, Lung, and Blood Institute (NHLBI) (5T35HL094308-12, John Sollee). This research did not receive any other specific grant from funding agencies in the public, commercial, or not-for-profit sectors. All authors confirm that they have full access to all the data in the study and accept responsibility to submit the report for publication.Data availability The data are not available for public access because of patient privacy concerns but are available from the corresponding authors if there is a reasonable request and approval from the institutional review boards of the affiliated institutions. The codebase used in this study is available online (https://github.com/chengjianhong/Covid-19-CXR.git). All implementation details are described thoroughly in the Methods and Appendix sections. Guarantor The scientific guarantor of this publication is Harrison X. Conflict of Interest The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article. No complex statistical methods were necessary for this paper.Informed Consent Written informed consent was waived by the Institutional Review Board.Ethical Approval Institutional Review Board approval was obtained. • Retrospective • Diagnostic or prognostic study • Multicenter study