key: cord-0993723-iljkthzz authors: Weikert, Thomas; Rapaka, Saikiran; Grbic, Sasa; Re, Thomas; Chaganti, Shikha; Winkel, David J.; Anastasopoulos, Constantin; Niemann, Tilo; Wiggli, Benedikt J.; Bremerich, Jens; Twerenbold, Raphael; Sommer, Gregor; Comaniciu, Dorin; Sauter, Alexander W. title: Prediction of Patient Management in COVID-19 Using Deep Learning-Based Fully Automated Extraction of Cardiothoracic CT Metrics and Laboratory Findings date: 2021-02-24 journal: Korean J Radiol DOI: 10.3348/kjr.2020.0994 sha: 13138f2b666637d542541dd3ce9c2c035cf69197 doc_id: 993723 cord_uid: iljkthzz OBJECTIVE: To extract pulmonary and cardiovascular metrics from chest CTs of patients with coronavirus disease 2019 (COVID-19) using a fully automated deep learning-based approach and assess their potential to predict patient management. MATERIALS AND METHODS: All initial chest CTs of patients who tested positive for severe acute respiratory syndrome coronavirus 2 at our emergency department between March 25 and April 25, 2020, were identified (n = 120). Three patient management groups were defined: group 1 (outpatient), group 2 (general ward), and group 3 (intensive care unit [ICU]). Multiple pulmonary and cardiovascular metrics were extracted from the chest CT images using deep learning. Additionally, six laboratory findings indicating inflammation and cellular damage were considered. Differences in CT metrics, laboratory findings, and demographics between the patient management groups were assessed. The potential of these parameters to predict patients' needs for intensive care (yes/no) was analyzed using logistic regression and receiver operating characteristic curves. Internal and external validity were assessed using 109 independent chest CT scans. RESULTS: While demographic parameters alone (sex and age) were not sufficient to predict ICU management status, both CT metrics alone (including both pulmonary and cardiovascular metrics; area under the curve [AUC] = 0.88; 95% confidence interval [CI] = 0.79–0.97) and laboratory findings alone (C-reactive protein, lactate dehydrogenase, white blood cell count, and albumin; AUC = 0.86; 95% CI = 0.77–0.94) were good classifiers. Excellent performance was achieved by a combination of demographic parameters, CT metrics, and laboratory findings (AUC = 0.91; 95% CI = 0.85–0.98). Application of a model that combined both pulmonary CT metrics and demographic parameters on a dataset from another hospital indicated its external validity (AUC = 0.77; 95% CI = 0.66–0.88). CONCLUSION: Chest CT of patients with COVID-19 contains valuable information that can be accessed using automated image analysis. These metrics are useful for the prediction of patient management. https://doi.org/10.3348/kjr.2020.0994 kjronline.org require hospitalization or even intensive care unit (ICU) treatment. There are regional differences in utilization of these scarce ressources during a pandemic with temporary shortages. Therefore, criteria for early prediction of patient management, especially whether ICU care is needed or not, are important. While viral testing remains the only specific method of diagnosis [5] , CT plays a role in the workup of suspected pulmonary manifestations of COVID-19 and associated complications. There is growing evidence that radiographic [6] and chest CT [7] [8] [9] [10] [11] [12] [13] [14] [15] features are associated with disease severity in COVID-19 based on (semi)-manual assessment and visual scoring of pulmonary parameters. This study intends to build on these approaches and expand them in three aspects: First, by introducing a fully automated and user-independent evaluation method, which is especially relevant in pandemic period with heavy workloads on healthcare providers. Second, this study explicitly focusses on the ultimate patient management status defined by a patient's clinical pathway established with sufficient temporal distance. Third, the inclusion of five cardiovascular metrics covered in all chest CTs has rarely been reported systematically. Notably, preexisting cardiovascular disease is a major risk factor for adverse outcomes in COVID-19 [16] . Laboratory findings were included to assess the value added by the CT metrics. We hypothesized that CT metrics representing pulmonary and cardiovascular diseases were associated with ultimate patient management in patients with COVID-19 and could help in predicting patient management. It is the goal of this study to extract these CT metrics using a fully automated deep learning-based approach and assess their potential, alone and in combination with laboratory findings and demographic data, for the prediction of patient management. This study was approved by the local ethics committee (Ethikkommission Nordwest-und Zentralschweiz; IRB approval number: 2020-00566). It is part of a research project registered on ClinicalTrials.gov on, April 04/29/2020 (Identifier: NCT04366765). All reverse-transcription polymerase chain reaction (RT-PCR) results for SARS-CoV-2 performed at the emergency department (ED) of our institution between March 25 and April 25, 2020, were downloaded from our laboratory data system (n = 6080 RT-PCR results in 5120 patients). RT-PCR for SARS-CoV-2 was performed using specimens from nasopharyngeal and oropharyngeal swabs. All patients with positive RT-PCR results for COVID-19 were identified (n = 438). In cases with multiple RT-PCRs, the patient was rated positively if a minimum of one of the specimens was positive. For the 438 patients, we searched our RIS/PACS system for the chest CTs performed during the study period, which resulted in 169 chest CTs. At our institution, chest CT is the imaging standard for verifying suspected pulmonary involvement in patients with SARS-CoV-2. For ensuring the independence of observations, all follow-up CTs from a given patient were excluded from the analysis (n = 49). This resulted in 120 CT scans in 120 patients. The time interval between the presentation at the ED and CT acquisition was determined. Figure 1 illustrates the search strategy. Information on the ultimate clinical pathway of a patient was retrieved from our hospital information system 12 weeks after the completion of CT data collection (date of determination of ultimate patient management: July 20, 2020). Based on this information, the following three groups were defined: group 1 (outpatient treatment), group 2 (inpatient treatment, general ward), and group 3 (admission to ICU). Each patient was assigned to the highest category individually reached (for instance, a patient that had initially been treated in the general ward and eventually needed ICU care was assigned to group 3). Chest CT scans were acquired in the supine position using two 128-slice scanners: SOMATOM Definition AS+ (n = 119) and SOMATOM Force (n = 1) (both Siemens Healthineers). Mean tube voltage was 105.0 kVp (standard deviation [SD]: 10.1), mean tube current-time product 81.1 mAs (SD: 19.2), and pitch factor 1.05 in all cases. Most of the scans were performed without a contrast agent (n = 99), whereas 21 CTs were performed with a mean of 71.8 mL (SD: 17.2) of contrast agent (Iopromide, Bayer AG) at an injection rate of 4 mL/s for excluding pulmonary embolism. Images reconstructed in 1-mm slice thickness using soft-tissue reconstruction kernel served as input to the algorithms. For all patients, the results of six standard laboratory parameters of inflammation and tissue damage were https://doi.org/10.3348/kjr.2020.0994 kjronline.org retrieved from our laboratory system (blood sample type in parentheses): C-reactive protein (CRP; heparin plasma), lactate dehydrogenase (heparin plasma), white blood cell count (EDTA), procalcitonin (heparin plasma), albumin (heparin plasma), and D-dimers (citrate plasma). Laboratory results were obtained on the day of chest CT acquisition. Multiple deep convolutional neural networks (DCNNs) were locally deployed on an imaging post-processing platform (Siemens Healthineers, Corporate Technology). The 1-mm series in soft kernel reconstruction served as input to an algorithm prototype based on a deep imageto-image network for lung and lung lobe segmentation and a subsequent DenseUNet for immediate segmentation of opacities. They were trained on chest CTs of n = 9549 (Deep-Image-to-Image Network) and 901 (DenseUNet) patients, completely independent of the testing dataset used in this study. DenseUNet defined all voxels with ground-glass opacity (GGO) or consolidation as positive/foreground and all other areas of the lung as negative/background. Subsequently, an Hounsfield unit (HU) threshold of -200 was applied to the prediction mask for differentiating GGO from consolidations. Table 1 provides details of all the metrics. Further technical details and high diagnostic performance of the algorithms have been reported previously [17] . The non-electrocardiogram-gated 1-mm series served as the only input to a DCNN based on the U-Net architecture for segmentation of the thoracic aorta and the total pericardial volume (TPV). TPV segmentation, which includes the heart and pericardial structures, such as fat and (if present) pericardial effusion, was used to identify kjronline.org candidate coronary calcification voxels by applying a threshold of > 130 HU. The calcium detection model based on ResNet subsequently predicts the true coronary calcifications. The diameters of the aorta were computed at key anatomical landmarks. The cardiovascular algorithms were trained using 3550 CT scans. Detailed information has been provided elsewhere [18, 19] . The quantification of coronary calcifications (QCCs) could only be calculated for series without contrast (n = 99/120). Table 1 lists all the cardiovascular metrics analyzed in this study. Categorical variables were expressed as counts and percentages. For continuous variables, means with corresponding SDs are provided as measures of variance. For comparing the differences between groups, one-way analyses of variance for normally distributed continuous variables, the Kruskal-Wallis H tests for non-normally distributed continuous variables, and the chi-square tests for categorical variables were performed. The statistical analysis comprised the following three steps: Step 1: A Series of Univariable Analyses with Appropriate post hoc Tests to assess the association of CT metrics, laboratory findings, and patient characteristics with patient management. Studies with contrast were excluded during the analysis of QCC, as this measure was only calculated on a non-contrast series. to assess the potential of CT metrics, laboratory findings, and patient demographics as well as their combinations, to classify patients who needed ICU care from those who did not. The ICU status of a patient (0 = no ICU; 1 = ICU) served as the dependent variable. CT metrics, laboratory findings, and patient demographics (age and sex) served as independent variables. Inclusion criteria for CT metrics and laboratory findings were p values ≤ 0.05, in the subgroup comparisons between groups 1 and 3 (outpatient vs. ICU) or group 2 vs. 3 (general ward vs. ICU) in Step 1 of the analysis. Furthermore, the parameters had to be available for all patients. This resulted in the following five models: For all approaches, area under the curve (AUC) with 95% confidence intervals (CIs) were calculated using prediction kjronline.org probabilities obtained from the binary logistic regression analyses. Furthermore, we analyzed the differences in the AUCs between the models according to the method proposed by DeLong et al. [20] . Step The main analysis dataset of this study included 120 patients with a mean age of 60.8 years (SD: 17.5; range: 18-92 years; 47 [39.2%] females). Table 2 summarizes the patient characteristics of the three patient management groups. All the datasets were successfully processed using the algorithm. The mean time interval between presentation at ED and CT acquisition was 0.98 days (SD: 2.32 days). Table 3 summarizes the results of the univariable analyses of CT metrics and laboratory findings. Pulmonary CT Metrics PO, PHO, LSS, and LHOS steadily increased continuously from group 1 to group 3, while lung volume and %LowHU decreased from group 1 to group 3. All these differences were statistically significant. Post hoc testing revealed that differences in PO, PHO, LSS, and LHOS were statistically significant at p values < 0.01 between all three subgroups. Regarding lung volume and %LowHU, group comparisons 1 vs. 3 and 2 vs. 3 differed statistically significantly. Figure 2 displays an image example for each group. Cardiovascular CT Metrics TPV and D_AAsc differed significantly among the three groups. Post hoc analysis revealed that differences in both TPV and D_AAsc were statistically significant only for the comparison of groups 1 and 3 (TPV: p = 0.041; D_AAsc: p = 0.033). QCC, D_Arch, and D_ADsc did not differ significantly. Figure 3 illustrates the outputs of the cardiovascular algorithms. Laboratory analysis for D-dimers and procalcitonin was not performed in some cases (11/120 and 36/120, respectively). All laboratory parameters differed significantly among the three groups. CRP levels increased steadily from groups 1 to 3, while albumin decreased. Subgroup comparisons were statistically significant for all group comparisons (CRP, albumin), group comparisons 1 vs. 3, and 2 vs. 3 (lactate dehydrogenase and procalcitonin), group comparison 2 vs. 3 only (white blood cell count), and group comparison 1 vs. 3 only (D-dimer). Table 4 specifies metrics and parameters included in the five multivariable models for classification of ICU status (yes/no) according to the criteria mentioned in the methods section. The best performing model was the PCLD model combining CT-derived, laboratory, and demographic parameters (AUC = 0.91). Demographic parameters alone could not distinguish ICU patients from non-ICU patients (AUC = 0.55). CT-derived metrics (including both pulmonary and cardiovascular metrics) alone, laboratory metrics alone, kjronline.org and pulmonary CT metrics combined with demographic parameters were all good classifiers with AUCs ≥ 0.84. Table 5 provides detailed information. The AUC of the D model differed significantly from that of all other models (p < 0.001). The difference in the AUCs of models with CT-derived parameters alone vs. laboratory parameters alone was not statistically significant (p = 0.462). Figure 4 displays the receiver operating characteristic curves of the PCLD, PC, and L models. As D-dimers and procalcitonin were not available in all cases, these two parameters were excluded from the analysis. The internal validation comprising 16 new cases of patients with positive RT-PCR results for SARS-CoV-2 resulted in a sensitivity of 80.0% (4 of 5 patients admitted to ICU correctly classified) and a specificity of 81.8% (9 of 11 patients not admitted to ICU correctly classified) for the PCLD model. Table 5 presents the results. In general, the performance measures on the internal validation dataset were slightly worse than those on the main analysis dataset but still acceptable. The mean age of the internal validation dataset was 63.0 years (SD: 16.0) and not statistically significantly different from the dataset used for the main analysis (p = 0.635). We also found evidence for external validity using the PD model (Supplementary Materials for detailed information). Table 6 demonstrates that demographic information did not differ significantly among the three datasets. This study demonstrated that it is feasible to automatically extract pulmonary and cardiovascular metrics from chest CT scans of patients with RT-PCRconfirmed COVID-19 that are useful for the prediction of patient management. Multiple CT metrics continuously and significantly increased or decreased with intensified patient management. The same was true for laboratory parameters reflecting inflammation and cell damage. The best prediction regarding ICU status was achieved by combining CT metrics, laboratory findings, and demographic information, while the latter alone could not differentiate the two classes. The CT metrics and laboratory findings were good classifiers on their own. Although internal and external validation demonstrated marginally inferior kjronline.org performance, it was still good. Our results regarding the relevance of pulmonary CT metrics in COVID-19 and their association with patient management are in line with previous studies and expected, as they reflect pathologic changes, concretely inflammatory GGO, and consolidations. Li et al. [8] reported an increasing extent of inflammatory pulmonary lesions from light to common to severe/critical clinical manifestations. Sun et al. [7] and Tan et al. [21] confirmed that quantitative CT parameters strongly correlate with laboratory inflammation markers. Lyu et al. [22] showed that the number of lung segments and lobes affected by consolidations increased with case severity, which is in line with the increase in LSS and LHOS with higher admission status [22] . Similarly, Liu et al. [10] reported an association between a higher lung severity score and extended hospitalization time. A kjronline.org significant number of additional studies have successfully applied lung volume assessment with or without a combination of clinical and laboratory tests for predicting disease severity, treatment intensity, outcome, and mortality [7, 10, 15, [23] [24] [25] [26] . Notably, the analyses in these studies required substantial manual interaction and visual assessment. However, in a pandemic with limited human resources, fully automated approaches are preferred. In this respect, Huang et al. [27] applied CT-derived opacification measures using deep learning to stratify four clinical subtypes according to their baseline clinical, laboratory, and CT findings. They provided further evidence of CT as an important tool for risk stratification in patients with COVID-19 and reported percentages of lung areas with opacities ranging from 0% (mild disease) to 49.6% (critical disease), which is in line with the results of the analysis at hand. However, radiological findings used to predict the outcomes were at the same time part of the outcome definition criteria of this study [28] . As previously shown, preexisting cardiovascular disease is a risk factor for adverse outcomes in COVID-19 [24] and, COVID-19 simultaneously affects the cardiovascular system [29] . However, the aforementioned approaches did not include quantitative measurements of cardiovascular CT metrics. This study included cardiovascular metrics, such as TPV, as an estimate of heart size. Indeed, a higher TPV was associated with a higher risk of intensified patient management. As age and sex did not differ significantly between groups, differences were caused probably by increased heart size or increased amount of pericardial fat. The AUCs of the models considering CT-derived metrics only vs. laboratory paramters only were both high and did not differ statistically significantly. This is probably due to the fact that both reflect inflammation of the lungs and are highly correlated. Internal validation indicated good internal generalizability, as did the external data for the PD model. This study had several limitations. First, the internal validation dataset was small, resulting in wide CIs; therefore, the results should be interpreted cautiously. However, high standardization of chest CT and the fact that other studies on the topic reported similar effect sizes provide confidence that the results are generalizable. Second, while the main investigator site had access to the algorithms with all pulmonary and cardiovascular metrics, the remote site had access to pulmonary metrics only. Third, this study included only patients with COVID-19 who underwent a CT scan, the diagnostic standard for patients with suspected pulmonary manifestations of COVID-19 at our center. Therefore, the presented approach might be less relevant in medical centers that rarely perform chest CT. Fourth, other features, such as the initial severity of symptoms, might be useful to classify patient management. Besides the focus on automatically retrieved CT metrics, this study also considered demographic parameters and laboratory findings. To conclude, this study provides evidence that chest CT of patients with COVID-19 contains valuable information for the prediction of ultimate patient management. Furthermore, this information is accessible using a deep learning-based fully automated image analysis workflow, which is especially helpful during the COVID-19 pandemic. The Data Supplement is available with this article at https://doi.org/10.3348/kjr.2020.0994. Saikiran Rapaka, Sasa Grbic, Thomas Re, Shikha Chaganti and Dorin Comaniciu are employees of Siemens Healthineers and provided the pulmonary and cardiovascular algorithms, but had no influence on data analysis and final results. All other authors declare no conflict of interest. COVID-19 Map -Johns Hopkins Coronavirus Epidemiology Working Group for NCIP Epidemic Response, Chinese Center for Disease Control and Prevention. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China Estimates of the severity of coronavirus disease 2019: a model-based analysis HLH Across Speciality Collaboration, UK. COVID-19: consider cytokine storm syndromes and immunosuppression ACR recommendations for the use of chest radiography and computed tomography (CT) for suspected COVID-19 infection. Acr.org Web site Implementation of a deep learning-based computer-aided detection system for the interpretation of chest radiographs in patients suspected for COVID-19 CT quantitative analysis and its relationship with clinical features for assessing the severity of patients with COVID-19 CT image visual quantitative evaluation and clinical classification of coronavirus disease (COVID-19) CT features of SARS-CoV-2 pneumonia according to clinical presentation: a retrospective analysis of 120 consecutive patients from Wuhan city Association between initial chest CT or clinical features and clinical course in patients with coronavirus disease 2019 pneumonia CT manifestations and clinical characteristics of 1115 patients with coronavirus disease 2019 (COVID-19): a systematic review and meta-analysis Coronavirus disease (COVID-19): spectrum of CT findings and temporal progression of the disease Development and validation of a prognostic nomogram based on clinical and CT features for adverse outcome prediction in patients with COVID-19 Assessment of the severity of coronavirus disease Prognostic implication of volumetric quantitative ct analysis in patients with COVID-19: a multicenter study in Daegu Cardiovascular considerations for patients, health care workers, and health systems during the COVID-19 pandemic Automated quantification of CT patterns associated with COVID-19 from chest CT Evaluation of a deep learning based aortic diameter quantification system against multi-reader consensus measurement Evaluation of a deep learning-based automated CT coronary artery calcium scoring algorithm Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach C-reactive protein correlates with computed tomographic findings and predicts severe COVID-19 early The performance of chest CT in evaluating the clinical severity of COVID-19 pneumonia: identifying critical cases based on CT characteristics Chest CT score in COVID-19 patients: correlation with disease severity and short-term prognosis Prevalence and impact of cardiovascular metabolic diseases on COVID-19 in China CT lung lesions as predictors of early death or ICU admission in COVID-19 patients Automatic localization of anatomical landmarks in cardiac MR perfusion using random forests Serial quantitative chest CT assessment of COVID-19: a deep learning approach National Health Commission & National Administration of Traditional Chinese Medicine. Diagnosis and treatment protocol for novel coronavirus pneumonia COVID-19 and the cardiovascular system We want to thank Ullaskrishnan Poikavilla from Siemens Healthineers, USA, for installing the algorithm prototype at our medical center. Additionally, we appreciate the great support of our research team, namely Rita Achermann, Ivan Nesic, Joshy Cyriac, and Bram Stieltjes.