key: cord-0993210-zu1fxf14 authors: Huang, Lu; Han, Rui; Ai, Tao; Yu, Pengxin; Kang, Han; Tao, Qian; Xia, Liming title: Serial Quantitative Chest CT Assessment of COVID-19: Deep-Learning Approach date: 2020-03-30 journal: Radiol Cardiothorac Imaging DOI: 10.1148/ryct.2020200075 sha: 18355614b3e03a911575a3b58bf0fbf99f32fcaf doc_id: 993210 cord_uid: zu1fxf14 PURPOSE: To quantitatively evaluate lung burden changes in patients with COVID-19 using serial CT scan by an automated deep learning method. MATERIALS AND METHODS: Patients with COVID-19 who underwent chest CT between 1(st) January 2020 and 3(rd) February 2020 were retrospectively evaluated. Patients were divided into mild, moderate, severe, and critical types, according to their baseline clinical, laboratory, and CT findings. CT lung opacification percentage of the whole lung and five lobes were automatically quantified by a commercial deep learning software, and compared over follow-ups CT scans. Longitudinal changes of the CT quantitative parameter were also compared among the four clinical types. RESULTS: A total of 126 patients with COVID-19 (age 52 years ± 15 years, 53.2% males) were evaluated, including 6 mild, 94 moderate, 20 severe and 6 critical cases. CT-derived opacification percentage was significantly different among clinical groups at baseline, gradually progressing from mild to critical type (all P < 0.01). Overall, the whole-lung opacification percentage significantly increased between baseline CT and 1(st) follow-up CT (median [interquartile range]; 3.6% [0.5%,12.1%] vs 8.7% [2.7%,21.2%], P < 0.01). No significant progression of the opacification percentages was noted between the 1(st) follow-up and 2(nd) follow-up CT (8.7% [2.7%,21.2%] vs 6.0% [1.9%,24.3%], P=0.655). CONCLUSION: The quantification of lung opacification in COVID-19 measured on chest CT by a commercially available deep-learning-based tool was significantly different among different clinical severity groups. This approach could potentially eliminate the subjectivity in the initial assessment and follow up of pulmonary findings in COVID-19. Introduction SARS-CoV-2 is a novel coronavirus initially identified in Wuhan, China, which causes a respiratory pandemic disease named Coronavirus Disease 2019 (COVID-19) (1, 2) . Chest CT has played a pivotal diagnostic role in the assessment of patients with COVID-19 in China (3) . Recent studies reported that the possible pathological mechanism in COVID-19 is diffuse alveolar damage and inflammatory exudation, which is similar to histologic findings seen in SARS-CoV pneumonia (1, 4) . The pathological evolution during the course of infection in COVID-19 has not been clarified, and the disparity of such changes in patients with different clinical severities are largely unknown. Chest CT, especially high-resolution CT (HRCT), can detect small areas of ground glass opacity (GGO) (5) , and, therefore, is a promising imaging tool for monitoring the disease, if radiation dose is balanced to comply with ALARA principles. It is common practice for radiologists to evaluate the pneumonia severity qualitatively or semi-quantitatively by visual scoring (6) . Visual evaluation of changes between two CT scan is subjective and its validity may depend on the radiologists' experience. Quantitative analysis of the CT scans using artificial intelligence (AI) tool, in particular deep learning, could provide an automatic and objective estimation of the disease burden, facilitating and expediting imaging interpretation during the COVID-19 pandemic (7). The purpose of the present study was to assess a quantitative CT image parameter, defined as the percentage of lung opacification (QCT-PLO), calculated automatically using a deep learning tool. We evaluated QCT-PLO in COVID-19 patients at baseline and on followup scans, focusing on cross-sectional and longitudinal differences in patients with different degrees of clinical severity. The local ethical review board approved this retrospective study and waived the requirement to obtain individual informed consent. Patients with COVID-19 who underwent chest CT in our department from 1 st January to 3 rd Non-contrast enhanced chest CT examinations were performed with three CT scanners (United Imaging uCT, United Imaging Healthcare, Shanghai, China; GE Optima 660, GE Healthcare, USA; Siemens SOMATOM Definition AS+, Siemens Healthineers, Germany). The patients were scanned in supine position during inspiratory breathhold. The scanning range was from apex to the base of lungs. Scanning parameters were as follows: tube voltage I n p r e s s 80-120 kV, tube current 50-350 mAs, pitch 0.99~1.22 mm, matrix 512×512, slice thickness 10 mm, field of view 350 mm×350 mm. Reconstruction was performed with slice thickness of 0.625~1.250 mm, a lung window with a width of 1200HU and a level of -600HU, and a mediastinal window with a width of 350 HU and a level of 40HU. Quantitative analysis of lung opacification was performed by a deep-learning algorithm. This Hence, all segmentation results derived from this deep-learning algorithm were visually evaluated by two radiologists (one with 7 years of experience in cardiopulmonary imaging and another with 8 years of experience in pulmonary imaging), who viewed the segmentation independently. Both radiologists were blinded to the patient's clinical status. The scoring procedure was as follows: both radiologists reviewed the segmentation results displayed as regions of interest overlaid on the CT images slice-by-slice. The readers did not adjust the automatic segmentation. The readers used a scoring criteria based on the adequacy of the segmentation task versus actual lung opacification. Specifically, the degree of matching was I n p r e s s quantified using a Likert score from 0 to 5. The scoring criteria is described in detail in Appendix E3. To reduce the subjectivity of the radiologist's evaluation, the final score of was the average of two scores for each scan. A final score ≥ 3 was considered as sufficient to meet the quantitative analysis requirement. Statistical analysis was performed using SPSS software (version 23.0, IBM statistics, Armonk, NY, USA). Categorical variables were expressed as counts (percentage), and continuous variable as mean ± SD or median (interquartile range). Normality of distribution was tested using the Kolmogorov-Smirnov test. The difference between two paired groups were assessed by paired t-test or Wilcoxon tests. Moreover, Comparisons among different clinical types were performed by the analysis of variance (ANOVA) or Kruskal-Wallis test. Comparison between any of the two clinical types were performed by t-test or Mann-Whitney U test with continuous variable, or χ 2 test with categorical variable. Low frequency variables were compared with Fisher exact test. Two-side P < 0.05 was considered statistically significant. One hundred and forty-eight patients with COVID-19 were initially enrolled, with 9 (6.1%) patients excluded due to respiratory motion artifacts and 13 (8.7%) excluded due to insufficient segmentation quality as determined by the scoring from the two radiologists (i.e., mean score < 3). Finally, a total of 126 patients (mean age, 52 years ± 15 years; age range, 14-86 years; 53.2% males) with COVID-19 were included. Baseline characteristics of COVID-19 patients are summarized in Table 1 . All patients were classified into four clinical I n p r e s s types, including 6 mild cases (4.8%), 94 moderate cases (74.6%), 20 severe cases (15.8%) and 6 critical cases (4.8%). The median of interval between baseline and 1 st follow-up was 4 days (interquartile range 3-6 days), and the median of interval between the 1 st and 2 nd followup was 5 days (interquartile range 3-7 days). Age and gender had no significant difference among the different clinical types of COVID-19 (P > 0.05). Duration between onset symptoms and initial CT scanning of mild and moderate type patients were shorter than those of severe and critical type (all P<0.01). In 117 patients of 126 (92.9%), fever was the initial symptom, while dyspnea was only observed in severe and critical types. Of the laboratory findings, WBC count, lymphocyte count, highsensitivity C-reactive protein (hs-CRP), and pulse oxygen saturation (SpO2) showed significant differences among the four clinical types of patients (all P<0.05). Compared to critical type patients, WBC count and hs-CRP were significantly lower in moderate type cases (both P<0.001), but lymphocyte count was higher in the moderate type (P=0.004). All 126 patients had two CT scans as per inclusion criteria, and 48 0f 126 (38.1%) patients had three CT scans. 236 of all 300 CT scans (78.6%) has a segmentation quality score in the range of 3~4, and 64 (21.4%) CT scans were in the range of 4~5. Table 2 . patients Differences in whole-lung QCT-PLO according to clinical severity subtype and days since onset of symptoms at the baseline CT is showed in Figure 2b . Significant differences of QCT-PLO were found among the four different clinical types at the baseline and at the 1 st follow-up (all P<0.05, Table 3 ). All of the 6 mild COVID-19 patients had negative CTs at the baseline, and were found positive at the 1 st follow-up CT scan ( Figure 3 ). QCT-PLO of right and left lower lobes were elevated in the 2 nd follow-up CT scan (both P < 0.05, Table E1 ). Compared to baseline CT scan, whole-lung and per lobe QCT-PLO increased significantly in moderate type patients (all P < 0.05, supplement 3) (Figure 4) , while no remarkable difference was found between the 1 st and 2 nd follow-up scans (all P > 0.05, Table E2 ). In severe and critical type patients, the whole-lung and per lobe QCT-PLO showed no significant differences between baseline, 1 st , or 2 nd follow-up CTs ( Figures 5 and 6 , respectively). In this study, we evaluated the longitudinal changes of pneumonia severity in different clinical types COVID-19 at baseline and follow-up imaging using a quantitative image parameter (QCT-PLO), which was automatically generated by a deep-learning tool from chest CT scans. Our major findings were: (median interval 4 days), while no remarkable progress was found at the 2 nd follow-up (median interval 5 days). Mild and moderate COVID-19 patients had shorter duration between onset symptoms and initial CT scan, which indicates that these patients could have presented at a relative early stage of disease. This was confirmed by the lower whole-lung and per lobe QCT-PLO at baseline CT. SpO2 of all severe and critical type patients were less than 90% and more than half had dyspnea, which concords to the higher lung opacification percentage assessed by the deep-learning tool. According to prior studies (9,10), severe and critical type patients had multiple GGO with consolidation, which can lead to ventilatory dysfunction and even respiratory failure. Moreover, hs-CRP was significantly elevated in severe and critical type patients, which indicates an inflammatory type of response. We observed in our data that whole-lung and per lobe QCT-PLO were higher at the 1 st follow up than at baseline, suggesting a sustained progression of imaging findings from presentation, plateauing on the 2 nd follow-up CT. Such pattern could be attributed to many factors, including characteristics of our cohort, clinical severity at admission, treatment effect, and the natural history of disease. Depending on the initial clinical type and time of scan, patients could present at any of the stages described here. A combined analysis of our quantitative results suggests that pulmonary involvement in COVID-19 ramps up after the beginning of symptoms, peaking at 13 days, which is in keeping with prior a prior observation (11) . There are several limitations of the present study. First, not all patients had a serial of three CT scans, therefore we cannot systemically evaluate the changes for all patients at the 1 st and 2 nd follow up. Second, there was no systematic confirmation of the pulmonary opacities as being directly caused by the pathological effects of the coronavirus. Last, although the commercial software can quantitatively evaluate lung opacification percentage, I n p r e s s the current version still needs radiologists' supervision. Noticeably, 8.7% (13/148) of the cases had insufficient segmentation quality to ensure appropriate quantification. In conclusion, the pulmonary involvement of COVID-19 could be objectively assessed by deep-learning-based quantitative CT. This automated tool may be used for quantifying the disease burden and monitoring disease progression or response to treatment. Radiologists reviewed the segmentation results overlaid each CT image. The scoring criteria were based on the agreement between the results of the automatic segmentation task and the actual lung opacities. The degree of matching was described in a Likert score from 0 to 5 ( Figure E2 ). A score of 0 was assigned in two cases ( Figure E2a ) When the segmentation results of most slices in a scan meet this situation, the scan score is 5. Clinical features of patients infected with 2019 novel coronavirus in Wuhan Diagnosis and Treatment Protocol of Novel Coronavirus (trial version 5 th ). National Health Commission of the People's Republic of China website Pathological findings of COVID-19 associated with acute respiratory distress syndrome Guidelines for management of incidental pulmonary nodules detected on CT images: From the Fleischner Society Science and technology anti-epidemic: inferVISION company first launched pneumonia AI system into clinical use. cn-healthcare website Convolutional Networks for Biomedical Image Segmentation A correlation study of CT and clinical features of different clinical types of 2019 novel coronavirus pneumonia Emerging Coronavirus 2019-nCoV Pneumonia Time Course of Lung Changes On Chest CT During Recovery From 2019 and Treatment Protocol of Novel Coronavirus (trial version 5th). National Health Commission of the People's Republic of China website U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention -MICCAI 2015 A Coefficient of Agreement for Nominal Scales