key: cord-1046285-og1bzxtg authors: Li, Shu; Liu, Shaoyu; Wang, Ben; Li, Qiuyu; Zhang, Hua; Zeng, Lin; Ge, Hongxia; Ma, Qingbian; Shen, Ning title: Predictive value of chest CT scoring in COVID-19 patients in Wuhan, China: A retrospective cohort study date: 2020-11-28 journal: Respir Med DOI: 10.1016/j.rmed.2020.106271 sha: 87bff8c3f98373c3e31095566859639335861992 doc_id: 1046285 cord_uid: og1bzxtg BACKGROUND: Computed tomography (CT) findings of COVID-19 patients were demonstrated by cases series and descriptive studies, but quantitative analysis performed by clinical doctors and studies on its predictive value were rarely seen. The aim of the study is to analyze CT score in COVID-19 patients and explore the predictive value. MATERIALS AND METHODS: We conducted a retrospective cohort study among confirmed COVID -19 patients with available CT images between February 8, 2020 and March 7, 2020. The lung was divided into six zones by the level of tracheal carina and the level of inferior pulmonary vein bilaterally on CT. Ground-glass opacity (GGO), consolidation, crazy-paving pattern and overall lung involvement were rated by Likert scale of 0–4 or binary as 0 or 1. Global severity score for each targeted pattern was calculated as total score of six zones. RESULTS: There were 53 patients and 137 CT scans included in the study. There were 18(34%) of the patients classified as moderate cases while 35(66%) patients were severe/critical cases. Severe/critical patients had higher CT scores in several types of abnormalities than moderate patients from the second week to the fourth week post symptom onset. Overall lung involvement score in the second week demonstrated predictive value for severity with a sensitivity of 81.0% and specificity of 69.2%. CONCLUSIONS: Our modified semi-quantitative CT scoring system for COVID-19 patients demonstrated feasibility. Overall lung involvement score on the second week had predictive value for clinical severity and could be indicator for further treatment. By late April, 2020, approximately over 3,000,000 patients were diagnosed with COVID-19 globally since outbreak of the disease at the end of December, 2019. It was reported that there could be up to over 20,000 cases newly diagnosed or over 900 death in one country in a single day and there have been more than 10 countries all over the world with morality rate exceeding 10% [1]. One of the existing barriers for COVID-19 treatment is to detect patients who might present to have stable vital sign as mild/moderate cases on arrival but suffered from dramatic exacerbation or even fatal outcome. It is of great importance to identify insidious onset and implement intervention and resource allocation at an early stage to reduce overall mortality rate. Chest computed tomography (CT) is supposed to be golden standard and important diagnostic tool for lung diseases. CT has been widely used for the diagnosis and severity evaluation for clinical purpose during the pandemic. There were quite a few case series and descriptive studies of CT manifestations and evolvement in COVID-19 patients. However, quantitative and subgroup analysis was not reported yet. Furthermore, almost all the studies The lung was divided into six zones (upper, middle, and lower on both sides) by the level of the tracheal carina and the level of the inferior pulmonary vein bilaterally on CT, using a modified scoring system [3, 4] . The observers recognized ground-glass opacity (GGO), consolidation and crazy-paving pattern following Fleischner Society definitions [5, 6] . Bronchiectasis, cavity, pleural effusion, etc., were not included in CT reading and analysis because of low incidence [7] .The reviewers evaluated the extent of the targeted patterns and overall affected lung parenchyma for each zone, using Likert scale (0=absent; 1=1-25%; 2=26-50%; 3=51-75%; 4=76-100%). Thus, GGO score, consolidation score, and overall lung involvement score were sum of 6 zones ranging from 0-24. For crazy-paving pattern, it was only coded as absent or present (0 or 1) for each zone and therefore ranging from 0-6 (See Figure 1) . Existence of certain abnormality, bilateral involvement or multi-zone involvement was identified as present for a single patient if ever observed in any of his/her CT. Multi-zone involvement was considered as ≥ 3 zones to avoid double counting from adjacent zones. A test set of 50 CT images were reviewed by two radiologists of over 5 years' experience (YQZ and GG) and two ED attending physicians with approximately 10 years' experience (SL and SYL) independently. Then, after optimal inter-rater reliability was reached between radiologists and clinicians, all CT images were reviewed by the two ED attending physicians. A second round of image reading was performed to resolve discrepancy. Final score was determined by average score from the two reviewers if consensus was not reached. The hypothesis of this study is that CT score can effectively distinguish severe/critical patients from moderate cases and the area under receiver operating characteristic curves (ROC curves) was greater than 0.5 with α = 0.05 (one side), β = 0.1. The proportion between groups was set as 1:1. At least 16 patients were to be included for each group. Median with interquartile range (IQR) was used for continuous variables and counts and frequencies for categorical variables. Continuous data were compared with Mann-Whitney U-Test, and categorical data with Chi-Square Test or Fisher's exact test. The inter-rater reliability was measured by intraclass correlation coefficient (ICC) among physicians. Values less than 0.5, between 0.5 and 0.75, between 0.75 and 0.9, and greater than 0.90 are indicative of poor, moderate, good, and excellent reliability, respectively [8] . To evaluate the predictive power of CT score on severity scale of COVID-19, receiver operating characteristic curves were Table 1 . The median time from symptom onset to first CT scan was 9 days while the median time of interval for CT scan in our study was 7 days (IQR 5.5, 10). There were 51 (96.2%) patients had GGO, 38 (71.7%) had consolidation, and 42(79.2%) had crazy-paving pattern throughout clinical course. Bilateral lung involvement was observed in 49 (92.5%) patients (See Table 2 ). The intraclass correlation coefficients (ICC) between two radiologists ranged from 0.593 to 0.814. The ICCs between two clinicians were 0.442 to 0.861. The ICCs between radiologists and clinicians were 0.756 to 0.933, which were classified as good or excellent (See Table 3 ). The median time consumed for each CT by clinicians was 118.5 seconds. In order to analyze changes of lung involvement with time and differences between moderate and severe/critical patients, clinical course was divided into five stages by week (1-5 week) according to time from initial symptom onset. Temporal changes of CT scores roughly J o u r n a l P r e -p r o o f depicted bell-shaped curves. In severe/critical patients, all the metrics reached their peak value in the third week and declined later. In moderate patients, crazy-paving pattern score reached its peak value in the second week. The consolidation score presented at a relatively constant low level. Overall lung involvement score and GGO score stayed at a moderate level with minor variation (See Figure 3 and Table 4 ). However, further quantitative analysis of data trending by ANOVA for repeated measurement design was not applicable because of considerable missing data. Furthermore, overall lung involvement score, GGO score, consolidation score, and crazy-paving pattern score were compared by week between groups of different clinical severity. Severe/critical patients had higher overall lung involvement score than moderate patients from the second week to the fourth week, so it was with GGO score (See Figure 3A and 3B and Table 4 ). Additionally, consolidation score and crazy-paving pattern score in moderate patients were lower than severe/critical patients during the third and fourth week (See Figure 3C and 3D and Table 4 ). From all above, the second week was the earlies time point to distinguish these two groups of patients. To evaluate predictive value of CT score on clinical severity, 34 sets of CT scores on the second week were utilized to generate ROC curve ( Figure 4 ). The best AUC 0.747(0.566,0.928), p=0.017, was obtained for overall lung involvement score. The optimum cut-off value was higher than 5.25, with a sensitivity of 81.0% and specificity of 69.2%. Combined model was developed in order to improve predict capacity. qSOFA and CURB 65 score were selected as significant variables. Combined model which included either qSOFA or CURB 65 score, increased AUC to 0. 810 (95% CI 0.654, 0.956) and 0.808(95% CI 0.658, J o u r n a l P r e -p r o o f 0.958), with specificity of 61.5% for both and sensitivity of 95.2% and 90.5%, respectively (See Table 5 , Figure 4 ). However, ROC comparison analysis failed to demonstrate significant differences of AUCs between original model and combined models. This is the first study by clinical doctors that compared the longitudinal changes of CT manifestations between moderate and severe/critical COVID-19 patients through a semi-quantitative visual scoring system. Severe/critical patients had higher overall lung involvement score and GGO score than moderate patients since the second week while consolidation score and crazy-paving pattern score reached their separating point later on the third week. Overall lung involvement score on the second week appeared to have predictive value for whole-course clinical severity with optimal cut-off of 5.25 points. Our patients were all confirmed cases with pneumonia who were admitted in early February in a university affiliated tertiary hospital in Wuhan. Clustered onset was frequently seen. There were 22(41.5%) severe and 13(24.5%) critical cases added up to two thirds of our patient population. Half of them was male, similar to previous study [9] . The incident of hypertension and diabetes in our patient cohort was as high as 45% and 15%, respectively, which might indicate vulnerability of this group of patients to COVID-19. However, this may also relate to a relatively senior age. Association of cardiovascular comorbidities with clinical severity and prognosis in hospitalized patients remains to be further investigated and would have profound impact on patient management. High resolution CT severity scoring system was widely used in interstitial lung disease and pneumonia for medical decision-making and prognosis [3, [10] [11] [12] . There were several innovative applications of the system in our study. First, this is the first pilot study of CT scoring by clinical doctors rather than radiologists. The two reviewers were both attending physician of emergency medicine by training with over ten years' clinical experience in a university affiliated tertiary hospital. As we saw in test set consisted of 50 patients, the inter-rater reliability between the two ED doctors was ranked as moderate or good while inter-rater reliability between radiologists and clinicians were good or excellent as measured by ICCs. Overall lung involvement score was proved to have the highest ICCs. Also, the values and evolving trend of CT score was similar to those reported by radiologists or deep-learning approach in previous studies [4, [13] [14] [15] . Additionally, the median time for visual assessment was only about 2 minutes without complex protocol. Second, the six zones of the lung were much easier to recognize than lobes or segments, especially when high resolution CT is not available. Third, three major types of lesions, GGO, consolidation and crazy-paving pattern, were evaluated separately to show evolving patterns and difference presentation between patients of different severity with predictive value. Rare manifestations were excluded from analysis to avoid inaccuracy and time consumption. Thus, CT score demonstrated remarkable feasibility and efficiency used by experienced clinicians directly as a quick diagnostic tool. CT scan at an early stage showed preferable diagnostic value with a sensitivity as high as 80-90% compared with rRT-PCR at around 70% [16, 17] . The peak of lung opacification J o u r n a l P r e -p r o o f occurred about 10-13 days after symptom onset with bimodal phases from GGO predominant to crazy-paving pattern and consolidation predominant before final remission [4, [13] [14] [15] [18] [19] . But those conclusions were made based on a majority of mild/moderate patients. This might introduce patient selection bias. Our study is the first one to differentiate clinical severity of patients by CT score and to analyze their dynamic changing over time at once, based on the considerable number of severe cases with subgroups analysis, which has rarely been explored. This idea originated from tough issues the authors encountered during patient care. There were 21(39.6%) patients in our study underwent escalation of care during hospital stay, higher than that reported at 20% [20] . COVID-19 patients compensated well with no or low oxygen demand in the early stage, sometimes suffered from sudden, unexpected deterioration, or even ended up with intubation or in-hospital death ultimately, due to mild physical activity or mood swing. The discrepancy between normal saturation and considerably affected lung on CT initially drew our attention to the predictive value of CT scan. It is noteworthy that the disadvantages of CT scan might be its financial burden and radiation exposure for the patients, CT screening for the detection of COVID-19 is not recommended by radiologist either [21] . However, the prognostic value of CT on the second week was extremely inspiring in that an overall lung involvement score exceeding 5 points at this time could be taken as alarming signal before oxygenation reserve crashed or clinical decompensation occurred. Physicians should provide sufficient oxygen support for the patient in advantage to prevent sudden deterioration. Furthermore, it might even be reasonable to include CT score as one of the criteria for clinical severity classification together with clinical indicators. Therefore, the authors suggested that importance of CT scan J o u r n a l P r e -p r o o f still outweighed its adverse impact for the patients during early stage of disease. The interval of follow-up CT should be considered based on patient status and contextual factors though. There are several limitations in our study. First of all, the study was a pilot study with only 53 patients enrolled. The relatively small sample size was inadequate to disclose further potential mechanism or to include other predictive factors for prognosis. Further study is needed with larger sample size and external validation. Second, the application of artificial intelligence (AI) assisted diagnostic technology is developing rapidly on reproductivity, sensitivity, and accuracy of quantitative evaluation [22, 23] . However, numerous procedural requirements are mandatory, including end-inspiration scan, scanner calibration, unified section thickness and reconstruction protocol, manual segmentation adjustment, lung volume correction and so on. The systemic error is also concerning. Furthermore, software analysis mainly focused on small airway diseases such as COPD and is less applied to the evaluation of GGO or other respiratory diseases. In addition, the authors, unfortunately, couldn't manage to get access to software analysis due to limited resources during the initial stage of epidemic. Software analysis was not applicable in such urgent circumstances or in remote areas when visual assessment was proved to be a simple, rapid, and relatively reliable method. Our modified semi-quantitative CT scoring system for COVID-19 patients demonstrated efficiency and feasibility for clinical use. Severe/critical patients had higher scores for GGO, consolidation, crazy-paving pattern, and overall lung involvement than moderate cases during 2-4 weeks of clinical course. Overall lung involvement score on the second week J o u r n a l P r e -p r o o f appeared to have predictive value for whole-course clinical severity. J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f National Health Commission & National Administration of Traditional Chinese Medicine. Diagnosis and Treatment Protocol for Novel Coronavirus Pneumonia Acute exacerbation of idiopathic pulmonary fibrosis: high-resolution CT scores predict mortality Chest CT findings of COVID-19:Relationship with duration Fleischner Society glossary of terms for thoracic imaging Imaging of pulmonary viral pneumonia Clinical and thin-section CT features of patients with 2019-nCoV-pneumonia Radiologic Practice A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Clinical Usefulness of HRCT in Assessing the Severity of Pneumocystis jirovecii Pneumonia: A Cross-sectional Study Correlation of delta high-resolution computed tomography (HRCT) score with delta clinical variables in early systemic sclerosis (SSc) patients Temporal Changes of CT Findings in 90 Patients with COVID-19 Pneumonia: A Longitudinal Study Deep-Learning Approach. Radiology: Cardiothoracic Imaging Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR Use of Chest CT in Combination with Negative RT-PCR Assay for the 2019 Novel Coronavirus but High Clinical Suspicion Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study. The Lancet Infectious Diseases Coronavirus Disease 2019 (COVID-19): A Systematic Review of Imaging Findings in 919 Patients Value of CT findings in predicting transformation of clinical types of COVID-19 Radiological Society of North America Expert Consensus Statement on Reporting:Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic and Radiology, the American College of Radiology, and RSNA. Radiology: Cardiothoracic Imaging Radiologic Features of Patients with 2019-nCoV Infection Preparing Medical Imaging Data for Machine Learning. Radiology: Cardiothoracic Imaging We wish to thank our radiologists Yuqing Zhao and Ge Guo for analyzing the images and constructive criticisms. We greatly appreciate the friendship with colleagues from Branch of Sino-French, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China. Thanks for their great support during patient management.Also, we are grateful to offer our deepest and warmest thanks for all the colleagues from B11 west isolation unit for their determination to risk their own lives to fight together during the epidemic and their passionate work with high quality and efficiency as a team.J o u r n a l P r e -p r o o f