key: cord-0929370-c0rtga8x authors: Dilek, Okan; Demirel, Emin; Akkaya, Hüseyin; Belibagli, Mehmet Cenk; Soker, Gokhan; Gulek, Bozkurt title: Different chest CT scoring systems in patients with COVID-19: could baseline CT be a helpful tool in predicting survival in patients with matched ages and co-morbid conditions? date: 2021-04-12 journal: Acta Radiol DOI: 10.1177/02841851211006316 sha: fe46a232182465d89e66867b0c18fab00a178ae8 doc_id: 929370 cord_uid: c0rtga8x BACKGROUND: Computed tomography (CT) gives an idea about the prognosis in patients with COVID-19 lung infiltration. PURPOSE: To evaluate the success rates of various scoring methods utilized in order to predict survival periods, on the basis of the imaging findings of COVID-19. Another purpose, on the other hand, was to evaluate the agreements among the evaluating radiologists. MATERIAL AND METHODS: A total of 100 cases of known COVID-19 pneumonia, of which 50 were deceased and 50 were living, were included in the study. Pre-existing scoring systems, which were the Total Severity Score (TSS), Chest Computed Tomography Severity Score (CT-SS), and Total CT Score, were utilized, together with the Early Decision Severity Score (ED-SS), which was developed by our team, to evaluate the initial lung CT scans of the patients obtained at their initial admission to the hospital. The scans were evaluated retrospectively by two radiologists. Area under the curve (AUC) values were acquired for each scoring system, according to their performances in predicting survival times. RESULTS: The mean age of the patients was 61 ± 14.85 years (age range = 18–87 years). There was no difference in co-morbidities between the living and deceased patients. The survival predicted AUC values of ED-SS, CT-SS, TSS, and Total CT Score systems were 0.876, 0.823, 0.753, and 0.744, respectively. CONCLUSION: Algorithms based on lung infiltration patterns of COVID-19 may be utilized for both survival prediction and therapy planning. The World Health Organization (WHO) declared the epidemic, which originated in the Wuhan province of China in December 2019 and outspread to the entire world, a pandemic (1) . The gold standard utilized in the diagnosis of the SARS-CoV-2 infection is the reversetranscription polymerase chain reaction (RT-PCR) test (2) . Radiologic imaging, too, plays a very important role in the diagnosis and follow-up of the disease. Lung computed tomography (CT) findings are especially useful in the diagnosis of the disease when RT-PCR tests are negative (3) . The majority of patients with COVID-19 experience disease with mild symptoms, whereas some patients develop severe clinical conditions such as pneumonia, pulmonary edema, acute respiratory distress syndrome, and even multiple organ failure. The severity of the clinical conditions may even lead to death (4, 5) . Lung CT is a very effective tool in the diagnosis, follow-up, and prognostic prediction of patients with COVID-19 pneumonia (6, 7) . Various CT severity scoring systems have been developed and tried since the very first days of the pandemic in order to make a comparative evaluation of the disease and predict survival outcomes. The contemporary scoring systems in use are the Total Severity Score (TSS), the Chest Computed Tomography Severity Score (CT-SS), and the Total CT Score (8) (9) (10) . These systems are in use in general practice in various centers, and they possess their own advantages and disadvantages. These systems usually aim to predict mild and severe pneumonia. The aim of the present study was to compare these semiquantitative scoring systems with our unique Early Decision Severity Score (ED-SS) system, in terms of the prediction of fatality in COVID-19. We aimed to investigate which method would prove best, and most practical for this predictive function, in the present era of severe contagion dissemination. The present study was planned as a single-center and retrospective study. Approval for the study was obtained from both the ethical committee and the Ministry of Health. A total of 100 cases of COVID-19 were included in the study. These were cases whose diagnoses had been confirmed with at least one RT-PCR test, between April 2020 and August 2020. During this period, the PCR tests of 572 of 3070 patients with suspected COVID-19 was positive. Among these patients, those who had a thoracic tomography scan and had a PCRþ history and whose detailed medical history could be accessed from our hospital's information system were evaluated. In total, 187 patients who met the inclusion criteria were identified. In our study, in order to reduce the selection bias as much as possible, we used two different referees (clinician and radiologist) in patients' choices and CT interpretation processes. While selecting the patients, we aimed to keep the clinician referee unaware of the CT findings. The clinician referee excluded the following: cases whose initial CT scans were done not on initial admission to hospital but were done later; cases with primary lung malignancies or lung metastases; cases with a history of lobectomy or other surgical interventions; and cases with a history of a radiotherapeutic procedure. A total of 53 deceased and 54 surviving patients with similar co-morbid conditions and ages were assigned to the radiologist referee by the clinician referee. After the exclusion of seven patients whose images were not suitable for optimal evaluation by the radiologist referee, 50 survivors and 50 deceased patients were ready for rater to score. All patients received the standard therapy protocol designated by the Turkish Ministry of Health (https:// covid19.saglik.gov.tr). The 100 cases included in the study were divided into two groups of 50 each. One group consisted of the living cases and the other consisted of the deceased cases. In the surviving patient group, their wellbeing was confirmed for at least 30 days after discharge. DICOM images were anonymized and prepared for the interpreters. Thorax CT scans were performed in a 128-detector scanner (Philips Ingenuity 128; Philips, Eindhoven, The Netherlands). All scans were completed in a single breath-hold in the supine position. The standard scanning area was designated as the space between the apex of the lungs and the costophrenic angles. The CT parameters were designated as follows: 80-120 kVp; 100-200 mAs; gantry rotation time ¼ 0.4 s; pitch ¼ 0.8 or 1; slice thickness ¼ 1 mm; and slice reconstruction ¼ 3 mm. Axial, sagittal, and coronal reformatted images were acquired from the raw slices. The radiation dose received by the patients was calculated as 3-5.5 mSv. All four scoring systems, three of which were present in the literature and one which was our original product, were utilized for all cases. Infiltrations and consolidations such as the ground-glass and crazy paving patterns were evaluated using all four modalities. 1. Total Severity Score: This system was developed by Kunwei et al. (8) , who investigated the relationship between the visual CT scoring system and disease category (11) . In this scoring system, all five lobes of the two lungs are evaluated independently and scored based on the percentage of involvement. These scores of involvement are categorized as follows: 0 ¼ 0% involvement; 1 ¼ 1%-25% involvement; 2 ¼ 26%-50% involvement; 3 ¼ 51%-75% involvement; and 4 ¼ 76%-100% involvement. The patient score (TSS) is acquired by the cumulation of all scores obtained from the five lobes, and its maximum value is designated as 20. 2. Chest Computed Tomography Severity Score: This system was developed by Yang et al. (9) These researchers investigated the relationship between the disease category and the visual CT scoring system (11) . In this scoring system, the total of 18 segments of the two lungs were redesignated as 20 segments. In order to do this alteration in segment numbering, the posterior apical segment of the left upper lobe was further divided into the apical and posterior segments; and the anteromedial basal segment of the left lower lobe was further divided into the anterior and basal segments. The scorings made on the basis of the percentage of the involved areas were designated as follows: 3. Total CT Score: This is a modality developed by Wang et al. (12) from a preliminary model (13) . This system divides each lung into three regions: the upper region (above the carina), and the middle and lower regions (below the lower pulmonary vein). Every region is evaluated with a scoring scale of 0-4. Scoring on the basis of zonal involvement is as follows: The patient score (Total CT Score) is the sum of the scores acquired from the six zones, and its maximum value is designated as 24. 4. Early Decision Severity Score: This scoring system was developed by our research team. The system makes a combined evaluation of the patient based on the visual CT scorings, intubation necessities, and mortality rates (ED). In this system, both lungs are divided into two regions according to their relationships with the major fissures. The scoring scales are based on the areal size of involvement; and the percentages and scores are designated as follows: 0 ¼ 0% involvement; 1 ¼1%-25% involvement; 2 ¼ 26%-50% involvement; 3 ¼ 51%-75% involvement; and 4 ¼ 75%-100% involvement. The patient score (ED-SS) is the sum of the scores acquired from the four regions, and its maximum value is designated as 16. The evaluation of the images was performed independently by two radiologists: a radiology specialist with four years of experience and a final-year radiology resident. All interpretations were performed under observation by a referee. All patients were evaluated, with a two-week interval, in order to prevent any interpreter bias. The interpreters did not know the medical conditions of the patients. The duration of the evaluation process and the results of these evaluations and time were recorded by the referee for each system. If there were any conflicting results between the two interpreters, a consensus was acquired under the auditing of a senior radiologist with 18 years of experience. The SPSS version 23.0 program was utilized for the statistical analyses of the data. The categorical variables were defined as numbers and percentages, while continuous variables were appointed as the mean AE SD (range). Distribution of the variables was compared against the normal distribution patterns using histograms and probability graphics as well as analytical methods such as the Kolmogorov-Smirnov and the Shapiro-Wilk tests. The chi-square and Fisher precision tests were used to compare the categorical variables. The Student's t-test was utilized for parameters that showed congruence with the normal distribution; whereas the Mann-Whitney U test was used for parameters that were not consistent with normal distribution. Sensitivity and specificity were calculated based on the results of the scorings and the mortality conditions of the cases. The area under the curve (AUC) was calculated on the receiver operating characteristic (ROC) curve and a cut-off value was designated. The reliability among the interpreters was assessed using the inter-class correlation coefficients (ICCs) for continuous variables. The ICCs were classified as follows: 0-0.2 ¼ no agreement; 0.21-0.40¼ weak agreement; 0.41-0.60 ¼ mild agreement; 0.61-0.80 ¼ good agreement; and 0.81-1.0 ¼ perfect agreement. The statistical significance level was designated as 0.05 for all tests. There were 61 men (61%) in the present study. The mean age of the cases was 61 AE 14.8 years (age range ¼ 18-87 years). No statistically significant differences were found between the living and the deceased cases, in terms of co-morbidities (P > 0.05). Dyspnea, on the other hand, was found to be more frequent in the deceased cases. This difference was found to be statistically significant. The mean hospitalization time for the living and deceased cases was 6 days (range ¼ 3-26 days) and 10 days (range ¼ 0-62 days), respectively. Other basic data are given in Table 1 . The scorings of all patients carried out by the two radiologists on the basis of the TSS, CC-SS, Total CT Score, and ED-SS systems are given in Table 2 . The evaluations done by both interpreters disclosed the fact that the mean scores of the deceased were higher than those of the living, cases (P < 0.05). The most successful system in predicting the survival outcomes was found to be the ED-SS system, for both interpreters. The AUC values were found to be 0.876 and 0.872 for interpreters 1 and 2, respectively. The AUC, sensitivity, specificity, and the positive and negative predictive values for all scoring systems are shown in detail in Table 3 . The ROC curves for both interpreters are shown in Fig. 1 . The agreement of the intraclass correlation values of the interpreters in all scorings were found to be rather high. The highest congruence was acquired at the CT-SS system. The detailed data are shown in Table 4 . The fastest evaluation was made by the Total CT Score system. In all systems, the more experienced interpreter performed the fastest evaluations. The evaluation duration times for both interpreters are shown in Table 5 . Various scoring systems were developed from the very beginning of the pandemic in order to build an evaluation correlation between the clinical and radiological findings of the disease, based on the findings of thorax CT scans (8, 9, 14) . These systems had proved to be functional; however, a solid consensus was still missing among radiologists on which system was the most effective, practical, applicable, and fast. The aim of the present study was to evaluate previous systems and ours together and make a comparison among these systems. The highest level of success in predicting survival in all systems was with the ED-SS, the system we developed. This system had an AUC value of 0.876, a sensitivity of 0.74, and a specificity of 0.88. In previously developed systems, AUC values are CT-SS, TSS, and Total CT score systems, respectively, from high to low: CT-SS: AUC ¼ 0.744, sensitivity ¼ 76%, specificity ¼ 84%; TSS: AUC ¼ 0,753, sensitivity ¼ 76%, specificity ¼ 66%; Total CT Score: AUC ¼ 0.744, Table 2 . Interpreter scores obtained from all scoring systems. (15) ; it had an AUC value of 0.839, sensitivity of 84%, and specificity of 66%, in the prediction of survival rates. It has been noticed that the AUC values determined for each previous system are high. In our evaluations, the reason for the relatively low AUC values compared to other studies is that the number of patients who died in these studies is very low compared to the number of patients participating in the general study, and that age and co-morbid conditions are not matched. There was no significant difference between the groups evaluated in our study in terms of comorbid conditions and mean ages. In this respect, in the comparison of the two groups, we think that we can evaluate more objectively the success of the semiquantitative scoring systems to be made on the first thorax CT at admission. The contemporary literature reveals that various radiologic imaging clues have been used during the initial applications to predict prognosis. Quantitative software is available to predict the prognosis of patients with COVID-19 pneumonia and to review possible treatment changes accordingly (16) (17) (18) . Besides, there are artificial intelligence (AI) models developed based on CT and their success rates are quite high (19, 20) . It is a fact that more studies are needed on the repeatability and widespread use of AI methods and their transition to routine practice. On the other hand, technical equipment is required for both quantitative evaluation and AI models, user training is required, and accessibility and cost are important problems. The rapid and ongoing spread of the contagion of SARS-CoV-2 has mandated the development of a better and more practical system for prognosis. There has been a race against time since the outbreak of the pandemic, which is why it has always been very important to predict the outcome of the disease process from lung CT images in patients with COVID-19. In this regard, it is very important that the CT scoring systems are practical and useful and have high predictive power. The ED-SS system was developed by our research team, and it has proved to have a superior ability to determine the prognosis in patients with COVID-19. This success of the ED-SS system may be attributed to the fact that the system does not deal with anatomical detail, but instead validates the involved lung areas directly and with great precision. Short evaluation times and easy teachability are other advantages of the system. Among the systems developed before ours, the CT-SS system has the highest precision in predicting survival outcome. However, this system still has its own disadvantages: it necessitates a detailed anatomical knowledge, and it also takes a long time for evaluation. The TSS system was the third in line in survival prediction. The biggest advantage of this system is that it makes a lobar evaluation. Its other advantage is the rather short period of time spent on the evaluation process. The lowest success rate in survival prediction belongs to the Total CT Score system. Nevertheless, this system still has the advantage of a short evaluation time. These scoring systems are very compatible and practical for radiologists with sufficient expertise (Figs. 2 and 3) . The findings of the present study demonstrated that the intraclass agreement coefficients were rather high in both our system and in other previously developed systems. Unfortunately, in today's pandemic conditions, it is usually not the radiologist who performs the initial evaluation. Therefore, it is of utmost importance that these systems are easily teachable for the radiology residents as well as other specialists. The learning curve for the segmental system of the lungs is steep and hard. It is easier to use the lobar, zonal, and fissural evaluation systems, in terms of an easier learning process. The fissural evaluation system is the most successful one among these systems. Its validation in different centers may be beneficial for a better outcome. The present study has some limitations. First, it was a single-center study. In addition, the number of cases was not very satisfying, and our system and other previous systems were not compared with quantitative systems. However, this disadvantage may be neglected, because many clinics lack quantitative systems; besides, it is not easy to utilize these systems. Systems not necessitating the utilization of extra equipment or software are much more practical and useful. In our study, we aimed to develop a CT-based validation system. Since our general study group is an elderly group with comorbid diseases, we think that there is a limitation in its adaptation to the general society. We believe that this is the reason for our prediction of poor survival with a lower CT score than previous studies. A process completely unaware of patients' CT scans was established to evaluate the compatibility of CT scoring systems between practitioners and clinical outcomes. We think that this reduces cross-system bias in patient selection as much as possible. In conclusion, the present study has demonstrated that CT-SS, TSS, and Total CT Score systems can be used to obtain information about the prognosis of patients with pulmonary involvement in the first thoracic CT evaluation in the race with the increasing workload in radiology departments. The ED-SS system has been developed as an alternative to these. The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. The author(s) received no financial support for the research, authorship, and/or publication of this article. Okan Dilek https://orcid.org/0000-0002-2144-2460 CT-SS, Chest Computed Tomography Severity Score; ED-SS, Early Decision Severity Score; TSS, Total Severity Score. A Novel Coronavirus from Patients with Pneumonia in China Frequency and distribution of chest radiographic findings in patients positive for COVID-19 Chest CT findings in patients with coronavirus disease 2019 and its relationship with clinical features Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study CT imaging features of 2019 novel coronavirus (2019-nCoV) Chest CT findings in coronavirus disease-19 (COVID-19): relationship to duration of infection Time course of lung changes at chest CT during recovery from coronavirus disease 2019 (COVID-19) CT image visual quantitative evaluation and clinical classification of coronavirus disease (COVID-19) Chest CT Severity Score: an imaging tool for assessing severe COVID-19 The clinical and chest CT features associated with severe and critical COVID-19 pneumonia Office of state administration of traditional Chinese medicine. Notice on the issuance of a programme for the diagnosis and treatment of novel coronavirus (2019-nCoV) infected pneumonia Temporal changes of CT findings in 90 patients with COVID-19 pneumonia: a longitudinal study Severe acute respiratory syndrome: temporal lung changes at thin-section CT in 30 patients Chest CT score in COVID-19 patients: correlation with disease severity and short-term prognosis Evaluation of the relationship between inpatient COVID-19 mortality and chest CT severity score Quantitative computed tomography analysis for stratifying the severity of Coronavirus Disease The performance of chest CT in evaluating the clinical severity of COVID-19 pneumonia: identifying critical cases based on CT characteristics Role of computed tomography in predicting critical disease in patients with covid-19 pneumonia: A retrospective study using a semiautomatic quantitative method Development and validation of a deep learning-based model using computed tomography imaging for predicting disease severity of coronavirus disease Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal