key: cord-0965333-1m3bqw9s authors: Debray, M.-P.; Tarabay, H.; Males, L.; Chalhoub, N.; Mahdjoub, E.; Pavlovsky, T.; Visseaux, B.; Bouzid, D.; Borie, R.; Wackenheim, C.; Crestani, B.; Rioux, C.; Saker, L.; Choquet, C.; Mullaert, J.; Khalil, A. title: Observer agreement and clinical significance of chest CT reporting in patients suspected of COVID-19 date: 2020-05-11 journal: nan DOI: 10.1101/2020.05.07.20094102 sha: 89c8e878cb23dfefe7c66c728f6392505fa5e146 doc_id: 965333 cord_uid: 1m3bqw9s Objectives: To assess inter-observer agreement and clinical significance of chest CT reporting in patients suspected of COVID-19. Methods: From 16th to 24th March 2020, 241 consecutive patients addressed to hospital for COVID-19 suspicion had both chest CT and SARS-CoV-2 RT-PCR. Eight observers (2 thoracic and 2 general senior radiologists, 2 junior radiologists and 2 emergency physicians) retrospectively categorized each CT into one out of 3 categories (evocative, compatible for COVID-19 pneumonia, and not evocative or normal). Observer agreement for categorization between all readers and pairs of readers with similar experience was evaluated with the Kappa coefficient. The results of a consensus categorization were correlated to RT-PCR. Results: Observer agreement across the 3 categories was good between all readers (kappa value 0.68 95%CI 0.67-0.70) and good to very good between pairs of readers (0.64-0.85). It was very good (0.81 95%CI 0.79-0.83), fair (0.32 95%CI 0.29-0.34) and good (0.74 95%CI 0.71-0.76) for the categories evocative, compatible and not evocative or normal, respectively. RT-PCR was positive in 97%, 50% and 27% of cases classified in the respective categories. Observer agreement was lower (p=0.045) and RT-PCR positive cases were less frequently categorized evocative in presence of an underlying pulmonary disease (p<0.001). Conclusion: Inter-observer agreement for chest CT reporting using categorization of findings is good in patients suspected of COVID-19. Among patients considered for hospitalization in an epidemic context, CT categorized evocative is highly predictive of COVID-19, whereas the predictive value of CT decreases between the categories compatible and not evocative. Since December 2019, a new respiratory disease related to a new coronavirus, SARS-CoV-2, developed in China and rapidly spread to other countries, reaching a pandemic stage in March 2020 [1, 2] . Even if the disease follows a benign course in many cases, some patients develop respiratory difficulties requiring hospitalization, leading to a large amount of patients with clinical suspicion of coronavirus disease 2019 presenting to the emergency departments [3] . Accurate identification of COVID-19 patients is crucial to isolate them from not infected patients and to limit the diffusion of the outbreak. The reference standard is the positivity of the real-time reverse transcription-polymerase chain reaction (RT-PCR); nevertheless, the sensitivity of this test remains unclear, having been reported between 42 and 71 % in some early series [4] [5] [6] , because of suboptimal sampling technique, limitations in performance assay or low viral load in the nasopharyngeal area. Chest CT shows abnormalities in a large majority of cases, with some signs described as typical or very evocative of the disease in the current outbreak context [7] [8] [9] [10] [11] . Sensitivity of chest CT has been reported as high as 97% as compared to RT-PCR [4] and CT abnormalities could precede RT-PCR positivity [12] . Because of its readily availability, chest CT may assist firstline triage of patients presenting to hospital [3] . Several Radiology Societies [6, [13] [14] [15] have proposed structured reporting of CT into categories, defined according to the typical or less typical appearance of lung involvement, to facilitate communication with physicians. In routine practice, categorization is based on each reader individual impression supported by numerous papers having described imaging signs of COVID-19 pneumonia [7] [8] [9] [10] [11] . Such categorization may directly impact the clinical decision-making. However, the reproducibility of the categorization is unknown and the clinical significance of the different categories is unclear. Thus, the objectives of our study were to assess inter-observer agreement to categorize CT findings as well as performances of chest CT across the different categories in patients suspected of COVID-19 presenting to hospital. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020 This is a monocentric retrospective study conducted in a University Hospital (Bichat Claude-Bernard Hospital, Paris, France) between 2020 March 16th and 2020 March 24th. Institutional review board was approved and written informed consent waived. During this period of COVID-19 outbreak, patients presenting at our hospital for COVID-19 suspicion and for whom hospitalization was considered had both chest CT-scan and SARS-CoV-2 RT-PCR. Diagnosis of COVID-19 relied on the positivity of the RT-PCR and CT could assist early triage in critically ill patients or with clinically overt pneumonia. Patients with a negative RT-PCR result could have a subsequent RT-PCR test and/or another chest CT during the next few days, depending on the physician's judgment. Consecutive adult patients attending the emergency room or the infectious diseases department of our hospital with clinical suspicion of COVID-19 and having both chest CT and SARS-CoV-2 RT-PCR were included. Demographic, clinical and laboratory data at presentation, follow-up data when available were extracted from electronic medical records. Clinical data included symptoms, any need for oxygen supply, time from symptom onset to CT, comorbidities and pre-existing pulmonary diseases. RT-PCR was performed on nasopharyngeal swabs or aspiration, using RealStar® SARS-CoV-2 RT-PCR Kit (Altona Diagnostics) or Cobas® SARS-CoV-2 Test (Roche). Chest CT-scans were acquired on a multidetector-row CT (Canon Aquilion PRIME or GENESIS) without contrast medium injection. They were performed in the supine position at full inspiration. The scanning parameters were as follows: 120 kVp, automatic exposure control for tube current (SD:15), exposure time 0.27-0.35 sec per rotation depending on the All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint CT unit, collimation 40 mm. Images were reconstructed with 1 mm slice thickness and 0.8 mm inter-slice gap, using a high-frequency reconstruction algorithm. All CT-scans were analyzed by 8 readers, including 2 senior emergency physicians (TP, DB, This category also included non-specific abnormalities as sub-segmental atelectasis or opacities considered to be sequelae. We chose to merge a normal appearance of the lung All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint parenchyma with these non-specific signs because its distinction from minor non-specific abnormalities or typical of sequelae does not have clinical relevance in the present study. Consequently, 3 categories were retained: evocative, compatible and not evocative or normal. Any disagreement between the 4 senior radiologists was analyzed in consensus of these 4 readers giving a final consensus categorization for all cases. Finally, all chest CTs were described by one thoracic radiologist (MPD) for presence and distribution of various elementary signs, as well as signs of any underlying pulmonary disease (significant pulmonary emphysema, interstitial lung disease, bronchiectasis, parenchymal sequelae, bronchial carcinoma). Categorical variables were described by numbers and percentage for each category. The agreement between two and more than two readers was evaluated with the Cohen's kappa coefficient and the Fleiss' kappa, respectively, and their 95% confidence interval, which measures the excess proportion of agreement after taking chance into account. Comparisons between dependent kappas (e.g. for different couple of readers for the same images) were performed with bootstrapping (N=10000 samples) and the p-value corresponds to the proportion of bootstrap samples that yield a couple of kappa value in a different order than the observed one. Comparison between independent kappa (e.g. for different levels of a categorical variable) were performed according to the method proposed in [16] . Comparison of the frequency of radiologic signs between categories was performed with the fisher exact test. All analysis were done using R v3.6.1. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint In total, 241 patients were included. Their demographic, signs at presentation and comorbidities are in Table 1 . COVID-19 was confirmed in 158 patients by RT-PCR positivity. follow-up strongly supportive of this diagnosis. 15 patients were considered non-COVID-19 because of at least 2 consecutive negative RT-PCR and absence of clinical and radiological signs favoring COVID-19 during follow-up. 66 patients were considered non-COVID-19 with only one negative RT-PCR but including 38 with CT and/or clinical follow-up ( Fig. 1 ). Kappa coefficient between all readers across the 3 CT categories was good (0.68, 95%CI 0.67-0.70). It was good to very good between each pair of readers, and significantly better between resident radiologists as compared to thoracic senior radiologists (p<0.001) ( Table 2 ). The Kappa value between all readers was lower when abnormalities were unilateral as compared to bilateral lesions (p=0.018) and in presence of underlying pulmonary lesions (p=0.045). It was lower for patients older as compared to those younger than 70 years (p=0.017), and when time from symptom onset to CT was shorter than 5 days (p=0.012). Observer agreement was very good between all readers for the category "evocative" (0.81, 95%CI 0.79-0.83), it was good for the category "not evocative or normal" (0.74, 95%CI 0.71-0.76) and fair for the category "compatible" (0.32, 95%CI 0.29-0.34) ( Table 3) . The RT-PCR positivity rate was highly significantly different among the 3 categories With RT-PCR as reference, chest CT classified evocative had 75% sensitivity (95%CI 68-81%) and 95% specificity (95%CI 87-98%) whereas chest CT classified evocative or All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Among 90 patients with a first negative RT-PCR, 22 were re-tested, including 5 out of 6 patients (83%) with CT considered evocative, 5 out of 17 patients (29%) with CT considered compatible and 12 out of 67 patients (18%) with CT considered not evocative or normal. Of these subsequent RT-PCR, 2 out of 5 were positive for each category «evocative» and «compatible» and 3 out of 12 for the third category. CT features of the whole population, of RT-PCR positive cases and of the different CT categories are in Table 4 . The most frequent pattern of CT considered evocative was mixed with predominant GGO. Typical bilateral and peripheral distribution with posterior involvement was almost constant. Some centrally distributed lesions were associated to peripheral lesions in 72% of cases (Fig. 2) . The 30 chest CTs considered compatible more frequently showed pure GGO as compared to evocative cases (p=0.0012). Among these 30 cases, 12 and 6 showed features of an underlying pulmonary disease and of an associated pulmonary edema, respectively (Fig. 3, 4) . As compared with cases classified «not evocative» (Fig. 5, 6) , cases classified compatible more frequently showed a typical distribution and atypical signs were absent among those with positive RT-PCR. Time from symptom onset to CT was longer, patients were older and need for oxygen supply was more frequent in patients whose CT was categorized evocative as compared to other patients. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The current study describes observer agreement for chest CT reporting in a large series of consecutive patients suspected of COVID-19. We found that categorization of CT reports was reproducible and meaningful in patients considered for hospitalization. With SARS-CoV-2 RT-PCR as reference, chest CT reported «evocative of COVID-19 pneumonia» was highly predictive of the disease in this population in the current outbreak and agreement for this category between observers of various experiences and sub-specialties was very good. The positivity rate of RT-PCR was highly significantly different among the categories, supporting a role for disease likelihood stratification by CT. It should be emphasized that a quarter of patients with chest CT classified «not evocative or normal» had a positive RT-PCR, highlighting that no CT pattern can rule out COVID-19 in the present epidemic context. As previously reported [4, 17] , RT-PCR may be positive in patients without lung abnormalities on CT. We observed only fair observer's agreement for the category «compatible». This may be The recent Radiological Society of North America proposal for CT findings related to COVID-19, includes 4 categories: typical, indeterminate, atypical appearance, or CT negative for All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. [13] [14] [15] . We herein chose to retain 3 categories, merging a normal appearance of the lung parenchyma with non-specific lung abnormalities or features suggesting an alternative diagnosis because we thought the probability of COVID-19 would be lower in these latter situations, although non-zero, and in accordance with the guidelines of the European Society of Radiology. By showing significant differences in the RT-PCR positivity rate among CT categories, our study supports that chest CT can participate in estimating the likelihood of COVID-19, in association with contact history, clinical presentation and prevalence of the disease in the population [18] . The role of chest CT for patients suspected of COVID-19 is not completely established. Despite limitations in sensitivity and result delays, the RT-PCR remains the diagnostic reference and chest CT is not recommended for screening by most Radiology Societies [6, 15, 19, 20] . According to a recent consensus statement from the Fleischner Society [20] , imaging may be indicated for diagnosis when RT-PCR is negative or unavailable in patients having risk factors for worsening or moderate-to-severe respiratory signs. In our study, most patients who had chest CT at the emergency room had indeed either moderate or severe clinical features or comorbidities. Chest CT helped addressing or transferring patients into the proper, COVID-19 or not COVID-19, hospitalization area, especially those needing urgent decision, before the RT-PCR result was provided. Chest CT could favor re-testing in cases with negative RT-PCR [12] . Patients with a first negative RT-PCR and a chest CT considered «compatible» have been more frequently re-tested and had a subsequent PCR more frequently positive, as compared to patients with a first negative RT-PCR and a chest CT considered «not evocative». Of note, two patients had a final COVID-19 retained diagnosis based on typical clinical and CT presentation and evolution, despite two negative nasopharyngeal RT-PCR tests. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. [4, 5, 21] and specificity between 25 and 56%, with pooled sensitivity and specificity of 94% and 37%, respectively according to a recent meta-analysis [22] . Whether positive CTs in these studies showed typical imaging features is unclear. Our results differ, chest CT classified evocative having 75% sensitivity and 95% specificity and chest CT classified evocative or compatible having 85% sensitivity and 77% specificity. These differences may be attributable to differences in CT features between an «evocative» CT and a «positive» CT as well as differences in characteristics of the population having chest CT. The prevalence of the disease in the population, severity and type of clinical presentation, time from symptom to CT, age of patients and any underlying pulmonary pathology may modify the performances of chest CT for diagnosing COVID-19 pneumonia [20, 23] . Indeed we observed that the presence of an underlying pulmonary disease lowered the sensitivity for an evocative CT. This concerned a quarter of the whole population in our study and almost a quarter of the patients with positive RT-PCR. CT reporting in several categories seems best suited to the routine practice than a binary conclusion, when CT abnormalities may mix different types of lesions or are very limited in extent. It allows identifying a category with typical features, whose high specificity can be useful in an epidemic context, allowing relying on chest CT for diagnosis in some cases. Our study has some limitations. Firstly, because it is monocentric and because of various presentation and prevalence of the disease around the world, caution should be taken to extrapolate CT sensitivity and specificity between populations and periods [24, 25] . We may assume that the high specificity we report for the category «evocative» would be lower if the prevalence of COVID-19 decreased and the one of other viral pneumonia or some interstitial lung diseases, as drugs or connective tissue diseases related, increased. Secondly, the clinical significance of CT reporting should integrate the risk level of each patient, that we have not precisely taken into account, even if the study period took place during the outbreak. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint 1 3 In conclusion, inter-observer agreement to report chest CT findings into categories for clinical suspicion of COVID-19 is good, among readers of various experience levels and subspecialties. Chest CT can participate in estimating the likelihood of COVID-19 in patients presenting to hospital during the outbreak. CT categorized evocative of COVID-19 pneumonia were highly predictive of the disease, whereas the predictive value of CT decreased between the categories «compatible» and «not evocative or normal», from 50 to 27%. Category reports need to be integrated to the clinical presentation and risk level for COVID-19. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Onset-to-CT-delay (days) 4 [2] [3] [4] [5] [6] [7] Data are numbers with percentages in brackets or medians with lower and upper quartiles in square brackets. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020 Cohen's kappa value for agreement between 2 readers, and Fleiss' kappa value for agreement between all readers *Observer agreement not significantly different between general senior radiologists nor senior emergency physicians and agreement between thoracic senior radiologists **Observer agreement significantly better between resident radiologists as compared to agreement between thoracic senior radiologists (p <0.001) All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020 Data are numbers with percentages, or 95% confidence interval for the κ value, in brackets All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Underlying pulmonary disease includes significant emphysema, interstitial lung disease, bronchiectasis, sequelae All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint 2 1 consistent with associated COVID-19 pneumonia. Nasopharyngeal SARS-CoV-2 RT-PCR positive. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 11, 2020. . https://doi.org/10.1101/2020.05.07.20094102 doi: medRxiv preprint Clinical features of patients infected with 2019 novel coronavirus in Wuhan World Health Organization (2020) Director General's speeches How imaging should properly be used in COVID-19 outbreak: an Italian experience Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases Sensitivity of Chest CT for COVID-19: Comparison to RT-PCR Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA. Radiol Cardiothorac Imaging Chest CT manifestations of new coronavirus disease 2019 (COVID-19): a pictorial review COVID-19): A Perspective from China Coronavirus Disease 2019 (COVID-19): A Systematic Review of Imaging Findings in 919 Patients Relation Between Chest CT Findings and Clinical Conditions of Coronavirus Disease (COVID-19) Pneumonia: A Multicenter Study Coronavirus disease 2019: initial chest CT findings Chest CT for Typical 2019-nCoV Pneumonia: Relationship to Negative RT-PCR Testing Reporting templates COVID-19 COVID-19 patients and the radiology department -advice from the European Society of Radiology (ESR) and the European Society of Thoracic Imaging (ESTI) Statistical methods for rates and proportions Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection Primary stratification and identification of suspected Corona virus disease 2019 (COVID-19) from clinical perspective by a simple scoring proposal A British Society of Thoracic Imaging statement: considerations in designing local imaging diagnostic algorithms for the COVID-19 pandemic The Role of Chest Imaging in Patient Management during the COVID-19 Pandemic: A Multinational Consensus Statement from the Fleischner Society Chest CT Features of COVID-19 in Diagnostic Performance of CT and Reverse Transcriptase-Polymerase Chain Reaction for Coronavirus Disease 2019: A Meta-Analysis Differential Diagnosis for Coronavirus Disease (COVID-19): Beyond Radiologic Features A call for caution in extrapolating chest CT sensitivity for COVID-19 derived from hospital data to patients among general population Chest CT and Coronavirus Disease (COVID-19): A Critical Review of the Literature to Date