key: cord-0717256-ly8kp1er authors: Kwee, Robert M.; Adams, Hugo J. A.; Kwee, Thomas C. title: Diagnostic Performance of CO-RADS and the RSNA Classification System in Evaluating COVID-19 at Chest CT: A Meta-Analysis date: 2021-01-14 journal: Radiol Cardiothorac Imaging DOI: 10.1148/ryct.2021200510 sha: 939a62a77aaa40edc555b745b9f601d0199c3918 doc_id: 717256 cord_uid: ly8kp1er PURPOSE: To determine the diagnostic performance of the COVID-19 Reporting and Data System (CO-RADS) and the Radiological Society of North America (RSNA) categorizations in patients with clinically suspected coronavirus disease 2019 (COVID-19) infection. MATERIALS AND METHODS: In this meta-analysis, studies from 2020, up to August 24, 2020 were assessed for inclusion criteria of studies that used CO-RADS or the RSNA categories for scoring chest CT in patients with suspected COVID-19. A total of 186 studies were identified. After review of abstracts and text, a total of nine studies were included in this study. Patient information (n¸ age, sex), CO-RADS and RSNA scoring categories, and other study characteristics were extracted. Study quality was assessed with the QUADAS-2 tool. Meta-analysis was performed with a random effects model. RESULTS: Nine studies (3283 patients) were included. Overall study quality was good, except for risk of non-performance of repeated reverse transcriptase polymerase chain reaction (RT-PCR) after negative initial RT-PCR and persistent clinical suspicion in four studies. Pooled COVID-19 frequencies in CO-RADS categories were: 1, 8.8%; 2, 11.1%; 3, 24.6%; 4, 61.9%; and 5, 89.6%. Pooled COVID-19 frequencies in RSNA classification categories were: negative 14.4%; atypical, 5.7%; indeterminate, 44.9%; and typical, 92.5%. Pooled pairs of sensitivity and specificity using CO-RADS thresholds were the following: at least 3, 92.5% (95% CI: 87.1, 95.7) and 69.2% (95%: CI: 60.8, 76.4); at least 4, 85.8% (95% CI: 78.7, 90.9) and 84.6% (95% CI: 79.5, 88.5); and 5, 70.4% (95% CI: 60.2, 78.9) and 93.1% (95% CI: 87.7, 96.2). Pooled pairs of sensitivity and specificity using RSNA classification thresholds for indeterminate were 90.2% (95% CI: 87.5, 92.3) and 75.1% (95% CI: 68.9, 80.4) and for typical were 65.2% (95% CI: 37.0, 85.7) and 94.9% (95% CI: 86.4, 98.2). CONCLUSION: COVID-19 infection frequency was higher in patients categorized with higher CORADS and RSNA classification categories. The coronavirus disease 2019 (COVID-19) pandemic has caused a major global crisis. On December 2, 2020, there were 64 million confirmed cases and almost 1.5 million confirmed deaths due to COVID-19 worldwide (1). Although most countries have already experienced the first surge of rising COVID-19 cases, second surges have started in late 2020. Chest imaging has an important role in the evaluation of patients with COVID-19 (2) . The chest imaging findings of COVID-19 were first reported in January 2020 and included bilateral lung involvement and ground-glass opacities in the majority of hospitalized patients (3) . Since this first report (3) , several studies on the diagnostic value of chest CT in COVID-19 have been published. However, as most initial studies did not use uniform diagnostic criteria (4), their results cannot directly be translated to clinical practice. Two major chest CT classification scales for standardized CT reporting of COVID-19 have been developed, namely the COVID-19 Reporting and Data System (CO-RADS) (5) and the Radiological Society of North America (RSNA) classification system for reporting COVID-19 pneumonia (6, 7) . CO-RADS basically consists of five categories (CO-RADS 1 to 5; Table E1 and Figures E1-5 [supplement]), whereas the RSNA classification system consists of four categories (negative, atypical, indeterminate, and typical; Table E2 and Figures E1-5 [supplement]). CO-RADS and the RSNA chest CT classification system are very similar. CO-RADS categories 1, 2, 3-4, and 5 are essentially equal to categories negative, atypical, indeterminate, and typical of the RSNA classification system, respectively (5, 8) . The use of these standardized diagnostic classification systems may reduce observer variation, enhance clinical communication, and improve generalizability. However, the diagnostic yields of both the CO-RADS and RSNA I n p r e s s categorizations are not completely clear yet. Original studies on this topic may suffer from small sample sizes and potential methodological quality concerns. Aggregated data are necessary to understand the clinical interpretability of these chest CT classification systems for the diagnosis of COVID-19. Although there have already been meta-analyses published on the diagnostic performance of chest CT in detecting COVID (4, 9) , the initial studies included within these meta-analyses suffered from methodological quality issues and did not use uniform diagnostic criteria such as the CO-RADS and RSNA categorizations. These shortcomings limit translation of diagnostic performance values to clinical practice. Therefore, our objective was to determine, in a meta-analysis, the diagnostic performance of the CO-RADS and the RSNA classification system in patients with clinically suspected COVID-19 infection. The study adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guideline (10) . In addition, the journal Radiology: Cardiothoracic Imaging was manually searched for potentially relevant publications. Publications which cited the original CO-RADS (5) and RSNA classification system for reporting COVID-19 pneumonia (6, 7) were also searched using the cited reference function in Web of Science and MEDLINE. The search was updated until August 24, 2020. Original studies which provided data on the diagnostic performance of the CO-RADS or RSNA classification system in evaluating patients with clinically suspected COVID-19 infection, and in which reverse transcription polymerase chain reaction (RT-PCR) was the reference standard, were eligible for inclusion. Reviews, abstracts, and studies were excluded for the following reasons: (a) included fewer than 10 patients, (b) reported insufficient data to compose a 2×2 contingency table to calculate sensitivity and specificity on per-patient level for any CO-RADS or RSNA classification system threshold, and (c) only provided data on the performance of artificial intelligence-based analyses. When overlapping data were presented in more than one study, the study with the largest number of patients was selected. Titles and abstracts of retrieved studies were reviewed using aforementioned selection criteria. The full-text version of each potentially eligible study was then reviewed to definitively determine if the study fulfilled the selection criteria. For each included study, the main characteristics (country of origin, patient inclusion period, number of patients, age, and sex of patients, clinical characteristics of included patients, CT protocol, CT interpreters, reference standard, and COVID-19 frequency) I n p r e s s were extracted by two independent reviewers (R.M.K., radiologist, and H.J.A., thirdyear resident in radiology). If data from multiple readers were reported, only data from the first reader were extracted and used for the analyses. The number of patients with and without COVID-19 according to the different CO-RADS and the RSNA classification categories was also extracted. Data on interobserver or intraobserver agreement using the CO-RADS and the RSNA classification system were also extracted. Any discrepancies were solved by consensus with a third reviewer (T.C.K., radiologist). The quality of included studies was assessed by two independent reviewers (R.M.K. and H.J.A.) using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool, which comprises four key items: patient selection, index test, reference standard, and flow and timing (11) . Any discrepancies were solved by consensus with a third reviewer (T.C.K.). Frequency of COVID-19 in each of the categories of the CO-RADS and the RSNA classification system were calculated for each individual study and pooled with a random effects model. Sensitivity and specificity of the CO-RADS and RSNA classification systems at specific diagnostic thresholds in detecting COVID-19 (ie, CO-RADS thresholds of at least 3, at least 4, 5, and RSNA classification thresholds indeterminate and typical) were pooled using a bivariate random-effects model (12) . The numbers were pooled in each CORADS and in each RSNA classification category separately. The same random effects model was used per each study, across I n p r e s s different categories. Cochran's Q and Chi-squared tests were performed to test for heterogeneity between studies, which was defined as P < .10. Statistical analyses were performed using the Open Meta-Analyst software package (13) and Meta-analysis of Diagnostic Accuracy Studies package in R software (14, 15) . Figure 1 displays the study selection process. A total of 182 studies were eligible for inclusion after searching databases. After screening titles and abstracts, 168 studies were excluded, leaving 14 studies that were potentially eligible for inclusion. After reading the full text of the 14 studies, three studies (16) (17) (18) were excluded because the diagnostic performance of either CO-RADS or the RSNA classification system was not investigated, one study (5) was excluded because no data on a per-patient level were reported, and another study (19) was excluded because there were overlapping data with another study (8) which comprised a larger number of patients. Nine studies were eventually included (8, (20) (21) (22) (23) (24) (25) (26) (27) . The main study characteristics are shown in Table 1 and Table E3 ( I n p r e s s Figure 2 provides a summary of the QUADAS-2 quality assessments. In one study (20) , it was unclear whether patients were enrolled consecutively or randomly. There was no risk of bias with regard to patient selection in the other studies or with regard to index test. Risk of bias with respect to reference test was rated high in three studies (23, 26, 27) because repeated RT-PCR testing was not used in all patients with a negative initial RT-PCR result and persistent clinical suspicion of COVID-19. Risk of bias with respect to reference test was rated unclear in one study (21) , because it was not clear whether all patients with an initial negative RT-PCR result and a persistent clinical suspicion of COVID-19 underwent repeated RT-PCR testing. In one study (20) , there was potential risk of bias with regard to flow and timing, because the time interval between CT and RT-PCR testing was not reported. There was no risk of bias with regard to flow and timing in the other studies, because the maximum time interval between chest CT and RT-PCR did not exceed seven days (22) . There were no applicability concerns. The frequency of COVID-19 in each of the categories of CO-RADS is displayed in Table 2 . With higher CO-RADS classification, the frequency of COVID-19 increased. Pooled frequency of COVID-19 in CO-RADS categories 1, 2, 3, 4, and 5 were 8.8%, 11.1%, 24.6%, 61.9%, and 89.6%. Pooled sensitivity and specificity of the CO-RADS and the RSNA classification system at specific thresholds are displayed in Table 3 . Pooled pairs of sensitivity and specificity using CO-RADS thresholds were the following: at least 3, 92.5% (95% CI: 87.1, 95.7) and 69.2% (95% CI: 60. 8 The frequency of COVID-19 in each of the categories of the RSNA classification systems is displayed in Table 4 For the CO-RADS, substantial to almost perfect interobserver agreement has been reported, with ĸ values of 0.648 to 0.773 (8) and intraclass correlation coefficients of 0.800 to 0.874 (20) . For the RSNA classification system, moderate to substantial interobserver agreement has been reported, with ĸ values of 0.500 (23) and of 0.570 to 0.663 (8) . None of the included studies reported data on intraobserver agreement. This meta-analysis provides pooled data with regard to the frequency of patients with COVID-19 for each category of CO-RADS and the RSNA classification system in patients with clinically suspected with having a COVID-19 infection. With higher CO-RADS and RSNA classification category, the frequency of patients with COVID-I n p r e s s RADS 5, the prevalence of COVID-19 was 89.6%. In the RSNA category typical, the frequency of COVID-19 was 92.5%. We also provided sensitivity and specificity values for specific diagnostic thresholds. Using the lowest clinically meaningful thresholds of CO-RADS of at least 3 Methodological quality of the studies included in the current meta-analysis generally appears to have higher quality that studies included within prior metaanalyses (4, 9) . In two prior meta-analyses, high risk of bias was present in all six included studies (100%) (4) and in ten of thirteen included studies (77%) (9). In our current meta-analysis, the "reference standard" was the only QUADAS-2 item which was deemed to be of high risk of bias. This item applied to three of the nine included studies (33%) because repeated RT-PCR testing was not used in all patients with a negative initial RT-PCR result and persistent clinical suspicion of COVID-19 (23, 26, 27) . If a low threshold is being used (eg, any lung abnormality on chest CT is considered positive for COVID-19), virtually all COVID-19 cases with lung abnormalities will be correctly classified, but all non-COVID-19 cases with any lung abnormality at chest CT will be incorrectly classified as having COVID-19 (29) . By applying standardized diagnostic criteria such as CO-RADS or the RSNA classification system, a higher proportion of non-COVID-19 cases with lung abnormalities due to other lung diseases will be correctly classified as not having COVID-19 but an alternative lung disease. It should be noted that the studies in our I n p r e s s meta-analysis included patients between January and June 2020, a period with a high COVID-19 frequency (mean of 48.7%; range, 41.7-59.8%). Specificity is likely to decrease with lower COVID-19 frequency and increasing frequency of other viral lung infections such as influenza (30) . Our study has some limitations. First, the included studies used RT-PCR, which is an imperfect reference standard with a reported sensitivity of 89% (95% CI: 81, 94) (31) . Sensitivity of RT-PCR appears to be lower in elderly patients (31) , which may be due to sampling error in these patients who are more likely to have poorer performance status (26) . Furthermore, vendor-specific effects and differences in the quality assurance process may affect the performance of RT-PCR (31). infection (32) (33) (34) . Second, because of the relatively low number of included studies, we did not perform subgroup or meta-regression analyses to explain statistical heterogeneity between studies. Geographical differences, non-reported prevalence of other lung diseases, interobserver variability in chest CT assessment, RT-PCR performance, and some methodological quality issues may have been potential sources of heterogeneity. Note that interobserver agreement varies from substantial to almost perfect for the CO-RADS (8, 20) and from moderate to substantial for the RSNA classification system (8, 23) . In conclusion, COVID-19 infection frequency was higher in patients categorized with higher CO-RADS and RSNA classification categories. Note.-The 95% CI is shown within parenthesis for the pooled frequency. * Data from the first reader. I n p r e s s † CO-RADS categories 1 and 2, and CO-RADS categories 4 and 5 were merged ‡ CO-RADS 1, 2, 4, and 5 data from the study of Korevaar et al. (25) were not included in the pooled analysis. § Statistical heterogeneity between studies was defined as P<0.10. * RSNA classification categories negative and atypical, and RSNA classification categories indeterminate and typical were merged † Data from the first reader. ‡ Data from the study of Falaschi et al. (22) were not included in the pooled analysis. I n p r e s s I n p r e s s Supplemental Tables Table E1. CO-RADS (adopted from reference (5)). The Role of Chest Pandemic: A Multinational Consensus Statement From the Fleischner Society Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Systematic Review and Meta-Analysis on the Value of Chest CT in the Diagnosis of Coronavirus Disease (COVID-19): Sol Scientiae, Illustra Nos CO-RADS: A Categorical CT Assessment Scheme for Patients Suspected of Having COVID-19-Definition and Evaluation Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA. Radiology: Cardiothoracic Imaging Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA -Secondary Publication Radiological Society of North America Chest CT Classification System for Reporting COVID-19 Pneumonia: Interobserver Variability and Correlation with RT-PCR Suboptimal Quality and High Risk of Bias in Diagnostic Test Accuracy Studies on Chest Radiography and Computed Tomography in the Acute Setting of the COVID-19 Pandemic: A Systematic Review Transparent reporting of systematic reviews and meta-analyses QUADAS-2: a revised tool for the of diagnostic accuracy studies Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews The R Project for Statistical Computing RSNA Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19: Interobserver Agreement Between Chest Radiologists Detection of Unsuspected Coronavirus Disease 2019 Cases by Computed Tomography and Retrospective Implementation of the Radiological Society of North Society of Thoracic Radiology/American College of Radiology Consensus Guidelines Chest imaging in patients with suspected COVID-19 CT Scanning in Suspected Stroke or Head Trauma: Is it Worth Going the Extra Mile and Including the Chest to Screen for COVID-19 Infection? Evaluation of the Usefulness of CO-RADS for Chest CT in Patients Diagnostic Performance of Chest CT for SARS-CoV-2 Chest CT accuracy in diagnosing COVID-19 during the peak of the Italian epidemic: A retrospective correlation with RT-PCR testing and analysis of discordant cases Diagnostic Accuracy of North America Expert Consensus Statement on Reporting CT Findings in Patients with Suspected COVID-19 Infection: An Italian Single Center Experience Chest CT for triage during COVIDthe emergency department: myth or truth? Added value of chest computed tomography in suspected COVID-19: an analysis of 239 patients Initial Results of the Use of a Standardized Diagnostic Criteria for Chest Computed Tomography Findings in Coronavirus Disease Chest CT Imaging Signature of Coronavirus Disease Coronavirus Disease 2019 and Chest CT: Do Not Put the Sensitivity Value in the Isolation Room and Look Beyond the Numbers Comparison of the computed tomography findings in COVID-19 and other viral pneumonia in immunocompetent adults: a systematic review and meta-analysis Diagnostic Performance of CT and Reverse Transcriptase Polymerase Chain Reaction for Coronavirus Disease 2019: A Meta-Analysis World Health Organization. Laboratory testing for 2019 novel coronavirus (2019-nCoV) in suspected human cases Advice on the use of point-of-care immunodiagnostic tests for COVID-19 National Institute for Public Health and the Environment. Testing for COVID-19