key: cord-337507-cqbbrnku authors: Cozzi, Andrea; Schiaffino, Simone; Arpaia, Francesco; Pepa, Gianmarco Della; Tritella, Stefania; Bertolotti, Pietro; Menicagli, Laura; Monaco, Cristian Giuseppe; Carbonaro, Luca Alessandro; Spairani, Riccardo; Paskeh, Bijan Babaei; Sardanelli, Francesco title: Chest x-ray in the COVID-19 pandemic: Radiologists’ real-world reader performance date: 2020-09-10 journal: Eur J Radiol DOI: 10.1016/j.ejrad.2020.109272 sha: doc_id: 337507 cord_uid: cqbbrnku PURPOSE: To report real-world diagnostic performance of chest x-ray (CXR) readings during the COVID-19 pandemic. METHODS: In this retrospective observational study we enrolled all patients presenting to the emergency department of a Milan-based university hospital from February 24th to April 8th 2020 who underwent nasopharyngeal swab for reverse transcriptase-polymerase chain reaction (RT-PCR) and anteroposterior bedside CXR within 12 h. A composite reference standard combining RT-PCR results with phone-call-based anamnesis was obtained. Radiologists were grouped by CXR reading experience (Group-1, >10 years; Group-2, <10 years), diagnostic performance indexes were calculated for each radiologist and for the two groups. RESULTS: Group-1 read 435 CXRs (77.0 % disease prevalence): sensitivity was 89.0 %, specificity 66.0 %, accuracy 83.7 %. Group-2 read 100 CXRs (73.0 % prevalence): sensitivity was 89.0 %, specificity 40.7 %, accuracy 76.0 %. During the first half of the outbreak (195 CXRs, 66.7 % disease prevalence), overall sensitivity was 80.8 %, specificity 67.7 %, accuracy 76.4 %, Group-1 sensitivity being similar to Group-2 (80.6 % versus 81.5 %, respectively) but higher specificity (74.0 % versus 46.7 %) and accuracy (78.4 % versus 69.0 %). During the second half (340 CXRs, 81.8 % prevalence), overall sensitivity increased to 92.8 %, specificity dropped to 53.2 %, accuracy increased to 85.6 %, this pattern mirrored in both groups, with decreased specificity (Group-1, 58.0 %; Group-2, 33.3 %) but increased sensitivity (92.7 % and 93.5 %) and accuracy (86.5 % and 81.0 %, respectively). CONCLUSIONS: Real-world CXR diagnostic performance during the COVID-19 pandemic showed overall high sensitivity with higher specificity for more experienced radiologists. The increase in accuracy over time strengthens CXR role as a first line examination in suspected COVID-19 patients. Since the start of the COVID-19 pandemic, international recommendations [1, 2] stated that the diagnosis of SARS-CoV-2 infection should primarily rely on viral testing rather than on chest imaging. This endorsed reference standard, i.e. reverse transcriptase-polymerase chain reaction (RT-PCR) on nasal or throat swabs, has become essential in the triage and monitoring phases of patients with suspected SARS-CoV-2 infection performance [3] , but is encumbered by a sensitivity oscillating between 38% and 89% [4] [5] [6] . Moreover, during the pandemic peak, RT-PCR response times became often incompatible with appropriate triaging and management of the high number of suspect COVID-19 cases simultaneously presenting to emergency departments [7] [8] [9] , forcing the incorporation of imaging in the diagnostic pathway to compensate both RT-PCR aforementioned shortcomings [2, 10, 11] . While the use of chest CTeven as a triaging testwas almost ubiquitous [11] [12] [13] , both initial reports from China and a recent meta-analysis highlighted its low specificity [14] . Therefore, an ever-growing number of institutions have come to prefer chest x-ray (CXR), also taking into account that it can be performed with portable equipment in isolation rooms [15] or even in external settings [16] . Such choice also minimizes potential contact between patients and operators, as well as other patients [15] [16] [17] [18] . This has been the case in our hospital, located less than 25 miles from the first pandemic hotspot in Lombardy, Italy. Apart from small-scale case series [19, 20] , three major retrospective studies have so far evaluated the diagnostic performance of CXR performed as a triaging test on emergency department admission [21] [22] [23] . The two largest by far are a retrospective review by a single radiologist of 518 CXRs acquired during the first phase of the pandemic peak (from March 1 st to March 15 th )with a resulting overall sensitivity of 57% [22] and a study coming from our group and performed on 535 patients [23] . In our analysis we instead considered the This retrospective observational study was approved by the Ethics Committee of BLINDED and performed between February 24 th and April 8 th , 2020, at BLINDED, a university hospital mainly focusing on cardiovascular diseases but promptly converted to a primarily COVID-19-dedicated hospital during the pandemic peak. We included in this study all patients presenting to our emergency department for suspected SARS-CoV-2 infection who underwent both a nasopharyngeal swab for RT-PCR and an anteroposterior bedside CXR within 12 hours from admission. At our hospital, CXRs are reported by the on-duty radiologist within about 60-90 minutes if performed during the day shift (07:00-20:00), and at the beginning of the following working day if performed during the night shift (20:00-07:00). Considering the delay in the availability of RT-PCR results caused by the high number of patients incessantly presenting to the emergency department during the pandemic peak in our region, all CXRs in the study period were reported by radiologists forcedly blinded to RT-PCR results. For this study's purposes, as previously described [23] , we then built a composite reference standard to improve RT-PCR sensitivity, by combining RT-PCR results with phone-call-based complete anamnesis in RT-PCR-negative patients who had not repeated the swab during hospitalization. Considering the rather unspecific nature of CXR findings in patients with COVID-19 pneumonia, a radiologist with 5 years of experience in CXR interpretation (BLINDED) reviewed all routine CXR reportsbeing blinded for the original radiologists' signaturesin order to classify them dichotomously as positives or negatives for COVID-19. The absence of pulmonary abnormalities on a CXR determined its classification as a negative one, while the presence of interstitial infiltratesassociated or not with alveolar infiltrateswith predominantly bilateral and basal distribution on a CXR implied its classification as a positive examination [1, 2, 11] . Conversely, CXR findings J o u r n a l P r e -p r o o f unrelated to COVID-19, such as lobar alveolar infiltrates, (typically associated with bacterial pneumonia), pleural effusion, pneumothorax, were considered as non-COVID-19-related finding for the purpose of this dichotomization. We grouped the seven radiologists from our department by their CXR reading experience: Group 1 included 4 radiologists (R1, R2, R3, and R4) with 10 or more years of experience in CXR reading; Group 2 included 3 (R5, R6, and R7) radiologists with less than 10 years of experience in CXR reading. All radiologists were board-certified: if a resident was in charge of drafting a first version of the report, the report was always checked by a board-certified radiologist and the final version was signed by the same board-certified radiologist. Only one of the seven radiologists (in Group 1) has a particular dedication to breast imaging but practices at least half of his time as a general radiologist. Overall and patient-sex-specific diagnostic performance indexes were calculated for each radiologist and for the two groups over the 6-week timeframe and according to the first and second half of all CXRs read for each radiologist. Data were presented as sensitivity, specificity, positive predictive value, negative predictive value, accuracy, positive likelihood ratio, negative likelihood ratio, and their 95% confidence interval (CI). Statistical analyses were performed using Microsoft Excel 2019 (Microsoft Corporation, Redmond, WA, USA). In the six-week study period, R1 read 180 CXRs, with a 79% disease prevalence, R2 read Figure 1 shows an example of a true positive and of a false negative case both for Group 1 and Group 2, while Table 1 details overall performance indexes of all readers and Table 2 shows the results of readers performance evaluation according to patients subgroups and different timeframes (i.e. the first and second three-week periods). Considering the first half and the second half of all CXRs read by each radiologist, we observed an increase in disease prevalence for 5 out of 7 readers: disease prevalence in the CXR subset read by R1 increased from 77% to 81%, from 64% to 77% for R2, from 86% to 90% for R4, from 70% to 86% for R5, from 64% to 77% for R6, while decreasing from 85% to 75% for R3 and from 71% to 43% for R7. Group Table 3 The role of CXR in COVID-19 imaging could be paramount in settings with temporarily-or permanently-limited RT-PCR availability, as anticipated by Murphy et al. [24] , who also warned against potential low diagnostic performance of CXR when reported by nondedicated chest radiologists. Real-world data from this study, albeit conducted in a highprevalence region and during a SARS-CoV-2 pandemic peak, seem to provide a better scenario, in which radiologists with less than 10 years of experience matched the 89.0% sensitivity attained by radiologists with more than 10 years of experience, with similar disease prevalence in the CXR subsets read by each group (73% versus 77%, respectively). A non-negligible cost for Group 2 to attain such a sensitivity was a consistently lower specificity (41%, 95% CI 25%-59%)a value similar to the pooled specificity reported for chest CT by a meta-analysis of 3 studies from non-high-epidemic areas and 2 studies from high-epidemic areas (37%, 95% CI 26%-50%) [14] while Group 1 showed a smaller difference between sensitivity and specificity, with a constantly higher accuracy ( Table 2) . Such pattern was also observed comparing different timepoints or the total number of CXRs read by each radiologist: between the first and second half of the six-week study period overall accuracy increased from 76% to 86%, with corresponding increases both in Group 1 and Group 2; between the first and second half of CXRs read by each reader, overall accuracy increased from 81% to 84%, again with corresponding increases in both groups, albeit more pronounced in the less experienced Group 2 (1% difference for Group 1, 11% difference for Group 2). This trend was most likely driven in both groups by adaptation to the escalation of examined cases (from 195 in the first three weeks to 340 in the following three) with an increase in sensitivity and accuracy mirrored by a specificity decrease. Of note, we can observe how in both groups there was a comparable number of readers exhibiting an J o u r n a l P r e -p r o o f inverse tendency towards a decrease in accuracy ( Figure 1 ) and sensitivity (Figure 2 ), reinforced by a decrease in specificity in all but one less-experienced reader (Figure 3 ). Limitations of this study include its retrospective and monocentric nature, the fact that each radiologist read a different subset of images, and the imbalance in the number of CXRs read by Group 1 and Group 2, with the lesser-experienced Group 2 reading 18.6% of all CXRs. However, the closely proportionate disease prevalence between the two groups substantiates the comparability of subsequent findings and seems to suggest a more pronounced influence of overall radiological experience on the diagnostic performance of each group. Such an hypothesis should be verified with a conventional multi-reader study, to ascertain if these differences in diagnostic performance are also influenced by the number of COVID-19-positive CXRs read by each radiologist, or indeed result from a combination of these factors. However, we should also consider that any multi-reader study performed after a pandemic outbreak would not reproduce the condition of the first outbreak, when the new disease first spread in a country. Other than a conventional multi-reader study, further evaluations of real-world diagnostic performance should also target the potential impact on diagnostic performance of various types of subspecialty radiological training and of centrespecific contingencies, such as presence and employment of residents, different radiologists workloads, and disparities in CXR reporting conducted during day or night shifts. In addition, the result herein reported should be considered in light of the pandemic peakwith very high disease prevalenceand could be not reproducible in low prevalence settings [25, 26] . Being this a real-world data study, our results rely on a practical dichotomization of CXR reports: their potential generalizability must be therefore very carefully considered, especially when, in case of suspected COVID-19, we have a non-typical CXR for SARS-CoV-2 pneumonia. Clinical translation of our findings would still result in at least two different scenarios, also taking into account the unspecific nature of CXR findings in COVID-19 pneumonia and J o u r n a l P r e -p r o o f other viral pneumonias. First, when a patient displays suspicious symptoms for COVID-19 that can however be justified by alternative pathological CXR findings pointing to another disease (such as pleural effusion, pneumothorax, bacterial pneumonia), the management of the patient would remain the one that would have normally been followed in the detected condition. Otherwise, if in a general situation of increased patient influx to emergency departments a patient presents with suspicious symptoms for COVID-19 but no suggestive CXR findings or other findings that can justify a COVID-19 diagnosis, the use of chest CT could be considered [2, 14] . However, taking into account the suboptimal diagnostic performance of chest CTin particular the potentially low specificity and positive predictive value [14] -if the patient's clinical conditions are stable and it is therefore possible to wait for RT-PCR confirmation of SARS-CoV-2 infection, preventive isolation would remain the safest approach. To summarize, the real-world diagnostic performance of CXR during the COVID-19 pandemic peak reached a relatively well-balanced overall accuracy (76%-86%), with an 89% sensitivity and a higher specificity for the more experienced radiologists (66%), lower for the less experienced radiologists (41%). Such data play in favour of the use of CXR as first line examination when chest imaging is required in suspected COVID-19 patients during a pandemic peak. The Role of Chest Imaging in Patient Management during the COVID-19 Pandemic: A Multinational Consensus Statement from the Fleischner Society Use of Chest Imaging in the Diagnosis and Management of COVID-19: A WHO Rapid Advice Guide Should RT-PCR be considered a gold standard in the diagnosis of Covid-19? Positive rate of RT-PCR detection of SARS-CoV-2 infection in 4880 cases from one hospital in Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT? Laboratory diagnosis of emerging human coronavirus infections -the state of the art Detection of 2019 novel coronavirus (2019-nCoV) by realtime RT-PCR Detection profile of SARS-CoV-2 using RT-PCR in different types of clinical specimens: A systematic review and meta-analysis Real-time RT-PCR in COVID-19 detection: issues affecting the results Radiology Department Preparedness for COVID-19: Radiology Scientific Expert Review Panel Integrated Radiologic Algorithm for COVID-19 Pandemic Chinese expert consensus statement A systematic review of chest imaging findings in COVID-19, Quant Diagnostic Performance of CT and Reverse Transcriptase-Polymerase Chain Reaction for Coronavirus Disease 2019: A Meta-Analysis Radiology department strategies to protect radiologic technologists against COVID19: Experience from Wuhan Coronavirus (COVID-19) Outbreak: What the Department of Radiology Should Know Frequency and Distribution of Chest Radiographic Findings in Patients Positive for COVID-19 Imaging evaluation of COVID-19 in the emergency department The role of initial chest X-ray in triaging patients with suspected COVID-19 during the pandemic Diagnostic impact of bedside chest X-ray features of 2019 novel coronavirus in the routine admission at the emergency department: case series from Lombardy region COVID-19 on the Chest Radiograph: A Multi-Reader Evaluation of an AI System Variation of a test's sensitivity and specificity with disease prevalence This study was partially supported by Ricerca Corrente funding from Italian Ministry of Health to IRCCS Policlinico San Donato J o u r n a l P r e -p r o o f