key: cord-0916014-448tamvr
authors: Hare, S. S.; Tavare, A. N.; Dattani, V.; Musaddaq, B.; Beal, I.; Cleverley, J.; Cash, C.; Lemoniati, E.; Barnett, J.
title: Validation of the British Society of Thoracic Imaging guidelines for COVID-19 chest radiograph reporting
date: 2020-06-17
journal: Clin Radiol
DOI: 10.1016/j.crad.2020.06.005
sha: 58a3e6cc15d8281f5aba1d7efa851747f57f2378
doc_id: 916014
cord_uid: 448tamvr

Abstract Aim To validate the British Society of Thoracic Imaging issued guidelines for the categorisation of chest radiographs for coronavirus disease 2019 (COVID-19) reporting regarding reproducibility amongst radiologists and diagnostic performance. Materials and methods Chest radiographs from 50 patients with COVID-19, and 50 control patients with symptoms consistent with COVID-19 from prior to the emergence of the novel coronavirus were assessed by seven consultant radiologists with regards to the British Society of Thoracic Imaging guidelines. Results The findings show excellent specificity (100%) and moderate sensitivity (44%) for guideline-defined Classic/Probable COVID-19, and substantial interobserver agreement (Fleiss’ k=0.61). Fair agreement was observed for the “Indeterminate for COVID-19” (k=0.23), and “Non-COVID-19” (k=0.37) categories; furthermore, the sensitivity (0.26 and 0.14 respectively) and specificity (0.76, 0.80) of these categories for COVID-19 were not significantly different (McNemar’s test p=0.18 and p=0.67). Conclusion An amalgamation of the categories of “Indeterminate for COVID-19” and “Non-COVID-19” into a single “not classic of COVID-19” classification would improve interobserver agreement, encompass patients with a similar probability of COVID-19, and remove the possibility of labelling patients with COVID-19 as “Non-COVID-19”, which is the presenting radiographic appearance in a significant minority (14%) of patients.

RESULTS: The findings show excellent specificity (100%) and moderate sensitivity (44%) for guideline-defined Classic/Probable COVID-19, and substantial interobserver agreement (Fleiss' k=0.61). Fair agreement was observed for the "Indeterminate for COVID-19" (k=0.23), and "Non-COVID-19" (k=0.37) categories; furthermore, the sensitivity (0.26 and 0.14 respectively) and specificity (0.76, 0.80) of these categories for COVID-19 were not significantly different (McNemar's test p=0.18 and p=0.67).

CONCLUSION: An amalgamation of the categories of "Indeterminate for COVID-19"

and "Non-COVID-19" into a single "not classic of COVID-19" classification would improve interobserver agreement, encompass patients with a similar probability of COVID-19, and remove the possibility of labelling patients with COVID-19 as "Non-COVID-19", which is the presenting radiographic appearance in a significant minority (14%) of patients. In early 2020, the unprecedented surge in UK COVID-19 cases saw the chest radiograph (CXR) emerge as the frontline diagnostic imaging test, in conjunction with clinical history and key blood biomarkers: C-reactive protein (CRP) and lymphopenia.

The British Society of Thoracic Imaging (BSTI) developed a simple, internationally recognised CXR reporting template 1 to help facilitate consistency of reporting with embedded CXR reporting codes and allow retrospective radiology information system keyword searches for audit purposes. Frontline doctors have found this standardised reporting method a useful adjunct to clinical assessment, particularly when CXRs are "hot-reported". The BSTI reporting template has been incorporated into an NHS England (NHSE) endorsed radiology decision tool for suspected COVID-19 1,2 . Moreover, CXR has also emerged as a pivotal triage tool in proposed The aim of the present study was to validate the BSTI COVID-19 CXR classification criteria with regards firstly to their reproducibility amongst consultant radiologists involved in the front-line care of patients with COVID-19, and secondly, to their diagnostic utility against symptomatic control patients without COVID-19.

Consecutive adult patients with nose/throat swab RT-PCR-confirmed SARS-CoV-2 infection were identified from the microbiology database at Barnet General Hospital.

Fifty consecutive patients were selected following exclusion of patients <18 years (n=0); patients with multiple organisms identified on PCR (n=0); and patients without admission CXR available on the picture archiving and communication system (PACS; n=4). As a retrospective evaluation of routinely collected clinical data, ethical approval was not required.

Given the limited application of RT-PCR testing at this time in England, and the reported non-trivial false-negative rate of RT-PCR testing for SARS-CoV-2 6 , it is difficult to identify patients who are definitively negative for SARS-CoV-2; therefore, a control cohort of patients was selected from November 2019, prior to the emergence of SARS-CoV-2. Fifty consecutive adult patients with symptoms consistent with COVID-19 (new cough and fever) and available admission chest radiograph were selected from the PACS records.

Images were anonymised regarding both patient identifiable data and date of image acquisition and stored in a random order on the Trust's PACS.

Seven consultant radiologists (median length of time on the specialist register 10 years, range 1-22 years) were recruited to participate in the study. They received training consisting of a review of the educational material available on the BSTI website 7 regarding COVID-19 classification of CXRs. Participants were informed of the presenting complaint (new cough and fever, query COVID-19), and asked to categorise each CXR with regards to the BSTI guidelines 7 (Fig. 1 

Patient demographics are displayed in Table 1 . Patients with SARS-CoV-2 infection were significantly older than those without, but there was no significant difference in gender. Although neither lymphocyte count nor lymphopenia (defined as lymphocyte count <1×10 9 l -1 ) were significantly different in COVID-19 patients than controls, patients with COVID-19 had significantly greater CRP levels at presentation.

Amongst all radiologists, overall agreement of CXR categorisation was moderate (fleiss K = 0.50). Agreement for individual diagnostic categories was substantial for 'Classic/Probable COVID-19', (k=0.61) and 'Normal' (k = 0.68) 9 . Fair agreement was observed for the 'Indeterminate for COVID-19' (k=0.23), and 'Non-COVID-19' (k=0.37) categories. Post-hoc combination of the 'Indeterminate for COVID-19' and 'Non-COVID-19' codes into a single category was associated with improved interobserver agreement (k=0.58).

For the purposes of final classification of CXRs, scores from two fellowship-trained thoracic radiologists (xx, xx) were used, with disagreements arbitrated by consensus.

Agreement amongst these radiologists was almost perfect for "Classic/Probable COVID-19" (k=0.83), substantial for "Normal" (k=0.70), moderate for "Non-COVID-19" (k=0.50) and slight for "Indeterminate" (k=0.25). The final classifications of patients are given in Table 2 . The "Classic/Probable COVID-19" category was associated with 100% specificity for COVID-19, and detected 44% of patients with RT-PCR-confirmed SARS-CoV-2 infection. Normal CXRs were significantly more frequent in controls (p<0.001 after adjustment for multiple testing), but still occurred in 16% of patients with RT-PCR-confirmed SARS-CoV-2 infection. The frequency of "Indeterminate for COVID-19" and "Non-COVID-19" chest radiographs was not significantly different between COVID-19 patients and controls, indeed the sensitivity and specificity of these categories for COVID-19 were not significantly different 

The results of the present study demonstrate that the BSTI "Classic/Probable COVID-19" categorisation is very specific and moderately sensitive for patients with RT-PCR-confirmed SARS-CoV-2 pulmonary infection on admission CXR, as opposed to symptom-matched controls. Furthermore, this classification is substantially agreed upon by consultant radiologists.

A significant minority of patients in this study with SARS-CoV-2 infection presented with normal CXRs, findings that reinforce the BSTI "Normal" categorisation, which states that COVID-19 cannot be excluded and that RT-PCR may be required. These results, however, highlight that some refinement of the BSTI COVID-19 classification criteria may be needed, specifically the categories of "Indeterminate for COVID-19" and "Non-COVID-19". Only fair interobserver agreement was observed for these categories, which in the case of "Indeterminate for COVID-19" fell to slight agreement when only the categorisation of fellowship-trained thoracic radiologists were used. In addition, these categories have similar diagnostic performance regarding SARS-CoV-2 infection.

The potential need for an iteration of these two categories is highlighted by an inherent overlap of CXR appearances between "Indeterminate for COVID-19" and "Non-COVID-19" categories. Examples of this overlap exist in patients with limited or unilateral consolidation, which could be SARS-CoV-2 or bacterial in aetiology; and in patients with multiple radiological abnormalities, for example, fluid overload and alveolar opacity.

In equivocal cases, it is expected that radiologists may also reasonably be informed by the pre-test probability of SARS-CoV-2 infection in assigning cases to the "Indeterminate" or "Non-COVID" categories. Thus, the categorisation of the same radiograph may differ depending on whether the patient presented in the first peak of the COVID-19 pandemic in London, when up to 80% of emergency department admissions were COVID related, as opposed to during a relative trough.

Inclusion of non-diagnostic examinations in the "Indeterminate for COVID-19" category also adds variation to this group of patents, compounding interobserver variation with regards to diagnosis and that regarding acceptable film quality. A nondiagnostic examination should be reported as such, and no statement about COVID-19 classification is necessary or possible in this situation. There is also variability in the recommendation implied by this diagnostic category; patients with a nondiagnostic film are more likely to benefit from a repeated attempt at imaging, whereas those with a diagnostic-quality examination revealing a non-specific abnormality may not.

As the prevalence of COVID-19 increases and as health-seeking behaviours of the population respond to the presence of a pandemic, atypical radiographic presentations of SARS-CoV-2 infection will become more frequent relative to noncoronavirus disease. Indeed, 14% of patients in this study with SARS-CoV-2 infection had a CXR categorised as "Non-COVID". The amalgamation of the "Indeterminate for COVID-19" and "Non-COVID-19" categories into a single "not classic of COVID-19" category would have several advantages. Firstly, this category would increase consultant radiologist agreement. Secondly, the category would encompass a group of patients with similar probability of SARS-CoV-2 infection.

Thirdly, it would remove the possibility of potentially mislabelling patients with SARS-CoV-2 infection as "Non-COVID-19". A subtle distinction here is noted; COVID-19 is the pulmonary infection caused by SARS-CoV-2, whereas chest radiographic abnormalities secondary to SARS-CoV-2, but not infection per se (for example pulmonary oedema secondary to SARS-CoV-2 myocarditis) is strictly speaking, not COVID-19. It is fair to assume this distinction may not be appreciated by all, and the "Non-COVID-19" terminology could be potentially misleading regardless of the aetiology of SARS-CoV-2-induced pulmonary abnormality.

The present study has a number of limitations. Firstly, performance of a given test varies with disease prevalence. In the present study, patients and controls were matched at a 1:1 ratio. Throughout the height of the pandemic, the authors' anecdotal experience is of patients with COVID-19 outnumbering those without. In order to minimise the effect of varying prevalence of SARS-CoV-2 on the results, only sensitivity and specificity have been presented, which are statistics independent of disease prevalence. Using PCR as the reference standard diagnosis in a study examining radiological diagnosis is necessary to avoid incorporation bias, but introduces the biases of PCR testing strategy, namely towards those patients with more severe disease requiring admission. This may have the effect of overstating the sensitivity of the "Classic/Probable COVID" category, but should not affect the specificity. It is currently uncertain whether patients with false-negative SARS-CoV-2 RT-PCR have a distinct radiological phenotype to those who are RT-PCR positive; this could also introduce bias into the data.

In conclusion, the results of the present demonstrate variable performance of the BSTI COVID-19 classification criteria. The guideline defined "Classic/Probable" appearance of COVID-19 has excellent specificity and moderate sensitivity for SARS-CoV-2 pulmonary infection, and furthermore, is associated with substantial interobserver agreement. The categories of "Indeterminate for COVID-19" and "Non-COVID-19", however, suffer from greater interobserver variability, and furthermore, have similar sensitivity and specificity for COVID-19. The authors suggest an amalgamation of these categories into a "not classic of COVID-19" category, which would increase interobserver agreement; encompass patients with similar probability of COVID-19; and remove the potential for mislabelling patients with SARS-CoV-2 infection as "Non-COVID".

2. NHS England. Specialty guides for patient management during the coronavirus pandemic Clinical guide for the management of Radiology patients during the coronavirus pandemic

Managing high clinical suspicion COVID-19 inpatients with negative RT-PCR: A pragmatic and limited role for thoracic CT

Chest radiographic and CT findings of the 2019 novel coronavirus disease (COVID-19): analysis of nine patients treated in korea

Frequency and distribution of chest radiographic findings in COVID-19 positive patients

Thoracic imaging in COVID-19 infection. Guidance for the Reporting Radiologist British Society of Thoracic Imaging

The authors would like to thank Mr Jack Gaskell for his help in image anonymisation. 

Lymphopenia defined as lymphocyte count <1.0 x10 9 L -1 .CRP, C-reactive protein; SD, standard deviation; IQR, interquartile range. 

• Classic COVID-19 on chest radiograph is very specific for SARS-CoV-2• There is substantial interobserver agreement for Classic COVID-19• There is only fair agreement for Indeterminate and Non-COVID appearances• Indeterminate and Non-COVID categories have a similar probability of SARS-CoV-2• These categories should be amalgamated into a 'Not Classic for COVID' category

☐ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.☒The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Dr Hare is on the committee of the British Society of Thoracic Imaging