key: cord-252784-wfsq0u9o authors: Favot, Mark; Malik, Adrienne; Rowland, Jonathan; Haber, Brian; Ehrman, Robert; Harrison, Nicholas title: Point-of-Care Lung Ultrasound for Detecting Severe Presentations of Coronavirus Disease 2019 in the Emergency Department: A Retrospective Analysis date: 2020-07-31 journal: Crit Care Explor DOI: 10.1097/cce.0000000000000176 sha: doc_id: 252784 cord_uid: wfsq0u9o OBJECTIVES: Analyze the diagnostic test characteristics of point-of-care lung ultrasound for patients suspected to have novel coronavirus disease 2019. DESIGN: Retrospective cohort. SETTING: Two emergency departments in Detroit, Michigan, United States, during a local coronavirus disease 2019 outbreak (March 2020 to April 2020). PATIENTS: Emergency department patients receiving lung ultrasound for clinical suspicion of coronavirus disease 2019 during the study period. INTERVENTIONS: None, observational analysis only. MEASUREMENTS AND MAIN RESULTS: By a reference standard of serial reverse transcriptase-polymerase chain reactions, 42 patients were coronavirus disease 2019 positive, 16 negative, and eight untested (test results lost, died prior to testing, and/or did not meet hospital guidelines for rationing of reverse transcriptase-polymerase chain reaction tests). Thirty-three percent, 44%, 38%, and 17% had mortality, ICU admission, intubation, and venous or arterial thromboembolism, respectively. Receiver operating characteristics, area under the curve, sensitivity, and specificity with 95% CIs were calculated for five lung ultrasound patterns coded by a blinded reviewer and chest radiograph. Chest radiograph had area under the curve = 0.66 (95% CI, 0.54–0.79), 74% sensitivity (95% CI, 48–93%), and 53% specificity (95% CI, 32–75%). Two lung ultrasound patterns had a statistically significant area under the curve: symmetric bilateral pulmonary edema (area under the curve, 0.57; 95% CI, 0.50–0.64), and a nondependent bilateral pulmonary edema pattern (edema in superior lung ≥ inferior lung and no pleural effusion; area under the curve, 0.73; 95% CI, 0.68–0.90). Chest radiograph plus the nondependent bilateral pulmonary edema pattern showed a statistically improved area under the curve (0.80; 95% CI, 0.68–0.90) compared to either alone, but at the ideal cutoff had sensitivity and specificity equivalent to nondependent bilateral pulmonary edema only (69% and 77%, respectively). The strongest combination of clinical, chest radiograph, and lung ultrasound factors for diagnosis was nondependent bilateral pulmonary edema pattern with temperature and oxygen saturation (area under the curve, 0.86; 95% CI, 0.76–0.94; sensitivity = 77% [58–93%]; specificity = 76% [53–94%] at the ideal cutoff), which was superior to chest radiograph alone. CONCLUSIONS: Lung ultrasound diagnosed severe presentations of coronavirus disease 2019 with similar sensitivity to chest radiograph, CT, and reverse transcriptase-polymerase chain reaction (on first testing) and improved specificity compared to chest radiograph. Diagnostically useful lung ultrasound patterns differed from those hypothesized by previous, nonanalytical, reports (case series and expert opinion), and should be evaluated in a rigorous prospective study. Objectives: Analyze the diagnostic test characteristics of point-ofcare lung ultrasound for patients suspected to have novel coronavirus disease 2019. Design: Retrospective cohort. Setting: Two emergency departments in Detroit, Michigan, United States, during a local coronavirus disease 2019 outbreak (March 2020 to April 2020). Patients: Emergency department patients receiving lung ultrasound for clinical suspicion of coronavirus disease 2019 during the study period. Interventions: None, observational analysis only. Measurements and Main Results: By a reference standard of serial reverse transcriptase-polymerase chain reactions, 42 patients were coronavirus disease 2019 positive, 16 negative, and eight untested (test results lost, died prior to testing, and/or did not meet hospital guidelines for rationing of reverse transcriptase-polymerase chain reaction tests). Thirty-three percent, 44%, 38%, and 17% had mortality, ICU admission, intubation, and venous or arterial thromboembolism, respectively. Receiver operating characteristics, area under the curve, sensitivity, and specificity with 95% CIs were calculated for five lung ultrasound patterns coded by a blinded reviewer and chest radiograph. Chest radiograph had area under the curve = 0.66 (95% CI, 0.54-0.79), 74% sensitivity (95% CI, 48-93%), and 53% specificity (95% CI, 32-75%). Two lung ultrasound patterns had a statistically significant area under the curve: symmetric bilateral pulmonary edema (area under the curve, 0.57; 95% CI, 0.50-0.64), and a nondependent bilateral pulmonary edema pattern (edema in superior lung ≥ inferior lung and no pleural effusion; area under the curve, 0.73; 95% CI, 0.68-0.90). Chest radiograph plus the nondependent bilateral pulmonary edema pattern showed a statistically improved area under the curve (0.80; 95% CI, 0.68-0.90) compared to either alone, but at the ideal cutoff had sensitivity and specificity equivalent to nondependent bilateral pulmonary edema only (69% and 77%, respectively). The strongest combination of clinical, chest radiograph, and lung ultrasound factors for diagnosis was nondependent bilateral pulmonary edema pattern with temperature and oxygen saturation (area under the curve, 0.86; 95% CI, 0.76-0.94; sensitivity = 77% [58-93%]; specificity = 76% [53-94%] at the ideal cutoff), which was superior to chest radiograph alone. Conclusions: Lung ultrasound diagnosed severe presentations of coronavirus disease 2019 with similar sensitivity to chest radiograph, CT, and reverse transcriptase-polymerase chain reaction (on first testing) and improved specificity compared to chest radiograph. Diagnostically useful lung ultrasound patterns differed from those hypothesized by previous, nonanalytical, reports (case series and expert opinion), and should be evaluated in a rigorous prospective study. Key Words: COVID-19; diagnosis; emergency department; point of care; sensitivity and specificity; ultrasound T he coronavirus disease 2019 (COVID-19) pandemic is impacting the lives of nearly everyone around the world in ways that are difficult to comprehend. Clinicians caring for patients with suspicion for COVID-19 are forced to consider the manner in which we use various imaging tests to aid in providing the most appropriate, individualized care possible (1). Unfortunately, diagnostic modalities, including chest radiograph (CXR) (2), CT (3), and reverse-transcription polymerase chain reaction (RT-PCR) on first test (3), have been reported to suffer from poor sensitivity. As a result, serial testing has been recommended (3) when any one of these modalities is negative, which increases the exposures staff and patients to COVID-19 in the hospital while also potentially delaying diagnosis in critically ill patients. Point-of-care lung ultrasound (LUS) has been suggested as a useful diagnostic modality in these patients (4) as it limits www.ccejournal.org 2020 • Volume 2 • e0176 COVID-19 exposure of ancillary staff, minimizes travel within the hospital for patients, can be performed at the bedside within minutes, and has been shown to be diagnostically superior to CXR in critically ill patients with other respiratory complaints (5) . LUS patterns for detecting COVID-19 have been suggested (4, 6) based on ultrasound (US) theory, case reports, and extrapolation from CT findings; however, diagnostic performance data in an observational analytical study are lacking (6) . The objective of this study was to describe LUS findings in patients being evaluated for COVID-19 and retrospectively assess the diagnostic test characteristics of different LUS patterns. We performed a retrospective study of a convenience sample of patients in two large urban emergency departments (EDs) in Detroit, Michigan from March 13, 2020, to April 20, 2020. IRB approval was obtained as part of a larger COVID-19 registry at our institution. Patients with suspected COVID-19 who underwent a diagnostic LUS examination with images archived in the ED US database were eligible for inclusion; only patients with complete examinations (10 images, described below) were included. With the exception of LUS performed solely to assess for pneumothorax, our standard ED LUS protocol is based on a prior LUS in heart failure trial which uses a horizontal probe orientation to maximize the amount of visualized pleural line (7) . All images were obtained using a curvilinear probe on a Zonare Z1 Pro ultrasound system (Mindray North America, Mahwah, NJ) with a LUS preset: 18 cm depth, clip length of 6 seconds, and multibeam former and tissue harmonics deactivated. Four zones are interrogated in each hemithorax: superior and inferior in both the anterior and lateral chest (Supplemental Fig. 1 , Supplemental Digital Content 1, http:// links.lww.com/CCX/A248; legend, Supplemental Digital Content 2, http://links.lww.com/CCX/A263). Our standard LUS protocol is to scan patients in the supine position with head-of-bed elevated 30-45°; however, actual position was not recorded in this convenience sample of patients. Assessment for pleural effusion was done by placing the probe in a vertical position (indicator to head) at the costal margin in the mid-axillary line such that both the lung and liver or spleen were visible. Based on prior reports of LUS findings suggestive of COVID-19 lung disease (4-6), LUS images were coded by a blinded US fellowship-trained observer for the presence of nonconfluent and confluent B-lines (based on the same methodology used in the B-lines lung ultrasound-guided emergency department management of acute heart failure (BLUSHED-AHF) study above [7] ), subpleural consolidations, and pleural effusions. Two lung zone patterns were also examined: symmetric bilateral B-lines (vs asymmetric, unilateral, or no B-lines) and nondependent bilateral pulmonary edema (NDBPE; bilateral B-lines with superior count ≥ inferior count and no pleural effusions). The NDBPE pattern was chosen based on the hypothesis that COVID-19 LUS findings may be similar to those seen in acute respiratory distress syndrome (ARDS). While multiple LUS findings were evaluated, no findings or patterns were a priori considered diagnostic for COVID-19 (i.e., this is a retrospective analysis of extant findings, not a prospective assessment of any specific pattern). Demographics, vital signs, test results, hospital course, and other clinical characteristics were recorded. Sonographers were not specifically blinded to results of other diagnostic test results. Concurrent point-of-care echocardiography was performed on an insufficient number of patients to meaningfully inform the analysis, and thus, results of these examinations were not included. Test characteristics and receiver operating characteristic (ROC) area under the curve (AUC) for individual LUS patterns and CXR (pulmonary edema and/or infiltrate, as bilateral vs unilateral vs none) were compared to a reference standard of serial RT-PCR (3) in RStudio v1.2.5001 (RStudio, Boston, MA), using the pROC package (8) . Logistic regression was used to model the joint utility of CXR and the LUS pattern with highest AUC. AUCs were compared by DeLong test and ideal cutoffs calculated by Youden J statistic. In a post hoc exploratory analysis, we sought to derive the highest performing combination of potential diagnostic predictors (vital signs, laboratory tests, CXR, LUS) in a logistic model selected by examination of Akaike information criteria, clinical plausibility, model parsimony, AUC, and the Hosmer-Lemeshow (HL) statistic. Physical examination findings were not considered in this step as they were not defined a priori, thereby precluding unbiased interpretation. Complete case analysis (CCA) can overoptimistically bias prediction models when data are suspected missing at random because missingness is not only a research reality but also a clinical one (9) . Thus, multiple imputation (MI) by fully conditional specification (m = 10) was performed in SAS v9.4 (SAS Institute, Cary NC) for eight patients without RT-PCR and three without CXR. MI modeling for the response variable was isolated from predictors in downstream analyses, performed in two bootstrapped stages (9) . To help protect against model overfitting and bias from MI, logistic models were fit to stage 1, and model performance measures (HL, ROC, diagnostic characteristics) were calculated on the bootstrapped stage 2 using a "pool-last" approach (10). All analyses were compared to CCA in sensitivity analysis. Sixty-four patients underwent LUS as part of an evaluation for COVID-19. See Tables 1 and 2, respectively, for characteristics and outcomes. Fifty-six patients had RT-PCR testing for COVID-19, with positivity of 71% (95% CI, 60-83%). Median count of RT-PCR tests per patient was one in positives and two in negative cases. Nineteen of 20 patients with in-hospital mortality tested positive for COVID-19, while one died before testing completion. Diagnostic test performance for COVID-19 diagnosis is described for CXR, LUS patterns, and the two multipredictor models by ROC plots (Fig. 1) . Bilateral infiltrate/edema on CXR was 74% sensitive (95% CI, 48-93%), 53% specific (95% CI, 32-75%), with AUC 0.66 (95% CI, 0.54-0.79). The strongest performing LUS finding was the NDBPE pattern (AUC, 0.73; 95% CI, 0.61-0.84; sensitivity = 69% [95% CI, 37-82%], specificity = 77% [95% CI, 50-92%]). Symmetric bilateral B-lines showed modest univariate diagnostic discrimination for COVID-19 (Fig. 1A) , while subpleural consolidation, confluent, and nonconfluent B-line patterns ( Fig. 1 B-D) failed to reach statistical significance (95% CI of AUC crossed 0.50). Combined CXR and NDBPE LUS pattern (Fig. 1G) showed significantly stronger diagnostic prediction (AUC, 0.80; 95% CI, 0.68-0.90) than either CXR or NDBPE alone (p = 0.035 and 0.020, respectively). In the exploratory analysis, the optimal diagnostic combination of clinical factors, CXR, and LUS patterns was NDBPE, fever (temperature ≥ 38°C), and hypoxia (room air pulse oximetry ≤ 94%). No other tested combination of CXR, LUS, or clinical factors added performance or parsimony beyond this model (AUC, 0.86; 95% CI, 0.76-0.94; sensitivity = 77% [58-93%]; specificity = 76% [53-94%] at the ideal cutoff). The NDBPE/fever/hypoxia model performance was nonsignificantly different compared to the CXR/ NDBPE model (p = 0.17) and was superior to CXR alone (p = 0.003) and NDBPE alone (p < 0.001). By contrast, a model of CXR/ fever/hypoxia (AUC, 0.78; 95% CI, 0.67-0.89; sensitivity = 64% [43-80%]; specificity = 77% [48-95%]) had superior overall discrimination compared to CXR alone (p = 0.042) but not to NDBPE alone (p = 0.579), or to the combination of NDBPE and CXR (p = 0.827). At the ideal cutoffs, the CXR/fever/hypoxia model had similar specificity but inferior sensitivity (p = 0.003) to the NDBPE/fever/hypoxia model and even to NDBPE alone (p = 0.015). Our results suggest that a NDBPE pattern on LUS offers additive diagnostic value to portable CXR in ED patients with suspected COVID-19. The NDBPE pattern was similarly sensitive to CXR, CT, and first-test RT-PCR (3), had improved specificity compared to CXR, can be rapidly performed at point of care, and minimizes ancillary staff exposure and patient transport. RT-PCR requires serial testing at this time (e.g., up to five tests to detect one positive patient in our sample), so a LUS-based strategy with the test characteristics we observed could be highly valuable for risk-stratification, cohorting of infected patients within the hospital, early guidance in management decisions, and resource-flexibility under pandemic conditions of resourcescarcity. The earlier diagnostic certainty that can be achieved using a LUS-based imaging protocol could also offer front-line physicians some relief from the cognitive and psychological stresses associated with providing medical care during a pandemic. Multiple prior studies, as well as a meta-analysis, have reported that LUS is superior to CXR for diagnosing many lung pathologies in critical illness, including alveolar interstitial syndrome (AIS) in ARDS (5) . For example, Lichtenstein et al (9) reported a sensitivity and specificity of 98% and 88% for LUS versus 60% and 100% for CXR in AIS. In our study, neither modality performed as well as this. There are several possible explanations for this. First, there is varied severity of pulmonary involvement with COVID-19, and some patients may have minimal (or no) lung findings early in the disease process, thereby reducing sensitivity. Additionally, RT-PCR is an imperfect gold standard, and as such, a patient's result on this test may not accurately reflect disease status, thereby negatively impacting LUS test characteristics. This study took place during the local peak of the pandemic, and thus, any patient who presented during this time with respiratory symptoms was likely to have been considered a potential COVID-19 patient, which could negatively impact specificity. In contrast to 11 recent publications on LUS in COVID-19 comprised of case reports, letters to the editor, and expert opinion in mostly noncritically ill patients (6), our study offers analytical (albeit retrospective) observational evidence of LUS diagnostic performance in a cohort of patients with a range of severities and relatively high mortality rate. Subpleural consolidations, confluent versus nonconfluent B-lines, and basilar-predominant changes had been suggested to be useful for COVID-19 diagnosis in such reports (4, 6) but failed to reach statistically significant diagnostic discrimination here. Discrimination for the NDBPE pattern was strong, and when part of a simple clinical score including hypoxia and fever (Fig. 1) , outperformed CXR. While we present data on a larger cohort, our results should nonetheless be considered hypothesis-generating. Prospective, external validation of our approach is needed before it is incorporated into routine practice. Future studies should also consider the effect of diverse clinical populations (i.e., mixed COVID-19/non-COVID-19 groups and broad clinical severity) on the accuracy of our approach, as well as the potential additive value of point-of-care echocardiography. Our study has several limitations. First, retrospective design and convenience sampling mean that these results should be interpreted as hypothesis-generating. Our use of MI in eight cases without reference standard could have biased our findings, although inconsistent access to rapid COVID-19 RT-PCR is the current reality in many countries, including the United States. Difficulties in obtaining accurate and timely COVID-19 testing is a prime reason why a LUS-based strategy would be useful in the first place. CCA results were equal or more favorable for LUS compared to results from MI. This is consistent with our overall MI modeling strategy, treating the uncertainties of MI modeling as additive to the uncertainty of missingness as a clinical reality (9), with an assumption that CCA was over-optimistic. As a gold standard, RT-PCR is imperfect, even when performed serially, and diagnostic accuracy calculated may have been affected. We could not mandate or control for the effect of patient positioning due to the retrospective nature of this study and chose to evaluate whether or not distribution of extravascular lung fluid in a nondependent pattern (superior lung zones having equal to or more fluid than inferior, gravity-dependent lung zones) would be predictive of COVID-19 lung disease because the majority of the patients seen in our ED early in the pandemic were ambulatory at arrival (even the critically ill). However, based on what is known about distribution of pulmonary edema in acute heart failure (AHF), lung water is able to change locations fairly rapidly as patient positioning changes (11) . A follow-up study examining this phenomenon in COVID-19 patients would be prudent. Another potential limitation is that the extent to which LUS findings were purely acute versus acute-on-chronic is unknown. Lack of concurrently obtained echocardiography data likewise precludes understanding if left heart dysfunction contributed to findings on LUS. Additionally, the high in-hospital mortality rate of patients in our cohort may represent spectrum bias, with providers having been more likely to perform LUS in sicker patients. The decision to perform a LUS may also have been the result of knowledge of the results of other diagnostic tests by the sonographer, such as CXR. Finally, the LUS patterns observed in our study may not be representative of those with milder disease. The findings described in the present study demonstrate that LUS has the potential to add value to the care of patients with suspected COVID-19, but useful patterns were different from what has been suggested in nonanalytical publications. Since this is a hypothesis-generating study, no firm conclusions can be made as to why one imaging pattern may have outperformed another; however, analysis of the case-mix of patients may provide some clues. As an example, subpleural consolidation on LUS had been hypothesized (4) as a potentially useful finding in COVID-19 due to utility of this pattern to identify viral and bacterial pneumonia other than COVID-19; however, we did not find this to be the case. It is therefore perhaps notable that eight of the 12 patients who were confirmed COVID-19 negative had a discharge diagnosis of pneumonia due to pathogens other than COVID-19. While we cannot rule out that these were false negatives for COVID-19 pneumonia (i.e., after serial testing), four of those eight had concomitant bacteremia with a pulmonary pathogen, one had pneumocystis pneumonia, one had confirmed invasive pulmonary aspergillosis, and one had confirmed influenza A. These seven of eight with a highly plausible alternative pulmonary infection could also demonstrate pleural LUS findings, and therefore evaluation for subpleural consolidation may simply have failed to distinguish these alternative infectious lung diseases from COVID-19. Furthermore, COVID-19 presentations were particularly severe in our report, with marked mortality rates and high rates of comorbidities. Consequently, an ARDS-like pulmonary edema picture of COVID-19 may have predominated the case-mix more so than the uncomplicated pneumonia expected in mild presentations of COVID-19, with the latter expected to be more consistent with LUS findings of consolidation compared to the former. While we consider the severity of COVID-19 presentations a strength of our report given a paucity of LUS data for critically ill COVID-19 patients (6) , it thus also must be considered a limitation. Just as the predominance of a "less-sick" COVID-19 population in previous LUS reports (6) causes a spectrum bias toward results more specific to mild COVID-19, our comparatively ill cohort (and any selection bias that may have led to it) likely introduces a spectrum bias toward LUS findings more characteristic of severe presentations (e.g., ICU-admitted patients, with high mortality) (12) . By contrast, the NDBPE pattern that we describe performed well. There is precedent for the lack of a base-apex gradient on LUS as one factor differentiating pneumogenic pulmonary edema (specifically ARDS) from cardiogenic pulmonary edema (13) . This is consistent with the NDBPE pattern we observed here, possibly by highlighting again that more severe presentations of COVID-19 involve the manifestation of bilateral increased extravascular lung fluid that is not hydrostatic in nature (i.e., an ARDS-type picture). Notably, four of the five COVID-19 negative patients who were diagnosed with AHF or volume overload from decompensated renal disease were absent the NDBPE pattern. It is possible then that the NDBPE pattern helped to differentiate critically ill COVID-19 patients from those with cardiogenic pulmonary edema or volume overload from renal failure (14) , although this too is simply a hypothesis and needs testing in prospective study. The differentiation of pneumogenic pulmonary edema (e.g., ARDS, viral pneumonitis) from cardiogenic edema has long been a challenge on LUS (15) , and echocardiographic evaluation of filling pressures has proved useful to this end in the past. Thus, concurrent ventricular filling pressures will also be needed to confirm the supposition that NDBPE on LUS can help rule-in COVID-19 lung disease in part by screening out cardiogenic and renal pulmonary edema (15) . Future research should test the hypotheses generated here with an explicit prospective design, inclusion of a broad spectrum of COVID-19 severity, multiple observations across the ED to ICU, and rigorous methods for diagnostic adjudication beyond the RT-PCR reference standard alone which would likely include CT of the thorax if feasible. COVID-19): A perspective from China Imaging profile of the COVID-19 infection: Radiologic findings and literature review Diagnostic performance between CT and initial real-time RT-PCR for clinically suspected 2019 coronavirus disease (COVID-19) patients outside Wuhan, China Is there a role for lung ultrasound during the COVID-19 pandemic? Diagnostic accuracy of chest radiograph, and when concomitantly studied lung ultrasound, in critically ill patients with respiratory symptoms: A systematic review and meta-analysis Point-of-care lung ultrasound in patients with COVID-19 -a narrative review Design and rational of the B-lines lung ultrasound-guided emergency department management of acute heart failure (BLUSHED-AHF) pilot trial The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data Comparative diagnostic performances of auscultation, chest radiography, and lung ultrasonography in acute respiratory distress syndrome pROC: An open-source package for R and S+ to analyze and compare ROC curves Impact of patient positioning on lung ultrasound findings in acute heart failure Spectrum bias-why clinicians need to be cautious when applying diagnostic test studies Thoracic ultrasound and SARS-COVID-19: A pictorial essay Diagnosing acute heart failure in the emergency department: A systematic review and meta-analysis Critical care ultrasonography in acute respiratory failure Drs. Favot and Harrison contributed equally as co-first authors.Dr. Harrison was responsible for analysis and interpretation of data. Drs. Favot, Malik, and Harrison were responsible for drafting of article. Drs. Favot, Ehrman, and Harrison were responsible for critical revision of article. Drs. Favot, Malik, Ehrman, and Harrison were responsible for study conception and design. All authors were involved in acquisition of data.Supplemental digital content is available for this article. Direct URL citations appear in the HTML and PDF versions of this article on the journal's website (http://journals.lww.com/ccejournal).The authors have disclosed that they do not have any potential conflicts of interest.