key: cord-1022936-9e3i7mfe authors: Born, J.; Beymer, D.; Rajan, D.; Coy, A.; Mukherjee, V. V.; Manica, M.; Prasanna, P.; Ballah, D.; Shah, P. L.; Karteris, E.; Robertus, J. L.; Gabrani, M.; Rosen-Zvi, M. title: On the Role of Artificial Intelligence in Medical Imaging of COVID-19 date: 2020-09-09 journal: nan DOI: 10.1101/2020.09.02.20187096 sha: 58198c9fe9e97ff639a744e2e60782288d6e334f doc_id: 1022936 cord_uid: 9e3i7mfe During the COVID-19 pandemic, lung imaging takes a key role in addressing the magnified need of speed, cost, ubiquity and precision in medical care. The rise of artificial intelligence induced a quantum leap in medical imaging: AI has now proven equipollent to healthcare professionals in several diseases and the potential to save time, cost and increase coverage. But AI-accelerated medical imaging must still fully demonstrate its ability in remediating diseases such as COVID-19. We identify key use cases of lung imaging for COVID-19, comparing CT, X-Ray and ultrasound imaging from clinical and AI perspectives. We perform a systematic, manual survey of 197 related publications that reveals a disparity in the focus of the AI and clinical communities, caused by data availability and the lack of collaboration, and in modality trends, driven by ubiquity. Last, challenges in AI-acceleration and ways to remediate them are discussed and future research goals are identified. The COVID-19 pandemic has created a desperate need of ubiquitous, accurate, low-cost and fast tests, and lung imaging is a key complementary tool in the diagnosis and management of COVID-19 [1] , [2] . To automate the processing of the deluge of clinical data, recent advances in artificial intelligence (AI) have been exploited by building deep learning (DL) models that match or exceed performance of healthcare professionals across a diverse set of detection and monitoring tasks [3] . The field of AI in medical imaging (MI) is flourishing in the context of COVID-19 as is evident by our study that revealed a doubling in the number of pre-print and published papers in 2020 compared to 2019 (see Fig. 1 ). Consequently, the hopes are high that AI can advance rapid patient stratification of suspected COVID-19 patients by automatically classifying lung screenings and thus empowering medical staff, especially in resource-limited areas without testing facilities. In this paper, we review the current progress in the development of AI technologies for MI to assist in addressing the COVID-19 pandemic, discuss how AI meets the identified gaps and share observations regarding the speed, geographical diversity and relevancy of these developments. * https://medphys.royalsurrey.nhs.uk/omidb/ Radiologists play a crucial role in interpreting medical images for the diagnosis and prognosis of disease. Although AI technologies have recently demonstrated performance that matches radiologists' accuracy in a number of specific tasks, it remains unclear if radiologists who adopt AI-assistance will replace those that do not. Rapid progress of AI in healthcare has been accelerated by the availability of large, diverse and qualified datasets/annotations. For example, breast cancer screening has reported tremendous success in using AI for retrospective detection due to 10 years of effort in large-scale data collection which resulted in OPTIMAM * , a database with a total cohort of >150,000 clients. Using this or other databases, several notable studies have been published comparing performances of DL-based mammography screening solutions to healthcare professionals [4] - [7] . Similarly, significant progress has been made in diagnosing lung conditions using chest X-rays (CXR) and computed tomography (CT), driven by access to open source datasets and labels. For example, DL-based approaches outperform radiologists in detecting several pulmonary conditions from CXR [8] and malignancy of lung nodules in low dose CT [9] . However, several key challenges limit the feasibility of adopting these solutions in practice, namely: (i) poor model generalization due to systemic biases, (ii) lack of model interpretability and (iii) non-scalable image annotation processes. The current pandemic caused a vivid controversy about AI-based diagnosis. While "CT imaging with AI could serve as a surrogate for doctors when fast judgement is needed" [10] , by May 2020 there was a lack of supporting evidence to justify such an approach [11] . This sheds a despondent light on the role of novel technologies to combat pandemics where rapid solutions against new diseases are needed. We believe that AI is expected to provide assistance in interpreting lung images in the context of COVID-19. The best systems however will have to incorporate demographics, hospitalization data, and patient history as features along with imaging [12] . Notably, all systems until May 2020 were rated at high risk of bias and overfitting due to non-representative cohort selection along with limited access to imaging data and vague reporting [12] . However, several recent studies evidence the value of AI-assisted disease management for diagnosis [13] and disease severity [14] . is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint The recent acceleration of publications intersecting AI and imaging for COVID-19, brings a need for rigorous comparative evaluation of papers to summarize and highlight trends to a broad clinical audience. Previous review papers on COVID-19 either focused on a technical assessment of AI in imaging [15] , elaborated on the role of imaging [1] , or appraised predictive models agnostic of imaging [12] . In contrast, this paper attempts to bridge clinical and technical perspectives by providing a comprehensive overview to guide researchers towards working on the most pressing problems in automating lung image analysis for COVID-19. Our work uncovers insights by conducting a thorough meta-analysis of 197 COVID-19-related publications. We find that, despite strong evidence for benefits of lung imaging in the containment of COVID-19, there exists a lack of coordination among scientific efforts towards addressing valuable problems. For example, X-Ray data has received disproportionally high attention from AI researchers compared to clinicians, while exploration of AI on ultrasound is neglected. Moreover, about 80% of papers focused on designing AI models for detecting COVID-19 though clinicians have called out its restricted utility for screening. Overall, we observe a two-to-fourfold increase in the number of publications related to AI for lung imaging in Q2 of 2020; causally evoked by COVID-19. The overall paper is organized as follows: sections 1. and 2. discuss the clinical role of lung imaging for COVID-19, an overview of lung imaging modalities and current state-of-the-art in AI. In section 3, we uncover insights and trends from the meta-analysis of 197 COVID-19-related publications. Finally, sections 4. and 5. describes key challenges for clinical adoption of AI, summarizes our analysis and presents recommendations for the future. Throughout this paper we focus on the lung as the primary organ of SARS-CoV-2 infection but note the significance of extrapulmonary manifestations that include thrombosis, gastrointestinal symptoms, kidney injury, dermatologic and neurological complications [16] . The two most important use cases of lung imaging for the COVID-19 pandemic are: diagnosis/detection and severity assessment. For a more detailed elaboration on the role of imaging for COVID-19 see [1] . COVID-19 may present with diverse symptoms, most commonly flu-like symptoms including dry cough, rhinorrhoea, fever, fatigue and a sore throat [2] . To address the heterogeneity of the clinical presentation, a variety of tests is necessary. The most prominent test relies on the identification of viral RNA in reverse transcriptase polymerase chain reaction (RT-PCR). The viral genetic material is obtained from nasopharyngeal swabs and is considered an accurate though imperfect test [2] . Serological tests identifying anti-SARS-CoV-2 IgM and IgG antibodies are also used [17] , but as antibody development can take up to 5 days in humans, they are inappropriate for detection of early infections [18] . Over the disease process, the maximal sensitivity of RT-PCR is obtained 8 days after infection with 0.8 [19] . Compared to diagnostic tests, CT imaging may be more sensitive at a single time point for the diagnosis of COVID-19 [20] . However, while chest CT may have improved sensitivity, the American College of Radiology (ACR) conservatively discourages to use imaging for diagnosis † . The two main diagnostic challenges are the differentiation from † https://www.acr.org/Advocacy-and-Economics/ACR-Position-Statements/Recommendations-non-COVID-19 viral pneumonia [21] and the detection of asymptomatic patients with unaffected lungs. Early in the pandemic chest CT was used to identify radiological manifestations [22] . Currently, monitoring disease progression is less important in patient management apart from diagnosing complications such as pulmonary embolisms and superimposed bacterial pneumonias or sequelae of intubation like pneumothoracies [23] . Imaging findings of COVID-19 patients have been shown to correlate with disease severity [24] . Retrospectively analyzed imaging findings on chest CTs that were compared with disease severity, revealed increased occurrence of consolidation, linear opacities, crazy-paving pattern and bronchial wall thickening in severe patients at a higher frequency than in non-severe COVID-19 patients. The CT findings that correlated with worse symptoms included a respiratory rate greater than 30 breaths per minute, oxygen saturation of 93% or less in a resting state and PaO2/FiO2 ratio of 300 mmHg or less, respiratory failure or a requirement for mechanical ventilation, shock or organ requiring intensive care unit monitoring and treatment [22] . In this section, we provide an overview of CT, X-ray, and ultrasound imaging. Per modality, we highlight recent art in AI solutions and summarize performance and evaluation schemes. Finally, we compare and contrast the role of the modalities in handling COVID-19. Details on MRI and Digital Pathology can be found in the Supplementary Material. High resolution CT remains the gold standard imaging technique for thoracic evaluation. The main abnormalities observed in common and severe COVID-19 cases respectively were ground glass opacities (GGOs) and patchy consolidation surrounded by GGOs. CT scanning accurately assessed the severity of COVID-19 and helped monitoring disease transformation among different clinical conditions [25] . COVID-19 pneumonia manifests with chest CT imaging abnormalities, even in asymptomatic patients, with rapid evolution from focal unilateral to diffuse bilateral GGOs that progressed to or co-existed with consolidations within 1-3 weeks [26] . Crucially, patients with a positive RT-PCR test can have a normal CT. Some reports find up to 50% of false negatives when scanned two days after onset of flu-like symptoms [27] . The visual features of GGOs and consolidation lend themselves to visual analysis by DL networks, and several CT analysis systems have emerged from both academia and industry [13] , [28] , [29] . These networks typically ingest a raw CT scan and output a differential patient diagnosis. Performance-wise, CT networks can be impressive and report a patient level detection with AUC of 0.92 [13] or even 0.96 [29] . To improve robustness, some systems include a lung segmentation stage that focuses attention on the lung for later analysis stages [14] , [28] . Some segmentation-based networks output pixelwise-labelled tissue maps of GGO or consolidation regions, which can be informative in grading disease severity or tracking progression over time. This can be achieved by segmenting anatomical landmarks with deep reinforcement learning and for-Chest-Radiography-and-CT-for-Suspected-COVID19-Infection . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint computing percentage of opacity and lung severity score as complementary severity measures [14] . CT datasets for training diagnosis networks are usually labelled using the RT-PCR test result for COVID-19 and consist of several hundreds to few thousand samples. Training for tissue segmentation is more labour intensive, typically employing manual region labelling by radiologists leading to sample sizes in the hundreds. COVID-19 dataset sizes are expected to grow with new repositories and consortiums such as RSNA ‡ , Imaging COVID-19 AI Initiative § , UK NHS NCCID ** . Some studies demonstrated that radiologists' performance improves upon consultation of AI: Junior radiologists along with AI can perform as well as mid-senior radiologist [30] and radiologists' sensitivity and specificity can improve by nearly 10% through AI [31] . Similarly, in a blind review with six radiologists' decisions with AI support were superior to radiologists alone (sensitivity 88% vs. 79%, specificity 91% vs. 88%), although the AI model alone performed the best (95% and 96%) [32] . Others used a sequential process composed of an abnormal-slice-selection model and a disease-diagnosis model [13] . They achieved equal sensitivity, but lower specificity compared to a senior radiologist (279 patients). Notably, using a combination of clinicopathological features and CT scans, the model correctly classified 68% of positive patients who presented normal CT scans according to the radiologists. Using a large CT database of 3,777 patients, [30] developed a differentially diagnosing AI system comparable to experienced radiologists. Their system is globally available to combat COVID-19 and highlights the benefits of AI not only in diagnostic assistance but also in prognosing critical illness. Chest X-ray is an integral step in the clinical workflow and often the firstline examination when a patient presents with chest pain, shortness of breath or other cardiopulmonary symptoms. CXR has a low sensitivity and specificity for identifying COVID-19; [33] reported that 89% of 636 COVID-19 patients exhibited normal or only mildly abnormal CXR. Notably, CXR should not be routinely performed in patients presenting with mild to moderate symptoms but may be useful during diagnostic ambiguity † † . Differing from mild cases, a bedside CXR is still considered the standard of care for both diagnostic applications in the ICU and monitoring changes in clinical status. The temporal course of findings in COVID-19, from development of consolidation on CT and opacities on Xray is well documented [34] . A multitude of DL network-based solutions were built to distinguish between COVID-19 and other pulmonary conditions in CXR and their performances vary heavily. An early study reports 87% accuracy (sensitivity 0.85, specificity 0.92) for 3-class classification and use, like many others, heatmaps to locate affected regions [35] . Luz et al. (2020) focus on computationally light-weight classification (sensitivity 0.97) that is claimed to scale to high-resolution images and enable mobile analysis [36] . To circumvent the low availability of data, weakly-labelled data augmentation with paediatric non-COVID-19-viral pneumonia CXR samples can improve model performance [37] . Despite the relatively larger size of today's CXR datasets, severe class imbalances, simultaneous prediction of multiple diseases and uncertainty in ‡ https://www.rsna.org/covid-19 § https://imagingcovid19ai.eu ** https://www.nhsx.nhs.uk/covid-19-response/data-and-information-automatic labelling [8] pose significant challenges to build AI models. Successful solutions adopt model ensemble approaches and classify pulmonary and cardiac biomarkers and conditions (e.g. pleural effusion or cardiomegaly) instead of diseases to obtain high AUC scores of 0.93 [38] . However, simpler models combining self-training, knowledge distillation and semi-supervised learning techniques can achieve comparable performances using only ~10% of labelled data [39] . LUS is widely considered a surrogate lung imaging modality that can reduce usage of ionizing radiation techniques at emergency rooms and resembles a sensitive bedside monitoring tool, especially for the criticallyill [40] . Over the past decades, increasing evidence for the diagnostic value of US in pulmonary conditions such as ARDS or pneumothorax has led to increased recommendation of LUS [41] , [42] . Especially for resourcelimited settings such as during triage or in low-income countries, LUS is superior to CXR in diagnosing pneumonia [43] or COVID-19 [44] and may resemble a viable first-line examination method [21] . SARS-CoV-2 induces patterns characteristic for viral pneumonia, most prevalently Blines (vertical, comet-tail artifacts arising from the pleural line [45] ) and pleural line abnormalities such as pleural thickening [46] . Findings suggest high diagnostic sensitivity that is comparable to CT scans and can help with diagnosis and subsequent monitoring of the infection [47] . Point-by-point correspondence between LUS and CT patterns can be made and even structured into a disease timeline [24] . This growing body of positive evidence has led many to advocate for an amplified role of LUS during the current pandemic [48] , [49] as safe and portable alternative to CT. However, US has received scant attention from the AI community given the positive clinical evidence. With the rise of DL in computer vision, learning-based approaches to US became more popular [50] and a plethora of advanced processing techniques cope with the wealth of US transducers and probes [51] . More than 50 papers were published about DL on US in 2017 [50] , but sparse work has been done on LUS. B-line detection is the most commonly solved task and most works used non-learning-based approaches [52] . But [53] showed that weakly-supervised DL can detect Blines and used neural attention maps to highlight the most significant regions of the probe that drove the prediction. This paved the road toward more interpretable models that could support clinical decision processes for a variety of pulmonary conditions with B-line involvement. Others confirmed that deep learning can quantify B-lines [54] or subpleural pulmonary lesions [55] . In the context of COVID-19, [56] performed a semantic segmentation of the lung to recognize pathologies like B-lines with high accuracy. They also predicted COVID-19 severity on a scale from 0-3 on 277 videos from 35 patients and achieved 0.7 positive predictive value and 0.6 recall. LUS thus not only has potential to enable severity assessment, but can also facilitate automatic differential diagnosis of COVID-19 from pneumonia [57] . The work continues in [58] and provides an unsupervised mechanism to extract pulmonary biomarkers which are deemed useful by physicians. The reported sensitivity of 0.98 and specificity of 0.91 for COVID-19 provides first evidence that automatic differentiation of COVID-19 from other pulmonary conditions is feasible. In triage situations LUS may thus be used to rapidly assist the patient stratification process and guide downstream molecular or radiologic governance/national-covid-19-chest-imaging-database-nccid/ † † For more guidelines see www.acep.org/corona/covid-19-field-guide/ . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint testing. As the absence of skilled personnel is the main preventer of an increased spread of US in the developing world [59] , a medical decision support tool building upon existing and available technologies is of utmost importance. In order for imaging to be effective at scale, it must ideally be ubiquitous, accurate, low-cost and fast, and preferably provide high-quality data, depend on portable devices and be non-invasive. The usage of CT is restricted to modern medical facilities (~30k CT scanner are globally available [63] ). The practical advantages of LUS are numerous and it is recommended during triage [49] . Its acceptance as the preferred modality in Italy [64] , may promote a more prominent role in the future. For further details see Table 1 . CXR is notoriously less sensitive than CT [65] . Weinstock et al. [33] found that 90% of COVID-19 patients exhibit normal or mildly abnormal CXR. A multimodal imaging study of CT, CXR and LUS reported B-lines in LUS as most consistent pathological pattern [66] . A direct comparison of LUS and CXR revealed that LUS has a significantly higher sensitivity for COVID-19 diagnosis than CXR whereas no significant differences were found in specificity [44] . There is conflicting evidence regarding a comparison of LUS and CT and a dissent about the implications of LUS as a first-line examination method. Indeed, Yang et al. [67] found a higher sensitivity than CT in detecting all examined patterns (regional AIP, AIS, pleural effusion and consolidations). But others reported relatively low sensitivity (0.69 and 0.78) for mild and moderate cases of COVID-19 [68] . Similar accuracy of LUS and CT was reported in [47] , with LUS identifying signs of COVID-19 more likely than CT (78% vs. 72%). This has caused a vivid debate on the role of LUS for the COVID-19 pandemic [48] , [49] , [69] . Major AI-studies that benchmark modalities by their usability for automatic segmentation, diagnosis or mere pattern detection are still underway. Several works have compared the predictive power of CT and CXR data from heterogenous data sources, but a clinical study with multi-imaging data from the same patients is needed to confirm the radiologic findings. In one work that combined COVID-19 CT and CXR data, no clear benefit was found from the multimodal integration [70] . To discover trends from the overwhelming research activities interfacing areas of COVID-19, AI and MI, we conducted a detailed meta-analysis of publications selected through a keyword search on PubMed and preprint servers such as arXiv, bioRxiv and medRxiv. For an in-depth description of the publication selection procedure, see Appendix A. We quantify the publication efforts, summarize trends and highlight key findings from a thorough manual analysis of 197 papers. There has been a stable growth of papers on AI on both lung and breast imaging over the years 2017-2019 (see Fig. 2A ). In 2020, the rise of lungrelated papers has been accelerated by COVID-19 causing an additional ~200 papers (111% growth rate). To compare the impact of individual modalities, Figure 2B shows that 2019 witnessed a stable trend of ~100 publications per quarter on AI on lung imaging with shares of 72% for CT, 25% for CXR and 2% for LUS. Post the COVID-19 outbreak, numbers soared to 138 (340) papers in Q1 (Q2) 2020. This rise was causally evoked by COVID-19 as excluding papers mentioning COVID-19 would have resulted in a continuation of the stable trend (see shaded bars) of a hypothetical 106 (Q1) and 112 (Q2) publications. Indicating a shift in the focus of research, the first half of 2020 produced more papers for each modality than all of 2019. The simpler and more accessible modalities X-Ray and US gained shares and make up 37% and 3% of all papers in 2020 compared to 60% CT papers. For Q2, X-Ray had the highest fraction of COVID-19 papers (79%), and the growth can be primarily ascribed to preprints (370% increase) rather than peer-reviewed publications (86%). . CC-BY-NC-ND 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint To complement the quantitative analysis, we reviewed 197 publications for qualitative analysis (see Fig. A1 for the inclusion workflow). In sum, the modality-specific COVID-19 papers of Fig. 2B were combined with papers containing the keywords 'AI, medical imaging and COVID-19'. Of the 1904 papers about MI and COVID-19 (see Fig 1B) , 1072 are specific to modalities as shown in Fig. 3A , indicating a dominance of CT in clinical papers and a roughly even relevance of X-Ray and US imaging. By using publication counts as an indirect indicator on scientific practice, we observe a mismatch in the focus of the AI community in comparison to the clinical community as illustrated by the distribution of papers per modality in Fig. 3B . In addition, the vast majority (77%) of papers focused on detection of COVID-19 over tasks like severity and prognosis (Fig. 4A ). Note that a key concern was the unanimous use of only imaging data for training the algorithms (52% CXR, see Fig 3B) . This trend is in contrast to the ACR recommendations that appraise imaging as an inconclusive test for COVID-19 detection. However, only a mere ~6% of papers exploited multimodal data towards building their AI models. While data availability is vital to AI and explains the community's focus on X-rays, success in creating impactful and practical healthcare AI solution hinges upon close collaboration among radiologists and AI experts. Pivoting AI focus to utilize multimodal data of patient cohorts would be increasingly effective and invaluable in the containment of COVID-19. To qualify the maturity of publications, we chose three parameters: (i) size of patient data used for training and evaluation, (ii) deployment and adoption of solutions in hospitals and (iii) complexity of the AI framework. We find only hily mature 7 studies, of which none utilized X-Rays. Further, papers that used CXR data were reported as lowest in average quality and highest in preprint ratio. For details on the task and maturity as distributed by modality see appendix (Fig. A2) . We observe that only ~27% of papers used proprietary or clinical data (Fig 4C) , while over 70% analyzed publicly available databases, usually comprised of no more than a few hundred images from heterogenous sources/devices without detailed patient information. The leading countries for AI expertise were China and USA with India and Canada as runners-ups. Instead, data was shared by countries hit early from the pandemic, i.e. China and, to a lesser extent, Italy, USA and Spain (see Fig 5) . However, the geographical diversity of author contributions was remarkable. First-authors from 34 countries and 6 continents contributed to the research, evidencing a global collaborative spirit (Fig 5B) . When applying AI in clinical settings, practical and socio-environmental factors like interrupted staff workflow, insufficient hardware infrastructure and trust are the main inhibitors of AI integration [71] . Here, we highlight some of the most delicate technical challenges connecting MI research and clinical deployment. Clinical AI models are plagued by generalization issues and often fail to sustain performance in new environments, owing to inherent biases in model or data, lack of real-world representation and small size of training is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint datasets. Generally, model robustness can be improved by leveraging continual learning and domain adaptation techniques which attempt to learn domain invariant representations by aligning the data distribution between source and target environments [72] . A domain shift is usually caused by changes in the data distribution arising from discrepancies in population demographics, missing measurements, or label imbalance. However, studies show that for clinical data, domain shifts are often associated with task shifts due to complex interactions between predictor variables and comorbidities [73] thereby rendering sophisticated domain adaption techniques insufficient. The combined lack of robustness and interpretability poses steep challenges for the adoption of AI models in clinical practice. The susceptibility of AI to adversarial attacks gave rise AI safety and explainability [74] . A humanly-interpretable access to the model's decision process is crucial to hone trust in AI, especially in medical applications where reasoning is inductive, sensitive decisions are made and patients expect plausible hypotheses from physicians. Most current interpretability tools focus on generating explanations which highlight patterns learned from the data but do not translate model decisions in human-understandable forms. Recently, counterfactual reasoning combined with uncertainty quantification principles have opened doors to building models integrating reliability into the optimization process [75] . This enables model introspection and facilitates human-in-the-loop analysis. COVID-19 has created a sharp rise in demand of respiratory care devices with a compound annual growth rate (CAGR) of 261% between 2019 and 2020 ‡ ‡ . The prevalence of the different imaging modalities varies across the globe for operational, financial and care-practice reasons (see Fig. A3 ). Key challenges are to meet the increased demands and deliver the right AI technologies into the right regions. The demand of US systems appears to increase and a switch in market lead from CT to US is predicted in North America. This development is driven by the adoption of handheld devices and home healthcare services that reduce hospitalization cost § § ; a trend further enhanced by COVID-19. Portable X-Ray or Ultrasound equipment is predestined for rapid on-device analysis and already used to triage at patient homes [49] . Future research will need to explore the feasibility of AI for real-time analysis. The ubiquitous presence of mobile, Bluetooth-equipped devices paves a road to absorb the absence of experienced doctors during triage. Under appropriate guidance, a wireless US coupled to a smartphone app can offer an alternative to CXR and CTs for defined tasks particularly in remote areas. But user interpretability and expertise will remain a challenge toward analytic automatization. Standardization of data across devices and streamlining of medical examination are challenging. Whenever AI integration is anticipated, execution protocols such as the BLUE protocol for LUS [45] should be followed with stringent care. While the use of US has increased exponentially in developing countries, lack of training is the highest barrier [59] . Software infrastructure to instruct physicians such as massive-openonline-courses are needed to accelerate AI-assisted diagnosis but also to collect high-quality, annotated data on large scale. Further, building AI systems that can integrate data collected across the clinical disease journey, especially multimodal imaging modalities, will be required to leverage the full power of AI and address the ACR recommendations. To summarize, we have provided an extensive comparison of lung imaging modalities for COVID-19 with an emphasis on AI. We highlight that imaging is not recommended for COVID-19 diagnosis by the ACR due to overlapping symptoms with viral infections. CT should be reserved for cases with inconclusive findings in initial assessments via CXR and LUS and LUS can be performed as first-line examination during triage [21] . Notably, clinical diagnosis should by no means solely base on algorithmic predictions. DL models combining CT and clinical features start to match [13] or surpass [31] performance of senior radiologists in detecting COVID-19, but the road toward clinical application is long and difficult. Our meta-analysis revealed that AI efforts predominantly focus on CXR and to a lesser extent on CT imaging. LUS instead is clinically not yet as relevant, but the economic and diagnostic impact is expected to grow. US has received little attention in AI despite its wide, global availability which is a key factor for dealing pandemics, especially in remote and undeveloped areas where supply chains are long, and repurposing of existing technologies is the sharpest weapon for prompt pandemic alleviation. Given that LUS is also more sensitive than CXR [44] , comparable to CT [47] , and has evident practical advantages, it is presumably the modality with the highest improvement potential for medical image analysis in the near future. MRZ, DB, JLR, EK and MG conceived the presented work. MRZ conceived the meta-analysis and supervised this project, and MRZ and MG set the high-level objectives of this work. MM and JB developed the software to perform paper keyword searches. JB, DM, DR, AC, VM, EK and MG manually reviewed papers. JB analyzed the results and JB and DR created the figures. All authors contributed toward the interpretation and improvement of the analysis. DB and JB led and distributed the manuscript writing efforts. JB, DB, AC, DR, MG, EK and MRZ wrote full, individual sections of the manuscript. All authors reviewed initial versions and contributed significantly to the different sections of the manuscript and approved the submitted version. The authors declare no conflicts of interest. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 9, 2020. . https://doi.org/10.1101/2020.09.02.20187096 doi: medRxiv preprint The role of imaging in the detection and management of COVID-19: a review COVID-19 diagnosis and management: a comprehensive review A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study Predicting breast cancer by applying deep learning to linked health records and mammograms International evaluation of an AI system for breast cancer screening Artificial intelligence in breast imaging CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison End-to-end lung cancer screening with threedimensional deep learning on low-dose chest computed tomography COVID-19 and artificial intelligence: protecting health-care workers and curbing the spread Cautions about radiologic diagnosis of COVID-19 infection driven by artificial intelligence Prediction models for diagnosis and prognosis of covid-19: Systematic review and critical appraisal Artificial intelligence--enabled rapid diagnosis of patients with COVID-19 Automated Quantification of CT Patterns Associated with COVID-19 from Chest CT Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19 Extrapulmonary manifestations of COVID-19 Diagnostic accuracy of serological tests for covid-19: Systematic review and meta-analysis Laboratory diagnosis of COVID-19: Current issues and challenges Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction--Based SARS-CoV-2 Tests by Time Since Exposure Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: a report of 1014 cases Current Concepts Imaging in COVID-19 and the Challenges for Low and Middle Income Countries The Journal of Global Radiology Imaging findings Coronavirus Disease (COVID-19): Spectrum of CT Findings and Temporal Progression of the Disease Use of chest imaging in COVID-19: a rapid advice guide Ultrasound in COVID-19: a timeline of ultrasound findings in relation to CT Computed Tomographic Features and Short-term Prognosis of Coronavirus Disease Radiological findings from 81 patients with COVID-19 pneumonia in Wuhan, China: a descriptive study Patients with RT-PCR-confirmed COVID-19 and normal chest CT Rapid AI Development Cycle for the Pandemic: Initial Results for Automated Detection & Patient Monitoring Using Deep Learning CT Image A nalysis Artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography End-to-end automatic differentiation of the coronavirus disease 2019 (COVID-19) from viral pneumonia based on chest CT AI Augmentation of Radiologist Performance in Distinguishing COVID-19 from Pneumonia of Other Etiology on Chest CT Chest x-ray findings in 636 ambulatory patients with COVID-19 presenting to an urgent care center: a normal chest x-ray is no guarantee Chest CT features of coronavirus disease 2019 (COVID-19) pneumonia: key points for radiologists Automated detection of COVID-19 cases using deep neural networks with X-ray images Towards an Efficient Deep Learning Model for COVID-19 Patterns Detection in X-ray Images Weakly Labeled Data Augmentation for Deep Learning: A Study Detection in Chest X-Rays Interpreting chest X-rays via CNNs that exploit hierarchical disease dependencies and uncertainty labels Self-Training with Improved Regularization for Few-Shot Chest X-Ray Classification Lung ultrasound for critically ill patients Lung ultrasound for the diagnosis of pneumonia in adults: a systematic review and meta-analysis Lung ultrasound will soon replace chest radiography in the diagnosis of acute community-acquired pneumonia Lung ultrasound as a diagnostic tool for radiographically-confirmed pneumonia in low resource settings Point-of-care Lung Ultrasound Is More Sensitive than Chest Radiograph for Evaluation of COVID-19 Lung ultrasound in the critically ill: the BLUE protocol Lung ultrasound findings in patients with COVID-19 pneumonia Correlation between Chest Computed Tomography and Lung Ultrasonography in Patients with Coronavirus Disease 2019 (COVID-19) COVID-19 outbreak: less stethoscope, more ultrasound Pointof-care lung ultrasound in patients with COVID-19--a narrative review Deep Learning in Medical Ultrasound Analysis: A Review Deep learning in ultrasound imaging Automatic Detection of B-Lines in In Vivo Lung Ultrasound Localizing B-lines in lung ultrasonography by weakly-supervised deep learning, in-vivo results Quantifying lung ultrasound comets with a convolutional neural network: Initial clinical results Boundary Restored Network for Subpleural Pulmonary Lesion Segmentation on Ultrasound Images at Local and Global Scales Deep learning for classification and localization of COVID-19 markers in point-of-care lung ultrasound POCOVID-Net: Automatic Detection of COVID-19 From a New Lung Ultrasound Imaging Dataset (POCUS) Accelerating COVID-19 Detection with Explainable Ultrasound Image Analysis Perceived barriers in the use of ultrasound in developing countries Cost-effectiveness of CT screening in the national lung screening trial Feasibility and safety of substituting lung ultrasonography for chest radiography when diagnosing pneumonia in children: a randomized controlled trial Chest CT for detecting COVID-19: a systematic review and meta-analysis of diagnostic accuracy The industry of CT scanning Our Italian experience using lung ultrasound for identification, grading and serial follow-up of severity of lung involvement for management of patients with COVID-19 Frequency and Distribution of Chest Radiographic Findings in COVID-19 Positive Patients COVID-19 pneumonia manifestations at the Lung ultrasonography versus chest CT in COVID-19 pneumonia: a two-centered retrospective comparison study from China A Clinical Study of Noninvasive Assessment of Lung Lesions in Patients with Coronavirus Disease-19 COVID-19) by Bedside Ultrasound POCUS in COVID-19: pearls and pitfalls A Multi-Task Pipeline with Specialized Streams for Classification and Segmentation of Infection Manifestations in COVID-19 Scans A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy Towards continuous domain adaptation for medical imaging Understanding Behavior of Clinical Models under Domain Shifts On the Interpretability of Artificial Intelligence in Radiology: Challenges and Opportunities Improving Reliability of Clinical Models using Prediction Calibration