key: cord-0729066-b7l5ql33 authors: Sogancioglu, Ecem; cCalli, Erdi; Ginneken, Bram van; Leeuwen, Kicky G. van; Murphy, Keelin title: Deep Learning for Chest X-ray Analysis: A Survey date: 2021-03-15 journal: Medical image analysis DOI: 10.1016/j.media.2021.102125 sha: f3d7412ad72cded760c4fe8b330e9f7b5417d4e3 doc_id: 729066 cord_uid: b7l5ql33 Recent advances in deep learning have led to a promising performance in many medical image analysis tasks. As the most commonly performed radiological exam, chest radiographs are a particularly important modality for which a variety of applications have been researched. The release of multiple, large, publicly available chest X-ray datasets in recent years has encouraged research interest and boosted the number of publications. In this paper, we review all studies using deep learning on chest radiographs, categorizing works by task: image-level prediction (classification and regression), segmentation, localization, image generation and domain adaptation. Commercially available applications are detailed, and a comprehensive discussion of the current state of the art and potential future directions are provided. A cornerstone of radiological imaging for many decades, chest radiography (chest X-ray, CXR) remains the most commonly performed radiological exam in the world with industrialized countries reporting an average 238 erect-view chest Xray images acquired per 1000 of population annually (United Nations, 2008) . In 2006, it is estimated that 129 million CXR images were acquired in the United States alone (Mettler et al., 2009 ). The demand for, and availability of, CXR images may be attributed to their cost-effectiveness and low radiation dose, combined with a reasonable sensitivity to a wide variety of pathologies. The CXR is often the first imaging study acquired and remains central to screening, diagnosis, and management of a broad range of conditions (Raoof et al., 2012) . Chest X-rays may be divided into three principal types, according to the position and orientation of the patient relative to the X-ray source and detector panel: posteroanterior, anteroposterior, lateral. The posteroanterior (PA) and anteroposterior (AP) views are both considered as frontal, with the X-ray source positioned to the rear or front of the patient respectively. The AP image is typically acquired from patients in the supine position, while the patient is usually standing erect for the PA image acquisition. The lateral image is usually acquired in combination with a PA image, and projects the X-ray from one side of the patient to the other, typically from right to left. Examples of these image types are depicted in Figure 1 . The interpretation of the chest radiograph can be challenging due to the superimposition of anatomical structures along the projection direction. This effect can make it very difficult to detect abnormalities in particular locations (for example, a nodule Ecem Sogancioglu and Erdi Ç allı contributed equally. * Corresponding author: E-mail: erdi.calli@radboudumc.nl posterior to the heart in a frontal CXR), to detect small or subtle abnormalities, or to accurately distinguish between different pathological patterns. For these reasons, radiologists typically show high inter-observer variability in their analysis of CXR images (Quekel et al., 2001; Balabanova et al., 2005; Young, 1994) . The volume of CXR images acquired, the complexity of their interpretation, and their value in clinical practice have long motivated researchers to build automated algorithms for CXR analysis. Indeed, this has been an area of research interest since the 1960s when the first papers describing an automated abnormality detection system on CXR images were published (Lodwick et al., 1963; Becker et al., 1964; Meyers et al., 1964; Kruger et al., 1972; Toriwaki et al., 1973) . The potential gains from automated CXR analysis include increased sensitivity for subtle findings, prioritization of time-sensitive cases, automation of tedious daily tasks, and provision of analysis in situations where radiologists are not available (e.g., the developing world). In recent years, deep learning has become the technique of choice for image analysis tasks and made a tremendous impact in the field of medical imaging (Litjens et al., 2017) . Deep learning is notoriously data-hungry and the CXR research community has benefited from the publication of numerous large labeled databases in recent years, predominantly enabled by the generation of labels through automatic parsing of radiology reports. This trend began in 2017 with the release of 112,000 images from the NIH clinical center (Wang et al., 2017b) . In 2019 alone, more than 755,000 images were released in 3 labelled databases (CheXpert (Irvin et al., 2019) , MIMIC-CXR (Johnson et al., 2019) , PadChest (Bustos et al., 2020) ). In this work, we demonstrate the impact of these data releases on the number of deep learning publications in the field. There have been previous reviews on the field of deep learning in medical image analysis (Litjens et al., 2017 ; van Gin- neken, 2017; Sahiner et al., 2018; Feng et al., 2019 ) and on deep learning or computer-aided diagnosis for CXR (Qin et al., 2018; Kallianos et al., 2019; Anis et al., 2020) . However, recent reviews of deep learning in chest radiography are far from exhaustive in terms of the literature and methodology surveyed, the description of the public datasets available, or the discussion of future potential and trends in the field. The literature review in this work includes 295 papers, published between 2015 and 2021, and categorized by application. A comprehensive list of public datasets is also provided, including numbers and types of images and labels as well as some discussion and caveats regarding various aspects of these datasets. Trends and gaps in the field are described, important contributions discussed, and potential future research directions identified. We additionally discuss the commercial software available for chest radiograph analysis and consider how research efforts can best be translated to the clinic. The initial selection of literature to be included in this review was obtained as follows: A selection of papers was created using a PubMed search for papers with the following query. chest and ("x-ray" or xray or radiograph) and ("deep learning" or cnn or "convolutional" or "neural network") A systematic search of the titles of conference proceedings from SPIE, MICCAI, ISBI, MIDL and EMBC was also performed, searching paper titles for the same search terms listed above. In the case of multiple publications of the same paper, only the latest publication was included. Relevant peerreviewed articles suggested by co-authors and colleagues were added. The last search was performed on March 3rd, 2021. This search strategy resulted in 767 listed papers. Of these, 61 were removed as they were duplicates of others in the list. A further 261 were excluded as their subject matter did not relate to deep learning for CXR, they were commentary or evaluation papers or they were not written in English. Publications that were not peer-reviewed were also excluded (8). Finally, during the review process 142 papers were excluded as the scientific content was considered unsound, as detailed further in Section 6, leaving 295 papers in the final literature review. The remainder of this work is structured as follows: Section 2 provides a brief introduction to the concept of deep learning and the main network architectures encountered in the current literature. In Section 3, the public datasets available are described in detail, to provide context for the literature study. The review of the collected literature is provided in Section 4, categorized according to the major themes identified. Commercial systems available for chest radiograph analysis are described in Section 5. The paper concludes in Section 6, with a comprehensive discussion of the current state of the art for deep learning in CXR as well as the potential for future directions in both research and commercial environments. This section provides an introduction to deep learning for image analysis, and particularly the network architectures most frequently encountered in the literature reviewed in this work. Formal definitions and more in-depth mathematical explanations of fully-connected and convolutional neural-networks are provided in many other works, including a recent review of deep learning in medical image analysis (Litjens et al., 2017) . In this work, we provide only a brief overview of these fundamental details and refer the interested reader to previous literature. Deep learning is a branch of machine learning, which is a general term describing learning algorithms. The algorithm underpinning all deep learning methods is the neural network, in this case, constructed with many hidden layers ('deep'). These networks may be constructed in many ways with different types of layers included and the overall construction of a network is referred to as its 'architecture'. Sections 2.3 to 2.6 describe commonly used architectures categorized by types of application in the CXR literature. In the 1980s, networks using convolutional layers were first introduced for image analysis (Fukushima and Miyake, 1982) , and the idea was formalized over the following years (LeCun and Bengio, 1998). These convolutional layers now form the basis for all deep learning image analysis tasks, almost without exception. Convolutional layers use neurons that connect only to a small 'receptive field' from the previous layer. These neurons are applied to different regions of the previous layer, operating as a sliding window over all regions, and effectively detecting the same local pattern in each location. In this way, spatial information is preserved and the learned weights are shared. Transfer learning investigates how to transfer knowledge extracted from one domain (source domain) to another (target) domain. One of the most commonly used transfer learning approaches in CXR analysis is the use of pre-training. With the pre-training approach, the network architecture is first trained on a large dataset for a different task, and the trained weights are then used as an initialization for the subsequent task for fine-tuning (Yosinski et al., 2014) . Depending on data availability from the target domain, all layers can be re-trained, or only the final (fully connected) layer can be re-trained. This approach allows neural networks to be trained for new tasks using relatively smaller datasets since useful low-level features are learned from the source domain data. It has been shown that pre-training on the ImageNet dataset (for classification of natural images) (Baltruschat et al., 2019b) is beneficial for chest radiography analysis and this type of transfer learning is prominently used in the research surveyed in this work. ImageNet pre-trained versions of many architectures are publicly available as part of popular deep learning frameworks. The pretrained architectures may also be used as feature extractors, in combination with more traditional methods, such as support vector machines or random forests. Domain adaptation is another subfield of transfer learning and is discussed thoroughly in Section 2.7. In this work we use the term 'image-level prediction' to refer to tasks where prediction of a category label (classification) or continuous value (regression) is implemented by analysis of an entire CXR image. These methods are distinct from those which make predictions regarding small patches or segmented regions of an image. Classification and regression tasks are grouped together in this work since they typically use the same types of architecture, differing only in the final output layer. One of the early successful deep convolutional architectures for image-level prediction was AlexNet (Krizhevsky et al., 2012) , which consists of 5 convolutional layers followed by 3 fully connected layers. AlexNet became extremely influential in the literature when it beat all other competitors in the ILSVRC (ImageNet) challenge (Deng et al., 2009 ) by a large margin in 2012. Since then many deep convolutional neural network architectures have been proposed. The VGG family of models (Simonyan and Zisserman, 2014) use 8 to 19 convolutional layers followed by 3 fully-connected layers. The Inception architecture was first introduced in 2015 (Szegedy et al., 2015) using multiple convolutional filter sizes within layered blocks known as Inception modules. In 2016, the ResNet family of models began to gain popularity and improve upon previous benchmarks. These models define residual blocks consisting of multiple convolution operations, with skip connections which typically improve model performance. After the success of ResNet, skip connections were widely adopted in many architectures. DenseNet models (Huang et al., 2017) , introduced in 2017, also use skip connections between blocks, but connect all layers to each other within blocks. A later version of the Inception architecture also added skip connections (Inception-Resnet) (Szegedy et al., 2017) . The Xception network architecture (Chollet, 2017) builds upon the Inception architecture but separates the convolutions performed in the 2D image space from those performed across channels. This was demonstrated to improve performance compared to Inception V3. The majority of works surveyed in this review use one or more of the model architectures discussed here with varying numbers of hidden layers. Segmentation is a task where pixels are assigned a category label, and can also be considered as a pixel classification. In natural image analysis, this task is often referred to as 'semantic segmentation' and frequently requires every pixel in the image to have a specified category. In the medical imaging domain these labels typically correspond to anatomical features (e.g., heart, lungs, ribs), abnormalities (e.g., tumor, opacity) or foreign objects (e.g., tubes, catheters). It is typical in the medical imaging literature to segment just one object of interest, essentially assigning the category 'other' to all remaining pixels. Early approaches to segmentation using deep learning used standard convolutional architectures designed for classification tasks (Chen et al., 2018b) . These were employed to classify each pixel in a patch using a sliding window approach. The main drawback to this approach is that neighboring patches have huge overlap in pixels, resulting in inefficiency caused by repeating the same convolutions many times. It additionally treats each pixel separately which results in the method being computationally expensive and only applicable to small images or patches from an image. To address these drawbacks, fully convolutional networks (FCNs) were proposed, replacing fully connected layers with convolutional layers (Shelhamer et al., 2017) . This results in a network which can take larger images as input and produces a likelihood map output instead of an output for a single pixel. In 2015, a fully convolutional architecture known as the U-Net was proposed (Ronneberger et al., 2015) and this work has become the most cited paper in the history of medical image analysis. The U-Net consists of several convolutional layers in a contracting (downsampling) path, followed by further convolutional layers in an expanding (upsampling) path which restores the result to the input resolution. It additionally uses skip connections between the same levels on the contracting and expanding paths to recover fine details that were lost during the pooling operation. The majority of image segmentation works in this review employ a variant of the FCN or the U-Net. This survey uses the term localization to refer to identification of a specific region within the image, typically indicated by a bounding box, or by a point location. As with the segmentation task, localization, in the medical domain, can be used to identify anatomical regions, abnormalities, or foreign object structures. There are relatively few papers in the CXR literature reviewed here that deal specifically with a localization method, however, since it is an important task in medical imaging, and may be easier to achieve than a precise segmentation, we categorize these works together. In 2014, the RCNN (Region Convolutional Neural Network) was introduced (Girshick et al., 2014) , identifying regions of interest in the image and using a CNN architecture to extract features of these regions. A support vector machine (SVM) was used to classify the regions based on the extracted features. This method involves several stages and is relatively slow. It was later superseded by fast-RCNN (Girshick, 2015) and subsequently by faster-RCNN (Ren et al., 2017) which streamlined the processing pipeline, removing the need for initial region identification or SVM classification, and improving both speed and performance. In 2017, a further extension was added to faster-RCNN to additionally enable a precise segmentation of the item identified within the bounding box. This method is referred to as Mask R-CNN . While this is technically a segmentation network, we mention it here as part of the RCNN family. Another architecture which has been popular in object localization is YOLO (You Only Look Once), first introduced in 2016 ( Redmon et al., 2016) as a single-stage object detection method, and improved in subsequent versions in 2017 and 2018 Farhadi, 2017, 2018) . The original YOLO architecture, using a single CNN and an image-grid to specify outputs was significantly faster than its contemporaries but not quite as accurate. The improved versions leveraged both classification and detection training data and introduced a number of training improvements to achieve state of the art performance while remaining faster than its competitors. A final localization network that features in medical imaging literature is RetinaNet (Lin et al., 2017) . Like YOLO, this is a single stage detector, which introduces the concept of a focal loss function, forcing the network to concentrate on more difficult examples during training. Most of the localization works included in this review use one of the architectures described above. One of the tasks deep learning has been commonly used for is the generation of new, realistic images, based on information learned from a training set. There are numerous reasons to generate images in the medical domain, including generation of more easily interpretable images (by increasing resolution, or removal of projected structures impeding analysis), generation of new images for training (data augmentation), or conversion of images to emulate appearances from a different domain (domain adaptation). Various generative schemes have also been used to improve the performance of tasks such as abnormality detection and segmentation. Image generation was first popularized with the introduction of the generative adversarial network (GAN) in 2014 (Goodfellow et al., 2014) . The GAN consists of two network architectures, an image generator, and a discriminator which attempts to differentiate generated images from real ones. These two networks are trained in an adversarial scheme, where the generator attempts to fool the discriminator by learning to generate the most realistic images possible while the discriminator reacts by progressively learning an improved differentiation between real and generated images. The training process for GANs can be unstable with no guarantee of convergence, and numerous researchers have investigated stabilization and improvements of the basic method (Salimans et al., 2016; Heusel et al., 2017; Karras et al., 2018; Arjovsky et al., 2017) . GANs have also been adapted to conditional data generation Odena et al., 2017) by incorporating class labels, image-to-image translation (conditioned on an image in this case) , and unpaired image-to-image translation (CycleGAN Zhu et al. (2017) ). GANs have received a lot of attention in the medical imaging community and several papers were published for medical image analysis applications in recent years (Yi et al., 2019b) . Many of the image generation works identified in this review employed GAN based architectures. In this work we use the term 'Domain Adaptation', which is a subfield of transfer learning, to cover methods attempting to solve the issue that architectures trained on data from a single 'domain' typically perform poorly when tested on data from other domains. The term 'domain' is weakly defined; In medical imaging it may suggest data from a specific hardware (scanner), set of acquisition parameters, reconstruction method or hospital. It could, less frequently, also refer to characteristics of the population included, for example the gender, ethnicity, age or even strain of some pathology included in the dataset. Domain adaptation methods consider a network trained for an image analysis task on data from one domain (the source domain), and how to perform this analysis accurately on a different domain (the target domain). These methods can be categorized as supervised, unsupervised, and semi-supervised depending on the availability of labels from the target domain and they have been investigated for a variety of CXR applications from organ segmentation to multi-label abnormality classification. There is no specific architecture that is typical for domain adaptation, but rather architectures are combined in various ways to achieve the goal of learning to analyze images from unseen domains. The approaches to this problem can be broadly divided into three classes (following the categorization of (Wang and Deng, 2018) ); discrepancy-based, reconstructionbased and adversarial-based. Discrepancy-based approaches aim to induce alignment between the source and target domain in some feature space by fine-tuning the image analysis network and optimizing a measurement of discrepancy between the two domains. Reconstruction-based approaches, on the other hand, use an auxiliary encoder-decoder reconstruction network that aims to learn domain invariant representation through a shared encoder. Adversarial-based approaches are based on the concept of adversarial training from GANs, and use a discriminator network which tries to distinguish between samples from the source and target domains, to encourage the use of domain-invariant features. This category of approaches is the most commonly used in CXR analysis for domain adaptation, and consists of generative and non-generative models. Generative models transform source images to resemble target images by operating directly on pixel space whereas non-generative models use the labels on the source domain and leverage adversarial training to obtain domain invariant representations. Deep learning relies on large amounts of annotated data. The digitization of radiological workflows enables medical institutions to collate and categorize large sets of digital images. In addition, advances in natural language processing (NLP) algorithms mean that radiological reports can now be automatically analyzed to extract labels of interest for each image. These factors have enabled the construction and release of multiple large labelled CXR datasets in recent years. Other labelling strategies have included the attachment of the entire radiology report and/or labels generated in other ways, such as radiological review of the image, radiological review of the report, or laboratory test results. Some datasets include segmentations of specified structures or localization information. In this section we detail each public dataset that is encountered in the literature included in this review as well as any others available to the best of our knowledge. Details are provided in Table 1 . Each dataset is given an acronym which is used in the literature review tables (Tables 2 to 7) to indicate that the dataset was used in the specified work. 1. ChestX-ray14 (C) is a dataset consisting of 112, 120 CXRs from 30, 805 patients (Wang et al., 2017b The dataset was automatically labeled from radiology reports using the same rule-based labeler system (described above) as CheXpert. A second version (V2) of MIMIC-CXR was later released including the anonymized radiology reports and DICOM files. 4. PadChest (P) is a dataset consisting of 160, 868 CXRs from 109, 931 studies and 67, 000 patients (Bustos et al., 2020) . The CXRs are collected at San Juan Hospital (Spain) from 2009 to 2017. The images are stored as 16bit grayscale images with full resolution. 27, 593 of the reports were manually labeled by physicians. Using these labels, an RNN was trained and used to label the rest of the dataset from the reports. The reports were used to extract 174 findings, 19 diagnoses, and 104 anatomic locations. The labels conform to a hierarchical taxonomy based on the standard Unified Medical Language System (UMLS) (Bodenreider, 2004) . 5. PLCO (PL) is a screening trial for prostate, lung, colorectal and ovarian (PLCO) cancer (Zhu et al., 2013 , 2005) . The images are distributed as anonymized DICOMs. The radiological findings obtained by radiologist interpretation are available in MeSH format 1 . 7. Ped-Pneumonia (PP) is a dataset consisting of 5,856 pediatric CXRs (Kermany, 2018) . The CXRs are collected from Guangzhou Women and Children's Medical Center, Guangzhou, China. The images are distributed in 8-bit grayscale images scaled in various resolutions. The labels include bacterial and viral pneumonia as well as normal. 8. JSRT dataset (J) consists of 247 images with a resolution of 2048 × 2048, 0.175mm pixel-size and 12-bit depth (Shiraishi et al., 2000) . It includes nodule locations (on 154 images) and diagnosis (malignant or benign). The reference standard for heart and lung segmentations of these images are provided by the SCR dataset (van Ginneken et al., 2006) and we group these datasets together in this work. 9. RSNA-Pneumonia (RP) is a dataset consisting of 30, 000 CXRs with pneumonia annotations (RSNA, 2018). These images are acquired from ChestX-ray14 and are 8-bit grayscale with 1024 × 1024 resolution. Annotations are added by radiologists using bounding boxes around lung opacities and 3 classes indicating normal, lung opacity, not normal. 10. Shenzhen (S) is a dataset consisting of 662 CXRs (Jaeger et al., 2014) . The CXRs are collected at Shenzhen No.3 14. COVIDGR (CG) is a dataset consisting of 852 PA CXR images where half of them are labeled as COVID-19 positive based on corresponding RT-PCR results obtained within at most 24 hours (Tabik et al., 2020) . This dataset was collected from Hospital Universitario Clínico San Cecilio, Granada, Spain, and the level of severity of positive cases is provided. 15. SIIM-ACR (SI) This dataset was released for a Kaggle challenge on pneumothorax detection and segmentation (ACR, 2019). Researchers have determined that at least some (possibly all) of the images are from the ChestX-ray14 dataset although the challenge organizers have not confirmed the data sources. They are supplied in 1024 × 1024 resolution as DICOM files. Pixel segmentations of the pneumothorax in positive cases are provided. 16. CXR14-Rad-Labels (CR) supplies additional annotations for a subset of ChestX-ray14 data (Majkowska et al., 2019) . It consists of 4 labels for 4,374 studies and 1,709 patients. These labels are collected by the adjudicated agreement of 3 radiologists. These radiologists were selected from a cohort of 11 radiologists for the validation split (2,412 studies from 835 patients), and 13 radiologists for the test split (1,962 studies from 860 patients). The individual labels from each radiologist as well as the agreement labels were provided. 17. COVID-CXR (CV) is a dataset consisting of 930 CXRs at the time of writing (the dataset remains in continuous development) (Cohen et al., 2020c) . The CXRs are collected from a large variety of locations using different methods including screenshots from papers researching COVID-19. Available labels vary accordingly, depending on what information is available from the source where the image was obtained. Images do not have a standard resolution and are published as 8-bit PNG or JPEG files. 18. NLST (N) is a dataset of publicly available CXRs collected during the NLST screening trial National Lung Screening Trial Research Team et al. (2011) . This trial aimed to compare the use of low-dose computed tomography (CT) with CXRs for lung cancer screening in smokers. The study had 26,732 participants in the CXR arm and a part of this data is available upon request. 19. Object-CXR (OB) is a dataset of 10,000 CXR images from hospitals in China with foreign objects annotated on the images. The download location (https://jfhealthcare.github.io/object-CXR/) is no longer available at the time of writing. Further detail is not provided since it cannot be verified from the image source. 20. Belarus (BL) This dataset is included since it is used in a number of reviewed papers however the download location (http://tuberculosis.by) is no longer available at the time of writing. The dataset consisted of approximately 300 frontal chest X-rays with confirmed TB. Further detail is not provided since it can no longer be verified from the image source. The rapid increase in the number of publicly available CXR images in recent years has positively impacted the number of deep learning studies published in the field. Figure Publication of medical image data is extremely important for the research community in terms of advancing the state of the art in deep learning applications. However, there are a number of caveats that should be considered and understood when using the public datasets described in this work. Firstly, many datasets make use of Natural Language Processing (NLP) to create labels for each image. Although this is a fast and inexpensive method of labeling, it is well known that there are inaccuracies in labels acquired this way (Irvin et al., 2019; Oakden-Rayner, 2020 . There are a number of causes for such inaccuracies. Firstly, some visible abnormalities may not be mentioned in the radiology report, depending on the context in which it was acquired (Olatunji et al., 2019) . Further, the NLP algorithm can be erroneous in itself, interpreting negative statements as positive, failing to identify acronyms, etc. Finally, many findings on CXR are subtle or doubtful, leading to disagreements even among expert observers (Olatunji et al., 2019) . Acknowledging some of these issues, Irvin et al. (2019) includes labels for uncertainty or no-mention in the labels on the CheXpert dataset. One particular cause for concern with NLP labels is the issue of systematic or structured mislabeling, where an abnormality is consistently labeled incorrectly in the same way. An example of this occurs in the ChestX-ray14 dataset where subcutaneous emphysema is frequently identified as (pulmonary) 'emphysema' (Calli et al., 2019; Oakden-Rayner, 2020) . It has been demonstrated that deep neural networks can tolerate reasonable levels of label inaccuracy in the training set without a significant effect on model performance (Calli et al., 2019; Rolnick et al., 2018) . Although such labels can be used for training, for an accurate evaluation and comparison of models it is desirable that the test dataset is accurately labelled. In the literature reviewed in this work, many authors rely on labels from NLP algorithms in their test data, while others use radiologist annotations, laboratory tests and/or CT verification for improved test set labelling. We refer to data that uses these improved labelling techniques as gold standard data (Table 1) . The labels defined in the public datasets should also be considered carefully and understood by the researchers using them. Many labels have substantial dependencies between them. For example, some datasets supply labels for both 'consolidation' and 'pneumonia'. Consolidation (blocked airspace) is an indicator of a patient with pneumonia, suggesting there will be significant overlap between these labels. A further point for consideration is that, in practice, not all labels can be predicted by a CXR image alone. Pneumonia is rarely diagnosed by imaging alone, requiring other clinical signs or symptoms to suggest that this is the cause for a visible consolidation. Many public datasets release images with a lower quality than is used for radiological reading in the clinic. This may be a cause for decreased performance in deep learning systems, particularly for more subtle abnormalities. The reduction in quality is usually related to a decrease in image size or bit-depth prior to release. This is typically carried out to decrease the overall download size of a dataset. However, in some cases, CXR data has been collected by acquiring screenshots from online literature, which results in an unquantifiable degradation of the data. In the clinical workflow, DICOM files are the industry standard for storing CXRs, typically using 12 bits per pixel and with image dimensions of approximately 2 to 4 thousand pixels in each of the X and Y directions. In the event that the data is post-processed before release it would be desirable that a precise description of all steps is provided to enable researchers to reproduce them for dataset combination. In this section we survey the literature on deep learning for chest radiography, dividing it into sections according to the type of task that is addressed (Image-level Prediction, Segmentation, Image Generation, Domain Adaptation, Localization, Other). For each of these sections a table detailing the literature on that task is provided. Some works which have equal main focus on two tasks may appear in both tables. For Segmentation and Localization, only studies that quantitatively evaluate their results are included in those categories. Figure 3 shows the number of studies for each of the tasks. Image-level prediction refers to the task of predicting a label (classification) or a continuous value (regression) by analyzing an entire image. Classification labels may relate to pathology (e.g. pneumonia, emphysema), information such as the subject gender, or orientation of the image. Regression values might, for example, indicate a severity score for a particular pathology, or other information such as the age of the subject. We classified 187 studies, fully detailed in Table 2 shelf deep learning models to predict a pathology, metadata information or a set of labels provided with a dataset. The number of studies for each label are provided in Figure 4 . The studies that specifically work on a dataset and its labels are grouped together at the bottom. 187 papers are included, each may study more than one label. The most commonly studied image-level prediction task is predicting the labels of the ChestX-ray14 dataset (31 studies). For example, Baltruschat et al. (2019a) compares the performance of various approaches to classify the 14 disease labels provided by the ChestX-ray14 dataset. Rajpurkar et al. (2018) compares the performance of an ensemble of deep learning models to board-certified and resident radiologists, showing that their models achieve a performance comparable to ex- (2020) Uses the features extracted from the training dataset to detect adversarial CXRs AA C C Anand et al. (2020) Self-supervision and adversarial training improves on transfer learning AA PM PP Khatibi et al. (2021) Claims 0.99 AUC for predicting TB, uses complex feature engineering and ensembling Schroeder et al. (2021) ResNet model trained with frontal and lateral images to predict COPD with PFT results PR Zhang et al. (2021a) One-class identification of viral pneumonia cases compared with binary classification PR Balachandar et al. (2020) A distributed learning method that overcomes problems of multi-institutional settings C C Burwinkel et al. (2019) Geometric deep learning including metadata with graph structure. Application to CXR C C Nugroho (2021) Proposes a new weighting scheme to impove abnormality classification C C DSouza et al. (2019) ResNet-34 used with various training settings for multi-label classification C C Sirazitdinov et al. (2019) Investigates effect of data augmentations on classification with Inception-Resnet-v2 C C Mao et al. (2018) Proposes a variational/generative architecture, demonstrates performance on CXRs C C Rajpurkar et al. (2018) Evaluates the performance of an ensemble against many radiologists C C Kurmann et al. (2019) Novel method for multi-label classification, application to CXR C C Paul et al. (2020) Defines a few-shot learning method by extracting features from autoencoders C C Unnikrishnan et al. (2020) Mean teacher inspired a probablistic graphical model with a novel loss C C Michael and Yoon (2020) Examines the effect of denoising on pathology classification using DenseNet-121 C C Wang et al. (2021a) Proposes integrating three attention mechanisms that work at different levels C C Paul et al. (2021b) Step-wise trained CNN and saliency-based autoeencoder for few shot learning C C, O Paul et al. (2021a) Uses CT and CXR reports with CXR images during training to diagnose unseen diseases C C, PR Bustos et al. (2020) Proposes a new dataset PadChest with multi-label labels and radiology reports C P Li et al. (2021a) Lesion detection network used to improve image-level classification C PR Ghesu et al. (2019) Method to produce confidence measure alongside probability, uses DenseNet-121 C,PL C, PL Haghighi et al. (2020) Uses self-supervised learning for pretraining, compares with ImageNet pretraining C,PT C, SI Zhou et al. (2020a) Proposes a new CXR pre-training method, compares with pre-training on ImageNet C,X C,RP,X Chen et al. (2020a) Proposes a graph convolutional network framework which models disease dependencies C,X C,X Zhou et al. (2019) Compares several models for the detection of cardiomegaly CM C Bougias et al. (2020) Tests four off-the-shelf networks for prediction of cardiomegaly CM PR Brestel et al. (2018) Inception v3 (2020) Model pre-trained with public data and fine-tuned for pneumothorax detection PT PR Moradi et al. (2019a) DenseNet-121 used to detect CXRs with acquisition-based defects Q C Takaki et al. (2020) GoogleNet combined with rule-based approach to determine the image quality Q PR Pan et al. (2019a) Detects abnormal CXRs using several models. Evaluates on independent private data T C Moradi et al. Evaluates assisting clinicians with an AI based system to improve diagnosis of TB TB PR Heo et al. (2019) Various architectures, inclusion of patient demographics in model considered TB PR Kim et al. (2018) Addresses preservation of learned data, application to TB detection using ResNet-21 TB PR Gozes and Greenspan (2019) Pre-training using CXR pathology and metadata labels, application to TB detection TB S Rajaraman and Antani (2020) Compares various models using various pretraining and ensembling strategies TB S Lakhani (2017) Evaluates models on detecting the position of feeding tube in abdominal and CXRs TU PR Mitra et al. (2020) Comparison of seven architectures and ensembling for detection of nine pathologies X X Pham et al. (2020) A method to incorporate label dependencies and uncertainty data during classification X X Rajan et al. (2021) Proposes self-training and student-teacher model for sample effeciency X X Calli et al. (2019) Analyses the effect of label noise in training and test datasets Z C Deshpande et al. (2020) Labels 6 different foreign object types and detects using various architectures Z M Lu et al. (2019) Evaluates the use of CXRs to predict long term mortality using Inception-v4 Z PL Zhang et al. (2021b) Low-res segmentation is used to crop high-res lung areas and predict pneumoconiosis Z PR Devnath et al. (2021) Pneumoconiosis prediction with DenseNet-121 and SVMs applied to extracted features Z PR Liu et al. (2017) Detection of coronary artery calcification using various CNN architectures Z PR Hirata et al. (2021) ResNet-50 for detection of the presence of elevated pulmonary arterial wedge pressure Z PR Kusunose et al. (2020) A network is designed to identify subjects with elevated pulmonary artery pressure Z PR,RP pert observers in most of the 14 labels provided by ChestX-ray14. Following this, pneumonia is second most studied subject (26 studies). Of the 26 studies that worked with pneumonia, 12 studied pediatric chest X-rays and 11 of those used the Ped-Pneumonia dataset for training and evaluation (Rajaraman et al., 2018a; Yue et al., 2020; Liang and Zheng, 2020; Behzadikhormouji et al., 2020; Elshennawy and Ibrahim, 2020; Ureta et al., 2020; Mittal et al., 2020; Shah et al., 2020; Qu et al., 2020; Ferreira et al., 2020; Anand et al., 2020 Lakhani and Sundaram (2017) . Performance of a deep learning model and how the assistance of this model improves the radiologist performance is studied by Rajpurkar et al. (2020) . This study in particular evaluates the use of extra clinical information such as age, white blood cell count, patient temperature and oxygen saturation to assist the deep learning model. Diagnosis or evaluation of COVID-19 from CXR is another topic that has attracted a lot of interest from researchers (17 studies). For example, Cohen et al. (2020a) predicts the disease severity, similarly Li et al. (2020a) predicts the disease progression by comparing an exam with the previous exams of the patient, and Tartaglione et al. (2020) detects COVID-19 using a very limited amount of data. Other than these most common tasks, there are many studies using deep learning to make Image-level Predictions from CXRs. Other commonly utilized labels are illustrated in Figure 4 and listed in Table 2 . (Gozes and Greenspan, 2019; Baltruschat et al., 2019a) . More sophisticated pre-processing steps to improve model performance include bone suppression (Baltruschat et al., 2019b; Zhou et al., 2020b) and lung cropping . Some studies bring methodological novelty by making use of methods that are known to work well to improve model performance elsewhere. For example, it is known that an ensemble of many models improves performance compared to a single model (Dietterich, 2000) . Some studies that make use of this method are Rajpurkar et al. (2018) ; Rajaraman et al. (2019a) ; Rajaraman and Antani (2020); Zhang et al. (2021c) . Attention mining (or object-region mining, attention-based) models are also found in the literature (Wei et al., 2018) . Those models aim to improve performance and add localization capabilities to an image-level prediction model. Some studies making use of attention mining models are Cai et al. (2018) ; Saednia et al. (2020) . Multiple-instance learning (multi-instance learning or MIL) is another method that is used to add localization capabilities to image-level prediction models. MIL breaks the input image into smaller parts (instances), makes individual predictions relating to those instances and combines this information to make a prediction for the whole image. Some studies that make use of MIL are (Crosby et al., 2020c; Schwab et al., 2020) . Other topics within the literature include model uncertainty (Ul Abideen et al., 2020; Ghesu et al., 2019) , quality of the CXR McManigle et al., 2020; Moradi et al., 2019a; Takaki et al., 2020; McManigle et al., 2020) and defence against adversarial attack Anand et al., 2020; Xue et al., 2019) . The different properties of datasets are also utilized to improve model capabilities or performance. Many of the public datasets make use of labels that are not mutually exclusive. This has resulted in a number of papers addressing the dependencies among abnormality labels (Pham et al., 2020; Chen et al., 2020b; Chakravarty et al., 2020) . Since many of the labels are common between datasets from different institutes there has been investigation of the issues related to domain and/or label shift in images from different sources (Luo et al., 2020; Cohen et al., 2020b) . The effect of dataset sizes is evaluated by Dunnmon et al. (2019). Semi-supervised learning methods combine a small set of labeled and a large set of unlabeled data to train a model (Gyawali et al., 2019 (Gyawali et al., , 2020 Wang et al., 2019; Unnikrishnan et al., 2020) . Most of the studies working on image-level prediction tasks deal with frontal CXR images. The importance of lateral chest X-rays and models that can deal with multiple views are evaluated in Bertrand Segmentation is one of the most commonly studied subjects in CXR analysis (58 papers) and includes literature focused on the identification of anatomy, foreign objects or abnormalities. The segmentation literature reviewed for this work is detailed fully in Table 3 . Anatomical segmentation of the heart, lungs, clavicles or ribs, on chest radiographs, is a core part of many computer aided detection (CAD) pipelines. It is typically used as an initial step of such pipelines to define the region of interest for subsequent image analysis tasks to improve performance and efficiency (Baltruschat et al., 2019b; Wang et al., 2020e; Rajaraman et al., 2019b; Heo et al., 2019; Liu et al., 2019; Mansoor et al., 2016) . Further, the segmentation itself can be useful to quantify clinical parameters based on shape or area measurements. For example, cardiothoracic ratio, a clinically used measurement to assess heart enlargement (cardiomegaly), can be directly calculated from heart and lung segmentations (Sogancioglu et al., 2020; . Organ segmentation has, for these reasons, become one of the most commonly studied subjects among CXR segmentation tasks as seen in Figure 5 . Another application found in the CXR literature is foreign object segmentation, i.e. catheter, tubes, lines, for which high ResNet-50 based architecture with segmentation and classification branches PT SI Tolkachev et al. (2020) Investigates U-Net based models with various backbone encoders for pneumothorax PT SI Groza and Kuzin (2020) Ensemble of three LinkNet based networks and with multi-step postprocessing PT SI Xue et al. (2018d) Cascaded network with Faster R-CNN and U-Net for aortic knuckle Z J Yi et al. (2019a) Multi-scale U-Net based model with recurrent module for foreign objects Z O Lee et al. (2018) Two FCN to segment peripherally inserted central catheter line and its tip Z PR Pan et al. (2019b) Two Mask R-CNN to segment the spine and vertebral bodies and calculate the Cobb angle Z PR performance levels have been reported using deep learning Frid-Adar et al., 2019; Sullivan et al., 2020) . Interestingly, only a small number of works addressed segmentation of abnormalities. Hurt et al. (2020) focused on segmentation of pneumonia, and Tolkachev et al. (2020) developed a method to segment pneumothorax. Both of these works used recently published challenge datasets (hosted by Kaggle), namely RSNA-Pneumonia and SIIM-ACR. In general, the determination of abnormal locations on CXR is dominated by methods which addressed this as a localization task (i.e. via bounding-box type annotations) rather than exact delineation of abnormalities through segmentation. This is likely to be attributable to the difficulty of precise annotation on a projection image and to the high annotation cost for precise segmentations. A small number of works tackled the segmentation task using a patch-based CNN, which is trained to classify the center of pixel in the patch as foreground or background by means of sliding-window approach ( et al., 2019). However, this approach is generally considered inefficient for segmentation and most works use fully convolutional networks (FCN) (Shelhamer et al., 2017) , which can take larger, arbitrary sized, images as input and produce a similar sized, per-pixel prediction, likelihood map in a single forward pass. In particular, the U-Net architecture (Ronneberger et al., 2015) , a type of FCN, dominates the field with 50% of segmentation works in literature (29/58) employing it or some similar variant. Successful applications were built with this architecture to segment organs (Novikov et al., 2018; Furutani et al., 2019; Kitahara et al., 2019) , pneumonia and foreign objects Frid-Adar et al., 2019) . For example, Novikov et al. (2018) compared three U-Net variant architectures for multi-class segmentation of the heart, clavicles and lungs on the JSRT dataset. Using regularization to prevent over-fitting and weighted cross entropy loss to balance the dataset, they outperformed the human observer at heart and lung segmentation. This result was in line with other works Bortsova et al., 2019; Arsalan et al., 2020) employing FCN-type architectures which also achieved very high performance levels on this dataset. One commonly encountered challenge is that many algorithms produce noisy segmentation maps. In order to tackle this, several works employed post-processing techniques. Lee et al. (2018) used a probabilistic Hough line transform algorithm to remove false positives and produce a smoother segmentation of peripherally inserted central catheters (PICC). Groza and Kuzin (2020) used a heuristic approach to average crossfold predictions with an optimized binarization threshold and a dilation technique for pneumothorax segmentation. Some authors proposed to learn post-processing by training an independent network, inputting segmentation predictions for refine-ment, rather than using conventional methods. For example, Larrazabal et al. (2020) used denoising autoencoders, trained to produce anatomically plausible segmentations from the initial predictions. Similarly, Souza et al. (2019) used a FCN to refine segmentation predictions. The final segmentation was achieved by combining the initial and reconstructed segmentation results. A number of researchers used a multi-stage training strategy, where network predictions are refined in several steps during training (Wessel et al., 2019; Souza et al., 2019; Xue et al., 2018d Xue et al., , 2020 . For example, Xue et al. (2018d) employed faster-RCNN to produce coarse segmentation results, which were then used to crop the images to a region of interest, which was provided to a U-Net trained to predict the final segmentation result. Similarly, Souza et al. (2019) employed two networks, where the second network received the predictions of the first to refine the segmentation results. Wessel et al. (2019) trained separate networks for segmentation of each rib in chest radiographs based on Mask R-CNN. The predicted segmentation results from the rib above was fed to each network as an additional input. Although most of the works in the literature harnessed FCN architectures, a few authors employed recurrent neural networks (RNN) for segmentation tasks (Yi et al., 2019a; Milletari et al., 2018; Mathai et al., 2019) and report good performance. Milletari et al. (2018) proposed a novel architecture where the decoding component was long short term memory (LSTM) architecture to obtain multi-scale feature integration. The proposed approach achieved a Dice score of 0.97 for lung segmentation on Montgomery dataset. Similarly, Yi et al. (2019b) developed a scale RNN, a network based on encoder and decoder architecture with recurrent modules, for segmentation of catheter and tubes on pediatric chest X-rays. The high cost of obtaining segmentation annotations motivates the development of segmentation systems which incorporate weak-labels or simulated datasets with the aim of reducing annotation costs (Frid-Adar et al., 2019; Ouyang et al., 2019; Lu et al., 2020b; Yi et al., 2019a) . Several works addressed this using weakly supervised learning approaches Ouyang et al., 2019) . Lu et al. (2020b) proposed a graph convolutional network based architecture which required only one labeled image and leveraged large amounts of unlabeled data (one-shot learning) through a newly introduced three contourbased loss function. Ouyang et al. (2019) proposed a pneumothorax segmentation framework which incorporated both images with pixel level annotations and weak image-level annotations. The authors trained an image classification network, ResNet-101, with weakly labeled data to derive attention maps. These attention maps were then used to train a segmentation model, Tiramisu, together with pixel level annotations. Localization refers to the identification of a region of interest using a bounding box or point coordinates rather than a more specific pixel segmentation. In this section we discuss only the CXR localization literature which provides a quantitative evaluation of this task. It should be noted that there are many other works which train networks for an image-level prediction task and provide some examples of heatmaps (e.g., saliency map or GradCAM) to suggest which region of the image determines the label. While this may be considered as a form of localization, these heatmaps are rarely quantitatively evaluated and such works are not included here. Table 4 details all the reviewed studies where localization was a primary focus of the work. The majority of CXR analysis papers performing localization focus on identifying abnormalities rather than objects (e.g., catheter) or anatomy (e.g., ribs). Localization of nodules, tuberculosis and pneumonia are commonly studied applications in the literature, as illustrated in Figure 6 . In recent years, a variety of specific architectures, i.e. YOLO, Mask R-CNN, Faster R-CNN, have been designed in computer vision research aiming at developing more accurate and faster algorithms for localization tasks . Such state of the art architectures have been rapidly adapted for CXR analysis and shown to achieve high-level performance. For example, Park et al. (2019) demonstrated that the (original) YOLO architecture was successful at identifying the location of pneumothorax on chest radiographs. The model was evaluated on an external dataset with CXRs from 1,319 patients which were obtained after percutaneous transthoracic needle biopsy (PTNB) for pulmonary lesions; it achieved an AUC of 0.898 and 0.905 on 3-h and 1-day follow-up chest radiographs, respectively. Similarly, other studies Schultheiss et al., 2020; Takemiya et al., 2019; Kim et al., 2019) harnessed architectures like RetinaNet, Mask R-CNN and RCNN for localization of nodules and masses. Kim et al. (2020) trained RetinaNet and Mask R-CNN for detection of nodule and mass and investigated the optimal input size. The authors showed that, using a square image with 896 pixels as the edge length, RetinaNet and Mask R-CNN achieved FROC of 0.906 and 0.869, respectively. A number of papers adapted classification architectures (e.g., ResNet, DenseNet) to directly regress landmark locations for CXR localization tasks (Hwang et al., 2019b; Cha et al., 2019) . One common way of tackling this is to adapt the networks to produce heatmap predictions and draw boxes around the areas that created the highest signals. For example, Hwang et al. (2019b) tailored a DenseNet-based classifier to produce heatmap predictions for each of four types of CXR abnormalities. The network was trained with pixel-wise cross entropy between the predictions and annotations. Similarly, Cha et al. (2019) adapted ResNet-50 and ResNet-101 architectures for localization of nodules and masses on CXR. Other studies (Xue et al., 2018c; Li et al., 2020c) tackled this problem using patch-based approaches, commonly referred as multiple instance learning, creating patches from chest X-rays and evaluating these for the presence of abnormalities. One challenge in building robust deep learning localization systems is to collect large annotated datasets. Collecting such annotations is time-consuming and costly which has motivated researchers to build systems incorporating weaker labels during training. This research area is referred to as weakly supervised learning, and has been investigated by numerous works Hwang et al., 2019b; Nam et al., 2019; Pesce et al., 2019; Taghanaki et al., 2019b) for localization of a variety of abnormalities in CXR. Most of the works (Hwang et al., 2019b; Pesce et al., 2019; Nam et al., 2019; leveraged weak image-level labels by adapting a CNN architecture to create two branches for localization (heatmap predictions) and classification. A hybrid loss function was used, combining localization and classification losses, which enabled training of the networks using images without localization annotations. There are 35 studies identified in this work whose main focus is Image Generation, as detailed in Table 5 . Image generation techniques have been harnessed for a wide variety of purposes including data augmentation (Salehinejad et al., 2019 ), visualization (Bigolin Lanfredi et al., 2019 Seah et al., 2019) , abnormality detection through reconstruction (Tang et al., 2019c; Wolleb et al., 2020) , domain adaptation (Zhang et al., 2018) or image enhancement techniques . The generative adversarial network (GAN) (Goodfellow et al., 2014; Yi et al., 2019b) has became the method of choice for image generation in CXR and over 50% of the works reviewed here used GAN-based models. A number of works focused on CXR generation to augment training datasets (Moradi et al., 2018b; Zhang et al., 2019a; Salehinejad et al., 2019) by using unconditional GANs which synthesize images from random noise. For example, Salehinejad et al. (2019) trained a DCGAN model, similar to Moradi et al. (2018b) , independently for each class, to generate chest radiographs with five different abnormalities. The authors demonstrated that this augmentation process improved the abnormality classification performance of DCNN classifiers (ResNet, GoogleNet, AlexNet) by balancing the dataset classes. (Section 4.3) . Tasks: IC=Interval Change, IL=Image-level Predictions, PR=Preprocessing, RP=Report Parsing, SE=Segmentation, WS=Weak Supervision. Bold font in tasks implies that this additional task is central to the work and the study also appears in another table in this paper. Labels: C=ChestX-Ray14, CM=Cardiomegaly, CV=COVID, L=Lung, LC=Lung Cancer, LO=Lesion or Opacity, ND=Nodule, PE=Effusion, PM=Pneumonia, PT=Pneumothorax, R=Rib, T=Triage/Abnormal, TB=Tuberculosis, TU=Catheter or Tube, X=CheXpert, Z=Other. Datasets: C=ChestX-ray14, CC=COVID-CXR, J=JSRT+SCR, M=MIMIC-CXR, O=Open-i, PP=Ped-pneumonia, PR=Private, RP=RSNA-Pneumonia, S=Shenzen, X=CheXpert. Another work (Zhang et al., 2019a) proposed a novel GAN architecture to improve the quality of generated CXR by forcing the generator to learn different image representations. The authors proposed SkrGAN, where a sketch prior constraint is introduced by decomposing the generator into two modules for generating a sketched structural representation and the CXR image, respectively. Abnormality detection is another task which has been addressed through a combination of image generation and oneclass learning methods (Tang et al., 2019c; Mao et al., 2020) . The underlying idea of these methods is that a generative model trained to reconstruct healthy images will have a high reconstruction error if abnormal images are input at test time, allowing them to be identified. Tang et al. (2019c) harnessed GANs and employed a U-Net type autoencoder to reconstruct images (as the generator), and a CNN-based discriminator and encoder. The discriminator received both reconstructed images and real images to provide supervisory signal for realistic reconstruction through adversarial training. Similarly, Mao et al. (2020) proposed an autoencoder for abnormality detection which was trained only with healthy images. In this case the autoencoder was tailored to not only reconstruct healthy images but also produce uncertainty predictions. By leveraging uncertainty, the authors proposed a normalized reconstruction error to distinguish abnormal CXR images from normal ones. The most widely studied subject in the image generation lit-erature is image enhancement. Several researchers investigated bone suppression (Liu et al., 2020a; Matsubara et al., 2020; Zarshenas et al., 2019; Gozes and Greenspan, 2020; Lin et al., 2020; Zhou et al., 2020b) and lung enhancement Gozes and Greenspan, 2020) techniques to improve image interpretability. A number of works (Liu et al., 2020a; Zhou et al., 2020b) employed GANs to generate bone-suppressed images. For example, Liu et al. (2020a) employed GANs and leveraged additional input to the generator to guide the dual-energy subtraction (DES) soft-tissue image generation process. In this study, bones, edges and clavicles were first segmented by a CNN model, and the resulting edge maps were fed to the generator with the original CXR image as prior knowledge. For building a deep learning model for bone suppressed CXR generation, the paired dual energy (DE) imaging is needed, which is not always available in abundance. Several other studies Gozes and Greenspan, 2020) addressed this by leveraging digitally reconstructed radiographs for enhancing the lungs and bones in CXR. For instance, trained an autoencoder for generating CXR with bone suppression and lung enhancement, and the knowledge obtained from DRR images were integrated through the encoder. Most of the papers surveyed in this work train and test their method on data from the same domain. This finding is inline with the previously reported studies (Kim et al., 2019; Eslami et al. (2020) Conditional GANs for multi-class segmentation of heart,clavicles and lungs SE CL,H,L J Onodera et al. (2020) Processing method to produce scatter-corrected CXRs and segments masses with U-Net SE LO SM Combines classification loss and autoencoder reconstruction loss IL,SE T J, MO,O,S Seah et al. (2019) Wasserstein GAN to permute diseased radiographs to appear healthy IL,LC Z PR Wolleb et al. (2020) Novel GAN model trained with healthy and abnormal CXR to predict difference map IL PE SM, X Tang et al. (2019c) GANs with U-Net autoencoder and CNN discriminator and encoder for one-class learning IL T C Mao et al. (2020) Autoencoder uses uncertainty for reconstruction error in one-class learning setting IL T PP,RP Mahapatra and Ge (2019) Conditional GAN based DA for image registration using segmentation guidance DA,RE,SE L C Madani et al. (2018) Adversarial based method adapting new domains for abnormality classification DA,IL CM PL Umehara et al. (2017) Proposes a patch-based CNN super resolution method SR Z J Uzunova et al. (2019) Generates high resolution CXRs using multi-scale, patch based GANs SR Z O Zhang et al. (2019a) Novel GAN model with sketch guidance module for high resolution CXR generation SR Z PP Lin et al. (2020) AutoEncoder for bone suppression and segmentation with statistical similarity losses SE,PR BS J Dong et al. (2019) Uses neural architecture search to find a discriminator network for GANs SE H,L J, PR Taghanaki et al. (2019a) Proposes an iterative gradient based input preprocessing for improved performance SE L S Fang et al. (2020) Learns transformations to register two CXRs, uses the difference for interval change RE,IC Z PR Yang et al. (2017) Generates bone and soft tissue (dual energy) images from CXRs PR BS PR Zarshenas et al. (2019) Proposes an CNN with multi-resolution decomposition for bone suppression images PR BS PR Gozes and Greenspan (2020) U-Net for bone generation with CT projection images, used for CXR enhancement PR BS SM Lee et al. (2019) U-Net based network to generate dual energy CXR PR Z PR Liu et al. (2020a) GAN integrates edges of ribs and clavicles to guide DES-like images generation PR Z PR Xing et al. (2019) Generates diseased CXRs, evaluates their realness with radiologists and trains models LC C C Novel CycleGAN model to decompose CXR images incorporating CT projection images IL C C,PR,SM Salehinejad et al. (2019) Uses DCGAN model to generate CXR with abnormalities for data augmentation IL CM,E,PE,PT PR Albarqouni et al. (2017) U-Net based architecture to decompose CXR structures, application to TB detection IL TB PR Moradi et al. (2018b) Two DCGAN trained with normal and abnormal images for data augmentation IL Z PL Bigolin Lanfredi et al. (2019) Novel conditional GAN using lung function test results to visualize COPD progression IL Z PR Zarei et al. (2021) Conditional GAN and two variational autoencoders designed for CXR generation PR Gomi et al. (2020) Novel reconstruction algorithm for CXR enhancement PR Zhou et al. (2020b) Bone shadow suppression using conditional GANS with dilated U-Net variant BS J Matsubara et al. (2020) Generates CXRs from CT to train CNN for bone suppression BS PR Zunair and Hamza (2021) Generates COVID-19 CXR images to improve network training and performance CV CC, RP Bayat et al. (2020) 2D-to-3D encoder-decoder network for generating 3D spine models from CXR studies Z PR Bigolin Lanfredi et al. (2020) Generates normal from abnormal CXRs, uses the deformations as disease evidence Z PR Prevedello et al., 2019) and highlights an important concern: most of the performance levels reported in the literature might not generalize well to data from other domains (Zech et al., 2018) . Several studies Zech et al., 2018; Cohen et al., 2020b) demonstrated that there was a significant drop in performance when deep learning systems were tested on datasets outside their training domain for a variety of CXR applications. For example, Yao et al. (2019) investigated the performance of a DenseNet model for abnormality classification on CXR images using 10 diverse datasets varied by their location and patient distributions. The authors empirically demonstrated that there was a substantial drop in performance when a model was trained on a single dataset and tested on the other domains. Zech et al. (2018) observed a similar finding for pneumonia detection on chest radiographs. Domain adaptation (DA) methods investigate how to improve the performance of a model on a dataset from a different domain than the training set. In CXR analysis, DA methods have been investigated in three main settings; adaptation of CXR images acquired from different hardware, adaptation of pediatric to adult CXR and adaptation of digitally reconstructed radiographs (generated by average intensity projections from CT) to real CXR images. All domain adaptation studies, and studies on generalization reviewed in this work are detailed in Table 6 . Most of the research on DA for CXR analysis harnessed adversarial-based DA methods, which either use generative models (e.g., CycleGANs) or non-generative models to adapt to new domains using a variety of different approaches. For example, Dong et al. (2018) investigated an unsupervised domain adaptation based on adversarial training for lung and heart segmentation. In this approach, a discriminator network, ResNet, learned to discriminate between segmentation predictions (heart and lung) from the target domain and reference standard segmentations from the source domain. This approach forced the FCN-based segmentation network to learn domain invariant features and produce realistic segmentation maps. A number of works (Chen et al., 2018a; Zhang et al., 2018; Oliveira and dos Santos, 2018) addressed unsupervised DA using CycleGAN-based models to transform source images to resemble those from the target domain. For example, Zhang et al. (2018) used a CycleGAN-based architecture to adapt CXR images to digitally reconstructed radiographs (DRR) (generated from CT scans), for anatomy segmentation in CXR. A CycleGAN-based model was employed to convert the CXR image appearance and a U-Net variant architecture to simultane- Dong et al. (2018) Adversarial training of lung and heart segmentation for DA SE CM J, PR Zhang et al. (2018) CycleGAN guided by a segmentation module to convert CXR to CT projection images SE H,L,Z PR Chen et al. (2018a) CycleGAN based DA model with semantic aware loss for lung segmentation SE L MO Oliveira et al. (2020a) Conditional GANs based DA for bone segmentation SE R SM Lenga et al. (2020) Continual learning methods to classify data from new domains IL C,M C, M Tang et al. (2019a) CycleGAN model to adapt adult to pediatric CXR for pneumonia classification IL PM PP,RP Mahapatra and Ge (2019) Conditional GAN based DA for image registration using segmentation guidance IG,RE,SE L C Madani et al. (2018) Adversarial based method adapting new domains for abnormality classification IG,IL CM PL Zech et al. (2018) Assessment of generalization to data from different institutes IL PM C, O Sathitratanacheewin et al. (2020) Demonstrates the effect of training and test on data from different domains IL TB S ously segment organs of interest. Similarly, CycleGAN-based models were adapted to transfer DRR images to resemble CXR images for bone segmentation (Oliveira et al., 2020a) and to transform adult CXR to pediatric CXR for pneumonia classification (Tang et al., 2019c) . Unlike most of the studies which utilized DA methods in unsupervised setting, a few studies considered supervised and semi-supervised approaches to adapt to the target domain. Oliveira et al. (2020b) employed a MUNIT-based architecture (Huang et al., 2018) to map target images to resemble source images, subsequently feeding the transformed images to the segmentation model. The authors investigated both unsupervised and semi-supervised approaches in this work, where some labels from the target domain were available. Another work by Lenga et al. (2020) studied several recently proposed continual learning approaches, namely joint training, elastic weight consolidation and learning without forgetting, to improve the performance on a target domain and to mitigate effectively catastrophic forgetting for the source domain. The authors evaluated these methods for 2 publicly available datasets, ChestX-ray14 and MIMIC-CXR, for a multi-class abnormality classification task and demonstrated that joint training achieved the best performance. In this section we review articles with a primary application that does not fit into any of the categories detailed in Sections 4.1 to 4.5 (14 studies). These works are detailed fully in Table 7 . Image retrieval is a task investigated by a number of authors (Anavi et al., 2015 (Anavi et al., , 2016 Conjeti et al., 2017; Chen et al., 2018c; Silva et al., 2020; Owais et al., 2020; Haq et al., 2021) . The aim of image retrieval tools is to search an image archive to find cases similar to a particular index image. Such algorithms are envisaged as a tool for radiologists in their daily workflow. Chen et al. (2018c) proposed a ranked feature extraction and hashing model, while Silva et al. (2020) proposed to use saliency maps as a similarity measure. Another task that did not belong to previously defined categories is out-of-distribution detection. Studies working on this (Márquez-Neila and Sznitman, 2019; Ç allı et al., 2019; Bozorgtabar et al., 2020) aim to verify whether a test sample be-longs to the distribution of the training dataset as model performance is otherwise expected to be sub-optimal. Ç allı et al. (2019) propose using the training dataset statistics on different layers of a deep learning model and applying Mahalanobis distance to see the distance of a sample from the training dataset. Bozorgtabar et al. (2020) approach the problem differently and train an unsupervised autoencoder. Later they use the feature encodings extracted from CXRs to define a database of known encodings and compare new samples to this database. Report generation is another task which has attracted interest in deep learning for CXR Yuan et al., 2019; Syeda-Mahmood et al., 2020; Xue et al., 2018a) . These studies aim to partially automate the radiology workflow by evaluating the chest X-ray and producing a text radiology report. For example, Syeda-Mahmood et al. (2020) first determines the findings to be reported and then makes use of a large dataset of existing reports to find a similar case. This case report is then customized to produce the final output. One other task of interest is image registration (Mansilla et al., 2020) . This task aims to find the geometric transformation to convert a CXR so that it anatomically aligns with another CXR image or a statistically defined shape. The clinical goal of this task is typically to illustrate interval change between two images. Detecting new findings, tracking the course of a disease, or evaluating the efficacy of a treatment are among the many uses of image registration (Viergever et al., 2016) . To that end, Mansilla et al. (2020) aims to create an anatomically plausible registration by using the heart and lung segmentations to guide the registration process. Computer-aided analysis of CXR images has been researched for many years, and in fact CXR was one of the first modalities for which a commercial product for automatic analyis became available in 2008. In spite of this promising start, and of the advances in the field achieved by deep learning, translation to clinical practice, even as an assistant to the reader, is relatively slow. There are a variety of legal and ethical considerations which may partly account for this (Recht et al., 2020; Strohm et al., 2020) , however there is growing acceptance that (2019) Proposes a method to reject out-of-distribution images during test time OD,IL Z C Bozorgtabar et al. (2020) Proposes to detect anomalies based on a dataset of autoencoder features OD Q,T C Ç allı et al. (2019) Mahalanobis distance on network layers to detect out-of-distribution samples OD Z C Anavi et al. (2015) Compares the extracted feature and classification similarities for ranking IR PR Haq et al. (2021) Uses extracted features to cluster similarly labeled CXRs across datasets IR C,X C,X Chen et al. (2018c) Proposes a learnable hash to retrieve CXRs with similar pathologies IR Z C Conjeti et al. (2017) Residual network to retrieve images with similar abnormalities IR Z O Anavi et al. (2016) Combines features extracted from CXRs and metadata for image retrieval IR Z PR Silva et al. (2020) Proposes to use the saliency maps as a similarity measure for image retrieval IR Z X artificial intelligence (AI) products have a place in the radiological workflow and attempts are underway to understand and address the issues to be overcome . In this section we examine the currently available commercial products for CXR analysis. An up to date list of commercial products for medical image analysis (Grand-challenge, 2021; van Leeuwen et al., 2021) was searched for products applicable to chest X-ray. One product was excluded as it is not specifically a CXR diagnostic tool, but a texture analysis product for many modalities. The 21 remaining products are listed in Table 8 . A number of these products have already been evaluated in peer-reviewed publications, as shown in Table 8 and it is beyond the scope of this work to make an assessment of their performance. All of the listed products are CE marked (Europe) and/or FDA cleared (United States) and are thus available for clinical use (Grand-challenge, 2021; van Leeuwen et al., 2021) . The commercial products include applications for a wide range of abnormalities, with 6 of them reporting results for more than 5 (and up to 30) different labels. The most commonly addressed task is pneumothorax identification (8 products), followed by pleural effusion (7), nodules (6) and tuberculosis (4). In contrast with the literature, which is dominated by imagelevel prediction algorithms, 17 of 21 products in Table 8 claim to provide localization of one or more abnormalities which they are designed to detect, usually visualized with heatmaps or contouring of abnormalities. Two further products are designed for generation of bone suppression images, one for interval change visualization and one for identification and reporting of healthy images. Products contribute differently to the workflow of the radiologist. Five products focus on detecting acute cases to prioritize the worklist and speed up time to diagnosis. Draft reports are produced by five other products, for either the normal (healthy) cases only or for all cases. The production of draft reports, like workflow prioritization, is aimed at optimizing the speed and efficiency of the radiologist. In this work we have detailed datasets, literature and commercial products relevant to deep learning in CXR analysis. It is clear that this area of research has thrived on the release of multiple large, public, labeled datasets in recent years, with 209 of 295 publications reviewed here using one or more public datasets in their research. The number of publications in the field has grown consistently as more public data becomes available, as demonstrated in Figure 2 . However, although these datasets are extremely valuable, there are multiple caveats to be considered in relation to their use, as described in Section 3. In particular, the caution required in the use of NLP-extracted labels is often overlooked by researchers, especially for the evaluation and comparison of models. For accurate assessment of model performance, the use of 'gold-standard' test data labels is recommended. These labels can be acquired through expert radiological interpretation of CXRs (preferably with multiple readers) or via associated CT scans, laboratory test results, or other appropriate measurements. Other important factors to be considered when using public data include the image quality (if it has been reduced prior to release, is this a limiting factor for the application?) and the potential overlap between labels. Although a few publications address label dependencies, this is most often overlooked, frequently resulting in the loss of valuable diagnostic information. While the increased interest in CXR analysis following the release of public datasets is a positive development in the field, a secondary consequence of this readily available labeled data is the appearance of many publications from researchers with limited experience or understanding of deep learning or CXR analysis. The literature reviewed during the preparation for this paper was very variable in quality. A substantial number of the papers included offer limited novel contributions although they are technically sound. Many of these studies report experiments predicting the labels on public datasets using off-theshelf architectures and without regard to the label inaccuracies and overlap, or the clinical utility of such generic image-level algorithms. A large number of works were excluded for rea- sons of poor scientific quality (142). In 112 of these the construction of the dataset gave cause for concern, the most common example being that the training dataset was constructed such that images with certain labels came from different data sources, meaning that the images could be easily differentiated by factors other than the label of interest. In particular, a large number of papers (61) combined adult COVID-19 subjects with pediatric (healthy and other-pneumonia) subjects in an attempt to classify COVID-19. Other reasons for exclusion included the presentation of results optimized on a validation set (without a held-out test set), or the inclusion of the same images multiple times in the dataset prior to splitting train and test sets. This latter issue has been exacerbated by the publication of several COVID-19 related datasets which combine data from multiple public sources in one location, and are then themselves combined by authors building deep-learning systems. Such concerns about dataset construction for COVID-19 studies have been discussed in several other works (López-Cabrera et al., 2021; DeGrave et al., 2020; Cruz et al., 2021; Maguolo and Nanni, 2020; Tartaglione et al., 2020) . Although a broad range of off-the-shelf architectures are employed in the literature surveyed for this review, there is little evidence to suggest that one architecture outperforms another for any specific task. Many papers evaluate multiple different architectures for their task but differences between the various architecture results are typically small, proper hyperparameter optimization is not usually performed and statistical significance or data-selection influence are rarely considered. Many such evaluations use inaccurate NLP-extracted labels for evaluation which serves to muddy the waters even further. While it is not possible to suggest an optimal architecture for a specific task, it is observed that ensembles of networks typically perform better than individual models (Dietterich, 2000) . At the time of writing, most of the top-10 submissions from the public challenges (CheXpert (Irvin et al., 2019) , SIIM-ACR (ACR, 2019), and RSNA-Pneumonia (RSNA, 2018)) consist of network ensembles. There is also promise in the development of self-adapting frameworks such as the nnU-Net (Isensee et al., 2021) which has achieved an excellent performance in many medical image segmentation challenges. This framework adapts specifically to the task at hand by selecting the optimal choice for a number of steps such as preprocessing, hyperparameter optimization, architecture etc., and it is likely that a similar optimization framework would perform well for classification or localization tasks, including those for CXR images. In spite of the pervasiveness of CXR in clinics worldwide, translation of AI systems for clinical use has been relatively slow. Apart from legal and ethical considerations regarding the use of AI in medical decision making (Recht et al., 2020; Strohm et al., 2020) , a discussion which is outside the scope of this work, there are still a number of technical hurdles where progress can be made towards the goal of clinical translation. Firstly, the generalizability of AI algorithms is an important issue which needs further work. A large majority of papers in this review draw training, validation and test samples from the same dataset. However, it is well known that such models tend to have a weaker performance on datasets from external domains. If access to reliable data from multiple domains remains problematic then domain adaptation or active learning methods could be considered to address the generalization issue. An alternative method to utilize data from multiple hospitals without breaching regulatory and privacy codes is federated learning, whereby an algorithm can be trained using data from multiple remote locations (Sheller et al., 2019) . Further research is required to determine how this type of system will work in clinical practice. A final issue for deep learning researchers to consider is frequently referred to as 'explainable AI'. Systems which produce classification labels without any indication of reasoning raise concerns of trustworthiness for radiologists. It is also significantly faster for experts to accept or reject the findings of an AI system if there is some indication of how the finding was reached (e.g., identification of nodule location with a bounding box, identification of cardiac and thoracic diameters for cardiomegaly detection). Every commercial product for detection of abnormality in CXR provides a localization feature to indicate the abnormal location, however the literature is heavily focused on image-level predictions with relatively few publications where localization is evaluated. Beyond the resolution of technical issues, researchers aiming to produce clinically useful systems need to consider the workflow and requirements of the end-user, the radiologist or clinician, more carefully. At present, in the industrialized world, it is expected that an AI system will act, at least initially, as an assistant to (not a replacement for) a radiologist. As a 2D image, the CXR is already relatively quickly interpreted by a radiologist, and so the challenge for AI researchers is to produce systems that will save the radiologist time, prioritize urgent cases or improve the sensitivity/specificity of their findings. Image-level classification for a long list of (somewhat arbitrarily defined) labels is unlikely to be clinically useful. Reviewing such a list of labels and associated probabilities for every CXR would require substantial time and effort, without a proportional improvement in diagnostic accuracy. A simple system with bounding boxes indicating abnormal regions is likely to be more helpful in directing the attention of the radiologist and has the potential to increase sensitivity to subtle findings or in difficult regions with many projected structures. Similarly, a system to quickly identify normal cases has the potential to speed up the workflow as identified by multiple vendors and in the literature (Dyer et al., 2021; Dunnmon et al., 2019; Baltruschat et al., 2020) . To further understand how AI could assist with CXR interpretation, we first must consider the current typical workflow of the radiologist, which notably involves a number of additional inputs beyond the CXR image, that are rarely considered in the research literature. In most scenarios (excluding bedside/AP imaging) both a frontal and lateral CXR are acquired as part of standard imaging protocol, to reduce the interpretation difficulties associated with projected anatomy. Very few studies included in this review made use of the lateral image, although there are indications that it can improve classification accuracy (Hashir et al., 2020) . Furthermore, the reviewing radiologist has access to the clinical question being asked, the patient history and symptoms and in many cases other supporting data from blood tests or other investigations. All of this information assists the radiologist to not only identify the visible abnormalities on CXR (e.g., consolidation), but to infer likely causes of these abnormalities (e.g., pneumonia). Incorporation of data from multiple sources along with the CXR image information will almost certainly improve sensitivity and specificity and avoid an algorithm erroneously suggesting labels which are not compatible with data from external sources. Another extremely important and time-consuming element in the radiolog-ical review of CXR is comparison with previous images from the same patient, to assess changes over time. Interval change is a topic studied by very few authors and addressed by only a single commercial vendor (by provision of a subtraction image). Innovative AI systems for the visualization and quantification of interval change with one or more previous images could substantially improve the efficiency of the radiologist. Finally, the radiologist is required to produce a report as a result of the CXR review, which is another time-consuming process addressed by very few researchers and just a handful of commercial vendors. A system which can convert radiological findings to a preliminary report has the potential to save time and cost for the care provider. In many areas of the world, medical facilities that do perform CXR imaging do not have access to radiological expertise. This presents a further opportunity for AI to play a role in diagnostic pathways, as an assistant to the clinician who is not trained in the interpretation of CXR. Researchers and commercial vendors have already identified the need for AI systems to detect signs of tuberculosis (TB), a condition which is endemic in many parts of the world, and frequently in low-resource settings where radiologists are not available. While such regions of the world could potentially benefit from AI systems to detect other conditions, it is important to identify in advance what conditions could be feasibly both detected and treated in these areas where resources are severely limited. The findings of this work suggest that while the deep learning community has benefited from large numbers of publicly available CXR images, the direction of the research has been largely determined by the available data and labels, rather than the needs of the clinician or radiologist. Future work, in data provision and labelling, and in deep learning, should have a more direct focus on the clinical needs for AI in CXR interpretation. More accurate comparison and benchmarking of algorithms would be enabled by additional public challenges using appropriately annotated data for clinically relevant tasks. SIIM-ACR Pneumothorax Segmentation X-Ray In-Depth Decomposition: Revealing the Latent Structures Fine-Tuning U-Net for Ultrasound Image Segmentation: Different Layers, Different Outcomes Self-Supervision vs A comparative study for chest radiograph image retrieval using binary texture and deep learning classification Visualizing and enhancing a deep learning framework using patients age and gender for chest x-ray image retrieval An overview of deep learning approaches in chest radiograph Automated Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks Accurate segmentation of lung fields on chest radiographs using deep convolutional networks Wasserstein generative adversarial networks Artificial Intelligence-Based Diagnosis of Cardiac and Related Diseases Ensemble learning based automatic detection of tuberculosis in chest X-ray images using hybrid feature descriptors Variability in interpretation of chest radiographs among russian clinicians and implications for screening programmes: observational study Accounting for data variability in multi-institutional distributed deep learning for medical imaging Smart chest X-ray worklist prioritization using artificial intelligence: a clinical workflow simulation Comparison of Deep Learning Approaches for Multi-Label Chest X-Ray Classification When Does Bone Suppression And Lung Field Segmentation Improve Chest X-Ray Disease Classification? Deep learning with nonmedical training used for chest pathology identification Chest pathology detection using deep learning with non-medical training Inferring the 3D Standing Spine Posture from 2D Radiographs, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Digital computer determination of a medical diagnostic index directly from chest x-ray images Deep learning, reusable and problem-based architectures for detection of consolidation on chest X-ray images Robust chest x-ray quality assessment using convolutional neural networks and atlas regularization Do lateral views help automated chest x-ray predictions? Adversarial Regression Training for Visualizing the Progression of Chronic Obstructive Pulmonary Disease with Chest X-Rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Interpretation of Disease Evidence for Medical Images Using Adversarial Deformation Fields, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Determination of disease severity in COVID-19 patients using deep learning in chest X-ray images Pneumothorax detection in chest radiographs using convolutional neural networks, in: Medical Imaging 2018: Computer-Aided Diagnosis The Unified Medical Language System (UMLS): integrating biomedical terminology Matwo-CapsNet: A Multi-label Semantic Segmentation Capsules Network, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Semi-supervised Medical Image Segmentation via Learning Consistency Under Transformations Identifying cardiomegaly in chest X-rays: a cross-sectional study of evaluation and comparison between different transfer learning methods SALAD: Self-supervised Aggregation Learning for Anomaly Detection on X-Rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 RadBot-CXR: Classification of Four Clinical Finding Categories in Chest X-Ray Using Deep Learning Adaptive Image-Feature Learning for Disease Classification Using Inductive Graph Networks, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 PadChest: A large chest x-ray image dataset with multi-label annotated reports. Medical Image Analysis 66 Iterative Attention Mining for Weakly Supervised Thoracic Disease Pattern Localization in Chest X-Rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 {FRODO}: Free rejection of out-of-distribution samples: application to chest x-ray analysis Handling label noise through model confidence and uncertainty: application to chest radiograph classification Emphysema quantification on simulated X-rays through deep learning techniques Automated radiographic bone suppression with deep convolutional neural networks Machine learning applied on chest x-ray can aid in the diagnosis of COVID-19: a first experience from Performance of deep learning model in detecting operable lung cancer with chest radiographs Learning Decision Ensemble using a Graph Neural Network for Comorbidity Aware Chest Radiograph Screening Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Label Co-Occurrence Learning With Graph Convolutional Networks for Multi-Label Chest X-Ray Image Classification Semantic-aware generative adversarial nets for unsupervised domain adaptation in chest x-ray segmentation Deep Hierarchical Multi-label Classification of Chest X-ray Images Diagnosis of common pulmonary diseases in children by X-ray images and deep learning DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs Pulmonary nodule detection on chest radiographs using balanced convolutional neural network and classic candidate detection Infogan: Interpretable representation learning by information maximizing generative adversarial nets Order-Sensitive Deep Hashing for Multimorbidity Medical Image Retrieval, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 Reproducibility of abnormality detection on chest radiographs using convolutional neural network in paired radiographs obtained within a short-term interval Fostering a healthy AI ecosystem for radiology: Conclusions of the 2018 RSNA summit on AI in radiology Xception: Deep learning with depthwise separable convolutions Training and Validating a Deep Convolutional Neural Network for Computer-Aided Detection and Classification of Abnormalities on Frontal Chest Radiographs 2020a. Predicting COVID-19 Pneumonia Severity on Chest X-ray With Deep Learning On the limits of cross-domain generalization in automated x-ray prediction Covid-19 image data collection: Prospective predictions are the future Hashing with Residual Networks for Image Retrieval Network output visualization to uncover limitations of deep learning detection of pneumothorax, in: Medical Imaging 2020: Image Perception, Observer Performance, and Technology Assessment Impact of imprinted labels on deep learning classification of AP and PA thoracic radiographs Deep convolutional neural networks in the classification of dual-energy thoracic radiographic views for efficient workflow: analysis on over 6500 clinical radiographs Deep learning for pneumothorax detection and localization using networks fine-tuned with multiple institutional datasets Public covid-19 x-ray datasets and their impact on model bias -a systematic review of a significant problem Exploiting Visual and Report-Based Information for Chest X-RAY Analysis by Jointly Learning Visual Classifiers and Topic Models AI for radiographic COVID-19 detection selects shortcuts over signal Computer-aided Detection Fidelity of Pulmonary Nodules in Chest Radiograph Design and Development of a Multimodal Biomedical Information Retrieval System ImageNet: A large-scale hierarchical image database Detection Of Foreign Objects In Chest Radiographs Using Deep Learning Automated detection of pneumoconiosis with multilevel deep features learned from chest X-Ray radiographs Ensemble Methods in Machine Learning Unsupervised Domain Adaptation for Automatic Estimation of Cardiothoracic Ratio, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 Neural Architecture Search for Adversarial Medical Image Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Automated identification of thoracic pathology from chest radiographs with enhanced training pipeline Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs Diagnosis of normal chest radiographs using an autonomous deep-learning algorithm Using deep-learning techniques for pulmonary-thoracic segmentations and improvement of pneumonia diagnosis in pediatric chest radiographs Impact of hybrid supervision approaches on the performance of artificial intelligence for the classification of chest radiographs Deep-Pneumonia Framework Using Deep Learning Models Based on Chest X-Ray Images Performance of Qure.ai automatic classifiers against a large annotated database of patients with diverse forms of tuberculosis Image-to-images translation for multi-task organ segmentation and bone suppression in chest x-ray radiography Unsupervised learningbased deformable registration of temporal chest radiographs to detect interval change Deep learning for chest radiology: A review Multi-View Ensemble Convolutional Neural Network to Improve Classification of Pneumonia in Low Contrast Chest X-Ray Images Artificial Intelligence-based Fully Automated Per Lobe Segmentation and Emphysema-quantification Based on Chest Computed Tomography Compared With Global Initiative for Chronic Obstructive Lung Disease Severity of Smokers Classification of COVID-19 in chest radiographs: assessing the impact of imaging parameters using clinical and simulated images Endotracheal Tube Detection and Segmentation in Chest Radiographs Using Synthetic Data, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Visual Pattern Recognition, in: Competition and Cooperation in Neural Nets Segmentation of lung region from chest x-ray images using U-net, in: International Forum on Medical Imaging in Asia Assessment of Data Augmentation Strategies Toward Performance Improvement of Abnormality Classification in Chest Radiographs Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Fifty years of computer analysis in chest imaging: rule-based, machine learning, deep learning Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database Fast r-CNN Rich feature hierarchies for accurate object detection and semantic segmentation Improved digital chest tomosynthesis image quality by use of a projection-based dual-energy virtual monochromatic convolutional neural network with super resolution Generative adversarial nets Deep Feature Learning from a Hospital-Scale Chest X-ray Dataset with Application to TB Detection on a Small-Scale Dataset Bone Structures Extraction and Enhancement in Chest Radiographs via CNN Trained on Synthetic Data Grand challenge: Ai for radiology COVID-19 pneumonia diagnosis using chest x-ray radiograph and deep learning Pneumothorax Segmentation with Effective Conditioned Post-Processing in Chest X-Ray Semi-supervised Medical Image Classification with Global Latent Mixing, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Semi-supervised Learning by Disentangling and Self-ensembling over Stochastic Latent Space, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Evaluation of computer aided detection of tuberculosis on chest radiography among people with diabetes in Karachi Pakistan Learning Semantics-Enriched Representation via Self-discovery, Selfclassification, and Self-restoration, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 A deep community based approach for large scale content based X-ray image retrieval Quantifying the value of lateral views in deep learning for chest x-rays 2017 IEEE International Conference on Computer Vision (ICCV), IEEE Deep residual learning for image recognition Deep Learning Algorithms with Demographic Information Help to Detect Tuberculosis in Chest Radiographs in Annual Workers' Health Examination Data Region Proposals for Saliency Map Refinement for Weakly-Supervised Disease Localisation and Classification, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Gans trained by a two time-scale update rule converge to a local nash equilibrium Deep Learning for Detection of Elevated Pulmonary Artery Wedge Pressure using Standard Chest X-Ray Multiclass semantic segmentation of pediatric chest radiographs Comparison of Baseline, Bone-Subtracted, and Enhanced Chest Radiographs for Detection of Pneumothorax Differentiation Between Anteroposterior and Posteroanterior Chest X-Ray View Position With Convolutional Neural Networks. RöFo -Fortschritte auf dem Gebiet der Röntgenstrahlen und der bildgebenden Verfahren Role of standard and soft tissue chest radiography images in COVID-19 diagnosis using deep learning Densely connected convolutional networks Multimodal unsupervised image-to-image translation Augmenting Interpretation of Chest Radiographs With Deep Learning Probability Maps Deep learning for chest radiograph diagnosis in the emergency department Development and validation of a deep learning-based automated detection algorithm for major thoracic diseases on chest radiographs Development and Validation of a Deep Learning-based Automatic Detection Algorithm for Active Pulmonary Tuberculosis on Chest Radiographs Self-Transfer Learning for Weakly Supervised Lesion Localization A novel approach for tuberculosis screening based on deep convolutional neural networks, in: Medical Imaging 2016: Computer-Aided Diagnosis, International Society for Optics and Photonics Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation Image-to-image translation with conditional adversarial networks Two public chest X-ray datasets for computer-aided screening of pulmonary diseases Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers MIMIC-CXR, a deidentified publicly available database of chest radiographs with free-text reports A general fully automated deep-learning method to detect cardiomegaly in chest x-rays How far have we come? artificial intelligence for chest radiograph interpretation Age prediction using a large chest x-ray dataset Boosting the Rule-Out Accuracy of Deep Disease Detection Using Class Weight Modifiers Progressive growing of GANs for improved quality, stability, and variation Looking in the Right Place for Anomalies: Explainable Ai Through Automatic Location Learning Large dataset of labeled optical coherence tomography (oct) and chest x-ray images Learning Interpretable Features via Adversarially Robust Optimization, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Proposing a novel multiinstance learning model for tuberculosis recognition from chest X-ray images based on CNNs, complex networks and stacked ensemble Contour-aware multi-label chest X-ray organ segmentation Keep and Learn: Continual Learning by Constraining the Latent Space for Knowledge Preservation in Neural Networks, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 Computerized Bone Age Estimation Using Deep Learning Based Program: Evaluation of the Accuracy and Efficiency Automatic Lung Segmentation on Chest X-rays Using Self-Attention Deep Neural Network Short-term Reproducibility of Pulmonary Nodule and Mass Detection in Chest Radiographs: Comparison among Radiologists and Four Different Computer-Aided Detections with Optimal matrix size of chest radiographs for computer-aided detection on lung nodule or mass with deep learning Lung segmentation based on a deep learning approach for dynamic chest radiography Retraining an open-source pneumothorax detecting machine learning algorithm for improved performance to medical images ImageNet Classification with Deep Convolutional Neural Networks Automated radiographic diagnosis via feature extraction and classification of cardiac size and shape descriptors Recalibration of deep learning models for abnormality detection in smartphone-captured chest radiograph Deep Multilabel Classification in Affine Subspaces COVID-19 detection and heatmap generation in chest x-ray images Deep learning to predict elevated pulmonary artery pressure in patients with suspected pulmonary hypertension using standard chest X ray Deep Convolutional Neural Networks for Endotracheal Tube Position and X-ray Image Classification: Challenges and Opportunities Deep Learning at Chest Radiography: Automated Classification of Pulmonary Tuberculosis by Using Convolutional Neural Networks Post-DAE: Anatomically Plausible Segmentation via Post-Processing With Denoising Autoencoders TextRay: Mining Clinical Reports to Gain a Broad Understanding of Chest X-Rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI Convolutional networks for images, speech, and time series Development of a deep neural network for generating synthetic dual-energy chest x-ray images with single x-ray exposure A Deep-Learning System for Fully-Automated Peripherally Inserted Central Catheter (PICC) Tip Detection Artificial intelligence in radiology; 100 commercially available products and their scientific evidence Continual Learning for Domain Adaptation in Chest X-ray Classification Domain Aware Medical Image Classifier Interpretation by Counterfactual Impact Analysis, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Attention-Guided Convolutional Neural Network for Detecting Pneumonia on Chest X-Rays Lesion-aware convolutional neural network for chest radiograph classification Automated assessment of COVID-19 pulmonary disease severity on chest radiographs using convolutional siamese neural networks Vispi: Automatic visual perception and interpretation of chest x-rays Multi-resolution convolutional networks for chest X-ray radiograph based lung nodule detection Robust Detection of Adversarial Attacks on Medical Images Lung fields segmentation in chest radiographs using Dense-U-Net and fully connected CRF Automatic cardiothoracic ratio calculation with deep learning Encoding CT Anatomy Knowledge for Unpaired Chest X-ray Image Decomposition, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Identifying pulmonary nodules or masses on chest radiography using deep learning: external validation and strategies to improve clinical practice A transfer learning method with deep residual network for pediatric pneumonia diagnosis Deep Feature Disentanglement Learning for Bone Suppression in Chest Radiographs Focal loss for dense object detection A survey on deep learning in medical image analysis SDFN: Segmentation-based deep fusion network for thoracic disease classification in chest X-ray images Coronary artery calcification (CAC) classification with deep convolutional neural networks, in: Medical Imaging 2017: Computer-Aided Diagnosis Generating Dual-Energy Subtraction Soft-Tissue Images from Chest Radiographs via Bone Edge-Guided GAN, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Convolutional Neural Network-Based Humerus Segmentation and Application to Bone Mineral Density Estimation from Chest X-ray Images of Critical Infants The coding of roentgen images for computer analysis as applied to lung cancer Image-based Deep Learning in Diagnosing the Etiology of Pneumonia on Pediatric Chest X-rays Deep Learning to Assess Long-term Mortality From Chest Radiographs Deep Learning Using Chest Radiographs to Identify High-Risk Smokers for Lung Cancer Screening Computed Tomography: Development and Validation of a Prediction Model Learning to Segment Anatomical Structures Accurately from One Exemplar, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Deep Mining External Imperfect Data for Chest X-Ray Disease Screening Current limitations to identify COVID-19 using artificial intelligence with chest X-ray imaging Local and global transformations to improve learning of medical images applied to chest radiographs Multi-label Thoracic Disease Image Classification with Cross-Attention Networks, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Semisupervised learning with generative adversarial networks for chest X-ray classification with ability of data domain adaptation Localization and Labeling of Posterior Ribs in Chest Radiographs Using a CRFregularized FCN with Local Refinement, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 A critic evaluation of methods for covid-19 automatic detection from x-ray images Training Data Independent Image Registration with Gans Using Transfer Learning and Segmentation Information Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-adjudicated Reference Standards and Population-adjusted Evaluation Learning deformable registration of medical images with anatomical constraints A Generic Approach to Lung Field Segmentation From Chest Radiographs Using Deep Space and Shape Learning Automatic tissue characterization of air trapping in chest radiographs using deep neural networks Deep Generative Classifiers for Thoracic Disease Diagnosis with Chest X-ray Images Abnormality Detection in Chest X-Ray Images Using Uncertainty Prediction Autoencoders, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Segmentation of Vessels in Ultra High Frequency Ultrasound Sequences Using Contextual Memory, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Bone suppression for chest X-ray image using a convolutional neural filter Diagnosing Heart Failure from Chest X-Ray Images Using Deep Learning The Indiana Network For Patient Care: A Working Local Health Information Infrastructure Y-Net for Chest X-Ray Preprocessing: Simultaneous Classification of Geometry and Segmentation of Annotations Radiologic and nuclear medicine studies in the united states and worldwide: Frequency, radiation dose, and comparison with other radiation sources Automated computer analysis of radiographic images Survey of image denoising methods for medical image classification CFCM: Segmentation via Coarse to Fine Context Memory, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 A Systematic Search over Deep Convolutional Neural Network Architectures for Screening Chest Radiographs Detecting Pneumonia Using Convolutions and Dynamic Capsule Routing for Chest X-ray Images Bimodal Network Architectures for Automatic Generation of Image Annotation from Text, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 Chest x-ray generation and data augmentation for cardiovascular abnormality classification Artificial intelligence for point of care radiograph quality assessment Identifying disease-free chest x-ray images with deep transfer learning Quality controlled segmentation to aid disease detection Automated Detection and Quantification of COVID-19 Airspace Disease on Chest Radiographs: A Novel Approach Achieving Expert Radiologist-Level Performance Using a Deep Convolutional Neural Network Trained on Digital Reconstructed Radiographs From Computed Tomography-Derived Ground Truth Computer aided detection of tuberculosis on chest radiographs: An evaluation of the CAD4TB v6 system COVID-19 on Chest Radiographs: A Multireader Evaluation of an Artificial Intelligence System Image Data Validation for Medical Systems, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Unsupervised Deep Anomaly Detection in Chest Radiographs Development and validation of a deep learning algorithm detecting 10 common abnormalities on chest radiographs Development and Validation of Deep Learning-based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs Two-stage deep learning architecture for pneumonia detection and its diagnosis in chest radiographs Deep learning, computer-aided radiography reading for tuberculosis: a diagnostic accuracy study from a tertiary hospital in India Reduced lung-cancer mortality with low-dose computed tomographic screening Fully convolutional architectures for multiclass segmentation in chest radiographs An aggregate method for thorax diseases classification Half a million x-rays! first impressions of the stanford and mit chest x-ray datasets Conditional image synthesis with auxiliary classifier GANs Effect of augmented datasets on deep convolutional neural networks applied to chest radiographs Longitudinal Change Detection on Chest Xrays Using Geometric Correlation Maps, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Deep Learning COVID-19 Features on CXR Using Limited Training Data Sets Caveats in generating medical imaging labels from radiology reports with natural language processing From 3d to 2d: Transferring knowledge for rib segmentation in chest x-rays Deep transfer learning for segmentation of anatomical structures in chest radiographs Truly generalizable radiograph segmentation with conditional domain adaptation Evaluation of dose reduction potential in scatter-corrected bedside chest radiography using Unet Learning Hierarchical Attention for Weakly-supervised Chest X-Ray Abnormality Localization and Diagnosis Weakly Supervised Segmentation Framework with Uncertainty: A Study on Pneumothorax Segmentation in Chest Xray, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Comprehensive Computer-Aided Decision Support Framework to Diagnose Tuberculosis From Chest X-Ray Images: Data Mining Study Generalizable Inter-Institutional Classification of Abnormal Chest Radiographs Using Efficient Convolutional Neural Networks Evaluation of a computer-aided method for measuring the Cobb angle on chest X-rays Application of deep learning-based computer-aided detection system: detecting pneumothorax on chest radiograph after biopsy Deep learning-based detection system for multiclass lesions on chest radiographs: comparison with observer readings Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization Generalized Zero-shot Chest X-ray Diagnosis through Trait-Guided Multi-view Semantic Embedding with Self-training Discriminative ensemble learning for few-shot chest x-ray diagnosis Fast few-shot transfer learning for disease identification from chest x-ray images using autoencoder ensemble Learning to detect chest radiographs containing pulmonary lesions using visual attention networks Interpreting chest x-rays via {cnn}s that exploit hierarchical disease dependencies and uncertainty labels Lung Region Segmentation in Chest X-Ray Images using Deep Convolutional Neural Networks Challenges related to artificial intelligence research in medical imaging and the importance of image analysis competitions Computer-aided detection in chest radiography based on artificial intelligence: a survey Using artificial intelligence to read chest radiographs for tuberculosis detection: A multi-site evaluation of the diagnostic accuracy of three deep learning systems Assessing and mitigating the effects of class imbalance in machine learning with application to X-ray imaging CardioXNet: Automated detection for cardiomegaly based on deep learning Detection of lung cancer on the chest radiograph: a study on observer performance An automatic approach to lung region segmentation in chest xray images using adapted U-Net architecture Self-training with improved regularization for sample-efficient chest x-ray classification, in: Medical Imaging 2021: Computer-Aided Diagnosis, International Society for Optics and Photonics Modality-Specific Deep Learning Model Ensembles Toward Improving TB Detection in Chest Radiographs Visualization and Interpretation of Convolutional Neural Network Predictions in Detecting Pneumonia in Pediatric Chest Radiographs A novel stacked generalization of models for improved TB detection in chest radiographs Detection and visualization of abnormality in chest radiographs using modality-specific convolutional neural network ensembles Analyzing inter-reader variability affecting deep ensemble learning for COVID-19 detection in chest radiographs Assessment of an ensemble of machine learning models toward abnormality detection in chest radiographs Visualizing and explaining deep learning predictions for pneumonia detection in pediatric chest radiographs High-Throughput Classification of Radiographs Using Deep Convolutional Neural Networks Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists CheXaid: deep learning assistance for physician diagnosis of tuberculosis using chest x-rays in patients with HIV Interpretation of plain chest roentgenogram Feature Transformers: Privacy Preserving Lifelong Learners for Medical Imaging, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Integrating artificial intelligence into the clinical practice of radiology: challenges and recommendations You only look once: Unified, real-time object detection YOLO9000: Better, faster, stronger Yolov3: An incremental improvement Faster r-CNN: Towards realtime object detection with region proposal networks Deep Learning is Robust to Massive Label Noise U-Net: Convolutional networks for biomedical image segmentation RSNA Pneumonia Detection Challenge. Library Catalog: www Artificial Intelligence Algorithm Detecting Lung Infection in Supine Chest Radiographs of Critically Ill Patients With a Diagnostic Accuracy Similar to Board-Certified Radiologists Estimation of age in unidentified patients via chest radiography using convolutional neural network regression An Attention-Guided Deep Neural Network for Annotating Abnormalities in Chest X-ray Images: Visualization of Network Decision Basis * Deep learning in medical imaging and radiation therapy Synthesizing Chest X-Ray Pathology for Training Deep Convolutional Neural Networks Improved techniques for training gans Severity assessment of COVID-19 using imaging descriptors: a deep-learning transfer learning approach from non-COVID-19 pneumonia, in: Medical Imaging 2021: Computer-Aided Diagnosis, International Society for Optics and Photonics Yield, Efficiency and Costs of Mass Screening Algorithms for Tuberculosis in Brazilian Prisons Deep learning for automated classification of tuberculosis-related chest X-Ray: dataset distribution shift limits diagnostic performance generalizability Computeraided detection improves detection of pulmonary nodules in chest radiographs beyond the support by bone-suppressed images Bone Suppression Increases the Visibility of Invasive Pulmonary Aspergillosis in Chest Radiographs The Effect of Supplementary Bone-Suppressed Chest Radiographs on the Assessment of a Variety of Common Pulmonary Abnormalities: Results of an Observer Study Prediction of Obstructive Lung Disease from Chest Radiographs via Deep Learning Trained on Pulmonary Function Data A robust convolutional neural network for lung nodule detection in the presence of foreign bodies Localization of Critical Findings in Chest X-Ray Without Local Annotations Using Multi-Instance Learning Chest radiographs in congestive heart failure: Visualizing neural network learning MS-Net: Mixed-Supervision Fully-Convolutional Networks for Full-Resolution Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 An Efficient Method to Predict Pneumonia from Chest X-Rays Using Deep Learning Approach Fully convolutional networks for semantic segmentation Multiinstitutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation, in: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists' detection of pulmonary nodules Interpretability-Guided Content-Based Medical Image Retrieval Deep Convolutional Neural Network-based Software Improves Radiologist Detection of Malignant Lung Nodules on Chest Radiographs Very deep convolutional networks for large-scale image recognition Deep learning in chest radiography: Detection of findings and presence of change Assessment of Critical Feeding Tube Malpositions on Radiographs Using Deep Learning Data Augmentation for Chest Pathologies Classification International Symposium on Biomedical Imaging (ISBI 2019), IEEE Comparing deep learning models for population screening using chest radiography Cardiomegaly detection on chest radiographs: Segmentation versus classification An automatic method for lung segmentation and reconstruction in chest X-ray using deep neural networks Implementation of artificial intelligence (AI) applications in radiology: hindering and facilitating factors A Deep Learning Method for Alerting Emergency Physicians about the Presence of Subphrenic Free Air on Chest Radiographs Automated Detection and Type Classification of Central Venous Catheters in Chest X-Rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Deep learning methods for segmentation of lines in pediatric chest radiographs Building a Benchmark Dataset and Classifiers for Sentence-Level Findings Chest X-Ray Report Generation Through Fine-Grained Label Learning, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Inception-v4, inception-resnet and the impact of residual connections on learning Going deeper with convolutions Comparison of dual-energy subtraction and electronic bone suppression combined with computer-aided detection on chest radiographs: effect on human observers' performance in nodule detection COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images Improved Inference via Deep Input Transfer, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 InfoMask: Masked Variational Latent Representation to Localize Chest Disease, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Calculating the target exposure index using a deep convolutional neural network and a rule base Detection of pulmonary nodules on chest x-ray images using R-CNN, in: International Forum on Medical Imaging in Asia Weakly Supervised One-Stage Vision and Language Disease Detection Using Large Scale Pneumonia and Pneumothorax Studies, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 TUNA-Net: Task-Oriented UNsupervised Adversarial Network for Disease Recognition in Cross-domain Chest X-rays, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Xlsor: A robust and accurate lung segmentor on chest x-rays using criss-cross attention and customized radiorealistic abnormalities generation Abnormal Chest X-Ray Identification With Generative Adversarial One-Class Classifier Automated abnormality classification of chest radiographs using deep convolutional neural networks Unveiling COVID-19 from CHEST X-Ray with Deep Learning: A Hurdles Race with Small Data Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: A retrospective study 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC) Prediction of Pulmonary to Systemic Flow Ratio in Patients With Congenital Heart Disease Using Deep Learning-Based Analysis of Chest Radiographs Deep Learning for Diagnosis and Segmentation of Pneumothorax: The Results on The Kaggle Competition and Validation Against Radiologists Pattern recognition of chest x-ray images Uncertainty Assisted Robust Tuberculosis Identification With Bayesian Convolutional Neural Networks Super-resolution convolutional neural network for the improvement of the image quality of magnified images in chest radiographs United nations scientific committee on the effects of atomic radiation (UNSCEAR) Semi-supervised Classification of Diagnostic Radiographs with NoTeacher: A Teacher that is Not Mean, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Detecting pneumonia in chest radiographs using convolutional neural networks Multi-scale GANs for Memory-efficient Generation of High Resolution Medical Images, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 2020. BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients A survey of medical image registration -under review Automated chest screening based on a hybrid model of transfer learning and convolutional sparse denoising autoencoder Lung nodule classification using deep feature fusion in chest radiography CheXLocNet: Automatic localization of pneumothorax in chest radiographs using deep convolutional neural networks Thorax-Net: An Attention Regularized Deep Neural Network for Classification of Thoracic Diseases on Chest Radiography Triple attention learning for classification of 14 thoracic diseases using chest radiography Deep visual domain adaptation: A survey Automated segmentation and diagnosis of pneumothorax on chest X-rays with fully convolutional multi-scale ScSE-DenseNet: a retrospective study MDU-Net: A Convolutional Network for Clavicle and Rib Segmentation from a Chest Radiograph Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Pulmonary edema severity estimation in chest radiographs using deep learning Potential of deep learning in assessing pneumoconiosis depicted on digital chest radiography Automatically discriminating and localizing COVID-19 from communityacquired pneumonia on chest X-rays DeepCOVID-XR: An Artificial Intelligence Algorithm to Detect COVID-19 on Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach Sequential Rib Labeling and Segmentation in Chest X-Ray using Mask R-CNN DeScarGAN: Disease-Specific Anomaly Detection with Weak Supervision, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 A Robust Network Architecture to Detect Normal Chest X-Ray Radiographs Adversarial Pulmonary Pathology Translation for Pairwise Chest X-Ray Data Augmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Deep learning of feature representation with multiple instance learning for medical image analysis Cascaded Robust Learning at Imperfect Labels for Chest X-ray Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Improving Robustness of Medical Image Diagnosis with Denoising Convolutional Neural Networks, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Multimodal Recurrent Model with Attention for Automated Radiology Report Generation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 Using deep learning for detecting gender in adult chest radiographs Localizing tuberculosis in chest radiographs with deep learning Extraction of Aortic Knuckle Contour in Chest Radiographs Using Deep Learning Dense-Unet: a light model for lung fields segmentation in Chest X-Ray images Cascade of multi-scale convolutional neural networks for bone suppression of chest radiographs in gradient domain A strong baseline for domain adaptation and generalization in medical imaging Can AI outperform a junior resident? Comparison of deep neural network to first-year radiology residents for identification of pneumothorax Automatic Catheter and Tube Detection in Pediatric X-ray Images Using a Scale-Recurrent Network and Synthetic Data Generative adversarial network in medical imaging: A review Validation of a Deep Learning Algorithm for the Detection of Malignant Pulmonary Nodules in Chest Radiographs How transferable are features in deep neural networks? Interobserver variability in the interpretation of chest roentgenograms of patients with possible pneumonia Detection of peripherally inserted central catheter (PICC) in chest X-ray images: A multitask deep learning model Automatic Radiology Report Generation Based on Multi-view Image Fusion and Medical Concept Enrichment, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Comparison and Validation of Deep Learning Models for the Diagnosis of Pneumonia A probabilistic conditional adversarial neural network to reduce imaging variation in radiography Separation of bones from soft tissue in chest radiographs: Anatomy-specific orientation-frequencyspecific deep neural network convolution Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study Viral Pneumonia Screening on Chest X-Rays Using Confidence-Aware Anomaly Detection A deep learning-based model for screening and staging pneumoconiosis Characterizing Label Errors: Confident Learning for Noisy-Labeled Image Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Diagnosis of Coronavirus Disease 2019 Pneumonia by Using Chest Radiography: Value of Artificial Intelligence SkrGAN: Sketching-Rendering Unconditional Generative Adversarial Networks for Medical Image Synthesis, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Simultaneous Lung Field Detection and Segmentation for Pediatric Chest Radiographs, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Task Driven Generative Modeling for Unsupervised Domain Adaptation: Application to X-ray Image Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2018 ET-Net: A Generic Edge-aTtention Guidance Network for Medical Image Segmentation, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2019 Object detection with deep learning: A review Comparing to Learn: Surpassing ImageNet Pretraining on Radiographs by Comparing Image Representations, in: Medical Image Computing and Computer Assisted Intervention -MICCAI 2020 Identifying Cardiomegaly in ChestX-ray8 Using Transfer Learning Dilated conditional GAN for bone suppression in chest radiographs with enforced semantic features The Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial and Its Associated Research Resource Deep transfer learning artificial intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs Unpaired image-to-image translation using cycle-consistent adversarial networks A promising approach for screening pulmonary hypertension based on frontal chest radiographs using deep learning: A retrospective study Deep learning to automate Brasfield chest radiographic scoring for cystic fibrosis Synthesis of COVID-19 chest X-rays using unpaired image-to-image translation. Social Network Analysis and Mining 11 This work was supported by the Dutch Technology Foundation STW, which formed the NWO Domain Applied and Engineering Sciences and partly funded by the Ministry of Economic Affairs (Perspectief programme P15-26 'DLMedIA: Deep Learning for Medical Image Analysis'.