key: cord-202824-jb47s9qt
authors: Zhang, Pengyi; Zhong, Yunxin; Deng, Yulin; Tang, Xiaoying; Li, Xiaoqiong
title: DRR4Covid: Learning Automated COVID-19 Infection Segmentation from Digitally Reconstructed Radiographs
date: 2020-08-26
journal: nan
DOI: nan
sha: 
doc_id: 202824
cord_uid: jb47s9qt

Automated infection measurement and COVID-19 diagnosis based on Chest X-ray (CXR) imaging is important for faster examination. We propose a novel approach, called DRR4Covid, to learn automated COVID-19 diagnosis and infection segmentation on CXRs from digitally reconstructed radiographs (DRRs). DRR4Covid comprises of an infection-aware DRR generator, a classification and/or segmentation network, and a domain adaptation module. The infection-aware DRR generator is able to produce DRRs with adjustable strength of radiological signs of COVID-19 infection, and generate pixel-level infection annotations that match the DRRs precisely. The domain adaptation module is introduced to reduce the domain discrepancy between DRRs and CXRs by training networks on unlabeled real CXRs and labeled DRRs together.We provide a simple but effective implementation of DRR4Covid by using a domain adaptation module based on Maximum Mean Discrepancy (MMD), and a FCN-based network with a classification header and a segmentation header. Extensive experiment results have confirmed the efficacy of our method; specifically, quantifying the performance by accuracy, AUC and F1-score, our network without using any annotations from CXRs has achieved a classification score of (0.954, 0.989, 0.953) and a segmentation score of (0.957, 0.981, 0.956) on a test set with 794 normal cases and 794 positive cases. Besides, we estimate the sensitive of X-ray images in detecting COVID-19 infection by adjusting the strength of radiological signs of COVID-19 infection in synthetic DRRs. The estimated detection limit of the proportion of infected voxels in the lungs is 19.43%, and the estimated lower bound of the contribution rate of infected voxels is 20.0% for significant radiological signs of COVID-19 infection. Our codes will be made publicly available at https://github.com/PengyiZhang/DRR4Covid.

The highly contagious Coronavirus Disease 2019 , caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus [1] [2] [3] , has spread rapidly spread to most countries in the world. Globally, as of 2:34pm CEST, 7 July 2020, there have been 11,500,302 confirmed cases of COVID-19, including 535,759 deaths, reported to World Health Organization (WHO) [4] . Rapid detection and confirmation of COVID-19 infection is critical to prevent the spread of this epidemic.

The real-time reverse-transcriptase polymerase chain reaction (RT-PCR) is regarded as the golden standard for COVID-19 diagnosis [5] [6] . However, the clinical findings has implied that RT-PCR has a low sensitivity [7] [8] [9] , especially at the initial presentation of COVID-19.

Radiological imaging, such as Computed Tomography (CT) and Chest X-ray (CXR), is currently used to provide visual evidence for confirming positive COVID-19 patients in clinical. CT scan provides accurate 3D images of the lungs that are able to detect very small lesions effectively such as lung nodule and tumor. Existing CT findings in COVID-19 infection [10] have indicated that CT screening presents superior sensitivity over RT-PCR [8] . However, the workflow of CT imaging, involving several pre-scan events [11] , is relatively complex and CT examinations are costly. As the number of infected patients rapidly increases, the routine use of CT brings heavy burden to the radiology department [12] , which is not conductive to rapid screening of COVID-19. In comparison, CXR examination is much easier, faster and less costly, and provide high-resolution 2D images of the lungs that can detect a variety of lung conditions such as pneumonia, emphysema and cancer. Therefore, CXRs are the typical first-line imaging modality used for patients under investigation of COVID-19 [9] , despite being less sensitive than initial RT-PCR testing (69% versus 91%, respectively) [13] . Developing automated infection measurement and COVID-19 diagnosis based on CXRs is important for faster examination.

Many approaches have been proposed for automated COVID-19 diagnosis based on CXRs, and have claimed notable detection accuracy of COVID-19 infection. However, the majority of these approaches are developed based on classification models rather than segmentation models, and some of these methods use saliency map (or attention map) to indicate the infected regions roughly. Therefore, these methods cannot provide precise segmentation of COVID-19 infection for further assessment and quantification. On the other hand, many approaches for infection segmentation have been developed based on CT imaging and some 3D CT scans with voxel-level infection annotations are publicly available. Compared with CT, infection segmentation based on CXRs is more challenging due to the heterogeneous nature of X-ray images and the difficulty of precisely annotating. Currently, there is no method developed for segmenting X-ray images for COVID-19 as reviewed by Shen et al. in [9] .

Digitally reconstructed radiograph (DRR) [14] [15] [16] [17] is a synthetic X-ray image that are generated by simulating the passage of X-rays through a 3D CT volume in specific poses (position and orientation) within a virtual imaging system. CXR findings on COVID-19 infection reflect the similar radiological signs with those described by CT [13] such as bilateral, peripheral consolidation and/or ground glass opacities (GGOs) [6] [8] [9] . Thus, we expect to take advantages of the public CT scans with voxel-level annotations of COVID-19 infection and the correlation between DRR and CXRs to realize automatic infection segmentation based on CXRs. main insight behind these approaches is to extract domain-invariant representations by embedding domain adaptation modules in the pipeline of deep learning [31] [35] [36] [37] [38] [39] [40] [41] .

Existing deep domain adaptation methods generally involve aligning the source and target domain distributions from three perspectives. The first stream is image alignment, and image-to-image translation models are typically used to reduce the gap between source domain images and target domain images [27] . The second stream is feature alignment [35] [36] [37] [38] [39] [40] , which is the majority approach and aims to learn domain-invariant deep features. The last stream is output alignment, which is often used to learn semantic segmentation of urban scenes from synthetic data [31] [41] . On the other hand, we recognize that there are two main approaches to perform feature alignment, including adversarial approach [37] [38] [42] [43] [44] and non-adversarial approach [34] [36] [39] [45] [46] [47] . The adversarial approach motivates deep models to extract domain-invariant features through adversarial training. It is done by training task-specific deep models to minimize the task-specific loss and the adversarial loss simultaneously, thereby fooling the domain discriminator to maximize the probability of deep features from source domain being classified as target domain. The nonadversarial approach is statistic moment matching-based approach, involving maximum mean discrepancy (MMD) [36] [45] [46] , central moment discrepancy (CMD) [47] and second-order statistics matching [39] . The statistic moment matching-based approach encourages deep models to extract domain-invariant deep features by minimizing the distance between the statistic moments of deep features from source domain and from target domain. MMD [48] is the most representative method, and has been widely used to measure the discrepancy between the source and target distributions [34] . Compared with the adversarial approaches, MMD-based methods are simple, easy to implement, and thus facilitate to verify the efficacy of DRR4Covid quickly. In our implementation of DRR4Covid, we directly use an off-the-shelf MMDbased domain adaptation approach, i.e., LMMD proposed by Zhu et al [34] , to enable the deep models trained on DRRs to generalize to real CXRs.

Segmentation is an essential step in automated infection measurement and COVID-19 diagnosis, which can provide the delineation of the regions of interest (ROIs), e.g., infected regions or lesions, in the CXRs for further assessment and quantification. Many approaches have been proposed for automated COVID-19 diagnosis based on CXRs. However, the majority of these approaches are developed based on classification models rather than segmentation models as reviewed by Shen et al. in [9] due to the aforementioned reasons. Thus, some researches resort to the interpretability of deep classification models to highlight the COVID-19 infection regions rather than accurately segmenting the infection regions. Specifically, Oh et al. [12] introduce a probabilistic Grad-CAM saliency map to highlight the multifocal lesions within CXRs in their local patch-based deep classification models for COVID-19 diagnosis. Such method is derived from a famous explanation technique, i.e., gradient weighted class activation map (Grad-CAM), and can effectively locate the radiological signs of COVID-19 infection, such as the multifocal ground-glass opacifications and consolidations. Similarly, Karim et al. [49] use a revised Grad-CAM, i.e., Grad-CAM++, and layer-wise relevance propagation (LRP) [50] in classifying CXRs as Normal, Pneumonia and COVID-19 to highlight the class-discriminating regions in CXRs. Moreover, Tabik et al. [51] adopt multiple explanation techniques, including occlusion [52] , saliency [53] , input X gradient [54] , guided backpropagation [55] , integrated gradients [56] , and DeepLIFT [57] , to investigate the interpretability of deep classification models and highlight the relevant infection regions of pneumonia and COVID-19 separately. To sum up, these approaches based on explanation techniques are mainly used for the inspection of deep model's decision, and are probably not suitable for further assessment and quantification. In comparison, our DRR4Covid is able to train deep segmentation models for precise infection segmentation directly without the need for the pixel-level infection annotations of real CXRs. 

In this section, we describe the modular framework of proposed DRR4Covid, and analyze the key points in the design, followed by an introduction of our implementation of DRR4Covid to realize learning automated infection segmentation on CXRs.

Given CT scans with voxel-level infection annotations and unlabeled CXRs, we aim to learn deep models to perform automated COVID-19 infection segmentation on CXRs. We design DDR4Covid with a modular framework as shown in Fig. 1 . DRR4Covid consists of three key components, including an infection-aware DRR generator, a deep model for classification and/or segmentation, and a domain adaptation module. The basic workflow of DDR4Covid involves generating DRRs with pixel-level infection annotations from CT scans, and training deep models on synthetic labeled DRRs and unlabeled CXRs by using the domain adaptation module.

Generating labeled DRRs. The DRR generator is responsible for synthesizing photo-realistic DRRs that resemble real CXRs as much as possible and producing pixel-level infection annotations that match the DRRs precisely by projecting 3D CT annotations along the same trajectories used in synthesizing DRRs. High-quality DRRs with pixel-level infection annotations in the context of this paper can be defined by two conditions. One is a good consistency between synthetic DDRs and their corresponding infection annotations; the other one is a good correlation between the visual findings of COVID-19 infection in synthetic DRRs and in real CXRs. As CXRs are generally considered less sensitive than 3D CT scans [9] , it may happen that CT examination detects an abnormality, whereas the X-ray screening on the same patient reports no findings. DRRs also suffer from such problem, and it will cause the inconsistency between synthetic DRRs and the corresponding infection annotations. This is the first key point that need more attention to obtain high-quality DRRs in the design of DRR4Covid. The second key point is the correlation between the visual findings of COVID-19 infection in real CXRs and in synthetic DRRs. Note that the synthetic DRRs and infection annotations are used later to train deep models for classification and segmentation. Thus, a large gap between the visual findings of COVID-19 infection in real CXRs and in synthetic DRRs will cause deep models on DRRs fail to generalize to real CXRs even if the domain adaptation module is applied.

Training deep models with the domain adaptation module. Although synthetic DRRs are photo-realistic, there is still a gap between DRRs and real CXRs. Thus, we introduce the domain adaptation module into the framework of DRR4Covid.

According to the quality of synthetic labeled DRRs, the problem of training deep models with the domain adaptation module will be further divided into two distinct kinds of problems. One is deep domain adaptation with fully supervised learning in source domain (i.e., synthetic DRRs) and unsupervised learning in target domain (i.e., real CXRs); whereas the other one is deep domain adaptation with weakly supervised learning in source domain and unsupervised learning in target domain. The premise of the first problem is the good consistency between synthetic DDRs and infection annotations. If such premise is not well satisfied, it will turn into the second problem due to the inaccurate synthetic infection annotations. Compared with the second problem, the first problem is well defined, and has been extensively studied. In this paper, we mainly focus on solving the first problem. Thus, we first implement a high-quality DRR generator, i.e., the infection-aware DRR generator.

We design an infection-aware DRR generator to produce high-quality DRRs defined in section 3.1. The standard DRR generator takes a CT volume or an infection annotation volume in a specific pose (position and orientation) as input and outputs a DRR or an infection mask. In comparison, our DRR generator takes both a CT volume and its infection annotation volume as input and produce a labeled DRR as illustrated in Fig. 2 . A ray is casted from the X-ray source through labeled CT volumes to the center of each pixel of DRR. Each pixel value of DRR is obtained by calculating the class-weighted RPL [22] , i.e., the class-weighted summation of the length travelled by this ray within each voxel, multiplied by the relative CT intensity of the voxel that are measured in HU. The calculation of the d-th pixel of DRR d p is formulated as: 

where c d  denotes the 3D index set of the voxels of category c in the X-ray direction. We argue that the strength of the radiological signs of COVID-19 infection in CXRs and DRRs depends on the contribution rate of infected voxels (CRIV) due to the heterogeneous nature of X-ray imaging. A higher value of CRIV means a larger number of infected voxels appear in the X-ray direction, and the radiological signs of COVID-19 infection, e.g., GGOs, become more significant. Such property of X-ray imaging is well modeled in formula (1) and (4) by our infection-aware DRR generator. Increasing the weight of infected voxels will improve the value of CRIV and vice versa. Accordingly, our infection-aware DRR generator can produce DDRs with different strengths of radiological signs of COVID-19 infection simply by adjusting the weight of infected voxel. Meanwhile, the synthetic pixel-level annotations of COVID-19 infection are also calculated based on the CRIV. Therefore, our infection-aware DRR generator can maintain the consistency between synthetic DRRs and annotations easily by using a large weight of infected voxels to improve the value of CRIV. Note that an excessive large value for the weight of infected voxels may lead to a large gap between the visual findings of COVID-19 infection in real CXRs and in synthetic DRRs so that even the domain adaptation module does not work.

To sum up, our infection-aware DRR generator has the following advantages:

(1) By setting the weight of infected voxels to a very small value, it can produce DRRs with no findings, which is essential for the training of deep classification models for COVID-19 diagnosis.

(2) By setting the weight of infected voxels to a relatively large value, it will generate high-quality DRRs with pixellevel annotations of COVID-19 infection, which is essential for the training of deep models for precise infection segmentation.

(3) By adjusting the weight of infected voxels from small values to large values, it will synthesize a serial of labeled DRRs with different strengths of the radiological signs of COVID-19 infection. Such DRRs might be able to be used to quantify the sensitivity of X-ray imaging in detecting COVID-19 infection. Network architectures. We design a FCN-based network as depicted in Fig. 3 . It consists of a backbone network, a classification header and a segmentation header. Compared with FCN [18] , our model has an auxiliary classification header. The classification header is designed for two purposes. One is to enable our model to perform both classification task and segmentation task for automatic infection measurement and COVID-19 diagnosis. The other one is to facilitate the use of MMD-based methods for domain adaptation. The backbone network is responsible for extracting deep features by performing the convolution and spatial pooling operations on DRRs and CXRs. The extracted deep features are then fed into the classification header and segmentation header separately. In the classification branch, we adopt a very simple structure with a global average pooling (GAP) layer and a fully convolution (FC) layer. In the segmentation branch, we use two convolutional layers followed by an up-sampling layer to generate the segmentation output with the same size as the input DRRs and CXRs.

MMD-based domain adaptation. As a nonparametric distance estimate between two distributions, MMD [48] has been widely used in domain adaptation algorithms to measure the discrepancy between the source and target distributions. In our implementation, we adopt an off-the-shelf MMD-based domain adaptation approach, i.e., LMMD loss proposed by Zhu et al [34] . LMMD can measure the discrepancy of local distributions by taking the correlations of the relevant subdomains into consideration. By minimizing the LMMD loss in the training of deep models, the distributions of relevant subdomains within the same category in source domain and target domain are drawn close. As the LMMD method is proposed in the context of object recognition and digit classification tasks, we can apply it to the classification header directly by aligning the deep features from the GAP layer. The effect of feature alignment can be propagated to the segmentation branch implicitly through the input of the GAP layer.

Objective function. The training of our model is performed by minimizing the classification loss cls l , segmentation loss seg l and LMMD loss mmd l simultaneously. The total loss is computed as: =+

where cls  , seg  and mmd  denote the weights of the classification loss, segmentation loss and LMMD loss, respectively.

Chest CT scans. We use the public COVID-19-CT-Seg dataset [58] , which consists of 20 public COVID-19 CT cases with pixel-level annotations of the left lung, right lung and COVID-19 infection. The annotations, first labeled by junior annotators, are refined by two radiologists with 5 years experience, and are further verified and refined by a senior radiologist with more than 10 years experience in chest radiology. In these 20 CT volumes, the voxel values of 10 volumes have been normalized to [0, 255] and we cannot access their CT values measured in HUs. We discard these ten cases and use the other 10 CT cases for DRR generation in our experiments. For each CT case, we are able to obtain 40 front-view DDRs and 40 lateral-view DDRs with pixel-level annotations of COVID-19 infection by using our infection-aware DRR generator, which will be detailed in Section 4.2. Thus, we build a training set in source domain with these 800 DRRs as shown in Table 1 .

Chest X-ray images. We use two public CXR datasets, i.e., COVID-19 Radiography Database [59] and COVID-19 Chest X-ray Image Data Collection [60] . The former consists of 219 COVID-19 positive images and 1341 normal images, and the latter consists of 794 COVID-19 positive images. We randomly select 219 normal images from these 1341 normal images, and combine them with these 219 COVID-19 positive images in COVID-19 Radiography Database to build a train-validation set in target domain. Besides, we use the 794 COVID-19 positive images in COVID-19 Chest X-ray Image Data Collection and the 794 normal images that are randomly selected from the remaining 1122 normal images to build a test set in target domain as shown in Table 1 . Moreover, 119 positive images and 119 negative (normal) images are randomly sampled from the train-validation set as training set, while the remaining images are used as validation set. All experiments are repeated five times with different splits of train-validation set, and the average results are reported. Generating normal DRRs. DRRs with no findings are important for the training of deep classification and segmentation models for COVID-19 diagnosis. Our infection-aware generator is able to generate such DRRs with no findings by setting the weight of infected voxels to a relatively small value to reduce the CRIV in the ray-casting process. In our experiment, we empirically set the weights of background, lung and COVID-19 infection as 0 1 2 =24.0, =24.0, =1.0 w w w , and some synthetic normal DRRs are depicted in Fig 4. Generating multiple DRRs from each CT cases. It is simple to generate multiple DRRs from a single CT volume by adjusting the pose (position and orientation) of the CT volume within a virtual imaging system. In our experiment, we randomly translate each CT volume between -100 and 100, and rotate it between -45° and 45° in x, y, and z directions. Some DRRs generated from a single CT case are illustrated in Fig. 5 . Building training sets in source domain (DRRs). For each CT volume, we first generate 40 normal DRRs, including 20 front-view DDRs and 20 lateral-view DDRs by randomly adjusting its pose. Next, in the same way we generate 40 DRRs with pixel-level annotations of COVID-19 infection with given 0 1 2 ( , , ) w w w and 2 T . Therefore, we build a training set in source domain with 800 DRRs as shown in Table 1 . We use such infection masks to indicate the infected pixels of DRRs whose corresponding X-rays pass through the infected voxels of CT volume.

We report the evaluation results of our model trained on the 63 training sets with/without MMD-based domain adaptation module in Table 1 -24 of the appendix. We will analyze these results from the three perspectives as introduced in the experiment design.

Standard DRRs versus infection-aware DRRs. First, we do qualitative comparison in Fig. 7 . As can be seen, many infected pixels of DRRs indicated by infection masks present no-findings due to the low contribution of infected voxels in the X-ray casting. This observation is consistent with the heterogeneous nature of X-ray imaging, and implicates the lower sensitivity of X-ray imaging in comparison of CT imaging. We notice that the radiological signs of COVID-19 infection in standard DRRs are rather weak. The strength of radiological signs of COVID-19 infection in standard DRRs depends only on the infected proportion of the lungs in CT volume. Such property makes it hard to take full advantages of the publicly available CT volumes. In comparison, our infection-aware DRR generator is able to produce DRRs with different strength of radiological signs of COVID-19 infection simply by adjusting the weight of infected voxels 2 w . For instance, a CT case with mild COVID-19 infection can produce DRRs with strong radiological signs of COVID-19 infection, and a CT case with severe COVID-19 infection can get normal DRRs. Seen from the last column in Fig. 6 , the radiological signs of COVID-19 infection get stronger gradually as the weight of infected voxels 2 w increases. Note that visual appearance of infected regions in DRRs will become unrealistic when the value of 2 w is too large, e.g., 2 =12.0 w , which may not facilitate to train deep models for automated COVID-19 infection segmentation. Such controllability of the strength of radiological signs of COVID-19 infection in DRRs is very helpful to make full use of the available CT volumes and to determine the precise infection masks to train infection segmentation models.

Next, we analyze the classification and segmentation results on validation and test sets without using domain adaptation in Table 13 -24 of the appendix. In order to avoid the interference of the choice of different contribution thresholds, we average the performance scores on CTIV, and compare the average scores of standard DRRs and infection-aware DRRs visually in Fig. 8 and Fig. 9 . The infection-aware DRRs achieve significantly higher average scores on both validation and test sets in target domain than the standard DRRs. Such results indicate that the gap between infection-aware DRRs (e.g., 2 3.0 w  ) and real CXRs is smaller than the gap between standard DRRs ( 2 1.0 w  ) and real CXRs, and thus clearly verify the efficacy of our infection-aware DRR generator without using the domain adaptation module. ).

Finally, we analyze the classification and segmentation results on validation and test sets with using domain adaptation in Table 1 -12 of the appendix. We compare the average results of standard DRRs and infection-aware DRRs visually in Fig. 10 and Fig. 11 . Similarly, the infection-aware DRRs surpass the standard DRRs by a large margin on both validation and test sets in target domain. Such results strongly demonstrates the effectiveness of our infection-aware DRR generator with using the domain adaptation module. 

In order to highlight the efficacy of our domain adaptation module, we compare the average scores of domain adaptation and no domain adaptation on validation and test sets in Fig. 12 and Fig. 13 . This intuitive comparison shows that the using of our domain adaptation module can improve the classification and segmentation scores of standard DRRs and infection-aware DRRs significantly and consistently, which strongly verifies the efficacy of our domain adaptation module. Besides, seen from Fig. 8 and Fig. 9 , we notice that the average scores of infection-aware DRRs increases first and then decrease as the weight of infected voxels 2 w increases from 1.0 to 3.0 and then to 12.0. The peak of average scores of infection-aware DRRs appears at 2 3.0 w  . It suggests that an excessively large weight of infected voxels may make the infected regions in DRRs unrealistic, thus leading to a decrease in performance scores without using domain adaptation module. In comparison, there is no significant decrease in the average scores of infection-aware DRRs with using our domain adaptation module as shown in Fig. 10 and Fig. 11 when the weight of infected voxels 2 w increases from 3.0 to 6.0 and then to 12.0. It implies that the domain adaptation module still works well even when infected regions in DRRs become slightly unrealistic. On the other hand, we observe that the segmentation scores are relatively lower than the classification scores without using the domain adaptation module. For instance, in the case of infection-aware DRRs with 2 3.0 w  , the average segmentation scores on the test set in target domain, including the accuracy, AUC and F1-score, are 0.586, 0.886 and 0.694 respectively, whereas the corresponding classification scores are 0.868, 0.933 and 0.870. Such results implicate that the segmentation header is much more sensitive to the domain discrepancy between DRRs and real CXRs than the classification header. By using the domain adaptation module, both the segmentation scores and classification scores are greatly improved; specifically, the improvement in segmentation scores is much more significant than the improvement in classification scores. For instance, in the case of infection-aware DRRs with 2 Table 4 , 5, 6, 10, 11 and 12 of the appendix. Note that these segmentation scores are averaged on five different training-validation splits. Next, we visualize the infection segmentation results of one of the five training-validation splits. The confusion matrices of the segmentation results on the corresponding validation and test sets are shown in Fig. 14 . We visualize several true positive and true negative cases in Fig. 15 . Compared with previous studies that highlight the infected regions roughly by resorting to the interpretability of deep classification models, our segmentation model trained on the infection-aware DRRs is able to segment the infected regions in CXRs directly and accurately. Besides, we present several failure (false positive and false negative) cases in Fig. 16 . w w w and 2 T . As mentioned earlier, CXRs are generally considered less sensitive than 3D CT scans [9] . It may happen that CT examination detects an abnormality, whereas the X-ray screening on the same patient reports no findings. To solve this problem, we introduce the infection-aware DRR generator to generate the infection-aware DRRs by increasing the weight of infected voxels to control the strength of radiological signs of COVID-19 infection in DRRs, and thus maintain the consistency of COVID-19 infection between generated DRRs and CT scans. We argue that the best parameters 0 1 2 ( , , ) w w w and 2 T will produce the best DRRs and get the highest classification and segmentation scores on the validation and test sets no matter whether the domain adaptation module is used or not. Therefore, we average the corresponding items in Table 1 -24 of the appendix to obtain the total average (T-Avg.) scores of 63 training sets in the source domain to search for the best parameters. Meanwhile, we compute the equivalent average proportion of infected voxels (EAPIV) in the lungs of the 10 CT cases that are used to generate infection-aware DRRs as shown in the last row of Table 2 . As can be seen, the peak of T-Avg. scores at each row in Table 2 Table 2 ), the peak of T-Avg. scores appears at 2 0.20 T  . Therefore, we argue that the lower bound of CRIV is 20.0% for significant radiological signs of COVID-19 infection in DRRs. It means the pixels whose CRIVs are lower than 20.0% can only present insignificant radiological signs of COVID-19 infection, and it is hard to distinguish such pixels from the pixels of the lungs. 

We propose a novel approach, called DRR4Covid, to learn automated COVID-19 infection segmentation on CXRs from DRRs. DRR4Covid consists of three key components, including an infection-aware DRR generator, a classification and segmentation network, and a domain adaptation module. The infection-aware DRR generator is able to produce DRRs with adjustable strength of radiological signs of COVID-19 infection, and generate pixel-level infection annotations that match the DRRs precisely, thus enabling the segmentation networks to be trained directly for automated infection segmentation. The domain adaptation module is introduced to reduce the domain discrepancy between DRRs and CXRs by training networks on unlabeled real CXRs and labeled DRRs together. We provide a simple but effective implementation of DRR4Covid by using a domain adaptation module based on Maximum Mean Discrepancy (MMD), and a FCN-based network with a classification header and a segmentation header. Extensive experiment results have confirmed the efficacy of our methods; specifically, quantifying the performance by accuracy, AUC and F1-score, our network without using any annotations from CXRs has achieved a classification score of (0.954, 0.989, 0.953) and a segmentation score of (0.957, 0.981, 0.956) on a test set with 794 normal cases and 794 positive cases. Besides, we estimate the sensitive of X-ray images in detecting COVID-19 infection by adjusting the strength of radiological signs of COVID-19 infection in synthetic DRRs; we report that the estimated detection limit of the infected proportion of the lungs is 19.43%±16.29%, and the estimated lower bound of the contribution rate of infected voxels is 20.0% for significant COVID-19 infection.

To our best knowledge, this is the first attempt to realize the automated COVID-19 infection segmentation base on CXRs by using the labeled DRRs that are generated from Chest CT scans. The limitation of our work is that the segmentation results can only be evaluated by using classification metrics due to the unavailability of pixel-level annotations of COVID-19 infection in CXRs. Future work can be carried out by extending the DRR4Covid to DRR4Lesion to enable multiple lung lesion segmentation on CXRs.

Appendix. 

Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study

Pathological findings of covid-19 associated with acute respiratory distress syndrome

Radiological findings from 81 patients with covid-19 pneumonia in wuhan, china: a descriptive study

Coronavirus disease 2019 (COVID-19) Situation Report

Chest CT for typical 2019-ncov pneumonia: relationship to negative RT-PCR testing

Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases

Coronavirus disease 2019 (covid-19): Role of chest ct in diagnosis and management

Sensitivity of chest ct for covid-19: comparison to rtpcr

Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19

Chest CT findings in 2019 novel coronavirus (2019-nCoV) infections from Wuhan, China: key points for the radiologist

Towards contactless patient positioning TMI 2020

Deep learning covid-19 features on cxr using limited training data sets

Frequency and distribution of chest radiographic findings in COVID-19 positive patients

Fast calculation of the exact radiological path for a three-dimensional CT array

Computation of digitally reconstructed radiographs for use in radiotherapy treatment design

A fast algorithm to calculate the exact radiological path through a pixel or voxel space

Fast generation of digitally reconstructed radiographs using attenuation fields with application to 2D-3D image registration

Fully convolutional networks for semantic segmentation

CT imaging based digitally reconstructed radiographs and their application in brachytherapy

Comparison of digitally reconstructed radiographs generated from axial and helical CT scanning modes: a phantom study

GPU-accelerated digitally reconstructed radiographs

Accelerated ray tracing for radiotherapy dose calculations on a GPU

A Fast DRR Generation Scheme for 3D-2D Image Registration Based on the Block Projection

Fast calculation of digitally reconstructed radiographs using light fields

Fast DRR generation for 2D/3D registration

How does the hip joint move? Techniques and applications

Task driven generative modeling for unsupervised domain adaptation: Application to x-ray image segmentation

DeepDRR-a catalyst for machine learning in fluoroscopy

Enabling machine learning in x-ray-based procedures via realistic simulation of image formation

X2CT-GAN: reconstructing CT from biplanar X-rays with generative adversarial networks

A curriculum domain adaptation approach to the semantic segmentation of urban scenes

A deep convolutional activation feature for generic visual recognition

How transferable are features in deep neural networks

Deep Subdomain Adaptation Network for Image Classification

Unsupervised domain adaptation by backpropagation

Learning transferable features with deep adaptation networks

Conditional adversarial domain adaptation

Multi-adversarial domain adaptation

Deep CORAL: Correlation alignment for deep domain adaptation

Multi-representation adaptation network for cross-domain image classification

Learning to adapt structured output space for semantic segmentation

Domain-adversarial training of neural networks

Adversarial discriminative domain adaptation

Cycada: Cycle-consistent adversarial domain adaptation

Deep transfer learning with joint adaptation networks

Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources

Central moment discrepancy (CMD) for domaininvariant representation learning

A kernel two-sample test

Explainable covid-19 predictions based on chest x-ray images

On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation

COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on Chest X-Ray images

Visualizing and understanding convolutional networks

Deep inside convolutional networks: Visualising image classification models and saliency maps

Investigating the influence of noise and distractors on the interpretation of neural networks

Salient deconvolutional networks

Axiomatic attribution for deep networks

Learning important features through propagating activation differences

Towards Efficient COVID-19 CT Annotation: A Benchmark for Lung and Infection Segmentation

Can AI help in screening Viral and COVID-19 pneumonia

COVID-CXNet: Detecting COVID-19 in Frontal Chest X-ray Images using Deep Learning

 Table 10 . Accuracy (Mean ± Standard Deviation) table of segmentation output on test set in target domain (domain adaptation). Accuracy (%) 01 12.0 w 