key: cord-0219478-jj4kajbi
authors: Fang, Chengyijue; Liu, Yingao; Liu, Mengqiu; Qiu, Xiaohui; Liu, Ying; Li, Yang; Wen, Jie; Yang, Yidong
title: Unsupervised COVID-19 Lesion Segmentation in CT Using Cycle Consistent Generative Adversarial Network
date: 2021-11-23
journal: nan
DOI: nan
sha: c97da4987a188bf4ad46d0172d3ee42d8c062e21
doc_id: 219478
cord_uid: jj4kajbi

COVID-19 has become a global pandemic and is still posing a severe health risk to the public. Accurate and efficient segmentation of pneumonia lesions in CT scans is vital for treatment decision-making. We proposed a novel unsupervised approach using cycle consistent generative adversarial network (cycle-GAN) which automates and accelerates the process of lesion delineation. The workflow includes lung volume segmentation,"synthetic"healthy lung generation, infected and healthy image subtraction, and binary lesion mask creation. The lung volume volume was firstly delineated using a pre-trained U-net and worked as the input for the later network. The cycle-GAN was developed to generate synthetic"healthy"lung CT images from infected lung images. After that, the pneumonia lesions are extracted by subtracting the synthetic"healthy"lung CT images from the"infected"lung CT images. A median filter and K-means clustering were then applied to contour the lesions. The auto segmentation approach was validated on two public datasets (Coronacases and Radiopedia). The Dice coefficients reached 0.748 and 0.730, respectively, for the Coronacases and Radiopedia datasets. Meanwhile, the precision and sensitivity for lesion segmentationdetection are 0.813 and 0.735 for the Coronacases dataset, and 0.773 and 0.726 for the Radiopedia dataset. The performance is comparable to existing supervised segmentation networks and outperforms previous unsupervised ones. The proposed unsupervised segmentation method achieved high accuracy and efficiency in automatic COVID-19 lesion delineation. The segmentation result can serve as a baseline for further manual modification and a quality assurance tool for lesion diagnosis. Furthermore, due to its unsupervised nature, the result is not influenced by physicians' experience which otherwise is crucial for supervised methods.

The coronavirus disease 2019(COVID-19) has become a global public health problem and now still affects billions of people's life. According to the World Health Organization, the pandemic has caused over 2 million deaths by April 2021 1 . The typical symptom of COVID-19 includes cough, fever, and pneumonia after infection 2 . Clinically, CT scans are commonly used to evaluate the progress and severity of pneumonia 3, 4 due to their high resolution in three dimensions and broad availability compared to other imaging modalities. Accurate delineation of pneumonia lesions is vital for evaluating disease progression and assessing the severity of infection which is crucial for treatment decision making 5 . However, manual segmentation is time-consuming and labor-intensive. Therefore, automatic segmentation methods are highly demanded.

In the past decade, deep learning has shown its tremendous power and potential in various radiological applications, including image segmentation 6, 7 , disease classification and synthetic image generation [8] [9] [10] [11] . Since the beginning of the COVID-19 pandemic, deep learning, incorporating various modalities of imaging techniques like CT, X-ray and Ultrasound 4,12-15 , has been applied in clinical diagnosis, predicting disease progress 16 , classifying pneumonia types and assessing severity of infection 4, 17 . However, existing methods are mostly based on supervised learning which requires substantial data labeling by radiologists as training references. For example, U-net networks have been used for the classification and segmentation of COVID-19 lesions in CT scans 18, 19, 20 . The results vary greatly among different studies, partially due to the inter-and intra-observer variations in the training lesion labeling by different radiologists 18 .

Compared to supervised learning, unsupervised learning does not require training labeling, and hence gets rid of the burden of manual lesion delineation and the inter-and intra-observer inconsistency. For example, Yao et al 21 proposed a label-free pneumonia lesion segmentation method which employed an unsupervised statistical method to simulate infected lungs from healthy ones. Zhang et al 22 developed an unsupervised method to augment lung images for better followup segmentation training using conditional GAN 22 . However, most of these methods focused on data augmentation with unsupervised network, and had to rely on supervised networks to train the lesion segmentation process 6, 21 .

Cycle consistent GAN (cycle-GAN) is an unsupervised network that has been widely used in medical image analysis, such as synthetic CT generation [23] [24] [25] and image transformation between different MRI sequences 9, 24 . Inspired by this, we propose a cycle-GAN-based unsupervised framework for COVID-19 lesion segmentation. The cycle-GAN is used to convert infected lung slices to healthy lung slices by transforming pneumonia lesions into normal lung tissues.

Then, the lesion is retrieved by subtracting the simulated "healthy" lung from the original image. The network does not require any image pairing or manual training label, hence can improve efficiency and eliminate the inter-and intra-observer inconsistency otherwise presented in supervised networks.

In this study, CT scans of 77 COVID-19 patients with positive reverse transcription polymerase chain reaction results are collected between Dec 2019 and Jan 2020 in the First Affiliated Hospital of University of Science and Technology of China. The data is anonymized before any analysis. The patient and CT scan information is listed in Table 1 . All CT images are converted to 1ⅹ1ⅹ1 mm 3 spatial resolution and cropped to 256ⅹ256 pixels per slice. The image window level is set to [-800 100] HU and all images are normalized to [-1, 1] with a zero background before being used for network training. We select 1264 healthy CT slices and 1272 slices with pneumonia lesions from 77 CT scans as training dataset. Both lungs are extracted from CT images firstly using a U-net segmentation network and the results are verified before further processing. The extracted lung images are used as the input of cycle-GAN.

In training stage, we firstly use a cycle-GAN strategy to generate synthetic healthy lung CT slices. The network architecture is illustrated in Figure 1 . We denote the infected lung CT slices by domain and healthy lung CT slices by domain Y, the probability distribution for each domain is referred as and , respectively. The generator denotes the mapping from domain to domain , and denotes the mapping from domain to domain X.

̃,̃ are synthetic "healthy" and "infected" lung slices. Two adversarial discriminators and are used to distinguish real input images and synthesized images. The architecture of the generator is a U-Net variant and consists of 8 stages, as shown in Figure 2 . Unlike the original U-net 7 , instance normalization, which can better preserve the image details in image generation process, is applied immediately after each convolutional layer except for the last one. All convolution filters in the generator have a size of 3×3 pixels. We set the channel number of the first block as 64. In the encoder part, the width and height of the feature map are halved using convolution with a stride of 2 instead of max pooling. In the first four stages, the channel number is doubled after the feature map passes each layer, while in the last four stages the channel number is fixed to 512. All the feature maps in the encoder part are concatenated with their counterparts in the decoder part. The encoder and decoder parts are symmetric. The discriminator and are implemented by a 70 × 70 Patch-GAN 26 . The architecture of the discriminator is illustrated in Figure 3 . The first three convolution layers use a stride of 2, while the remaining convolution layers use a stride of 1. All the convolution layers have a padding of 1 and employ the Leaky-ReLU activation function with a slope parameter of 0.2. In the first convolution layer, a feature map with 64 channels is generated. After that, the channel number is doubled after the feature map passes each layer. In the last layer, the output is reduced to one channel. 

where the total loss aiming to learn the mapping function between the source and target domain. and are introduced in (1) mainly to weigh the importance of the three losses. After optimization, we set = 10 and = 5.

is the loss function of the discriminator calculating the difference between synthetic "healthy" slices and real healthy slices. To maintain stability during the learning process, we here choose L2 loss in the LSGAN 27 as our loss function instead of the sigmoid cross entropy in regular GANs 28 . The is defined as : 

where ~ denotes the learning process on domain X. is used to keep the consistency of the two generators and , and is defined as :

where ‖•‖ is the 1 norm. Since we only want to convert unhealthy lung CT slices into healthy ones, an identity loss in (1) is designed to keep the image feature when a healthy slice is input into the generator. The identity loss is defined as :

We use the ADAM optimization method to train all the networks with 1 = 0.5 and 2 = 0.999. Kernels are initialized randomly with a Gaussian distribution. We update the generator and the discriminator at each iteration. The image slice is randomly cropped to patches of 256 × 256 pixel size as the input. The number of mini-batches is one, and the number of epochs is 100. The learning rate is initially set to 0.0002 and linearly decreased to 0 in the last 50 epochs.We stopped the training at the 85 th epoch which had smallest loss for optimal performance.

The post-processing steps are illustrated in Figure 4 . The synthetic healthy image is subtracted from its corresponding real infected image to obtain a difference map. A median filter is applied to the difference map to suppress noise and remove small islands. Then k-means clustering is used to segment the lesion from the low-intensity background.

Finally, a 5x5 Gaussian kernel is employed to smoothen the lesion edge before erosion and dilation with a radius of 1 pixels are implemented to further remove small and isolated regions. The post-processing procedures are done in MATLAB 2018A and take less than 2 minutes for each patient. The training is conducted on a Linux machine with an NVIDIA RTX 2080Ti GPU and took 8 hours. As a comparison, we also compare the k-means clustering method with Otsu thresholding, which is commonly used for thresholding segmentation.

In this study, we choose Coronacases and Radiopedia which are two public COVID-19 CT image databases to evaluate the performance of our method, with each database providing 10 and 9 patients with lesions delineated. The Dice similarity coefficient (DSC), volume precision (PSC), and volume sensitivity(SEN) are used to evaluate the performance of the proposed segmentation method, which are defined as:

where , represent the predicted and ground truth lesion volume. outperforms the semi-supervised Inf-Net method. We also compare our method with a state-of-the-art unsupervised label-free 21 method (denoted as "label-free" in Table 2 ). Our method results in higher scores in most indices and is more robust than the "label-free" method when implemented between different datasets. The results also indicates that k-means clustering performs better than Ostu thresholding in image post-pocessing.

The proposed method performs well in small lesion segmentation. Figure 5(a) shows that a small lesion with only 2mm width is correctly delineated. As shown in figure 5(b) , our method can also readily separate lesions from the chest wall. Interestingly, our method catched some low contrast lesions which are skipped by radiologists during manual segmentation, as shown in figure 5 (c). The orange arrow in (a) points to a lesion detected only by our method.

To further evaluate the capability of our method on lesion diagnosis, we divided the lung volume into 12 subvolume as illustrated in figure 7 and counted whether there are lesions in each region. We compared our method with supervised methods using evaluation metrics including diagnostic accuracy, precision and sensitivity. As shown in Table 3 ,the accuracy of our method reached 93% on Coronacases and 86% on Radiopedia dataset, which is comparable with the supervised methods. 

In this study, we propose an unsupervised method for delineating COVID-19 lesions using cycle-GAN. This unsupervised learning method shows great potential in lesion segmentation without employing labeled data, and is validated on different public database. It can work as an efficient and independent automatic segmentation method or provide a start point for physicians followup refinement.

The proposed method is robust and less database dependent. Different datasets are labeled by different radiologists, and there may exist considerable labeling inconsistency heavily depending on physicians' experiences and habits. We trained nnUnet-2D and -3D on Coronacases database using 5-fold cross-validation, and then tested on Radiopedia database. The results, as shown in Table 4 , are much worse than that when the network was trained on mixed database in table 3, attributing to the inconsistency between the training and testing datasets. We also trained our proposed method on Coronacases and tested on Radiopedia. As shown in table 4, the results of our proposed method is less influenced by the inconsistency between different dataset. There are some limitations for the proposed unsupervised method. The method missed some small and low contrast lesions, as shown in Figure 8 (a). The sample number might not be sufficiently large, particularly lacking patients with lesions in the peak superior and bottom inferior parts of the lung volume, resulting in missed delineation as shown in Figure 8 (b). In this work, we trained the network with 2D images due to hardware limitation, which didn't make full use of the three-dimension image property. Future study using 3D image input may further improve segmentation accuracy, particularly the contour continuity along the image thickness direction.

For the post-processing, we only used Otsu thresholding and k-means methods which are two simple but common methods. In spite of this, the proposed unsupervised method still achieved decent segmentation results, with a dice value of 0.748 and accuracy of 0.93 in Coronacases database. In the future, the unsupervised method can be combined with more sophysicated post-processing methods, such as texture analysis or even additional deep learning network, to further improving segmentation results. In this work, we propose an unsupervised approach that can accurately and efficiently delineate the COVID-9 lesions automatically in CT scans. The traing process of the unsupervised network does not rely on any labeled data. The segmentation result can serve as a baseline for further manual modification and a quality assurance tool for lesion diagnosis. Furthermore, due to its unsupervised nature, the result is not influenced by physicians' experience which otherwise is crucial for supervised methods.

COVID-19) Dashboard | WHO Coronavirus (COVID-19) Dashboard With Vaccination Data

Coronavirus disease 2019 (covid-19)

CT manifestations of coronavirus disease-2019: A retrospective analysis of 73 cases by disease severity

Chest CT Severity Score: An Imaging Tool for Assessing Severe COVID-19

Diagnosis and treatment of coronavirus disease 2019 (COVID-19): Laboratory, PCR, and chest CT imaging findings

3D U-net: Learning dense volumetric segmentation from sparse annotation

U-net: Convolutional networks for biomedical image segmentation

Deep Residual Learning for Image Recognition

MRI-only based synthetic CT generation using dense cycle consistent generative adversarial networks

Generative adversarial networks

MR-based synthetic CT generation using a deep convolutional neural network method

Chest CT findings of coronavirus disease 2019 (COVID-19)

CovXNet: A multi-dilation convolutional neural network for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable multi-receptive feature optimization

Sonographic signs and patterns of COVID-19 pneumonia

Is There a Role for Lung Ultrasound During the COVID-19 Pandemic?

Chest CT findings in coronavirus disease 2019 (COVID-19): Relationship to duration of infection

Severity assessment of COVID-19 using CT image features and laboratory indices

A Noise-Robust Framework for Automatic Segmentation of COVID-19 Pneumonia Lesions from CT Images

Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation, and Diagnosis for COVID-19

Segmentation of COVID-19 Infections on CT: Comparison of Four UNet-Based Networks

Label-Free Segmentation of COVID-19 Lesions in Lung CT

Learning COVID-19 Infection Segmentation from a Single Radiological Image

Paired cycle-GAN-based image correction for quantitative cone-beam computed tomography

Liver synthetic CT generation based on a dense-CycleGAN for MRI-only treatment planning

CBCT-based synthetic CT generation using deep-attention cycleGAN for pancreatic adaptive radiotherapy

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks Monet Photos

Least Squares Generative Adversarial Networks

nnU-Net: a self-configuring method for deep learningbased biomedical image segmentation

Inf-Net: Automatic COVID-19 Lung Infection Segmentation from CT Images

Efficient COVID-19 Segmentation from CT Slices Exploiting Semantic Segmentation with Integrated Attention Mechanism

Deep Painterly Harmonization

Research reported in this publication is supported by the Fundamental Research Funds for the Central Universities