key: cord-0710522-40sgfik9 authors: Jiang, Xiaoben; Zhu, Yu; Zheng, Bingbing; Yang, Dawei title: Images denoising for COVID-19 chest X-ray based on multi-resolution parallel residual CNN date: 2021-06-28 journal: Mach Vis Appl DOI: 10.1007/s00138-021-01224-3 sha: e6c0f6fbecb35fa41a324b38cc0d672ee52f705f doc_id: 710522 cord_uid: 40sgfik9 Chest X-ray (CXR) is a medical imaging technology that is common and economical to use in clinical. Recently, coronavirus (COVID-19) has spread worldwide, and the second wave is rebounding strongly now with the coming winter that has a detrimental effect on the global economy and health. To make pre-diagnosis of COVID-19 as soon as possible, and reduce the work pressure of medical staff, making use of deep learning networks to detect positive CXR images of infected patients is a critical step. However, there are complex edge structures and rich texture details in the CXR images susceptible to noise that can interfere with the diagnosis of the machines and the doctors. Therefore, in this paper, we proposed a novel multi-resolution parallel residual CNN (named MPR-CNN) for CXR images denoising and special application for COVID-19 which can improve the image quality. The core of MPR-CNN consists of several essential modules. (a) Multi-resolution parallel convolution streams are utilized for extracting more reliable spatial and semantic information in multi-scale features. (b) Efficient channel and spatial attention can let the network focus more on texture details in CXR images with fewer parameters. (c) The adaptive multi-resolution feature fusion method based on attention is utilized to improve the expression of the network. On the whole, MPR-CNN can simultaneously retain spatial information in the shallow layers with high resolution and semantic information in the deep layers with low resolution. Comprehensive experiments demonstrate that our MPR-CNN can better retain the texture structure details in CXR images. Additionally, extensive experiments show that our MPR-CNN has a positive impact on CXR images classification and detection of COVID-19 cases from denoised CXR images. shortage of medical resources and the overload of doctors. Here, Ouyang et al. [14] used a dual-sampling attention network to detect of COVID-19 cases. [15] proposed a novel PSSPNN model for classification between COVID-19, secondary pulmonary tuberculosis, community-captured pneumonia, and healthy subjects. DenseNet-OTLS method [16] achieved better performances than state-of-the-art approaches in diagnosing COVID-19. [17, 18] both utilized CNN to segment COVID-19 infection in CT images. And Shi et al. [19] make a review of imaging data acquisition, segmentation, and diagnosis for COVID-19 using AI (artificial intelligence). The above works are all typical methods of COVID-19 image analysis. Nevertheless, there are various types of noise in CXR images, such as ground-glass opacity, bilateral abnormalities, and interstitial abnormalities. Especially, low-dose CXR images susceptible to noise are complicated and fuzzy likely to interfere with the diagnosis of machines and doctors [20] . Therefore, obtaining clearer details in CXR images and improving the images quality by denoising is of great significance [21, 22] . Due to the high practical value, the medical image denoising method [20, [23] [24] [25] [26] [27] has been extensively studied for a long time. Mondal et al. [28] and Raj et al. [29] used discrete wavelet technology [30] for medical image denoising. The methods are simple to calculate and run faster, but they both had an unsatisfactory performance in removing Gaussian white noise (GWN) widely existing in medical images. In addition to classic filtering [31] [32] [33] and transform domain medical images denoising method [24, 25, 34] , non-local mean (NLM) [35, 36] and block-matching and 3D filtering (BM3D) [37, 38] based on the self-similarity show promising denoising performance. Although traditional medical image denoising algorithms can improve the quality of medical images to a certain extent, they usually need to manually selected parameters and complex optimized algorithms [39] , and enable to preserve texture details effectively [20] . Recently, deep learning methods [40] [41] [42] [43] [44] [45] [46] , given enough data, have significant advances in images denoising than those traditional handcrafted methods. They are significantly different in several key respects. First of all, deep learning methods do not need to manually adjust the parameters and complicated optimization algorithm. Moreover, deep learning methods can be competent for many varied noise tasks through different training data. However, the proposed methods above still have some obvious weaknesses. (1) Most of these methods ignore the connection between shallow layers and deep layers. (2) Some of these deep networks fail to extract information from feature maps effectively. (3) Lack of efficient multi-resolution feature fusion method. Given these, in this paper, we proposed a novel multi-resolution parallel residual CNN for CXR images denoising. There is spatial information in the shallow layers with high resolution and semantic information in the deep layers with low resolution. We utilize the multi-resolution parallel convolution streams to connect the spatial and semantic information. The ECSA module is proposed to make the network focus more on texture details in CXR images with fewer parameters. We usually directly add or concatenate multiple resolution feature maps. However, they both provide limited expressive power to the network. Therefore, we design the AMFF method based on attention to improve the expression of the network. The main contributions of this work are summarized as follows: (1) Multi-resolution parallel convolution flows are used to fuse information from high-resolution and low-resolution features. It is also used to enhance the robustness of the model. (2) An ECSA model combining effective channel and spatial attention is proposed to make the network pay more attention to the texture details of CXR images while reducing the parameters. (3) To improve the representation of the network, an attentional-based AMFF method is used, which adaptively fuses multi-resolution features, rather than simply combining and summing features. (4) To verify the impact of the MPR-CNN, we design abundant experiments for CXR images classification. The outstanding results demonstrate the ability of our network to detect of COVID-19 cases from denoised CXR images. The remainder of this paper is organized as follows. Section 2 provides a brief survey of related work. In Sect. 3, our MPR-CNN was first presented and then illustrates the loss function and optimization. In Sect. 4, extensive experiments are conducted to evaluate. Finally, several summaries and future work are given in Sect. 5. In this paper, we proposed the MPR-CNN model for CXR images denoising. With the rapidly growing CXR images of confirmed cases, there is a pressing necessity to enhance the images quality for improved COVID-19 detection. To better understand the composition and the core of the model, we briefly describe the representative methods for each of the central studied problems. Deep learning has become a dominant machine learning method in image processing, such as image classification [7] , image recognition [47] , and image denoising, which have demonstrated great potential and remarkable performance due to flexible and powerful plug-in components in deep learning [39] . Burger et al. [48] first utilized the multilayer perceptron (MLP) for image denoising and the extensive experiments demonstrate that MLP has similar or even better representation power than the hand-crafted BM3D. Besides, GANs [45, 46] that are frameworks to estimate generative models are also fine choices to suppress the noise. Generally, the framework consists of a generative network (G) and a discriminative network (D), ruling the game theory. In terms of improving the efficiency of denoising, CNNs can be regarded as a modular part, and some classic optimization methods can be inserted to restore potential clean images, which is effective for processing noisy images. DnCNN [43] and IRCNN [49] both use a full convolution network with a signal-scale feature for image denoising. An encoder-decoder method was utilized in [50] [51] [52] [53] . First, the input is gradually mapped to the low resolution representation, and then the stepwise reverse mapping is applied to the original resolution. Although these CNNs have achieved progressive results, they still have limitations. Full convolution networks do not use any downsampling operations, so the feature maps have more precise spatial details. However, these networks are less efficient in encoding contextual information due to their limited acceptance field. On the other hand, encoder-decoder methods lost fine spatial details, although gaining more context information. Multi-resolution features fusion is an important process to improve the denoising of CXR images. The low-level features with higher resolution, contain more position and detailed information. However, they have less semantic information and more noise due to less convolution. In contrast, high-level features with richer feature information, but the resolution is very low, and the perception of details is unsatisfactory. The purpose of feature fusion is to merge the features extracted from the input into new features that are more expressive than the original one. The classic feature fusion methods are mainly divided into summation [54, 55] and concatenation [56] . Assuming the dimension of the two respective input features are p and q, and the dimension of the output feature Z by concatenation is shown in Eq. (1). The number of channels is increased, but the information in each channel is maintained the same. In contrast, assuming the two respective input features are x and y, and the value of output characteristic Z is shown in Eq. (2) . Here, represents a constant. (1) Dim(Z) = p + q However, they both provide limited expressive power to the network. Inspired by this reason, we design the AMFF method based on attention to improve the expression of the network. Recently, lots of works [57] [58] [59] [60] utilize channel attention or spatial attention to improve the performance of deep learning as an effective module. Hu et al. [57] first proposed a squeeze and excitation network (SENet) to pay attention to the relationship between channels. The weight of each channel is squeezed by global average pooling (GAP) and fully connection layers. Zhang et al. [60] propose a residual non-local attention network to address the issue that the uneven distribution of information in the corrupted images. [59] combines the channel and spatial attention to improve the feature extraction ability of networks. The attention mechanism enables the network to learn where to concentrate and promotes the network to focus on the target object. The channel attention mechanism enhances or suppresses different channels for different tasks, by modeling the weights of each feature channel. The essence of spatial attention is to locate the target and perform some transformations or obtain weights. These attention mechanisms can improve the expression of the features by establishing dependencies between channels, or weighted spatial attention masks. However, these methods still need a large cost on memory and computation complexity. In this section, we introduce the proposed CXR images denoising network MPR-CNN in detail, containing MNEB, ECSA, and AMFF. The ECSA module is designed to make the network focus more on texture details in CXR images and reduce the parameters by 1D convolution instead of full connection layer. The AMFF module based on attention, rather than simple concatenation or summation for feature fusion, is utilized to improve the expression of the network. The MNEB is utilized for fusing information from high and low resolution features, which is included the ECSA and the AMFF. Also, the whole network uses residual blocks to reduce the difficulty of network learning. Further, the SSIML1 loss and the cosine annealing strategy [61] are set to train our MPR-CNN. We will describe these methods in later subsections. (2) Z = x + y The network architecture of the proposed MPR-CNN consisted of ECSA, AMFF, and MNEB is shown in Fig. 1 . Here, "DS" and "US" stand for downsampling and upsampling, respectively. First, the MPR-CNN applies a convolutional layer with the filter size of 1 × 3 × 3 × 48 to extract low-level features from the input X (noisy CXR images). Then, the feature maps pass through several layers of MNEB modules that will describe in Sect. 3.2. The MNEB is the fundamental building block of MPR-CNN. Next, we use a convolutional layer with filter size of 48 × 3 × 3 × 1 again to obtain the desired residual image R(X). At last, we can subtract R(X) from X to get the output (denoised CXR images). The architecture of the MNEB is shown in the dotted box above Fig. 1 . The full convolution with filter size of 48 × 3 × 3 × 48 is utilized to keep more precise spatial details and performing with filter size of 48 × 3 × 3 × 96 and 4 × downsampling with filter size of 48 × 3 × 3 × 192 on the original features to gain more context information. Then, we use the ECSA module that will describe in Sect. 3.3 to focus more on texture details in CXR images and reduce the parameters as well. Next, 2 × upsampling with filter size of 96 × 3 × 3 × 48 and 4 × downsampling with filter size of 192 × 3 × 3 × 48 are applied to restore to original feature maps size. Further, the AMFF which is utilized to fuse multi-resolution features will be described in Sect. 3.4. Finally, a convolutional layer with filter size of 48 × 3 × 3 × 48 is applied to extract the residual information from feature maps again. The MNEB module also uses residual learning as same as the whole network to reduce the difficulty of network learning. Multi-resolution parallel convolution streams are utilized for fusing information from high and low resolution features, as well as to enhance the robustness of the model. As shown in Fig. 2 , the ECSA module is made up of channel attention and spatial attention, making the network focus more on texture details in CXR images and reduce the parameters as well. The channel attention branch is designed to enhance or suppress different channels for CXR images denoising by modeling the weights of each feature channel. Global average pooling (GAP) is applied to squeeze the input feature maps M C ∈ R H×W×C and yield a feature descriptor d ∈ R 1×1×C . The excitation operator usually passed through two fully connected layers to dimension reduction and cross channel interaction. However, dimension reduction has side effects on the prediction of channel attention. Therefore, we utilize the 1D convolution with kernel sizes of 5 and 2 paddings to replace the two fully connected layers. The complexity of this method is tiny, and the promotion effect is significant. Next, the sigmoid gating is applied to generate activations d ∈ R 1×1×C . Finally, the output of the channel attention branch is obtained by multiplying M C and d . The spatial attention branch is designed to locate the target and perform some transformations. Given a feature map M S ∈ R H×W×C , GAP, and global max pooling (GMP) are first applied to extract the information along the channel dimensions and then concatenating them to generate a feature map F S ∈ R H×W×2 . Next, the F S passes through a convolution layer and sigmoid activation to generate a spatial attention feature map F S ∈ R H×W×1 . Finally, the output of the spatial attention branch is obtained by multiplying M S and F S . The overall pipeline of the ECSA module, a convolution layer with kernel size of 3 × 3 is first applied to extract the low-level features and PReLU is to improve the nonlinear characteristics of the network. After another convolution layer with kernel size of 3 × 3, the feature maps pass through both the channel and spatial attention in parallel. Next, we concatenated the feature maps along the spatial and channel dimensions. Finally, a convolution layer with kernel size of 3 × 3 is used to extract the residual information from feature maps again. The ECSA module is also a residual block. As shown in Fig. 3 , we design the AMFF method based on attention rather than directly add or concatenate multiple resolution feature maps to improve the expression of the network. We first fuse the multiple resolution feature maps by element-wise sum as shown in Eq. (3) and get the feature maps M in . , where M 1 , M 2 , and M 3 represent 1 ×, 2 ×, and 4 × feature maps, respectively. Then, the M passes through the GAP to extract the average information along the channel dimension and gain a feature descriptor D ∈ R 1×1×C . Further, we use global depthwise convolution (GDC) in that the number of convolution groups is the same as the channel number, and the size of convolution kernel is the same as that of input feature map, to assign each position a learnable weight and get a new descriptor D ∈ R 1×1×C . Next, we still utilize the 1D convolution with kernel sizes of 5 and 2 padding to cross channel interaction and keep the channel dimension unchanged. Afterward, the sigmoid gating is applied to generate three different attention activations S 1 ∈ R 1×1×C ,S 2 ∈ R 1×1×C , S 3 ∈ R 1×1×C . Finally, the output M out of the AMFF after recalibration and aggregation is defined in Eq. (4). We propose the MSL1 loss to train our MPR-CNN by adding multi-scale structural similarity (MS_SSIM) [62, 63] and L1 loss. On one hand, MS-SSIM can preserve the contrast in high-frequency regions in CXR images, The overall MSL1 loss is given by Eq. (9), which expresses the loss error between the desired residual image V(X-Y) and estimated one R(X) from noisy CXR image, and the is a constant which is set to 0.2 for all the experiments by ablation study. We utilize PSNR as the evaluation for our MSL1 loss, which is shown in Table 6 in Sect. 4.4. In addition, as shown in Eq. (10), the cosine annealing strategy is set as an optimization method and decreases the learning rate from initial value 5e−4 to 5e−6 during training. Here, the stands for initial value and T is empirically set as 5. In this section, we first describe the datasets and then give the implementation details. Next, we compare our MPR-CNN with some state-of-the-art denoising methods. Furthermore, ablation studies are designed to explore the impact of each of our architectural components and choices on the final performance. Finally, we innovatively verify the impact of the MPR-CNN for CXR images classification. We evaluate the denoising performance of our MPR-CNN via COVID-19 radiography database [64] (Fig. 4) . Also, we further make classification experiments to verify the impact of the MPR-CNN for CXR images classification. To balance classification data while classifying, we collect another 605 CXR images of COVID-19 from three datasets: (1) Fig. 1 COVID-19 chest X-ray Dataset [65] , (2) COVID-19 Image Data Collection [66] , and (3) ActualMed COVID-19 chest X-ray Dataset [67] . There are three types of cases that are COVID-19, normal, and viral pneumonia. The detailed component distribution of the classified dataset is shown in Table 1 . The proposed MPR-CNN is end-to-end trainable that does not require any pre-training of sub-modules. The model is trained with the Adam optimizer ( 1 = 0.9 , and 2 = 0.999 ) that is an extension of the stochastic gradient descent algorithm, and the cosine annealing strategy is set as an optimization method and decreases the learning rate from initial value 5e-4 to 5e-6 during training. The mini-batch size is set as 16 In this subsection, we test the denoising performance of our MPR-CNN in terms of both subjective and objective evaluation, comparing with 7 state-of-the-art denoising methods, such as NLM [35] , BM3D [37] , DnCNN [43] , IRCNN [49] , FFDNet [53] , SRGAN [45] , and ESRGAN [46] . In terms of subjective evaluation, we compare the denoising ability for different noise levels and different scaling factors with peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) index. PSNR is a measure to evaluate the ability of model to remove noise, while the SSIM is a measure of the similarity of two images. The value is higher, the corresponding denoising method has a better performance. Table 2 describes the average PSNR (dB) and SSIM of different methods on test data with different noise levels of 15, 25, and 40. The proposed MPR-CNN achieves the best performance on noise levels of 15, 25, and 50 in SSIM, although the value of PSNR is a little bit lower than ESPGAN when σ = 15. Also, when σ = 25 the value of PSNR is 0.84 dB, 0.627 dB, and 0.034 dB more than BM3D, DnCNN, and ESRGAN, respectively. Especially, the value of SSIM is 0.037, 0.020, and 0.011 more than NLM, IRCNN, and SRGAN. It is noted that our MPR-CNN achieves excellent results on denoised tasks of different noise levels (Fig. 5) . Table 3 describes the average PSNR (dB) and SSIM of different methods on three types of CXR images with noise levels of 30. It is noted that our MPR-CNN is also superior to competing methods on each type of CXR image and has a better denoising performance on types of viral pneumonia and COVID-19 than the normal type. Moreover, to evaluate the ability of the proposed model for the blind Gaussian denoising, we also added WGN to Fig. 6a with standard deviation σ = {10, 15, 20, 25, 30, 35, 40, 45} and the line chart of PSNR and SSIM are shown in Fig. 7 . The blue solid lines represent the denoising result of our MPR-CNN, and one can clearly see that the values of PSNR and SSIM of our MPR-CNN are higher than other competing methods at most time, although the value of PSNR is a little bit lower than ESPGAN when σ < 25. Figure 7 demonstrates that our proposed MPR-CNN is robust for the blind CXR images denoising. Average PSNR (dB) and SSIM of different methods on test data with different scaling factors are shown in Table 4 . Here, we set three scaling factors, × 2, × 4, and × 8, and the corresponding CXR images sizes are 512 × 512, 256 × 256, and 128 × 128. The noise level is still set to 25. According to Table 3 , it is noted that our MPR-CNN has a better performance in denoising CXR images with different scaling factors than other methods. For computation time, we select 4 state-of-the-art denoising methods to perform the test for CXR images denoising. The size of the CXR image is set as 128 × 128, 256 × 256, and 512 × 512 as illustrated in Table 5 . From that, we can find that the inference time of our MPR-CNN is very competitive in contrast to other popular methods. We design ablation studies to explore the impact of each of our architectural components and choices on the final performance. All the ablation experiments use the same test data, adding WGN with standard deviation = 25 to simulate the low-dose noisy CXR images. First, we analyze the impact of different loss function for denoising CXR images in Table 6 . It shows that the proposed MSL1 loss with = 0.2 has the most outstanding denoised performance than other loss function, which increases 0.243 dB more than L1 loss and 0.46 dB more than MS-SSIM. Furthermore, the SSIM also has a certain promotion. It could be concluded that MSL1 loss can preserve the contrast in high-frequency regions in CXR images and keep the color and brightness as well. Then, we study the influence of the number of multiresolution streams in the MNEB for the CXR images denoising quality in Table 7 . According to Table 7 , we can note that the MNEB with two different resolution streams is better than one of single one and three different resolution streams have the best performance. Therefore, it could be concluded that increasing the number of streams can provide a significant improvement for CXR images denoised and the MNEB is important to improve the CXR images quality. Finally, in Table 8 , we make ablation studies on the impact of proposed ECSA and AMFF for CXR images denoised. From the first three columns, we can note that the AMFF based on attention, rather than simple concatenation or summation for feature fusion, can improve the expression of the network, which increases by 0.318 dB more than summation and 0.108 dB more than concatenation. Moreover, it is also evident from Table 8 that the ECSA module has a positive effect on our MPR-CNN, which, respectively, increases by 0.019, 0.020, 0.020 in SSIM. Extensive ablation experiments prove that the proposed MS-SSIM loss can preserve more detailed information in CXR images as well as the MNEB, and the ECSA and AMFF both have positive influences on the final CXR images quality. In this subsection, we not only make the denoising experiments but also use the denoised CXR images by MPR-CNN to classify CXR images. To evaluate the effectiveness of the MPR-CNN, we use three classic classified networks: ResNet18 [47] , VGG19 [68] , DenseNet121 [69] (Fig. 8) . We use a classified dataset that has been introduced in Sect. 4.1 for CXR images classification. The images size are set to 512 × 512, which adding WGN with standard deviation = 20 to simulate the low-dose noisy CXR images. Here, the vertical data represents the true value, while horizontal data stands for the predicted one. Especially, the number of diagonals represents the correct classifications. Moreover, the normal cases correctly classified, respectively, increase by 5 and 6 after denoising using VGG19 and DenseNet121. Then, the viral Pneumonia cases correctly classified, respectively, increase by 63 and 38 after denoising using ReseNet18 and VGG19. Especially, the correctly classified COVID-19 cases, respectively, increase by 11, 17, and 4 using ResNet18, VGG19, and DenseNet121. Hence, it can clearly note that the MPR-CNN has a positive impact on the CXR images classification. The classification effects between different models denoised by DnCNN and MPR-CNN are shown in Table 9 . To quantify the classified networks, we calculated the test accuracy (ACC), sensitivity (SEN), and precision (PRE) of each infection type on the above classified dataset. Here, the higher value the SEN corresponds to the lower the probability of missing positive cases. Moreover, the higher value the PRE results in the lower the probability of misdiagnosing negative cases. After denoising by MPR-CNN, the ACCs of the ResNet18, VGG19, and DenseNet121 are, respectively, improved by 8.96%, 8.53%, and 8.52%, while the PREs, respectively, improved by 7.41%, 7.26%, and 7.37%. Comparing to DnCNN, the SENs have improved by 0.56%, 1.13%, and 1.56%, respectively, using ResNet18, VGG19, and DenseNet121. Meanwhile, the classification performance of denoised CXR image by MPR-CNN is very close to original one. The ACCs have just decreased by 0.57%, 0.30%, and 0.32% using ResNet18, VGG19, and DenseNet121. Furthermore, it could be concluded that classification models fed into CXR images by MPR-CNN have a lower probability of missing COVID-19 cases, as well as a lower probability of misdiagnosing negative cases. In this paper, we propose a novel MPR-CNN for CXR images denoising and special application for COVID-19 that can improve the images quality. Multi-resolution parallel convolution streams are utilized for fusing information from both high and low resolution features. The ECSA module is proposed to make the network focus more on texture details in CXR images as well as to reduce the parameters. The AMFF method based on attention is utilized to improve the expression of the network rather than simple concatenation or summation for feature fusion. The MSL1 loss is utilized to preserve the contrast in high-frequency regions in CXR images and keep the color and brightness as well. The extensive experiments demonstrate that all the proposed methods have significant impacts on CXR images denoising. Comparing to competing methods, our MPR-CNN has the best performance in both subjective visual evaluation and objective indicators. It is noted that our proposed MPR-CNN is very robust for blind CXR images denoising. Moreover, extensive experiments show that the proposed MPR-CNN has a positive impact on CXR images classification and detection of COVID-19 cases from denoised CXR images. On the whole, the proposed MPR-CNN can provide a more clear and rigorous diagnostic basis both for radiologists and machines. We will continue to focus on the development of COVID-19, and our future work will concentrate on effectively reducing the noise artifacts in COVID-19 CXR images with the current powerful method. Improving the quality of COVID-19 CXR images, to classify and detect of COVID-19 cases more accurately from denoised CXR images. CovidGAN: data augmentation using auxiliary classifier GAN for improved Covid-19 detection World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19) Deep learning COVID-19 features on CXR using limited training data sets Towards efficient COVID-19 CT annotation: a benchmark for lung and infection segmentation Deep learning Deep learning in neural networks: an overview ImageNet classification with deep convolutional neural networks Deep multi-view enhancement hashing for image retrieval 3D room layout estimation from a single RGB image Depth image denoising using nuclear norm and learning graph model Task-adaptive attention for image captioning A survey on deep learning in medical image analysis Cerebral micro-bleeding detection based on densely connected neural network Dual-sampling attention network for diagnosis of COVID-19 from community acquired pneumonia PSSPNN: PatchShuffle Stochastic Pooling Neural Network for an explainable diagnosis of COVID-19 with multiple-way data augmentation Covid-19 diagnosis via DenseNet and optimization of transfer learning setting Domain adaptation based selfcorrection model for COVID-19 infection segmentation in CT images Inf-net: automatic covid-19 lung infection segmentation from CT images Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19 Chest X-ray image denoising method based on deep convolution neural network Automated chest screening based on a hybrid model of transfer learning and convolutional sparse denoising autoencoder Performance evaluation of image denoising developed using convolutional denoising autoencoders in chest radiography Total variation wavelet-based medical image denoising Wavelet-domain medical image denoising using bivariate laplacian mixture model Medical image denoising using adaptive threshold based on contourlet transform Group-sparse representation with dictionary learning for medical image denoising and fusion Medical image denoising using convolutional denoising autoencoders Denoising and compression of medical image in wavelet 2D Denoising of medical images using undecimated wavelet transform An image fusion algorithm using wavelet transform Bilateral filtering for gray and color images Medical image denoising using bilateral filter An adaptive median filter for image denoising Locally adaptive wavelet domain Bayesian processor for denoising medical ultrasound images using speckle modelling based on Rayleigh distribution A non-local algorithm for image denoising Medical image denoising by parallel non-local means Image denoising by sparse 3-D transform-domain collaborative filtering Ultra-low-dose CT image denoising using modified BM3D scheme tailored to data statistics Attention-guided CNN for image denoising Real image denoising with feature attention Encoder-decoder with atrous separable convolution for semantic image segmentation Convolutional auto-encoder for image denoising of ultra-low-dose CT Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising Deep shrinkage convolutional neural network for adaptive noise reduction Photo-realistic single image super-resolution using a generative adversarial network ESRGAN: enhanced super-resolution generative adversarial networks Deep residual learning for image recognition Image denoising: can plain neural networks compete with BM3D? Learning deep CNN denoiser prior for image restoration DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better Kindling the darkness: a practical low-light image enhancer Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections FFDNet: toward a fast and flexible solution for CNN-based image denoising Generalized K-L transform based combined feature extraction Feature fusion: parallel strategy vs. serial strategy A shape and texture-based enhanced Fisher classifier for face recognition Squeeze and excitation networks Non-local neural networks CBAM: convolutional block attention module Residual non-local attention networks for image restoration SGDR: stochastic gradient descent with warm restarts Image quality assessment: from error visibility to structural similarity Multiscale structural similarity for image quality assessment COVID-19 radiography database Figure 1 COVID-19 chest X-ray data initiative COVID-19 image data collection Actualmed COVID-19 chest x-ray data initiative Very deep convolutional networks for large-scale image recognition Network in network The author(s) declared no conflicts of interest with respect to the research, authorship, and publication of this paper.Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Xiaoben Jiang is pursuing the PhD degree in East China University of Science and Technology. His current research interests include digital image processing and computer vision. His experience includes the denoising method on chest X-ray images and CT images, and detection of COVID-19 cases from denoised CXR images. He has published in journals in the crossing field of medical science and computer vision and has been involved in publicly and privately funded projects.Yu Zhu received the PhD degree in optical engineering from Nanjing University of Science and Technology, China, in 1999. She is currently a professor in the Department of Electronics and Communication Engineering of East China University of Science and Technology. Her research interests include image processing, computer vision, multimedia communication and deep learning. She has published more than 100 papers in journals and conferences.Bingbing Zheng obtained the BS degree in Information Science and Engineering from East China University of Science and Technology in 2015 and is pursuing the PhD degree in East China University of Science and Technology. His main research interests include deep learning for medical image processing and computer vision. His experience includes the identification and detection of pulmonary nodules on CT images, the classification and segmentation of prost a t e o n M R I , a n d t h e classification and segmentation of COVID-19. He has published in journals and conferences in the crossing field of medical science and computer vision and has been involved in publicly and privately funded projects.