key: cord-1046819-1ar548i4 authors: Mirza, Muhammad Waqar; Siddiq, Asif; Khan, Ishtiaq Rasool title: A comparative study of medical image enhancement algorithms and quality assessment metrics on COVID-19 CT images date: 2022-04-25 journal: Signal Image Video Process DOI: 10.1007/s11760-022-02214-2 sha: 75f1d1feedcc52cc340dd0c62e84b35e4b3f0c52 doc_id: 1046819 cord_uid: 1ar548i4 Medical imaging can help doctors in better diagnosis of several conditions. During the present COVID-19 pandemic, timely detection of novel coronavirus is crucial, which can help in curing the disease at an early stage. Image enhancement techniques can improve the visual appearance of COVID-19 CT scans and speed-up the process of diagnosis. In this study, we analyze some state-of-the-art image enhancement techniques for their suitability in enhancing the CT scans of COVID-19 patients. Six quantitative metrics, Entropy, SSIM, AMBE, PSNR, EME, and EMEE, are used to evaluate the enhanced images. Two experienced radiologists were involved in the study to evaluate the performance of the enhancement techniques and the quantitative metrics used to assess them. COVID-19 (coronavirus disease 2019) is a contagious disease caused by SARS-CoV-2 virus that mostly affects the respiratory system as pneumonia and acute respiratory distress syndrome [1] . The COVID-19 outbreak is supposed to continue inflicting significant health issues and causing high fatality while drastically affecting society and economies around the world [2] . There are over 268 million COVID-19 cases and over 5.2 million deaths reported in the world as of December 2021 according to the world health organization [3] . COVID-19 is commonly diagnosed by Reverse Transcription Polymerase Chain Reaction (RT-PCR), which has low accuracy, slow response time, and is less sensitive [4] . Early infection detection improves the likelihood of successful treatment for affected patients while also reducing the spread of an infectious disease like COVID-19 in the community. In a recent work, COVID-19 patients were classified using medical monitoring sensors and efficient artificial neural networks based on physiological and psychological inputs [5] . Various learning-based prediction models were used to estimate the infection rate, the possibility of the second and third waves of the pandemic, and the risk of an outbreak linked with travelling. Radiography techniques, such as a chest X-ray or computed tomography (CT), are commonly used to diagnose lung disorders such as pneumonia [6] and COVID-19 [7] . Many studies have recently been conducted to detect COVID-19 utilizing X-ray and CT images using various AI-based approaches. To increase network performance in classifying an image into COVID-19, normal, or other lung disorders, various transfer learning approaches, unique network designs, and ensemble solutions have been proposed [8] . Medical images, especially CT scans, are difficult to visualize due to low contrast [9] . This can affect the performance of the AI-based techniques as well as the quality of the diagnostic outcomes [10] . Adequate local contrast can improve diagnostic accuracy [11] and drastically decrease the processing time [12] . Compared to magnetic resonance imaging (MRI), CT images have low soft-tissue contrast, relatively high noise level, and a high dynamic range [13] . These factors along with the advancements in medical imaging technology have raised the curiosity of image processing community to develop enhancement algorithms for these images [14] . Contrast enhancement to improve the visual appearance of an image is a crucial preprocessing step in the field of medical image processing. Techniques such as reducing blur and noise can increase contrast and offer more information that can be extracted from the image. This is an important phase especially now, as with the extensive number of CT scans obtained every day in hospitals for COVID-19 patients, the quality of the acquired images can fluctuate due to numerous factors, such as the patients' condition, breathing state, and human error [8] . The primary objective of image enhancement is to increase the interpretability of information contained in the image for human viewers or to extract useful features accurately for the machine learning-based algorithms. The enhanced images obtained by modifying the pixel intensity of the input image through nonlinear transformations are qualitatively better than the original images. However, the applied transformation must preserve the information and not alter it during the enhancing process [8] . Different image enhancement techniques have been developed over the years for the betterment of the image quality, improvement in contrast and preservation of the details in the images. Among those, histogrambased image enhancement techniques are widely used as they offer simplicity in their execution and produce good results [9, 15] . The best aspect of histogram-based approaches is the ease with which separate processing methods can be integrated. Real-time processing of histogram-based methods make them more compelling toward the diagnosis of COVID-19 through CT scans. In this paper, we apply several existing histogram-based enhancement techniques to a variety of COVID-19 CT images acquired in DICOM format. We have carried out a detailed evaluation of the techniques, through visual comparisons and using several quantitative metrics, in terms of enhancing the details for more accurate and fast diagnosis. The comparative study aims at helping frontline health workers, as there is an urgent need to assist healthcare professionals and radiologist in making accurate COVID-19 diagnoses in a short time. A summary of the techniques included in this study and their performance on the existing medical image datasets is given in Table 1 . It can be noted that these techniques have been evaluated on different image datasets using different evaluation methods. Therefore, it is not easy to draw reliable conclusions on their performance. Our study will evaluate them all on the same COVID-19 image dataset. Therefore, besides determining their appropriateness for COVID-19 images, it will measure their relative performance as well. There are several quantitative measures used in the literature to indicate the quality of medical images, as can be seen in Table 1 . We will use some most common metrics for our study and evaluate their performance as well. The rest of this paper is organized as follows. Section 2 describes commonly used histogram-based image enhancement techniques. Section 3 gives a summary of commonly used quantitative metrics used for medical images. Section 4 presents the results of our experimental evaluation, while some conclusions drawn from the study are stated in Sect. 5. The range of reported scores on the used datasets is mentioned in the [min, max] notation. If only one value is given, it indicates the average score on the entire dataset. The works from where these results are taken are cited in the second column. In general, a high score indicates a better quality, except for the metrics marked with an asterisk which assign low score for better quality Histograms serve as the foundation for many spatial domain image processing techniques. In the existing literature, histogram-based enhancement techniques are among the oldest as well as the most recent and the most effective ones. Histograms are easy to calculate and manipulate in software, and they also adapt themselves well to limited hardware implementations, making them popular for real-time image processing and enhancement [16] . In this section, we briefly describe some popular histogram-based image enhancement techniques that are used is our study. Histogram equalization (HE) [17] is one of the simplest contrast improvement methods, being used on medical images since the mid-1980s [18, 19, 31] . Although HE is simple, it can produce visual artefacts, noise intensification, level saturation impact, washed-out effect, and under-and overenhancement [20] . These unwanted visual degradations are inevitable due to considerable shift in mean brightness caused by HE [32] . Many improved HE variants has been proposed to overcome these problems. Several solutions have been presented in recent years to address the shortcomings of histogram equalization [15] and to resolve the mean shift and brightness preserving issue in the resultant image [14] . A method known as brightness preserving bi-histogram equalization (BBHE) [21] was proposed to conserve the mean brightness and improve the contrast of an input image. The BBHE initially divides an input image into two sub-images based on the input image's mean. The sub-images are equalized independently based on the transformation function, and the result of BBHE is a combination of equalized sub-images. DSIHE divides the input image histogram using the gray levels with the cumulative distribution function (CDF) value based on the median [22] . DSIHE decomposes the image aiming at the preservation and maximization of entropy [33] of the resultant image. The two sub-images, one dark and one bright, are processed using the HE and combined into a single enhanced image. MMBEBHE is an extension to the BBHE, which separates the histogram based on the threshold level that yields the smallest absolute mean brightness error [23] . The input histogram is divided using the threshold while the remaining enhancement method remains the same as in BBHE. RMSHE performs histogram decomposition recursively, where each new histogram is separated further based on its individual mean value [24] . RMSHE provides scalable brightness preservation to overcome the disadvantages of the previous techniques. RSIHE is a generalization of the DSIHE technique. It divides the input histogram by using a cumulative probability density of 0.5 [25] . This procedure is repeated for a set number of recursion levels to evenly divide the image into sub-images. The segmented histograms are independently equalized and combined to give the enhanced image. This technique also provides scalable brightness because of its recursive nature. RSWHE is similar to RMSHE and RSIHE but it adds a histogram weighting step before equalization [26] . The histogram segmentation step generates 2 r sub-histograms at recursion level r . Using the normalized power-law function, the histogram weighting step alters the probability density function of each sub histogram. Each of the 2 r subhistograms is equalized separately and finally all sub-images are combined to yield the enhanced image. AGCWD replaces the transformation function of HE by a new function based on adaptive gamma correction [27] . The technique can prevent abrupt changes in high intensities while enhancing the low intensities. Furthermore, AGCWD utilizes a weighting distribution function to smoothen enhancement. An image with a high probability density function (PDF) will not be overly enhanced, whereas an image with a low PDF will not be under-enhanced [47] . ESIHE is particularly effective for low exposure grayscale images and preserves entropy while providing control over the enhancement. The algorithm involves calculating the exposure threshold, histogram clipping, and histogram subdivision and equalization [28, 34] . The threshold parameter divides the image into underexposed and overexposed sub-images. Histogram clipping is done to prevent overenhancement, by computing a clipping threshold and clipping the histogram bins with larger counts. R-ESIHE is an extension of the ESIHE method, which performs ESIHE on the image iteratively until the difference in exposure levels between subsequent iterations is smaller than a certain threshold [29] . R-ESIHE has the same implementation steps as ESIHE, and only difference is the recursive implementation. In the other variant, RS-ESIHE, the histogram is first divided based on the individual exposure threshold, and histogram equalization is applied on all the sub-histograms. The clipping process is done in both algorithms to avoid over enhancement. In BHE, several slices of the input image are obtained by dividing its dynamic range into uniform bins [30] . Each bin is stretched to enhance the contrast and a cross bilateral filter is used to overcome the artefacts. To overcome overenhancement, a variant of BHE referred as contrast limited BHE (cl-BHE) is also proposed, which clips the bin counts at some specified threshold and re-distributes the overhead pixels to other nonzero weights. Tone-mapping is a process used to transform high dynamic range (HDR) images for visualization on standard screens. The medical images stored in the DICOM format are HDR, therefore, existing tone-mapping operators (TMOs) have also been used to enhance the CT scans. We include a recently proposed histogram based TMO, which uses pixel intensities in a perceptual domain defined by the well-known perceptual quantizer (PQ) function. This TMO has not been used for the medical images so far in the existing literature. A recent technique designed for medical image enhancement used a novel iterative method for histogram construction. Initially, all pixel values are assumed to be in a single bin, which is split into two such that the clustering error is nearly equal in each new bin. This process continues iteratively until 256 bins are obtained, and in each iteration the bin with the largest clustering error is split into two. The pixel contained in a bin b i , 0 ≤ i ≤ 255, are assigned the new value equal to i. The resultant image is visually enhanced and good for display on the standard screens. Several quantitative metrics have been proposed in the literature to quantify the image quality, as can be noted in Table 1 . We provide a brief description of the most popular ones below. Entropy measures the level of details present in the image [14] and is often used as an assessment parameter of the medical images. Shannon entropy is commonly used for this purpose, which is a blind metric, i.e., it does not require the reference image to determine its value [32] . EME computes the contrast in the reference and the enhanced images, to determine the level of improvement [41] . To measure the contrast, the image is divided into smaller blocks. The difference of the maximum and minimum intensity levels in each block in log domain is measured as the local contrast, and all values are summed to calculate the value of EME. The measure is suitable for images with uniform backgrounds [42] . EMEE measures the enhancement by entropy of the local contrast values of the image blocks used in EME [43] . Enhancement of contrast and detail can lead to increase in the noise level present in the medical images. The peak signal to-noise ratio (PSNR) [39] is a good measure of noise in an image and it measures the degree of deterioration of the enhanced image in comparison with the input image [40] . Human visual perception system is extremely capable of extracting structural information from a scene (image) and, as a result, it can distinguish the structural differences between a reference image and a processed image [37] . SSIM index compares three features, structure, luminance, and contrast of the reference and the test images, to give a single score of similarity. SSIM value ranges from 0 to 1 and a higher value is considered better. AMBE represents the difference of mean brightness between the original the enhanced image. This metric computes the degree of brightness distortion produced by enhancement. The lower the AMBE value the better the preservation of brightness in the image [38, 39] . Above mentioned six metrics are most commonly used for medical images. However, certain other metrics also exist as can be seen in Table 1 . For completeness, we briefly describe those metrics also for the interested readers. Feature Similarity (FSIM) computes local similarity map between original and enhanced images and aggregates the local similarities to a single score. Natural Image Quality Evaluator (NIQE) uses a set of statistical features derived from a dataset of natural, undistorted images, to compute the difference between the original and the enhanced image. Mean Squared Error (MSE) calculates the sum of the squared errors between enhanced and original images. Discrete Entropy (DE) assess the richness of details in an image after enhancement. Contrast is a second-order statistical measure that gives the intensity difference between a pixel and its neighbor across the entire image. Contrast Improvement Index (CII) is calculated as ratio of the contrast values of the enhanced and original images. Michelson Contrast (MC) is calculated by dividing the difference of maximum and minimum intensity values of the image by the sum of maximum and minimum intensity values. Weber Contrast (WC) is calculated by dividing the difference of maximum and minimum luminance of the image by the maximum luminance. Structure Similarity of Tone-mapped Image Quality Index (TMQI-S) is a commonly used measure to determine the structural fidelity of the tonemapped HDR images. Recently, it has been used in some works to measure the quality of enhanced medical images. For this study, we used 50 CT images with confirmed COVID-19 diagnosis at different levels of infection. These images were taken from the public dataset Harvard Dataverse [35] which contains 1013 images in DICOM format [36] . They were enhanced using 14 histogram-based techniques discussed in Sect. 2, thus giving a total of 700 images. All images were evaluated using six quantitative metrics (entropy, EME, EMEE, SSIM, AMBE, and PSNR), as well as by two experienced radiologists, to assess the performance of the enhancement techniques. Furthermore, we calculate correlation between the evaluation done by the radiologist and the metrics to assess the relative performance of metrics. Details are given below in this section. While professional radiologists are trained to read the standard CT scans, the accuracy of diagnosis and the time taken for it can be optimized by using the images of better visual quality. Typical examples of enhanced images using 14 techniques used in this study are shown in Fig. 1 . We applied the six above-mentioned metrics to all 50 sets of enhanced images and calculated the average scores of each technique as shown in Table 2 . Based on these average scores a rank between 1 and 14 is assigned to each technique by each metric as shown in Table 3 . We showed the enhanced results to two most senior radiologists in the country. For this, we prepared 50 sheets of images, each containing 14 enhanced images generated by the techniques being evaluated. The images on each sheet were only numbered as 1 to 14 without mentioning the names of the technique, to avoid any bias. For the same reason, the location of images generated by different techniques were also randomized on each sheet. Figure 1 is one example of those sheets. Each radiologist was asked to mark three best images on each sheet based on the visibility of infected regions in the image which could help in better diagnosis. To minimize the judgement error, we consider the total number of events in which a technique remained at any of the first three positions as the score of the technique. These scores and the rankings of the enhancement techniques based on them are also shown in Table 3 . Comparing the rankings given by the metrics with those given by the radiologists, we did not observe a good correlation between the two. Three metrics, Entropy, EME, and EMEE, rank technique (k) BHE as the best, whereas it was at the last position in radiologist's ranking. In fact, out of 50 images, each enhanced by 14 techniques, not even one enhanced by BHE was picked as the best by the radiologists. The other three metrics SSIM, PSNR, and AMBE kept (d) RMSHE at the first position which stood at 5 th position in radiologist's ranking. We discussed these results with the radiologists and based on their feedback, the ground-glass opacities are soft tissues, which are the regions of their interest for COVID-diagnosis. We have marked these regions with white and red boxes in the original image shown in Fig. 1 . The methods that enhance these regions are ranked high by the radiologists. Looking at the example of RMSHE discussed above, it enhances the details in some parts at the expense of other regions. In case of COVID-19 images, the suppressed regions by RMSHE happen to be of radiologist's interest. Therefore, neither this technique, not the three metrics that prefer it are a good choice for the COVID-19 CT images. Bold values indicate the ranks attained by each technique from quantitative and subjective evaluation scores The existing metrics are mostly designed for natural images, and they consider several factors which are not important for medical images. For example, entropy-based features rank those images high which have more details, but in many cases, noise is the cause of high entropy. Here we evaluate the performance of six metrics used in our study in truly predicting the quality of the enhanced images as desired by the radiologists. For this, in Table 4 , we show the correlation between the rankings given to the enhancement techniques by each matric and the rankings given by the radiologists, using the Spearman rank-order correlation coefficient (SRCC). SRCC is a nonparametric variation of the Pearson correlation coefficient that calculates the level of relationship between two variables based on their ranks. The SRCC values of the metrics listed in Table 4 are very low showing a poor correlation. However, we found a better correlation when we used the combined evaluation of the six metrics for ranking, as shown in the last column of the table. Table 5 , we show these techniques and the rankings given by the radiologist (indicated by R) along with the rankings given to them collectively by the six metrics (indicated by M). For techniques (i), (g), (n) and (d), we see a good correlation between the two, but for the rest it is poor. The overall SRCC value remained 0.815. Next, we reverse the order of picking the top-5. We pick the best five determined by the combined scores of the metrics and check their ranks given by the radiologist. These are shown in Table 6 . In this case, the correlation seems high and SRCC obtained a high value equal to 0.855. From Table 6 , we can conclude that the combined scores of the six metrics remain high for well-enhanced images. However, the opposite cannot be guaranteed, and some good images get poor scores as shown in Table 5 . Top 3 techniques ranked by the radiologist as well as the combined evaluation by the metrics are techniques (i) R-ESIHE, (g) AGCWD, and (n) NLQ. We can observe good uniform enhancement in each region of the resultant images generated by these techniques, including the regions of interest marked by the red and white rectangles in the original image in Fig. 1. Technique (b) , which was among top 3 picks of the radiologist but was not picked by the metric, also shows very good enhancement. The worst performing techniques, as per the radiologist were (k) BHE, (a) BBHE, and (j) RS-ESIHE. If we look at the sample images produced by these techniques in Fig. 1 , we can note that (k) BHE shows less enhancement in the regions of interest marked by the radiologists but enhances the noise in other parts. This is the reason it was picked as the best by the metrics that rely on the enhancement of details and contrast -entropy, EME, and EMEE. This shows that the higher values of detail and contrast metrics in medical imaging may be due to higher noise presence, and therefore they should be used with caution. In Fig. 1 if we look at the sample image of (a) BBHE, we can see that the enhancement in the region of ground-glass opacities is less as compared to other images and the region surrounding the soft tissues is also suppressed. The technique (j) RS-ESIHE does not seem to lose details; however, the enhancement in all regions is mediocre. A comparative analysis of different image enhancement technique's effect on the COVID-19 CT scans was presented. Fourteen enhancement techniques and six quantitative metrics and 50 COVID-19 CT scans were used in this study. The results were thoroughly evaluated by an experienced radiologist. A number of observations were made in this study. First, the evaluation metrics did not rank the enhanced images very well; however, the combined scores of six metrics showed a better correlation with the true quality of the images determined by an experienced radiologist. Second, some images which were ranked high by the radiologists got poor evaluation by the combined scores of six metrics, and the vice versa. This highlights the shortcomings of the existing metrics and need for a new metric specifically designed for medical images. Finally, certain regions in COVID-19 CT scans are important for diagnosis. Some image enhancement techniques achieve good overall contrast but suppress these regions. Overall, R-ESIHE and AGCWD were evaluated the best and the second best performing techniques, while DSIHE and NLQ shared the third position. These observations and recommendations can help the researchers and the medical equipment manufacturers develop more suitable algorithms and hardware for accurate and efficient diagnosis of COVID-19. Post-COVID-19 global health strategies: the need for an interdisciplinary approach A global survey of potential acceptance of a COVID-19 vaccine WHO Coronavirus (COVID-19) Dashboard. Available: https:// covid19.who.int Artificial intelligence for the detection of COVID-19 pneumonia on chest CT using multinational datasets Coviddeep: Sars-cov-2/covid-19 test based on wearable medical sensors and efficient neural networks Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks Deepcovid: Predicting covid-19 from chest x-ray images using deep transfer learning Exploring the effect of image enhancement techniques on COVID-19 detection using chest X-ray images A weighted histogram-based tone mapping algorithm for CT images Artifacts in CT: recognition and avoidance Performance of a single lookup table (LUT) for displaying chest CT images1 Chest CT window settings with multiscale adaptive histogram equalization: pilot study Magnetic resonance imaging versus computed tomography in the evaluation of soft tissue tumors of the extremities Medical image contrast enhancement using range limited weighted histogram equalization Digital image processing ed: Prentice hall Upper Saddle River Image Processing: Principles and Applications Histogram equalization of CT images Medical image processing: the characterization of display changes using histogram entropy A novel reformed histogram equalization based medical image contrast enhancement using krill herd optimization Fuzzy gray level difference histogram equalization for medical image enhancement Contrast enhancement using brightness preserving bihistogram equalization Image enhancement based on equal area dualistic sub-image histogram equalization method Minimum mean brightness error bihistogram equalization in contrast enhancement Contrast enhancement using recursive mean-separate histogram equalization for scalable brightness preservation Recursive sub-image histogram equalization applied to gray scale images Recursively separated and weighted histogram equalization for brightness preservation and contrast enhancement Efficient contrast enhancement using adaptive gamma correction with weighting distribution Image enhancement using exposure based sub image histogram equalization Enhancement of low exposure images via recursive histogram equalization algorithms Bilateral histogram equalization for X-ray image tone mapping A comparative study of histogram equalization based image enhancement techniques for brightness preservation and contrast enhancement A new histogram equalization method for digital image enhancement and brightness preservation The entropy of an image A novel optimal fuzzy system for color image enhancement using bacterial foraging COVID19-CT-dataset: an open-access chest CT image repository of 1000+ patients with confirmed COVID-19 diagnosis Introduction to the DICOM standard Image quality assessment: from error visibility to structural similarity Local gray level S-curve transformation-a generalized contrast enhancement technique for medical images A comprehensive survey on image contrast enhancement techniques in spatial domain Adaptive contrast enhancement methods with brightness preserving Transform-based image enhancement algorithms with performance measure Appropriate contrast enhancement measures for brain and breast cancer images A new measure of image enhancement Chest CT imaging signature of coronavirus disease 2019 infection: in pursuit of the scientific evidence Chest CT in COVID-19: what the radiologist needs to know Chest imaging appearance of COVID-19 infection Brain early infarct detection using gamma correction extreme-level eliminating with weighting distribution Genetic algorithm based adaptive histogram equalization (GAAHE) technique for medical image enhancement A novel improved method of RMSHEbased technique for mammography images enhancement Contrast enhancement brain infarction images using sigmoidal eliminating extreme level weight distributed histogram equalization Contrast enhancement using real coded genetic algorithm based modified histogram equalization for gray scale images Fuzzy contextual inference system for medical image enhancement Pipeline for advanced contrast enhancement (PACE) of chest x-ray in evaluating COVID-19 patients by combining bidimensional empirical mode decomposition and contrast limited adaptive histogram equalization (CLAHE) Particle swarm optimized texture based histogram equalization (PSOTHE) for MRI brain image enhancement A non-uniform quantization scheme for visualization of CT images Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Acknowledgements The authors extend their appreciation to the deputyship for research and innovation, ministry of education in Saudi Arabia, for funding this research work through the project number MoE-IF-G-20-11.