key: cord-0765011-aidj1iv4
authors: Singh, Pritpal; Bose, Surya Sekhar
title: Ambiguous D-means fusion clustering algorithm based on ambiguous set theory: Special application in clustering of CT scan images of COVID-19
date: 2021-08-26
journal: Knowl Based Syst
DOI: 10.1016/j.knosys.2021.107432
sha: 02d458bda5b296eedd1712d0a985446b638a65e2
doc_id: 765011
cord_uid: aidj1iv4

Coronavirus Disease 2019 (COVID-19) has been considered one of the most critical diseases of the 21st century. Only early detection can aid in the prevention of personal transmission of the disease. Recent scientific research reports indicate that computed tomography (CT) images of COVID-19 patients exhibit acute infections and lung abnormalities. However, analyzing these CT scan images is very difficult because of the presence of noise and low-resolution. Therefore, this study suggests the development of a new early detection method to detect abnormalities in chest CT scan images of COVID-19 patients. By this motivation, a novel image clustering algorithm, called ambiguous D-means fusion clustering algorithm (ADMFCA), is introduced in this study. This algorithm is based on the newly proposed ambiguous set theory and associated concepts. The ambiguous set is used in the proposed technique to characterize the ambiguity associated with grayscale values of pixels as true, false, true-ambiguous and false-ambiguous. The proposed algorithm performs the clustering operation on the CT scan images based on the entropies of different grayscale values. Finally, a final outcome image is obtained from the clustered images by image fusion operation. The experiment is carried out on 40 different CT scan images of COVID-19 patients. The clustered images obtained by the proposed algorithm are compared to five well-known clustering methods. The comparative study based on statistical metrics shows that the proposed ADMFCA is more efficient than the five existing clustering methods.

In December 2019, the first Coronavirus Disease 2019 (COVID- 19) outbreak has been discovered in Wuhan, China [48] . This disease is caused by a novel virus, called Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) [30] . According to the World Health Organization (WHO) [43] , this disease has spread around the world and reported 200,840,180 confirmed cases along with 4,265,903 deaths by the end of 6th August, 2021. Now, this disease has seriously affected the health, economic and social systems of advanced and emerging countries [6] . Therefore, the WHO has declared this disease as one of the deadliest pandemics of all time. However, the pandemic situation caused by this disease has yet to be taken seriously [15] . Due to this reason, several countries, including the United States, Brazil, India and Italy, have been seriously affected by this virus [7, 8] .

As a result, several research groups are collaborating in the development of strategies, vaccines and new approaches to address the pandemic [17] .

COVID-19 is associated with serious respiratory symptoms that cause death in most cases [49] .

This disease is observed as pneumonia, the infection of which is very contagious from one to the other [1] . Digital imaging techniques, such as X-ray [2] and computed tomography (CT) [3] have contributed significantly to the diagnosis of this disease. Computer scientists have been actively involved in developing methods for analyzing these images based on machine learning. To develop such methods, they mainly use the convolutional neural network (CNN), which is a form of deep neural network. Some of the CNN based methods devised by the researchers are summarised in Table 1 .

The methodologies [24, 36, 2, 35, 23, 4, 21, 41, 37] discussed in Table 1 focus primarily on classifying X-ray or CT scan images in terms of infection and non-infection. Moreover, such approaches cannot identify the infection in the lungs. However, these studies indicate that X-ray and CT scan images of COVID-19 can be useful in determining the severity of lung infection. But, there are some inherent drawbacks associated with X-ray and CT scan images of COVID-19, which are described below:

grayscale values from dissimilar grayscale values. In the case of an image K, clustering is a means of partitioning it into non-overlapping regions of clusters K 1 , K 2 , . . . , K n , such that:

Based on the developments in clustering algorithms, they can be classified into two categories: (a) crisp clustering and (b) soft clustering [26] . Crisp and soft clusterings differ widely in their approach to assigning group members. In crisp clustering, the grayscale values of an image belong exclusively to a fixed group, i.e., their assignment is completely binary. That is, they completely belong to a cluster (true) or not (false). On the other hand, a soft clustering approach allows grayscale values to share a degree of membership in several groups. One of the most popular algorithms in the category of crisp clustering is K-means clustering (KMC) algorithm [45, 18] . Another example in this category of clustering algorithm is multi-view information-theoretic co-clustering (MV-ITCC),

which is based on information-theoretic co-clustering approach [44] . According to recent advances in the development of clustering algorithms, soft clustering algorithms can be divided into fuzzy set [46] based clustering, intuitionistic fuzzy set (IFS) [5] based clustering and neutrosophic set [33] based clustering. One of the most frequently used methods, developed using fuzzy sets, is fuzzy C-means (FCM) [22, 25] . Subsequently, researchers introduce several variants of the FCM algorithm. For example, Chen et al. [10] propose multiple-kernel FCM (MKFCM) algorithm using the composite kernel concept. Ji et al. [19] propose the weighted image patch-based FCM (WIPFCM) algorithm, which considers the spatial information of pixels during image clustering. Wang et al. [42] incorporate information-theoretic concept into the FCM algorithm to improve its performance. To deal with the noise in grayscale images, Zhao et al. [47] propose a new version of the FCM algorithm, called generalized fuzzy c-means clustering (GFCM) algorithm. Chaira [9] introduce a novel intuitionistic FCM (IFCM) clustering algorithm by adopting the concept of IFS. Verma et al. [38] propose an improved intuitionistic FCM (IIFCM) clustering algorithm by including the local spatial information during the clustering. To overcome the drawback of FCM algorithm, Deng et al. [13] propose a new transfer prototype-based fuzzy clustering method. Singh [28] introduces neutrosophic-entropy based clustering algorithm (NEBCA) for performing clustering operation on magnetic resonance imaging (MRI) of Parkinson's disease.

The above discussed clustering methods [22, 10, 19, 42, 47, 9, 38, 28] are based on the concepts of fuzzy set, IFS and neutrosophic set, and they are able to deal with inherent uncertainties of grayscale values, but they have certain limitations, such as:

a) The clustering methods based on fuzzy set (i.e., FCM [22] and its variants [10, 19, 42, 47] ) support the representation of uncertainties of grayscale values only by considering a true degree of membership. These methods define the true degree of membership of a grayscale value in [0, 1]. b) IFS based clustering methods (i.e., IFCM [9] and IIFCM [38] ) model the uncertainties of grayscale values with respect to hesitancy. These methods define the hesitancy in terms of true and false degree of memberships.

c) The clustering method based on neutrosophic sets (i.e., NEBCA [28] ) defines the uncertainties of grayscale values with three degree of memberships, called true, false and indeterministic [27, 29] .

The above discussion indicates the need of a more robust clustering method that can deal with the drawbacks mentioned in (1)-(4) as well as inherent uncertainties in the CT scan images of COVID-19 patients precisely. Recently, Singh et al. [32] proposed a novel theory to deal with uncertainties, called ambiguous set theory. Ambiguous set theory is better than fuzzy set, IFS and neutrosophic set in terms of its capability of modeling the inherent ambiguities very effectively with four degree of memberships, as true, false, true-ambiguous and false-ambiguous. Singh et al. [32] show the application of this theory in segmenting MRI of human brain. However, the application of this theory has not been extended to the analysis of other types of medical images, such as X-rays and CT scan images. A recent study by Singh and Bose [30] shows that the clustering approach is useful to identify infected regions in CT scan images of COVID-19 patients. With this motivation and considering the severity of COVID-19, this study extends the application of this theory in the development of a clustering algorithm that could be helpful in the analysis of CT scan images of COVID-19 patients. Hence, the main contributions of this study are five-fold as:

1) First, we first introduce the notion of ambiguous set theory with a mathematical representation.

2) Second, this study presents ambiguous membership functions (AMFs) to define the true, false, true-ambiguous and false-ambiguous memberships of an event using ambiguous set.

3) Third, to quantify the inherent ambiguities associated with the four degree of memberships [29] . This final image is referred to as final clustered image (FCI). The main reason for using image fusion operation in this algorithm is to integrate the best features into the resultant image [40] . A major problem with most crisp clustering algorithms (as discussed above) is that data points are often assigned to incorrect clusters by ignoring their correlation with some other subset of data points in the problem space [12] . Therefore, the main goal of ADMFCA is to cluster the grayscale values of CT scan images of COVID-19 patients in such a way that correlated features of the infected regions can be easily identified and formed clusters with highly correlated features.

Fifth, in support of ambiguous set theory, various definitions, set-theoretical operations, theorems and properties are discussed.

The proposed ADMFCA is validated with chest CT scan images of COVID-19 patients with the corresponding ground truth. The performance of the proposed ADMFCA is compared with five existing clustering algorithms, including KMC [45] , FCM [22] , GFCM [47] , IIFCM [38] and NEBCA [28] . The performance of the proposed ADMFCA and existing algorithms [45, 22, 47, 38, 28] is compared using statistical metrics, such as mean squared error (MSE), peak signal-to-noise ratio (PSNR), Dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC) and correlation coefficient (CC). These statistical analyses show the effectiveness of the proposed ADMFCA over the selected clustering algorithms [45, 22, 47, 38, 28] .

The remainder of this article is organized as follows. Background for the study is presented in Section 2. Section 3 introduces the proposed ambiguous set theory. The proposed ADMFCA for image segmentation is presented in Section 4. Various properties of ambiguous-entropy distance function are discussed in Section 5. Experimental results are described in Section 6. Finally, conclusions and future directions are presented in Section 7.

This section presents an overview of the fuzzy set, intuitionistic fuzzy set and neutrosophic set.

The fuzzy setF for any event Q i (i = 1, 2, . . . , n) in the discrete and finite universe of discourse S can be described as [46] :

For the continuous and infinite universe of discourse S, the fuzzy setF for any event Q i (i = 1, 2, . . . , n) can be defined as:F

In Eqs. 3-4, µF (Q i ) represents the degree of membership for each Q i ∈ S. In this theory, the degree of membership of each Q i , i.e., µF (Q i ) always belongs to the range [0, 1]. In Eqs. 3-4, the horizontal bar represents a delimiter. The numerator of each term reflects the degree of membership of each Q i in the fuzzy setF . In Eq. 3, the summation symbol "+" represents the aggregation of each Q i , called an aggregation operator. In Eq. 4, the integral sign indicates a continuous functiontheoretic aggregation operator for continuous events [46] .

Atanassov [5] proposed the concept of intuitionistic fuzzy set (IFS). It helps to represent the hesitancy involved in each Q i ∈ S with respect to two membership functions, called degree of membership and degree of non-membership.

For a fixed crisp set C, an IFS is denoted as C I (Q i ), and defined as:

Here, µF (Q i ) ∈ [0, 1] denotes the membership of Q i , and θ C (Q i ) ∈ [0, 1] denotes the non-membership of Q i in the C.

In the IFS, the boundaries of membership and non-membership of Q i must satisfy the following condition as:

The non-membership of Q i is defined in terms of the fuzzy set as: θ C (Q i ) = 1 − µF (Q i ). However, this leads to a loss of information when Q i changes its state from one to another. Therefore, this loss of information is shown using IFS with respect to the membership and non-membership functions as:

Here, (Q i ) indicates the degree of loss. It can be defined only for the IFS, because here (Q i ) = 0.

However, for ordinary fuzzy set, θ C (Q i ) = 1 − µF (Q i ), so (Q i ) = 0.

For Q i , an IFS can also be represented in terms of loss as:

A neutrosophic set N can be expressed as a single-valued neutrosophic set (SVNS) [39] . The SVNS can be expressed for discrete and finite S as:

This section presents the philosophy of the ambiguous set, its various definitions followed by related properties.

According to the Oxford Dictionary, the word ambiguous is an adjective that means "open to more than one interpretation". Some information reflects different interpretations, which leads to ambiguity and incompleteness. As a result of this problem, decision-making becomes difficult in most of the cases.

Consider this proposition: "Mr. X is lying". This statement is either true or false; however, in view of the proposition, the human cognitive process can have the following perceptions:

P2: E is completely false, i.e., False: E .

P3: E is a little true, i.e., True-ambiguous: E .

P4: E is a little false, i.e., False-ambiguous: E .

By integrating the four different perceptions of an event, a novel theory is proposed, called ambiguous set theory [32] . According to this theory, the initial perceptions of the event E are indicate the ambiguities of E in terms of degree-of-true, degree-of-false, degree-of-ambiguity-intrue and degree-of-ambiguity-in-false, respectively. In this way, this theory addresses the problems associated with ambiguous features of the information in the data.

The inherent uncertainty for any event x in the universe of discourse S can be defined using ambiguous set theory [32] , which represents the uncertainty in terms of four degree of membership functions, namely, true (T ), false (F ), true-ambiguous (T A) and false-ambiguous (F A). Mathematically, the ambiguous set can be defined as follows. 

An ambiguous set can be defined for the discrete case as follows.

Definition 3. (Discrete ambiguous set). An ambiguous setÅ for the discrete and finite universe of

. , x n } can be represented as:

In Eq. 11, both symbols "+" and " " are termed as aggregation operators. For the continuous and infinite S, the ambiguous setÅ can be denoted as:

The AMFs can be defined as follows for the SVAS. The different grayscale intensities also pose the challenge of distinguishing one region from another.

Consequently, users cannot confidently use the linguistic terms "dark gray", "gray" and "light gray"

to describe these three grayscale values. However, this difficulty can be resolved by using ambiguous sets, where inherent imprecision or approximation of grayscale intensities are expressed by AMFs.

In this respect, three different ambiguous setsÅ 1 ,Å 2 andÅ 3 can be defined for the grayscale values at pixel positions P 15 , P 17 and P 112 on the universe of discourse Z = [0, 255] using the AMFs (Eqs. [13] [14] [15] [16] , respectively, as:Å 1. linguistic description of all uncertainties associated with the effect of AMFs, and 2. the distribution of ambiguity in the two-dimensional plane.

Entropy can be used to measure the individual ambiguousness represented by the AMFs, namely, T , F , T A and F A. Such measurements of ambiguousness with respect to T , F , T A and F A are called true entropy (TE), false entropy (FE), true-ambiguous entropy (TAE) and false-ambiguous entropy (FAE), respectively. These four entropies can be defined as follows. 

which can be defined as follows:

be two ambiguous sets. Some operations on ambiguous sets are given below:

Definition 7. (Ambiguous vector and its complement). Let Θ = (Θ 1 , Θ 2 , . . . , Θ n ) be a vector, where

is an ambiguous set on the universe S. Then, Θ is called an ambiguous vector on S. We define the complement of Θ as

. Θ T denotes the transpose of Θ. If n = 1, we do not distinguish between the ambiguous vector Θ = (Θ 1 ) and the ambiguous set Θ 1 .

be ambiguous sets on the universe S; let Θ = (Θ 1 , Θ 2 , . . . , Θ n ), θ = (θ 1 , θ 2 , . . . , θ n ). We call Θ · θ = { x, y, z, u, v |x ∈ S} the inner product of Θ and θ, where ∨ and ∧ denote the max and min operations, 

). Note that Θ · θ is an ambiguous set; also, when n = 1, we have

Definition 9. (Outer product). For each j = 1, 2, . . . , n, let

Proof: According to Definitions 6, 8 and 9, we have:

Theorem 2. For each j = 1, 2, . . . , n, let

be ambiguous sets on the universe S; let Θ = (Θ 1 , Θ 2 , . . . , Θ n ), θ = (θ 1 , θ 2 , . . . , θ n ). Then, Θ·θ = θ·Θ,

J o u r n a l P r e -p r o o f

Proof: According to Definitions 8 and 9:

Definition 10. (Jaccard similarity measure). Let η = (η 1 , η 2 , . . . , η n ) and ς = (ς 1 , ς 2 , . . . , ς n ) be two vectors of length n, where all coordinates are positive. The Jaccard similarity measure of these two vectors is defined as:

where, η · ς = n i=1 η i ς i is the inner product of the vectors η and ς.

Definition 11. (Dice similarity measure). Let η = (η 1 , η 2 , . . . , η n ) and ς = (ς 1 , ς 2 , . . . , ς n ) be two vectors of length n, where all coordinates are positive. The Dice similarity measure of these two vectors is defined as: 

Property 1. The Jaccard, Dice and cosine similarity measures satisfy the following properties as: The above similarity measures motivate the following definition.

Definition 13. LetÅ

be two ambiguous sets on the universe S. Let η = (η 1 , η 2 , . . . , η n ) and ς = (ς 1 , ς 2 , . . . , ς n ) be two vectors of length n, where η i , ς i ∈ S for i = 1, 2, . . . , n. Let w 1 , w 2 , . . . , w n be non-negative real numbers, called weights. The ambiguous weighted Jaccard similarity measure, ambiguous weighted Dice similarity measure and ambiguous weighted cosine similarity measure of these two ambiguous sets for the vectors η, ς are defined, respectively, as:

This section introduces the proposed ADMFCA for clustering grayscale images. The proposed ADMFCA is based on ambiguous set theory, entropies and image fusion operation. Each step of the proposed ADMFCA is explained next.

J o u r n a l P r e -p r o o f

Step 1. Define the grayscale domain of image: The grayscale value G ij associated with each pixel P ij (i = 1, 2, . . . , m)(j = 1, 2, . . . , n) of an input gray image I GI can be expressed in a grayscale domain as:

Here, m × n represents the total number of grayscale values in the I GI . In Eq. 28, each grayscale value G ij ∈ P ij is defined in the range [0, G] with G = 255. Hence, the universe of discourse S for each G ij ∈ I GI is defined as S = [0, G].

Step 2. Define the ambiguous domain of image: The ambiguous domain for the grayscale image I GI is defined by representing the grayscale value G ij of each pixel in the ambiguous set. The ambiguous set of each G ij is denoted asÅ ij , and can be expressed in the following matrixÅ A as:

In Eq. 29, eachÅ ij is defined as:

In Eq. 30, the four AMFs, namely, T , F , T A and F A for G ij ∈ S can be defined as:

In Eq. 31, min and max represent the minimum and maximum functions, respectively. In Eqs. 33 and 34, the ambiguous distance function A F can be defined as:

Step 3. Measurements of ambiguousness for ambiguous set: The ambiguousness of the 

Step 4. Selection of clusters for the entropies: Choose D initial number of clusters at random

Step 5. Define the set of centers for each of the clusters: Define a set of random initialized centers for each of the clusters C T d , C F d , C T A d and C F A d as:

Here, 0 indicates the 1st epoch of the algorithm. From Eq. 40, it can be assumed that C T d cluster Step 6. Set the epochs:

and E F A (Å ij , G ij ), the epoch e from 0 to Epoch is set as e = 0, 1, . . . , Epoch, where Epoch denotes the maximum number of epochs.

Step 7. Computation of distances between entropies and centers: Each of the entropies

The determination of the nearest center vectors W i (0), X i (0), Y i (0) and Z i (0) is done by employing ambiguous-entropy distance function. The proposed function computes the distance between E T (Å ij , G ij ) and W i (0) as:

Similarly, the proposed metric computes the distances between E F (Å ij , G ij ) and X i (0), E T A (Å ij , G ij )

Journal Pre-proof and Y (0), and E F A (Å ij , G ij ) and Z i (0), defined in Eqs. 45-47, respectively, as: (47) In Eqs. 44-47, Dist[·] denotes the ambiguous-entropy distance metric. In Eq. 44, if W i (0) is the closest center to E T (Å ij , G ij ), then it is assigned to the cluster C T d . A similar explanation can be given for Eqs. 45-47.

Step 8. Selection criterion of clusters: The selection of each cluster by E T (Å ij , G ij ), E F (Å ij , G ij ), E T A (Å ij , G ij ) and E F A (Å ij , G ij ) depends on the minimum values of the ambiguous-entropy distances (Eqs. 44-47, respectively). For example, let W i (0) and W j (0) be the two randomly defined centers for the clusters C T i and C T j with respect to the clustering E T (Å ij , G ij ). Now, E T (Å ij , G ij ) ∈ W i (0) if it satisfies the following condition as: 

Step 9. Update the centers: After each epoch, the proposed algorithm updates their centers.

This process continues until it reaches the maximum epoch Epoch. During the individual clustering of E T (Å ij , G ij ), E F (Å ij , G ij ), E T A (Å ij , G ij ) and E F A (Å ij , G ij ), the respective new centers are denoted as W i (e + 1), X i (e + 1), Y i (e + 1) and Z i (e + 1), defined in Eqs. 49-52, respectively, as: Step 11. Generate the clustered images: Individual clustering of E T (Å ij , G ij ), E F (Å ij , G ij ), E T A (Å ij , G ij ) and E F A (Å ij , G ij ) generates the four different clustered images, called TE clustered image (TECI), FE clustered image (FECI), TAE clustered image (TAECI) and FAE clustered image (FAECI). TECI, FECI, TAECI and FAECI are denoted as T ECI , F ECI , T A ECI and F A ECI , respectively.

Step 12. Obtain the final clustered image: The final clustered image (FCI) is generated by applying the image fusion operation [29] on four clustered images, viz., T ECI , F ECI , T A ECI and F A ECI as:

Here, F CI denotes the FCI.

The pseudocode of the proposed ADMFCA is summarized in Algorithm 1.

This section presents various properties of ambiguous-entropy distance function. This function is used in the proposed ADMFCA to compute the distance between entropy and center of the clusters.

Consider the following generalized form of ambiguous-entropy distance function that computes the distance between E T (Å ij , G ij ) and W i as:

Here, ambiguous-entropy distance function Dist[E T (Å ij , G ij ), W i ] is used to compute the distance between E T (Å ij , G ij ) and W i . For ease of explanation of various properties of this function, we only consider the vectors E T (Å ij , G ij ) and W i . However, these properties are also valid in the case of the computation of the distances between E F (Å ij , G ij ) and X i , E T A (Å ij , G ij ) and Y i , and E F A (Å ij , G ij ) and Z i . In the following, we have discussed various properties of ambiguous-entropy distance function in terms of Eq. 54.

Proof: Assume two centers W i and W j , where W i , W j ∈ W and W i > W j . From Eq. 54, it

It indicates that as W i increases,

patients [20] . This dataset contains CT scan images of COVID-19 patients with 20 different labels.

In this study, CT scan images with 10 different labels are selected for the experiment. Out of each label, four different CT scan images are selected. Thus, a total of 10 × 4 = 40 CT scan images are available with their respective ground truths. These 40 images are split into four different groups, called Group #1, Group #2, Group #3 and Group #4. However, the extracted CT scan images have noise and poor resolution issues. Therefore, these images are preprocessed before carrying out the experiment. The adaptive filtering technique [14] and the histogram equalization method [16] are used for noise removal and resolution improvement, respectively. Eventually, these preprocessed images are used for the experiment. Detailed information on the experimental datasets is available in Table 2 . Table 2) in terms of the four degree of memberships in the ambiguous set as:

• G ij of all white pixels are defined by T (G ij ),

• G ij of all non-white pixels are defined by F (G ij ),

• G ij of all white pixels with certain non-white pixels are defined by T A(G ij ), and

• G ij of all non-white pixels with certain white pixels are represented by F A(G ij ). The proposed algorithm is simulated by selecting three different cluster numbers as D = 2, 3, 4.

The main objective of the simulation with different cluster numbers is to determine which cluster number is best to generate the optimal FCIs. The best cluster number for the proposed ADMFCA is determined by evaluating the quality of the FCIs using statistical metrics, such as MSE, PSNR, DSC, JSC and CC (Eqs. 55-59, respectively). The FCIs are obtained by setting the maximum number of epochs to Epoch = 100.

Consider a CT scan image #94 (Group #1, Label: 2) shown in Fig. 2(a) . The respective ground truth of this image is shown in Fig. 2(b) . In Fig. 2(c) , the preprocessed image of Fig. 2(a) obtained by the adaptive filtering technique [14] followed by the histogram equalization method [16] is shown.

Then, the proposed ADMFCA is applied to the preprocessed image (Fig. 2(c) Table 2 are grouped using the proposed ADMFCA, and FCIs are generated. From the experiment, it is observed that the proposed ADMFCA generates the optimal FCIs from preprocessed images with D = 3. From now on, the rest of our experimental results and discussion on the proposed ADMFCA are based on D = 3.

A visual analysis is conducted to assess the quality of FCIs obtained from preprocessed CT scan images of COVID-19 patients. To demonstrate visual analysis, CT scan images #142 (Label: [45] , FCM [22] , GFCM [47] , IIFCM [38] and NEBCA [28] are shown column-wise in Figs Journal Pre-proof [45, 22, 47, 38, 28] and proposed ADMFCA are presented in Tables 3-7, respectively. The average   values of MSE, PSNR, DSC, JSC and CC are obtained and shown in the last row of each table. A discussion is carried out on these statistical values next.

• Table 3 The proposed ADMFCA, on the other hand, has an average MSE value of 1.59, which is significantly lower than the existing clustering approaches, such as KMC, FCM, GFCM, IIFCM and NEBCA. This low MSE value for the proposed ADMFCA indicates that it can produce high quality FCIs with minimal intensity loss.

• • Table 5 shows the statistics of DSC values for the existing methods and the proposed ADMFCA.

The proposed ADMFCA yields an average DSC value of 0.92, which is significantly higher than the existing clustering methods. This high DSC value indicates that the FCIs produced by the proposed ADMFCA are identical to their respective ground truths.

• The JSC values for the existing competing methods and the proposed ADMFCA are shown in Table 6 . The proposed ADMFCA obtains an average JSC value of 0.96 for the FCIs, which is significantly higher than the existing clustering methods. For the proposed ADMFCA, the high JSC value indicates that the regions of interest of the FCIs are almost identical to their respective ground truths.

• Table 7 Consequently, the proposed ADMFCA is highly effective at forming clusters of pixels associated with infected regions.

In this study, ambiguous set theory was discussed, which was recently proposed to address inherent uncertainties of events. The ambiguous set theory can be considered as an extension of three existing theories, viz., fuzzy set, intuitionistic fuzzy set and neutrosophic set. The main robustness of this theory was its ability to represent the ambiguity of an uncertain event with four distinct degree of memberships, called true, false, true-ambiguous and false-ambiguous. To endorse this theory, various definitions, formulas and properties were discussed in this study. The main contributions of this study are summarized as:

• It can be concluded that the proposed ADMFCA was proven to be effective in clustering CT scan images of COVID-19 patients. Therefore, the proposed ADMFCA can be considered as a new promising diagnostic method for health professionals. The main limitation of the study was that the proposed ADMFCA was validated only on chest CT scan images of COVID-19 patients. In the future, the proposed ADMFCA can be verified and validated with other forms of digital images, such as X-rays, MRIs [29] , remotely sensed high-resolution satellite images [31] , and so on. Additionally, the proposed ADMFCA can be used to cluster a variety of numerical data, including meteorological data, financial data, stock market data, and so on. FECI, (f) TAECI, (g) FAECI, (h) FCI, (i) histogram of (c), and (j) histogram of (h).

J o u r n a l P r e -p r o o f FECI, (f) TAECI, (g) FAECI, (h) FCI, (i) histogram of (c), and (j) histogram of (h).

J o u r n a l P r e -p r o o f FECI, (f) TAECI, (g) FAECI, (h) FCI, (i) histogram of (c), and (j) histogram of (h).

J o u r n a l P r e -p r o o f 

J o u r n a l P r e -p r o o f 

Recognition of COVID-19 disease from X-ray images by hybrid model consisting of 2D curvelet transform, chaotic salp swarm algorithm and deep learning technique

Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks

Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks

Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks

Intuitionistic fuzzy sets

Modeling and forecasting of epidemic spreading: The case of Covid-19 and beyond

Forecasting of COVID-19 time series for countries in the world based on a hybrid approach combining the fractal dimension and fuzzy logic

A novel method for a COVID-19 classification of countries based on an intelligent fuzzy fractal approach

A novel intuitionistic fuzzy c means clustering algorithm and its application to medical images

A multiple-kernel fuzzy c-means algorithm for image segmentation

Enhanced soft subspace clustering integrating within-cluster and between-cluster information

A survey on soft subspace clustering

Transfer prototype-based fuzzy clustering

Adaptive filters in Matlab: from novice to expert

Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe

A novel 3-D color histogram equalization method with uniform 1-D gray scale histogram

Coronavirus disease 2019 (COVID-19): A literature review

A hybrid fuzzy clustering approach for the recognition and visualization of MRI images of Parkinson's disease

Fuzzy c-means clustering with weighted image patch for image segmentation

COVID-19 CT Lung and Infection Segmentation Dataset

Diagnosis of coronavirus disease 2019 (COVID-19) with structured latent multi-view representation learning

A modified fuzzy c-means image segmentation algorithm for use with uneven illumination patterns

A novel medical diagnosis model for COVID-19 infection detection based on deep features and bayesian optimization

Automated detection of COVID-19 cases using deep neural networks with X-ray images

Knowledge-leveraged transfer fuzzy C-Means for texture image segmentation with self-adaptive cluster prototype matching. Knowledge-Based Systems

Evolving fuzzy clustering approach: An epoch clustering that enables heuristic postpruning

A neutrosophic-entropy based adaptive thresholding segmentation algorithm: A special application in MR images of Parkinson's disease

A neutrosophic-entropy based clustering algorithm (NEBCA) with HSV color system: A special application in segmentation of parkinson's disease (PD) MR images

A type-2 neutrosophic-entropy-fusion based multiple thresholding method for the brain tumor tissue structures segmentation

A quantum-clustering optimization method for COVID-19 CT scan image segmentation

Uncertainty representation using fuzzy-entropy approach: Special application in remotely sensed high-resolution satellite images (RSHRSIs)

A novel ambiguous set theory to represent uncertainty and its application to brain MR image segmentation

Neutrosophy, a new Branch of Philosophy

Managing information measures for hesitant fuzzy linguistic term sets and their applications in designing clustering algorithms

Convolutional capsnet: A novel artificial neural network approach to detect COVID-19 disease from X-ray images using capsule networks

Deep learning COVID-19 detection bias: accuracy through artificial intelligence

A new approach for classifying coronavirus COVID-19 based on its manifestation on chest X-rays using texture features and neural networks. Information Sciences

An improved intuitionistic fuzzy c-means clustering algorithm incorporating local information for brain image segmentation

Single valued neutrosophic sets

Bayesian image segmentation fusion. Knowledge-Based Systems

A weakly-supervised framework for COVID-19 classification and lesion localization from chest CT

An adaptive spatial information-theoretic fuzzy clustering algorithm for image segmentation

COVID-19 Weekly Epidemiological Update

Multi-view information-theoretic co-clustering for co-occurrence data

An improved K-means clustering algorithm for fish image segmentation

Fuzzy sets. Information and Control

Kernel generalized fuzzy c-means clustering with spatial information for image segmentation

COVID-19 and the cardiovascular system

A pneumonia outbreak associated with a new coronavirus of probable bat origin

Corresponding author) and Surya Sekhar Bose The authors whose names are listed in the manuscript certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript

Ministry of Science & Technology