key: cord-0839111-11vof8yv authors: Majumder, Sandip; Kar, Samarjit; Samanta, Eshan title: A fuzzy rough hybrid decision making technique for identifying the infected population of COVID-19 date: 2020-11-23 journal: Soft comput DOI: 10.1007/s00500-020-05451-0 sha: a99833764c97268b033154f43641c9ed237f3f01 doc_id: 839111 cord_uid: 11vof8yv Decision theoretic rough set model have been used over many years in most of the application areas. It provides a novel way for knowledge acquisition, especially when dealing with vagueness and uncertainty. Many mathematical modelings have been presented recently to control the pandemic nature of COVID-19 and along with its control model as well. Decision-based treatment recommendation has not yet been found so far in any of the articles. In this paper, we have proposed a novel approach of three-way decision based on linguistic information of a COVID-19 susceptible person. To present this, we have discussed the probabilistic rough fuzzy hybrid model with linguistic information. This model helps us to guess the infected person and decide whom to send for self-isolation, home quarantine and medical treatment in an emergency situation. The significance of the proposed hybrid model has been discussed by presenting a comparative study and reported along with justifications too. The COVID-19 is a highly infectious disease. The primary case was identified on December 31, 2019, in the city of Wuhan, the capital of Hubei province, China. The name 'Coronavirus' comes from the Latin word 'Corona' which means a crown circle of light or nimbus. This virus has comparable symptoms as influenza and pneumonia. In the beginning, it was spotted in mainland China and then spread to the whole world, infected around 3,292,489 people and taken almost 233,144 lives as on April 30, 2020. Figure 1 represents the worldwide confirmed cases till April 30, 2020. World Health Organisation (WHO) has declared it to be a pandemic. It is very difficult to track the presence of this deadly virus because symptoms are similar to that of flue and cough and cold. This virus exposes its symptoms after 7 to 14 days from the time it enters the human body. In the absence of a vaccine, social distancing is the most widely adopted strategy for its mitigation and control (Ferguson et al. 2020) . Public health concerns are being paid globally on how many people are infected and suspected. If a healthy person comes in close contact with the infected person, or with his/her belongings, the virus enters his/her body. Only proper testing allows the infected person to know that they are infected with the virus, this can help them receive the care they need, and it can help them take measures to reduce the probability of infecting others. People who do not know they are infected might not stay at home and thereby risk infecting others. Testing is also crucial for an appropriate response to the pandemic. It allows us to understand the spread of the disease and take evidence-based measures to slow down the spread of the disease specifically in India. In the recent times, the study of COVID-19 transmission has gained attention by the researchers and practitioners. Ahmadi et al. (2020) studied the COVID-19 outbreak by considering geographical and climatological parameters. Zhu and Chen (2020) presented a statistical disease model to analyze the early outbreak in China. Boldog et al. (2020) proposed an integrated model for assessing risk of COVID-19 outbreaks in countries outside of China. Yan et al. (2020) developed a predictive model to identify early detection of high risk patients before their Worldometer (2020) health status is transformed from mild to critical condition. To study the spread trend of COVID-19, numerous research articles have been published in the literature (Ahmadi et al. 2020; Zhu and Chen 2020; Boldog et al. 2020; Yang and Wang 2020; Yan et al. 2020) . Unfortunately, the capacity for COVID-19 testing is still low in many countries (especially in India) around the world. For this reason, we do not have a good understanding of the spread of the pandemic. Therefore, it is essential to develop a decision making tool to identify the suspected person of the COVID-19. Khatua et al. (2020b) have proposed an optimal control fuzzy model on COVID-19 using granular differentiability, as discussed in Mazandarani et al. (2017) . They presented the same on SEIAHRD model, where they have considered hospitalized patients, symptomatic patients and asymptomatic patients as well individually. Khatua et al. also contributed optimal control model to check the pandemic parameters as well in Khatua et al. (2020a) . Khatua et al. (2020c) reported on SIR-Network Model for COVID-19 with respect to its impact on a particular state name as West Bengal in India. In this paper, we have tried to develop a three-way decision model of COVID-19 suspected people based on their linguistic information of attribute values, which help suspected infected person to send self-isolation, home quarantines or treatment as a result of which the rate of contamination can be reduced. Here, COVID-19 suspected people might not be able to give an exactly quantitative description. They express their opinions with linguistic term such as good, very good, and not so good. In decision theoretic rough set (Yao 2010) , the loss function is an essential thing to determine the threshold values of the parameters α and β. Most of the decision making problem calculate threshold value by Bayesian decision making process with the help of loss function. Our main focus is on COVID-19 infected person, so the medical expert can form the loss function using their expertise or they may choose the threshold value. The paper is organized as follows: In Sect. 2, we have briefly discussed classical rough set, probabilistic rough set and three-way decisions, decisiontheoretic rough set model. Section 3 discusses the linguistic variable and the basic operations. In Sect. 4, we have presented three-way decisions based on linguistic information, fuzzy probability, linguistic-valued information systembased probability. In Sect. 5, we create an example of group of COVID-19 suspected people and based on their linguistic information, make a three-way decision. Also, a comparative study has been reported. At last finally in Sect. 6 overall conclusion of this work has been discussed with remarks. In this section, we briefly review the classical rough set defined by Pawlak (1982) , probabilistic rough set, three-way decision based on probabilistic rough set, decision theoretic rough set model based on Bayesian decision making process (Yao and Wong 1992; Yao 2003 Yao , 2010 Yao , 2011 Ziarko 1993; Wu and Xu 2016 ). Consider an information system S = U , A, V , f . U is the non-empty finite set of objects called universe. A is a finite non-empty set of attribute. V is a set of attribute value and f : U × A −→ V is an information function. Then, for any E ⊆ A, indiscernible relation IND(E) on U is defined as Clearly, it is an equivalence relation on U and as a result, induces partition on U . For any X ⊆ U , lower and upper approximations are defined as with this approximation U can be divided into three disjoint regions namely, Hence, if x ∈ POS(X ), then x surely belongs to the concept X . If x ∈ NEG(X ), then x certainly does not belong to target set X . If an object x ∈ BND(X ), then it may or may not belong to X . Upper approximation, lower approximation, boundary region defined by Pawlak (1982) are perfect. But the main drawback is that it is not able to make decision for the majority of the object. With the knowledge, the probabilistic rough set model was proposed. Main intuition of probabilistic rough set model is to expand the decision region, i.e., to expand positive and negative regions using two parameters α and β. Let U , E be the approximation space, then U , E, P is a probabilistic approximation space (Yao 2008) , where P is a probability measure defined on a subset of universe U . For any X ⊆ U , where |.| denotes the cardinally. Now for 0 ≤ β < α ≤ 1, upper and lower approximations of X are given by: Now these two approximations lead to three-way decision region. POS (α,β) The conditional probability may be recognized as a level of confidence that an object having the same description as x belongs to X . For α = 1 and β = 0, the decision theoretic rough set model coincides with the classical rough set model. A probabilistic two-way decision model may be obtained with, α = β. A major difficulty is the interpretation and determination of the threshold (α, β). Based on the Bayesian decision procedure, values of α and β are calculated. Now we will represent a brief description of the Bayesian decision procedure for a given object x. Let Ω = {s 1 , s 2 , . . . ., s m } be a finite set of m possible states and . . ., a n } be a finite set of n possible action. Hence, we can construct a m × n matrix which represents all possible loss function. If the object x is in state s j , then λ(a i |s j ) represents the loss incurred for taking action a i and P(s j |x) represents the conditional probability of x being in a state s j . If action a i taken for the object x, then expected risk associated with action a i is given by: In decision theoretic rough set model, set of states denoted by Ω = {X , X c } and set of action denoted by A = {a P , a B , a N } where a P , a B , a N represent the three actions to classify an object into POS(A), BND(A), NEG(A), respectively. A 3 × 2 matrix for all the values of loss function is shown in Table 1 . λ P P , λ B P , λ N P denote the loss incurred for taking action of a P , a B , a N , respectively, when the decision object belongs to X . λ P N , λ B N , λ N N denote the loss incurred for taking action of a P , a B , a N , respectively, when the decision object belongs to X C . The expected loss of three actions given an equivalence class [x] of a decision object x is as follows: According to the Bayesian decision procedure, the minimum cost decision rules are as follows. We consider the loss function inequality λ P P ≤ λ B P < λ N P and λ N N ≤ λ B N < λ P N with We can formulate the decision rules based on this division of the universe as follows: where the threshold α and β are defined as: . The parameters α, β define the regions and provide us associated risk for classifying an object. Here, parameter α makes the division between (P) region and (B) region. Similarly, parameter β makes the division between (B) region and (N ) region. These minimum risk decision rule help us to classify the object into approximation regions. Professor L. A. Zadeh (1965) proposed the concept of fuzzy set in 1965. Fuzzy sets theory proposes to deal with unclear boundaries, representing vague concepts and working with linguistic variables. In this sense, fuzzy sets emerged as an alternative way to deal with uncertainties. Fuzzy set theory is an extension of classical set theory where elements have a degree of membership, called membership function having interval [0, 1]. Let 'X ' be the universe of discourse and μÃ(x) be membership function associated with fuzzy setsÃ, then μÃ(.) maps every element of X to the interval [0, 1], i.e., Hence, a fuzzy setà defined on X can be written as x 5 } be the reference set of students andà be the reference set of "smart" students, where "smart" is fuzzy term and represented bỹ Here,à indicates that the smartness of x 1 is 0.4, x 2 is 0.5, and so on. Hence, membership function provides a measure of the degree of similarity of an element to a fuzzy set. Clearly, membership function is subjective, because it is specific to an individual assessor or a group of assessors. It is also assumed that for each x ∈ X the assessor is able to assign an μÃ(x). It is noted that for crisp set, a membership function can be defined as follows: Hence, the crisp set has sharp boundaries, whereas fuzzy set has vague boundaries. Basic terminology: 1. α-cut: Given a fuzzy setà defined on x and any number α ∈ [0, 1], the α-cut is the crisp setsà α = {x|μÃ(x) ≥ α} and strong α-cut is the setà α * = {x|μÃ(x) > α} 2. Level set ofÃ: The set of all levels α ∈ [0, 1] that represents distinct α-cuts of given fuzzy setà is called a level set of A, denoted by 3. Support: For fuzzy setà its support is a crisp set denoted by s(Ã) and defined by s(Ã) = {x|μ A (x) ≥ 0}. 4. Normal and Subnormal fuzzy set: Maximum value of the membership degree of a fuzzy set is called height of the fuzzy set. A fuzzy setà is normal if its height is 1 and subnormal if its height is less than 1. Core of a fuzzy set are those x for which μÃ(x) = 1. 5. Convex fuzzy set: Fuzzy setà is convex if μÃ(λ( . Cardinality: For a finite fuzzy setÃ, the cardinality |Ã| is defined as |Ã| = x∈X μÃ(x) and ||Ã|| = |Ã| |X | is called relative cardinality ofÃ. A variety of definitions exist for the measurement of fuzziness. These facts are discussed in Dubois and Prade (1982) , Klement and Schwyhla (1982) , Sugeno (1985) and Zimmermann (2011) following concerned articles. LetÃ,B are two fuzzy sets, then they are equivalent if Fuzzy sets allow us to represent vague concepts in natural language. The representation depends on both the concept and the context in which it is used. Several fuzzy set representing linguistic concept such as low, medium, high, and so on are often employed to define the status of a variable. Such a variable is usually called a fuzzy variable. The significance of the fuzzy variable is that they facilitate gradual transitions between states. This consequently possesses a natural capacity to express and deal with observation and measurement uncertainties. Remark 2 In this paper, we are going to classify suspected people of COVID-19 who might be infected with coronavirus and so loss function may be prepared by some medical expert (Pauker and Jerome 1980) . A linguistic variable is a variable whose values are words or sentence in a natural or artificial language. It has values that are linguistic elements, such as words and phrases which is derived using quantitative or qualitative reasoning such as with probabilistic or fuzzy systems (Deng and Yao 2014; Xu 2005; Pawlak 1985; Zadeh 1965; Klir and Yaun 2006; Chakraborty 2011) . Let L = {s α | α = 0, 1, . . . .., r } be a totally ordered discrete term set where r +1 is the granularity of the linguistic term set L. Since L is totally ordered, law of trichotomy defined on it, i.e., There is also linguistic term set with symmetric subscript L = {s α | α = −r , . . . ., −1, 0, 1, . . . ., r }. Here 2r + 1 denotes the granularity of L and s 0 represent an assessment of fair. s −r and s r are lower and upper limits. Consider an example: L = {s −3 = very bad, s −2 = bad, s −1 = slightly bad, s 0 = fair, s 1 = slightly good s 2 = good, s 3 = very good}. To facilitate computation and consider all the available information, extend the discrete term set L to continuous term set L * = {s λ | s −r ≤ s λ ≤ s r , λ ∈ [−r , r ]} where s λ of L * are same as s α of L for λ = α. In L * index of any term denote the degree of the term. So for calculate probability with linguistic term we define a real-valued function from L * as follows: L * = {s λ | s −r ≤ s λ ≤ s r , λ ∈ [−r , r ]} be a continuous linguistic term set I : L * −→ [−r , r ] be a real-valued function where I (s λ ) = λ for any s λ ∈ L * . This function helps us to deal with decision making problem under uncertainty. It is to be noted that if s λ ∈ L, then s λ is the original term and λ be the original index. Otherwise, s λ is the virtual term and λ is the virtual term index. Decision maker always uses the original linguistic terms to evaluate alternatives and the virtual linguistic term can only appear in operation. Given a continuous term set L * , for any s λ , s μ ∈ L * and α, α 1 , α 2 ∈ [0, 1], the following operational laws hold: Our main focus on this paper is to determine the COVID-19 infected person based on the linguistic terms for evaluation values of all attribute. So we have two fundamental issues: (i) Calculate the conditional probability of every suspected person with respect to decision object. Here, decision object is the suspected person of COVID-19 (Karni 2009 ). (ii) Selection of the threshold value parameters, i.e., value of α and β which are used in the lower and upper approximation, respectively (Greco et al. 2008; Pauker and Jerome 1980) . To resolve the issue (i) we define the probability concept on a fuzzy event under the linguistic-valued attribute set. ) | x ∈ R n } be a real-valued fuzzy set, then crisp probability of fuzzy event is defined by where P(A α ) = x∈A α P(x). In an information system, the attribute values are given by linguistic variable. Consider a linguistic-valued information system as follows: Now consider a fuzzy set B with membership value μ B (x 1 ) = 0.5, μ B (x 2 ) = 0.7 and μ B (x 3 ) = 0.8 and probabilistic measure P defined by P(x 1 ) = 0.2, P(x 2 ) = 0.3 and P(x 3 ) = 0.5 Then, To facilitate computation, we define real-valued function on discrete-valued linguistic information system. where r is the total number of terms in L. For symmetric subscript linguistic set v : Here, v(s λ ) is a continuous mapping which makes transformation between L * and [0, 1]. Following results are immediate. Proposition 4.1 Let L * = s λ |s −r ≤ s λ ≤ s r , λ ∈ [−r , r ] be a set of continuous linguistic terms 'v' is a transformation between L * and real-valued over [0, 1] then, (2): Let −r ≤ λ 1 ≤ λ 2 ≤ r , then,s λ 2 ≥ s λ 1 . is an increasing function over L * . As the middle linguistic label so represents an assessment of 'in difference', transformation function v(s λ ) can also be represented in terms of v(s 0 ) as follows. For any linguistic-valued information system, let B ∈ F(U ) and x ∈ U , then the conditional probability of B with respect to x denoted by for all attribute j where θ : [0, 1] × [0, 1] −→ [0, 1] is a fuzzy logic operator (Pawlak 1985; Zadeh 1965; Klir and Yaun 2006) . Fuzzy logic operator may define in many ways. Here we define θ(x, y) = min(x, y). Thus, for all attribute j We illustrate this by an example continued from Table 2 (x 2 , a 4 ) ); (11) Clearly, P(B|x) satisfies the axioms of probability. Now with the help of conditional probability of a fuzzy event with linguistic description about attribute, we can define lower and upper approximation. Let B ∈ F(U ) and 0 ≤ β < α ≤ 1 and x ∈ U then Now, these two approximations lead to three-way decision region (Hu 2014) . To resolve the second issue, i.e., for selection of threshold value α and β, we use the function I , when loss function is expressed in terms of linguistic form. So loss function inequality is: Parameters α, β define the regions and provide us associated risk for classifying an object. Our main focus is to classify suspected people those who might be infected with coronavirus, so that medical experts can choose parameter value α, β on the basis of their experience (Pauker and Jerome 1980; Pawlak and Sowinski 1994; Yao and Azam 2014) . We illustrate an example using Table 3 ("Appendix I") of twenty-six people of different age group with their linguisticvalued information about different attributes related to COVID-19. Here, we have considered the attributes on the basis of the past history of COVID-19 infected population, where the pandemic impact of the infection is already in the third stage. AAo-HNS Infectious Disease and Patient Safety Quality Improvement Committee in the USA recently informed that without the presence of any symptoms like cough, fever, breathing problem, etc., the symptoms like malfunctioning of sensing organs related to smell and taste might be included as an additional identifier for COVID-19 infected patients who might require quarantine and treatment as well. In this example, we consider four age group people, seven conditional attribute and one decision attributes (here 'c' indicates COVID-19). We have considered different membership value for COVID-19 for different age group, which indicates the tendency of infection in the different age group. Here, linguistic term index is all non-negative and discrete, so we calculate the values with the help of Eqs. 7 and 10 for taking decisions by considering values of 1 − P c For the group 'I' (less than 20), threshold is taken as α = 0.8, β = 0.7 For the group 'II' (20 to 40), threshold is taken as α = 0.7, β = 0.55 For the group 'III' (40 to 60), threshold is taken as α = 0.4, β = 0.25 For the group 'IV' (> 60), threshold is taken as α = 0.3, β = 0.2. Now acceptance, non-commitment and rejection are deter- For group I (less than 20): For group II (20 to 40) : Here, POSITIVE region indicates immediate TEST FOR COVID-19 for the persons. BOUNDARY region indicates SELF-ISOLATION for the persons, and NEGATIVE region indicates HOME QUARANTINE of the peoples. Any person having travel history from some infected area must go for self-isolation, and persons having any symptom must allow for testing, which is crucial for an appropriate response to the pandemic. In the last few years, three-way decision theoretic rough set models have been used in many areas of decision making, especially under uncertainty. There are some important issues in decision theoretic rough set models: (i) conditional probability and (ii) threshold value parameters which are determined by loss functions. The determination of thresholds is generally approached as an optimization of some property or examining a trade off solution between multiple criteria. In Yao and Wong (1992) , Yao et al. (1991) and Yao (2009) , the authors have presented the decision theoretic rough set which is divided into different type models according to the combination of values for conditional probability and loss function with the linguistic term. Overall threshold values are calculated based on Bayesian decision procedure which deals with making a decision with minimum risk based on observed evidence. In this paper, COVID-19 suspected person expressed their viewpoints on different attributes by using linguistic terms. We have implemented some novel method to deal with linguistic information and obtained its conditional probability. Threshold value parameters are obtained according to the loss function given by medical experts, or they might have taken as per their own experience depending upon the situations. Threshold value parameter for different age group people may vary for different places depending upon the contamination rate. As in India every three in four cases that are infected with COVID-19 belongs to age group 21 to 60 years (as on April 2020). Ministry of Health and Family Welfare (MOHFW), Govt. of India, has said that of these 75% of confirmed cases, the maximum cases up to 42% are of between 21 and 40 years of age, while 33% are of between 41 and 60 years. Furthermore, 9% cases belong to less than 20 years, whereas 17% cases belongs to age group greater than 60 years indeed. In the USA, a report was published online as an Morbidity and Mortality Weekly Report (MMOWR) early released (8th April 2020). Hospitalization rates and characteristics of patients hospitalized with laboratory confirmed COVID-19 disease, are shown in Garg (2020) . From Chakraborty (2011), Garg (2020) , it is very clear that around 55.5% of confirmed cases has been taken to the hospital for further treatment of the patients infected due to COVID-19. Among these numbers as per Chakraborty (2011 ), Garg (2020 , only 0.4% cases belong to the age group below 17 years, whereas 2.5% cases are registered belonging to age group of 18-49 years; on the other hand, around 7.4% cases have been found belonging to the age group of 50-64 years, and around 12.2% of cases have been registered for the patients belonging to the age group of 65-74 years. Apart from these statistics, more cases have been registered for age group of 75-84 years with a gesture of around 15.8% and for age group greater than (2020), Garg (2020) and COVID (2020) 85 years 17.2% cases have been recorded so far. Based on the confirmed cases from Garg (2020), a comparative analysis has been performed and is reported in Fig. 2 . Therefore, based on the present situation, an expert can choose threshold value and by following the proposed hybrid method, the decision maker can take decisions in emergency situations to help COVID-19 suspected person due to infections. As a result of which, the rate of contamination can be reduced and simultaneously, mortality rate will decrease. In this paper, we have established a three-way decision based on linguistic information system for identifying a suspected person infected due to COVID-19. Based on this model, it would be easier to decide for COVID-19 infected person to send for self-isolation, home quarantine and immediate treatment in an emergency situation. As COVID-19 is highly infectious, correct decisions measures to slow down the spread of the disease that is very much important to confined it up to a limit just before entering the community spreading stages to reach. It is also important to note that thorough rapid testing is mandatory along with this method; as otherwise, it will be very difficult to take decisions while including the asymptomatic infected population in cases. Comparative analysis based on age group for India and USA signifies that our method is more effective and feasible as compared to other approaches. It is because predetermined cases will be taken seriously by following the proposed hybrid decision maker and simultaneously will reduce the huge percentage of the infected population belonging to the age group between 21 and 40 years in case of India. On the other side, it might reduce the percentage of infected people of the age group above 60 years in case of USA as well. This in turn might be able to check the death count of the USA which has been devastatingly overshooting a count of 50k almost. Investigation of effective climatology parameters on COVID-19 outbreak in Iran Risk assessment of novel coronavirus COVID-19 outbreaks outside China On fuzzy sets and rough sets from the perspective of indiscernibility Decision-theoretic three-way approximations of fuzzy sets A class of fuzzy measures based on triangular norms a general framework for the combination of uncertain information Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand Hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed coronavirus disease 2019-COVID-NET, 14 States Parameterized rough set model using rough membership and Bayesian confirmation measures Three-way decisions space and three-way decisions A theory of medical decision making under uncertainty A dynamic optimal control model for SARS-CoV-2 in India A fuzzy dynamic optimal model for COVID-19 epidemic in India based on granular differentiability Analysis of SIRnetwork model on COVID-19 with respect to its impact on West Bengal in India Correspondence between fuzzy measures and classical measures Granular differentiability of fuzzy-number-valued functions The threshold approach to clinical decision making Rough sets Rough sets and fuzzy sets Rough set approach to multi-attribute decision analysis An introductory survey of fuzzy control Worldometer CC (2020) Worldometer Managing consistency and consensus in group decision making with hesitant fuzzy linguistic preference relations Deviation measures of linguistic preference relations in group decision making Prediction of criticality in patients with severe Covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in Wuhan A mathematical model for the novel coronavirus epidemic in Wuhan, China Probabilistic approaches to rough sets Probabilistic rough set approximations July) Three-way decision: an interpretation of rules in rough set theory Three-way decisions with probabilistic rough sets The superiority of three-way decisions in probabilistic rough set models Web-based medical decision support systems for three-way medical decision making with game-theoretic rough sets A decision theoretic framework for approximating concepts A decision-theoretic rough set model Fuzzy sets On a statistical transmission model in analysis of the early phase of covid-19 outbreak Variable precision rough set model Fuzzy set theory-and its applications A Appendix I See Table 3 .Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.