key: cord-1024255-pzlrycgh authors: Yingjie, Zhu; Bin, Yang title: Optimal scale combination selection for inconsistent multi-scale decision tables date: 2022-04-28 journal: Soft comput DOI: 10.1007/s00500-022-07102-y sha: 2dc95ac4b8f2bf5f51ca2ffa8d223142a61dc9e6 doc_id: 1024255 cord_uid: pzlrycgh Hierarchical structured data are very common for data mining and other tasks in real-life world. How to select the optimal scale combination from a multi-scale decision table is critical for subsequent tasks. At present, the models for calculating the optimal scale combination mainly include lattice model, complement model and stepwise optimal scale selection model, which are mainly based on consistent multi-scale decision tables. The optimal scale selection model for inconsistent multi-scale decision tables has not been given. Based on this, firstly, this paper introduces the concept of complement and lattice model proposed by Li and Hu. Secondly, based on the concept of positive region consistency of inconsistent multi-scale decision tables, the paper proposes complement model and lattice model based on positive region consistent and gives the algorithm. Finally, some numerical experiments are employed to verify that the model has the same properties in processing inconsistent multi-scale decision tables as the complement model and lattice model in processing consistent multi-scale decision tables. And for the consistent multi-scale decision table, the same results can be obtained by using the model based on positive region consistent. However, the lattice model based on positive region consistent is more time-consuming and costly. The model proposed in this paper provides a new theoretical method for the optimal scale combination selection of the inconsistent multi-scale decision table. Rough set theory was originally proposed by Professor Pawlak in 1982 (Pawlak 1992 . Because of the mature mathematical foundation and unnecessary of prior knowledge, it is easy to use and become an effective tool for dealing with various incomplete information such as imprecise, inconsistent information. It is a powerful data analysis method. Rough set theory can, in the absence of prior knowledge, find out the classification of knowledge to determine the upper and lower approximation of the problem by describing the set of the given problem and then, analyze and process the uncertain data. Based on the theory of Pawlak rough set, Wu and Leung introduced the concept of multi-scale decision table (MSDT) from the perspective of granular computing and analyzed the knowledge acquisition in it Wu and Leung (2011) . The concept of multi-scale is very common in our life. We can describe a thing from multiple angles, that is, from multiple scales. And it is also widely used in deep learning (Bharati et al. 2020; Li et al. 2020; Qian et al. 2017; Taverniers et al. 2021) . Significantly, some applications of deep learning models to medical imaging and drug discovery for managing COVID-19 disease are studied by some literatures such as (Bharati et al. 2021a, b; Khamparia et al. 2021; Mondal et al. 2021a, b) . Additionally, MSDT is one of the research objects in the field of knowledge discovery in database. So how to extract useful information and discover new knowledge from MSDT is worth research. The general method of processing multi-scale decision tables is to limited multi-scale attributes on a certain scale, and then, we can obtain a series of single-scale decision tables (SSDT) whose each attribute only has one scale. At last, we can do data mining on a single-scale decision table we choose. In a multi-scale information table, if all the attributes are on the finest scales, then the most information of objects is included, but this process is of high cost. However, if all the attribute are on the coarsest scales, then some useful information may be lost. Therefore, one or several optimal scale combinations which can reduce the cost without losing useful information are existed. However, Wu and Leung pointed out that their research is based on two assumptions (Wu and Leung 2013) . One of them is that the number of scales of each attribute must be the same. Another one is that only the corresponding single attributes are able to combine into a subsystem in the process of decomposition for subsystems. Under the same assumptions, (Gu and Wu 2013) and (She et al. 2015) studied the knowledge acquisition and rule induction in multi-scale decision tables. Later, Li and Hu extended their theory and broke these two assumptions . They proposed lattice model and complement model to calculate the optimal scale combination. Based on the concept of multiscale attribute significance they introduced, they proposed stepwise optimal scale selection model. For the attribute with different significance, the scale selection should be carried out step by step, which can effectively reduce the time of calculating and get the best results based on the attribute significance Since then, the theoretical study on generalized multiscale decision table has also been explored by some researchers. Xie et al. proposed three new types of rules and their extraction methods (Xie et al. 2018 ). In Hao et al. (2017) , motivated by the fact that sequential three-way decisions are an effective mathematical tool in dealing with the data with information sequentially updated, Hao et al. used this methodology to investigate the optimal scale selection problem in a dynamic multi-scale decision table. And in Huang et al. (2019) , Huang et al. addressed the issue of optimal scale selection and rule acquisition in dominance-based multi-scale intuitionistic fuzzy (IF) decision tables. What is more, on the basis of an IF inclusion measure, two novel multi-granulation decision-theoretic models have been developed in multiscale IF information tables in . In 2021, (Wang et al. 2021 ) firstly investigated the belief structure and the plausibility structure by defining belief and plausibility functions from the multi-granulation viewpoint and discuss how to construct multi-granulation rough set models in multi-scale information systems. Then, the optimal scale selection methods with various requirements are studied in two aspects of optimistic and pessimistic multi-granulation for a multi-scale decision information system. Wu and Leung did a comparison study of optimal scale combination selection in multi-scale decision tables whose different attributes have different numbers of scales (Wu et al. 2017; Wu and Leung 2020) . They formulate information granules with different scale combinations in multi-scale information systems and discuss their relationships. What is more, the definition and properties of lower and upper approximations of sets with different scale combination are proposed in their paper. The relationship between rule extraction and feature matrix is further studied Chen et al. 2019) . Finally, the matrix is used to describe the scale combination, and the matrix method for optimal scale combination selection and the optimal scale combination keeping the positive region unchanged in the consistent and inconsistent generalized multi-scale decision information system are given, respectively. In Bao et al. (2021) , Bao et al. defined entropy optimal scale combination in multi-scale decision tables. They proved that the entropy optimal scale combination and classical optimal scale combination proposed previously are equivalent. Recently, Zhan et al. (2021) establish group decisionmaking (GDM) idea on multi-scale information systems proposed by Wu and Leung from the perspective of multiexpert group decision-making (MEGDM). It can be applied to sorting problems on multi-scale information systems. As a more general case, inconsistent decision tables are more common in daily life and knowledge discovery tasks. And consistent multi-scale tables can be used as special cases of inconsistent multi-scale tables. Nevertheless, the current works on multi-scale decision tables and also optimal scale combination selection are mainly aimed to calculate the optimal scale combination in consistent multi-scale decision tables. They cannot be applied to more general scenarios. Before using, we must judge the type of the table. And for inconsistent multi-scale decision tables, we can only obtain an optimal scale combination. It is full of limitations if there is missing data in the table. Motivated by these, in this paper, we focus on how to get all the optimal scale combinations in inconsistent multiscale decision tables. Complement model and lattice model based on positive region consistence are proposed, and the algorithms of them are given as well. Compared with the above models, our models are more generalized. Our main contributions are summarized as follows: -We propose complement model and lattice model based on positive region consistent and give the algorithm for inconsistent multi-scale decision tables. -We conduct some numerical experiments to prove that the models based on positive region consistence can also deal with consistent multi-scale decision tables correctly. The remainder parts of the paper are organized as follows. In Sect. 2, several basic notions of Pawlak rough set, information tables and decision tables, scale combination and attribute significance are reviewed. In Sect. 4, the concept of positive region consistent is introduced. And the optimal scale combination selection models for inconsistent multiscale decision tables and their algorithm are proposed. Some numerical experiments are employed in Sect. 5. Finally, we conclude the paper with a summary and outlook the further in Sect. 6. In this section, we review several basic concepts and results of Pawlak rough set, information tables and decision tables, scale combination and attribute significance. Let U be a finite and nonempty set called universe of discourse. If R ⊆ U × U is an equivalence relation on U , that is, R is a reflexive, symmetric and transitive binary relation on U , then the pair (U , R) is called a Pawlak approximation space (Pawlak 1992) . The equivalence relation R partitions the universe of discourse U into disjoint subsets. Such partition is a quotient set of U and denoted by the R equivalence class containing x. The elements in U /R are called elementary sets. For any set X ∈ P(U ), lower and upper approximations are defined as follows: Definition 1 Let U be a finite and nonempty set called universe of discourse. If X ∈ P(U ), lower and upper approximations of X are defined as: where P(U ) is the power set of U . Obviously, they can be defined by: If and only if R(X ) = R(X ), X cannot be precisely defined by R. (R(X ), R(X )) is called the Pawlak rough set of X with respect to (w. The accuracy of rough set can be defined as (Pawlak 1992) : where | · | is the cardinal number of set. For the empty set ∅, Definition 2 (Wu and Leung 2013) Let U be a finite and nonempty universe of discourse. P 1 and P 2 are two partitions of U , For each A ∈ P 1 , if there exists B ∈ P 2 such that A ⊆ B, we say that P 1 is finer than P 2 or P 2 is coarser than P 1 , denoted as P 1 P 2 . If A ⊂ B, we say P 1 is strictly finer than P 2 , denoted as P 1 P 2 . . , x n } is a finite and nonempty set called universe of discourse, A = {a 1 , a 2 , . . . , a m } is a finite and nonempty set of attributes. For each attribute a ∈ A, a is a surjective function from U to V a , and it determines an equivalence relation on U . (4) Similarly, we can define the equivalence relation as: Then, we obtain a partition U /R d of U . For any B ⊆ C, define equivalence R B as: For the inconsistent decision table (U , C ∪ {d}), the concept of generalized decision attribute is introduced by Wu and Leung (2020). For any B ⊆ C, the generalized decision attribute of x w.r.t. B, denoted as ∂ B , can be defined as (Komorowski et al. 1999) : According to Eq.7, we know that for any decision where U is finite and nonempty object set called universe of discourse, A is finite and nonempty set of attribute, d is decision. Each attribute a j ∈ C is a multi-scale attribute, that is, for the same object in U , attribute a j can take on different values at different scales. For each attribute a j ∈ C, we assume that the higher the level of scale is, the coarser the partition w.r.t. the scale becomes. If the attribute a j has three levels of scale, its first level of scale a 1 j is finer than its second level of scale a 2 j , and its second level of scale a 1 j is finer than its third level of scale a 2 j . The general method of processing multi-scale information tables is to limit multi-scale attributes on a certain scale, and then, we can obtain a series of single-scale information tables whose each attribute only has one scale. At last, we can do data mining on a single-scale information table we choose. The concept of scales combination and scales collection and some properties was introduced by Li and Hu (2017). Definition 6 (Li and Hu 2017) Let S = (U , A) be a multiscale information table, where attribute a i has I i levels of scale, i = 1, 2, . . . , m. If we restrict attribute a 1 , a 2 , · · · , a m on their I i th scale, respectively, we can obtain a single-scale information table S K , where K = (l 1 ; l 2 ; . . . ; l m ). The combination (l 1 ; l 2 ; . . . ; l m ) is called the scales combination of S in S K . All the scales combination of S is called scales collection, denoted as Definition 7 (Li and Hu 2017) Let S = (U , A) be a multiscale information table, and L is the scales collection of S. For K 1 , K 2 ∈ L , if the elements of K 2 are not less than the corresponding elements of K 1 , then we say that K 1 is weaker than K 2 or K 2 is stronger than K 1 , denoted as K 1 K 2 . According to Definition 7, we know that L is an partial order relation. Thus, (L , ) is a partial order set, which is reflexive, antisymmetric and transitive. Furthermore, (L , ) is a lattice in which every two elements have a unique supremum and a unique infimum. Obviously, we can get the proposition as follow. Proposition 1 According to the Proposition 1, we can define the concept of optimal scale combination as follow . Definition 8 Let L be a scales collection of a consistent multi-scale decision table S, for K ∈ L , all the K meet the condition that if for all the K ∈ L and K K , Therefore, the consistency of multi-scale decision table can be defined by: Definition 9 (Wu and Leung 2011) Let S = (U , C ∪{d}) be a multi-scale decision table, and 1 m = (1; 1; · · · ; 1). If S 1 m = (U , {a 1 j | j = 1, 2, . . . , m} ∪ {d}) whose all the attributes are on their finest level of scale is consistent, then the multi-scale decision table S is consistent. Let L be the scales collection of (U , C ∪ {d}). For an arbitrary K ∈ L , the corresponding equivalence relation R A K can be defined as U can be partitioned by R A K into a family of equivalence classes as follows where According to Eqs.8 and 9, we can know the relation between equivalence relation and subsets of attributes. L is the scales collection of S. For K 0 = (k 1 ; k 2 ; · · · ; k m ) ∈ L and an arbitrary subset C 1 ⊆ C, there exists K 1 ⊆ K 0 such that the indexes of K 1 in K 0 are the same as those of C 1 in C. Similarly, there exist a sequence C m ⊆ · · · ⊆ C 2 ⊆ C 1 ⊆ C 0 and the corresponding indexes sets K m ⊆ · · · ⊆ K 2 ⊆ K 1 ⊆ K 0 . The following equations hold In order to extend the application of multi-scale decision table, proposed complement model and lattice model. Let S = (U , C ∪ {d}) be a multi-scale decision table, and S + be its complement system. Let I i , i = 1, 2, · · · , m be the number of levels of scales of attribute a i , respectively, and they are not necessary to be the same. For some attributes with less number I i , we complement them with several known levels of scales to obtain a new multi-scale decision table, whose attributes have the same number of levels of scales. Let I = max{I 1 , I 2 , · · · , I m }, that is, the maximum of I i and p be the index of attribute with the largest number of levels of scales. In case of multiple occurrences of the maximum values, the index corresponding to the first occurrence is returned. Firstly, the concept of scale vector is introduced. In order to ensure that the number of levels of scales of all the attribute are all the same in S + , other complement scale vector C + i (i = p) should be formed as (l i1 , l i2 , . . . , I i j , l i I ), where 1 ≤ l i j ≤ I i and l i j ≤ l ik when j ≤ k. Moreover, to include more information about a i , the original scale vector C i should be covered by C + i . Thus, C + i should satisfy the following conditions: Therefore, the number of possible choice for scale vector C + i of a i is equivalent to choose I − I i from I i with replacement . According to Brualdi (2010) Let S = (U , C ∪ {d}) be a multi-scale decision table, and I i (i = 1, 2, · · · , m) be the number of levels of scales of a i which are not necessary to be the same. L is scales collection of S, and |L | = m i=1 I i . If S is consistent, then there exists the scales combination K which is the optimal scales combination of S. Lattice model is aimed to select all the optimal scales combination. For a given multi-scale decision table, lattice model can be described via the following procedure: 1. According to Definition 6, scales collection L of S can be calculated. 2. Based on the consistence and the partial order relation between elements in L of S, the set of optimal scales combinations OSC can be obtained according to Definition 8. 3. In the subsystem S K confirmed by the optimal scale combination K ∈ O SC, knowledge acquisition and other tasks can be done. Let S = (U , C ∪ {d}) be a multi-scale decision table, where U = {x 1 , x 2 , . . . , x n }, C = {a 1 , a 2 , . . . , a m } and I i is the number of levels of scales of a i , (i = 1, 2, · · · , m). If 1 m K 1 K 2 (I 1 ; I 2 ; . . . ; I m ) and S K 1 is an inconsistent decision table, according to Proposition 1, S K 2 is also inconsistent. Hence, if S is inconsistent, that is S 1 m is inconsistent, then S K is also inconsistent for any K ∈ L . For K ∈ L , there is an equivalence relation as follow: For X ∈ U , the upper and lower approximations of X are shown as: where The positive region under scale combination K in S is defined as (Wu and Leung 2013; : where U /d = {D 1 , D 2 , ..., D r }. The algorithm of judging whether or not a given subsystem of a multi-scale decision table is positive region consistent has been given by . We extend the application of complement model and lattice model in this subsection. The complement model and lattice model proposed by Li and Hu are aimed to process the consistent multi-scale decision table. Therefore, we can combine positive region consistence with these two models to obtain the complement model and lattice model based on positive region consistence which can deal with inconsistent multi-scale decision tables. In order to deal with multi-scale decision table by using complement model and lattice model based on positive region consistence, we only need to replace the judgement of consistence of subsystem with the judgement of positive region consistence in complement model and lattice model. And for K ∈ L , "S K is consistent" is included by "S K is positive region consistent" . Hence, x 11 0 0 2 2 2+ 2 2 2+ 1 1 2 x 12 0 0 2 2 2+ 2 2 2+ 1 1 2 x 13 1 1 2 2 2+ 3 3+ 2+ 1 1 3 x 14 2 2+ 1 1 1 3 3+ 2+ 1 1 3 In order to verify the feasibility of complement model(CM-PR) and lattice model(LM-PR) based on positive region consistence, some numerical experiments are employed in this section. And we compare the results of them with the result of stepwise optimal scale selection based on positive region consistence(SOSS-PR) proposed by . a 2 , a 3 , a 4 }. We can notice that x 4 and x 6 are indistinguishable w.r.t. R C , but d(x 4 ) = d(x 6 ). The results obtained by using CM-PR, LM-PR and SOSS-PR, respectively, are shown in Table 2 . In order to evaluate the above algorithms more objectively, two data sets are collected from the University of California, Irvine (UCI) Machine Learning Repository (Lichman 2013) . These two decision tables are single-scale decision tables. Thus, we use the method by to obtain their corresponding multi-scale decision tables. There are four steps in that method, but we only do the first three steps. The decision value of object x in multi-scale decision table is not change to ∂ C (1;1;...;1) (x), that is, it keeps the original decision value. Then, the multi-scale decision tables we obtain are not inconsistent. And the details of them and the results are shown in Tables 3, 4, respectively. Through the numerical experiments, it can be found that the optimal scale combination of CM-PR is weaker than that of LM-PR and the result of SOSS-PR is one of the results of LM-PR. Moreover, the running time of SOSS-PR is shorter than that of LM-PR. These conclusions are similar to the conclusions of the model deal with consistent multi-scale decision tables summarized by and . In order to test the performance of CM-PR and LM-PR in consistent multi-scale decision tables, we use complement model, lattice model, stepwise optimal scale selection, CM-PR, LM-PR, SOSS-PR to deal with some consistent multiscale decision tables, respectively. Tables 5, 6 and 7 are three consistent multi-scale decision tables collected from . The results are shown in Tables 8, 9 . For Table 5 , the set of optimal scales combination via complement model and CM-PR is {(3;2;3;2),(3;3;3;1)}, the set of optimal scales combination via lattice model and LM-PR is {(4;3;1;2),(4;2;2;2),(1;3;3;2),(4;3;2;1),(3;2;3;2),(3;3;3;1), (4;1;3;2)}, and the optimal scales combination via stepwise optimal scale selection and SOSS-PR is (4;3;1;2). For Table 6 , the set of optimal scales combination via complement model and CM-PR is {(2;1;3;3;3),(2;2;2;2;2), (1;2;3;3;3)}, the set of optimal scales combination via lattice model and LM-PR is {(2;2;4;1;3),(1;2;4;3;3),(2;1;4;3;3), (1;2;3;3;4), (2;2;4;3;2), (2;2;3;1;4),(2;2;2;3;3), (2;1;3;3;4)}, and the optimal scales combination via stepwise optimal scale selection and SOSS-PR is (2;2;4;3;2). For Table 7 , the set of optimal scales combination via complement model and CM-PR is {(2;2;2;2;4)}, the set of optimal scales combination via lattice model and LM-PR is {(2;2;2;2;4)}, and the optimal scales combination via stepwise optimal scale selection and SOSS-PR is (2;2;2;2;4). The results are shown in Tables 8, 9 Moreover, for the data sets described in Table 3 , we use the method proposed by to obtain their corresponding consistent multi-scale decision tables. And optimal scales combination on these two consistent multi-scale decision tables using three models based on consistence and three models based on positive region consistence are shown in Tables 10, 11, respectively. Compared Table 8 with Table 9 and compared Table 10 with Table 11 , some facts are verified. The running times of complement model and CM-PR have no static relationship and the running time of LM-PR is about two times longer than that of lattice model. The running time of SOSS-PR is slightly slower than that of stepwise optimal scale selection. When dealing with the consistent multi-scale decision table, the model based on the positive region consistence has the same results as the model based on consistence. Thus, CM-PR, LM-PR and SOSS-PR are also able to deal with consistent multi-scale decision tables efficiently. In a word, for the general multi-scale decision tables, we can directly use the model based on positive region consistence to deal with them. For single-scale decision tables, we can only do the first three steps in the method proposed by to obtain their corresponding multi-scale decision tables. Generalized decision values are not need to calculate. Finally, the same results can be obtained by using the models based on positive region consistent. Moreover, the optimal scales combination obtained after converting single scale decision table to multi-scale decision table often has excellent performance in classification experiments ). Based on the models and theories proposed by Li and Hu, this paper introduces some new methods which are more generalized to calculate all the optimal scale combinations in inconsistent multi-scale decision tables. It is also an expansion of the multi-scale decision tables studied by Wu and Leung, breaking their two strong assumptions. Some numerical experiments are employed to verify that the new models have the same properties as the complement model and lattice model in dealing with the inconsistent multi-scale decision table. And for consistent multi-scale Table 8 The results of models based on consistence Table 9 The results of models based on positive region consistence decision tables, the model based on positive region consistence can also get the same results. In fact, "the consistence of the single-scale subsystem" is included by the "positive region consistence of it." Therefore, the model based on positive region consistent can efficiently solve the problem of optimal scale selection in the consistent and inconsistent multi-scale decision tables. However, the lattice model based on positive region consistence which can always calculate all the optimal scale combinations is more time-consuming and costly. So future research scopes will also include how to optimize the algorithm of lattice model and reduce the time complexity so that it can perform well in dealing with large data sets. Entropy based optimal scale combination selection for generalized multi-scale information tables Hybrid deep learning for detecting lung diseases from X-ray images Medical Imaging with Deep Learning for COVID-19 Diagnosis: A Comprehensive Review CO-ResNet: optimized ResNet model for COVID-19 diagnosis from X-ray images Matrix method for the optimal scale selection of multi-scale information decision systems On knowledge acquisition in multi-scale decision systems Optimal scale selection in dynamic multi-scale decision tables based on sequential three-way decisions Dominance-based rough sets in multi-scale intuitionistic fuzzy decision tables Inclusion measure-based multi-granulation decision-theoretic rough sets in multi-scale intuitionistic fuzzy information tables Knowledege acquisition and matrix method of generalized multi-scale information system Diagnosis of breast cancer based on modern mammography using hybrid transfer learning Rough sets: tutorial A new approach of optimal scale selection to multi-scale decision tables Stepwise optimal scale selection for multiscale decision tables via attribute significance Automated gleason grading and gleason pattern region segmentation based on deep learning for pathological images of prostate cancer UCI machine learning repository CO-IRv2: Optimized incep-tionResNetV2 for COVID-19 detection from chest CT images Diagnosis of COVID-19 using machine learning and deep learning: a review Rough Sets:Theoretical Aspects of Reasoning about Data Multi-scale deep learning architectures for person Re-identification A local approach to rule induction in multiscale decision tables Mutual information for explainable deep learning of multiscale systems Multi-granulation-based optimal scale selection in multi-scale information systems Theory and applications of granular labelled partitions in multi-scale decision tables Optimal scale selection for multi-scale decision tables On rule acquisition in incomplete multi-scale decision tables A comparison study of optimal scale combination selection in generalized multi-scale decision tables Rule acquisition and optimal scale selection in multi-scale formal decision contexts and their applications to smart city An investigation on Wu-Leung multi-scale information systems and multi-expert group decisionmaking Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations The authors are extremely grateful to the editor and anonymous referees for their valuable comments and helpful suggestions which helped to improve the presentation of this paper. Data Availability Enquiries about data availability should be directed to the authors. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper Human and animal rights This article does not contain any studies with human participants or animals performed by the author.