key: cord-0944605-pqg351ty authors: Maity, Ananta; Das, Kousik; Samanta, Sovan; Mondal, Sukumar; Dubey, Vivek title: A study of cluster hypergraphs and its properties date: 2021-02-12 journal: Soc Netw Anal Min DOI: 10.1007/s13278-021-00721-7 sha: a1199ad6b5966dfd6af13dfab46da84b7d016b8e doc_id: 944605 cord_uid: pqg351ty In this study, cluster hypergraphs are introduced to generalize the concept of hypergraphs, where cluster nodes are allowed. Few related terms and properties on cluster hypergraphs are discussed. Some operations, including the Cartesian product, union, intersection, etc., are studied. Different types of matrix representations and isomorphism are also proposed on cluster hypergraphs. The notion of an effective degree for nodes is introduced to capture the group/ cluster effects. At last, the area of applications and future directions with conclusions is deployed. The graphs are perfect representations of relations among data. The great mathematician L. Euler (1707-1783), known as the father of Graph Theory, first provided the concepts of graphs in 1736 by solving the famous Konigsberg bridge problem. A graph (Harary 1972; Sabidussi 1959 ) is a set of vertices with a relation among vertex. There are huge developments in graph theory and its applications in competitions, colorings, social networks (Das et al. 2018; Yan and Ding 2009) , etc. But a hypergraph is a generalization of the graph in which any subset of a vertex set is an edge rather than two vertices. Specially, Berge (1961 Berge ( ,1973 Berge ( , 1989 introduced hypergraphs as a generalization of graphs. Buroscha and Ceccherini (1996) characterized cubehypergraphs where each hyperedge contains three vertices. Tyshkevich and Zverovich (1996) introduced a new multivalued function, the line hypergraph. The function generalizes two classical concepts at once, namely, of the line graph and the dual hypergraph. In terms of this function, proofs of some known theorems online graphs can be unified, and their more general versions can be obtained. Sonntag and Teichert (2000) defined hypertrees, and they extended the notion to competition hypergraphs (Sonntag and Teichert 2004) . Bollabos and Scott (2000) discussed three-uniform hypergraphs. There are various research works on hypergraphs (Djokovic 1973; Frankl and Katona 2000; Katona and Kierstead 1999) . Bergen studied about the infection in hypergraphs (Bergen et al. 2018) . But all these previous work's concepts of a group of nodes as a single node ignored. Samanta et al. (2020a) have been introduced cluster hypergraphs to generalize the concept of hypergraphs and to capture the notion of cluster nodes. The author considered the simple node as well as the cluster node in hypergraph where a simple node is as a usual node, and the cluster node is containing more than one simple node. The author also proposed the various types of the coloring of cluster hypergraphs (Samanta et al. 2020b) . Current concepts of graphs use simple nodes and clusters separately to find the influence of/on a node. But this approach is not enough to capture more complicated situation, when a node is part of different clusters. For example, node 4 is part of two additional clusters: 8, 9. In this paper, we develop a general approach to address the issue for hyper cluster graphs. In this study, we introduced various terms and properties of cluster hypergraphs with the application. Thus, the major contributions of this study are as follows. • The degree and effective degree of cluster hypergraphs has been introduced. • Operations (Cartesian product, union, intersection) on two cluster hypergraphs have been proposed. • Various matrices on cluster hypergraphs have been presented. • Type-I and Type-II isomorphism have been introduced. The rest of the sections are organized as follows. In Sect. 2, preliminaries on cluster hypergraphs are described. In Sect. 3, the degree and effective degree of cluster hypergraphs have basic operations, such as a Cartesian product, union and intersection, are defined. In Sect. 4, different types of matrix representation with examples are displayed. In Sect. 5, isomorphisms on cluster hypergraphs are studied. In Sect. 6, the conclusion and future direction are described. Definition 2.1 Let X = x 1 , x 2 , … , x n be a finite set and let E = e 1 , e 2 , … , e m be a family of subsets of X such that The pair (X, E) is called a hypergraph with vertex set X and hyperedge set E . The elements x 1 , x 2 , … , x n of X are vertices of hypergraph H , and the sets e 1 , e 2 , … , e m are hyperedges of hypergraph H. Definition 2.2 Let X be a non-empty set and V X be a subset of P(X) such that ∉ V X and X ⊂ V X . Now, E be a multi-set whose elements belong to P(P(X)) such that Then, G = V X , E is said to be cluster hypergraph where V X is said to be vertex set, and E is said to be multi-hyperedge set. A k-cluster hypergraph is defined as follows: Definition 2.3 Let X be a non-empty set and V X be a subset of P k (X) , k = 1, 2, 3 … such that ∉ V X and X ⊂ V X . Now, E be a multi-set whose elements belong to P V X such that Then, G = V X , E is said to be k-cluster hypergraph where V X is said to be vertex set, and E is said to be multihyperedge set. Generally, for k = 1, −1 cluster hypergraphs are assumed as cluster hypergraphs. • It is assumed that each node inside a cluster node is automatically connected to the cluster node, but these inside nodes may not be connected to each other. • In a virtual representation (Fig. 2 ) of any cluster hypergraph, the cluster nodes are assumed as separate nodes, and the connections to the inside nodes are shown in the representation. • The maximal nodes are those nodes which are not contained in any other cluster nodes. The elements of X are termed as simple nodes. A simple node may be termed as a maximal node if it does not belong to any other nodes. • Depending on the cluster node sizes and their edges, cluster hypergraphs are classified into some different categories. Definition 2.5 Let X = x 1 , x 2 , … , x m be a non-empty set and G = V X , E be a cluster hypergraph where V X = v 1 , v 2 , … , v n be set of nodes, that is, v i ∈ P(X), i = 1, 2, … , n and E = e 1 , e 2 , … , e k be the set of edges, that is, e i ∈ P(P(X)), i = 1, 2, … , k. w, the degree of node v i which is contained in the edges e ij , j = 1, 2, … , p which is denoted as d v i and defined as d Note that an edge containing single vertex will contribute once to the degree of containing vertex. Similarly, the edges containing more than one vertex, contribute once to each of its vertices. Example 2.6 In the cluster hypergraph ( Fig. 1) , the degree of node {D} is 3. A number of connections to a node are termed as a degree of a node. If a node is contained in another cluster node (group), then the node may have separate effect of connections. In this study, a separate term is defined for a node, an effective degree of a node. Definition 2.7 Let X = x 1 , x 2 , … , x m be a non-empty set and G = V X , E be a cluster hypergraph where V X = v 1 , v 2 , … , v n be set of nodes, that is, v i ∈ P(X), i = 1, 2, … , n and E = e 1 , e 2 , … , e k be the set of edges, that is, e i ∈ P(P(X)), i = 1, 2, … , k. Also, let, cv i is cluster node containing simple node v i . Now, an effective degree of a simple node v i is denoted as ed v i d defined as where l is the number of cluster nodes containing v i . Example 2.8 In the cluster hypergraph ( Fig. 1) , degree of node {C} is 1, but an effective degree of node {C} is Theorem 2.9 Let G = V X , E be a cluster hypergraph where |X| = k. Then, the total degree of the cluster hypergraph is less than 2 k − 1 2 2 k −2 . Proof Since G = V X , E is a cluster hypergraph where |X| = k . Thus, maximum number of nodes of G is 2 k − 1. Now, in cluster hypergraphs, there may be edges containing a single node, containing double nodes and so on. Thus, the contribution of edges of containing n nodes is n × (number of edges). The maximum contribution of all edges is Hence the result. Then, the total degree of the cluster hypergraph is less than k × 2 k−1 . Let G = V X , E be a cluster hypergraph where |X| = k. Then, the total effective degree of the cluster hypergraph is less than + 2 k − 1 2 k − 2 , where δ is the sum of maximum degrees of nodes, and l is the number of cluster nodes in the hypergraph. ere l is the number of cluster nodes containing x i . Here, sum of the effective degree is . Let us suppose the total number of cluster nodes in the hypergraph is l . Now, the degree of a cluster node is always less than 2 k − 2 . Thus, sum of the degrees of all cluster nodes is less than or equal l × 2 k − 2 . One simple node may belong to all cluster nodes. Thus, sum of all effective degrees is less than or equal to + 2 k − 1 2 k − 2 , where = 2 k − 1 2 2 k−2 and l is the number of cluster nodes in the hypergraph. Hence, the result is true. The basic operations, that is, cartesian product, union, the intersection of cluster hypergraphs are defined and studied some properties. Definition 3.1 Suppose G = V 1 , E 1 and H = V 2 , E 2 . are two cluster hypergraphs. Then, the Cartesian product of G and H is denoted by G × H = (V, E) and defined as vertex Proof Let G = (V 1 ,E 1 ) and H = (V 2 ,E 2 ) are two cluster hypergraphs with G × H = (V, E) . Then, � ∉ E 1 , � ∉ E 2 and for each elemente 1 in E 1 , there exists at least one element v 1 in V 1 such that v 1 ∈ e 1 and for each element e 2 in E 2 , there exists at least one element v 2 in V 2 such that v 2 ∈ e 2 . Therefore, ∉ V , � ∉ E and for each element e ∈ E , there exists at least one element v ∈ V such that v ∈ e . Hence, all the conditions of the definition of cluster hypergraph satisfy in G × H, ; therefore, G × H is a cluster hypergraph. Hence the result. Example 3.5 Consider two cluster hypergraphs ( F i g s . 6 , 7 ) G = (V 1 , E 1 ) a n d H = (V 2 , E 2 ) , . Then, � ∉ E 1 , E 2 and for each element e 1 in E 1 , there exists at least one element v 1 in V 1 such that v 1 ∈ e 1 and for each element e 2 in E 2 , there exists at least one element v 2 in V 2 such that v 2 ∈ e 2 . Then, � ∉ E ( ∵� ∉ E 1 , E 2 ) and for each element e in E , there exists at least one element v in V such that v ∈ e. Therefore, G ∪ H is a cluster hypergraph. Hence the result. Proof Since G = (V 1 ,E 1 ) and H = (V 2 ,E 2 ) are two cluster hypergraphs with G ∩ H = (V, E) . Then, � ∉ E 1 , E 2 and for each element e 1 in E 1 , there exists at least one element v 1 in V 1 such that v 1 ∈ e 1 and for each element e 2 in E 2 , there exists at least one element v 2 in V 2 such that v 2 ∈ e 2 . Then, � ∉ V, E and for each element e in E , there exists at least o element v in V such that v ∈ e. Therefore, G ∩ H is a cluster hypergraph. Hence the result. Matrix representation is the easiest and convenient way to represent graphs. We represent cluster hypergraph as four types of matrices. • Adjacency matrix • Incidence matrix • Location matrix • Circuit matrix If one cluster node is adjacent to another maximal node, m is related to the indirect path from simple nodes of one cluster node to the simple nodes of another cluster node or to the maximal node itself). If two clusters are connected, then the nodes among the cluster are indirectly connected. The corresponding adjacency matrix of Example 4.2 is given below: (1,0) (0,0) (0,0) (0,0) (0,0) (0,0) (0,0) (0,0) f (0,0) (0,1) (0,1) (0,1) (0,0) (0,0) (1,0) (1,0) g (0,0) (0,1) (0,1) (0,1) (0,0) (1,0) (0,0) (1,0) H (0,0) (0,1) (0,1) (1,0) (0,0) (1,0) (1,0) (0,0) • The entries along the principal diagonal are all (0,0) if the graph has no self-loop. • If the entry value is (1,0), then there is a direct edge between two nodes. • If the entry values are (0,1), then the nodes are indirectly connected through some cluster to maximal node adjacency. For complete CCCH, the entries for the adjacency matrix will be either (1,0) or (0,1) for all non-diagonal entries. Definition 4.5 Let us consider a cluster hypergraph with m nodes and n edges. The incidence matrix A = a ij is an m × n matrix define as, The corresponding incidence matrix of Example 4.6 is given below: • Since every edge is incidence on precisely two vertices, each column of A has exactly two 1's. • The number of 1's is in a row equal to the degree of the corresponding vertex. Note 4.8 For (m, n)uniform cluster hypergraph, the n entries of a column of the incidence matrix for will be 1. Let us consider a cluster hypergraph with m simple nodes and n cluster nodes. The location matrix defined as an m × n matrix A = a ij such that Example 4.10 In Fig. 12 , the corresponding location matrix is given below: a ij = 1, if j-th edge is incident on i-th node 0, otherwise a ij = 1, if i-th simple node is located in j-th cluster node 0, otherwise Observations 4.11 • Number of rows is the total number of simple nodes. • Number of columns is the total number of cluster nodes. • The number of 1′s in each column is the number of simple nodes, contained in the corresponding cluster nodes. For (m, n)uniform cluster hypergraph, m-entries for each column in the location matrix will be 1. A chain is defined as the finite alternating sequence of nodes and edges, beginning and ending with nodes such that each edge is incident with the nodes preceding and following it. In a virtual representation of cluster hypergraph G, a chain is defined to be a sequence (iii) each edge is incident with the vertices preceding and following it. Then, this chain is called a circuit. Isomorphism is an important property of cluster hypergraphs. We define isomorphism on cluster hypergraphs. G = V X , E and G � = V � X , E � are said to be Type-I isomorphic if there exists a bijective mapping f ∶ V X → V � X such that there exists an edge (a, b) ∈ E if and only if (f (a), f (b)) ∈ E � for all a, b ∈ V X . • There are one-one correspondences between nodes and edges among two cluster hypergraphs. • There should be the same number of nodes between two cluster hypergraphs. • The degree of a node under the isomorphic image should be the same. • The effective degree of a node under the isomorphic image should be the same. • The matrices, that is, adjacency matrices, incidence matrixes, location matrices and circuit matrices under isomorphic image should be the same. • There should be the same number of maximal nodes under isomorphism. G = V X , E and G � = V � X , E � are said to be Type-II isomorphic if there exists a bijective mapping f ∶ V X → V � X such that for all maximal nodes a, b ∈ V X there exists edge (a, b) ∈ E if and only if there exist maximal nodes f (a), f (b) ∈ V � X such that (f (a), f (b)) ∈ E � . • Type-I isomorphism implies Type-II isomorphism between the two cluster hypergraphs but not conversely. • Adjacency matrices of Type-I and Type-II isomorphism between the cluster hypergraphs may not be the same. A. This study of cluster hypergraphs introduces a new area of graph theory and networks. The properties, matrix representations and isomorphism of cluster hypergraphs will be main backgrounds of the future studies in the area. There are a number of applications of cluster hypergraphs in searching, centrality measures, etc. in social networks. There is a big problem to search for a particular type of small amount of data or a small network from a big amount of data or a big network. It is possible to search such small networks by using cluster hypergraphs. First represent the virtual presentation of cluster hypergraphs to the desired particular types of network and fit all possible type-2 isomorphism mappings to all such small networks of a big network. This can be done by adjacency matrices or degree sequence properties of cluster hypergraphs. Thus, isomorphism properties of cluster hypergraphs can be applied for filtering the required data as advanced searching. This study also analyzes the node's importance in the network, especially the importance of a cluster node in a social network where a group of nodes is taken as a cluster node. The We considered a small research network (Fig. 15 ) among researchers and supervisors in an area of a specific subject. These persons are taken as cluster nodes or simple nodes. Here, a supervisor is considered as a cluster node, and a researcher is taken as a simple node. The edges are considered if there is a paper between them. Also, a supervisor always has an edge over his/her researchers, which belongs to the same cluster. But the problem is to find "who is influential in this network?". Hence, we have to find some measures like degree centrality, effective degree centrality of nodes in the network. Here, degree centrality indicates the direct influence of a node in the network, and effective degree centrality indicates the effective influence of node considering the belongingness of node in a cluster in a network (Fig. 16) . All the necessary calculations are done (Table 1) and found the direct influence and effective direct influence of all nodes in the network using the definition of degree and effective degree, respectively. In this study of the applications, we compared (Fig. 17) the results, and it is observed that • The direct influence and effective direct influence of maximal clusters are the same. • The effective direct influence of non-maximal nodes may be greater than direct influence. • The non-maximal nodes within a cluster are also influential when these nodes belong to a bigger cluster with more nodes. • The cluster nodes and nodes in the cluster are much more influential than the simple maximal node that are not in a cluster. Figure 17 shows that the effective degree brings additional information that is in clusters, and hence effective degree is higher in each case. But, the relative importance of node (ranking) and its normalized inform us that effective degree shows a different perspective. The relative importance of the node changes in the network due to the influence of clusters. This is shown in Fig. 18a and b for the above example problem. We illustrate the application of the above-described approach to a real-life problem. We study the publications for COVID 19 and how authors, departments (consisting of different authors), and papers (authors collaborated on papers) are interrelated. For this purpose, a small data have been collected and have been shown in Table 2 . Further, we take the perspective that an author is a node, a paper is a cluster of nodes, and a department is a hyperedge. Thus, an influential department is one that works on different influential papers. Further, it is often observed that nodes associated with a persuasive paper to gain from the association. We provide a means to compute the impact of such an association with a paper/s on individual nodes. In Tables 2 and 3 , paper-author and department-author details of ten collected paper are displayed. Here, each individual author is taken as simple node. Each paper may be written by more than one author. The collection of authors per paper is considered as cluster. Now, same department authors are connected by hyperedges. The traditional graph where two authors are connected if they share a common paper, is shown in Fig. 19 . The cluster hypergraph of assumed data is shown in Fig. 20 . Here, small black circles are simple nodes. Big circles are clusters and blue lines indicate the hyperedges. Now, effective degrees of simple nodes of the cluster hypergraphs are calculated in Table 4 . Figure 21a and b show the relative ranking using the currently accepted degree centrality measure and compare it to the proposed effective degree. We observe that the normalized effective degree values are bigger than those computed using the normalized degree. This, we believe, is due to the fact that since all values (effective degree) increase by some amount (from the base-the degree values), and the largest value has also increased similarly. This has the effect of increasing every value. If the highest effective degree value were significantly higher, we would have seen a reduction in the normalized value (or vice versa). We also see a change in the ranking of influential nodes. This is a significant difference. We find that when additional information is considered, the influence of the node increases. For example, an author that is associated with an influential paper would appear more influential that other authors who are not connected to such clusters. In this paper, some properties of cluster hypergraphs have been developed. The effective degree is defined and is taken as a replacement of degree as the effective degree is more convenient to represent as the effects of other persons who form a cluster/groups are considered separately along with direct impact. A comparison with the currently available approach has been mentioned for describing the limitations of these approaches, and this also serves as the motivation for developing the proposed method. Two realistic applications have been shown. For the second application, we collect data on COVID-19-related publications. In both applications, we find that our approach not only captures more information but also changes the influence ranking of the nodes. Computationally, our method has similar complexity as that of existing methods. Thus, the approach in the paper improves upon currently available approaches to addressing the problem of determining the influence of a node in cluster hypergraphs. Fárbung von Graphen deren sämtliche bzw. ungerade Krense starr sind Math Infection in hypergraphs Judicious partitions of 3-uniform hypergraphs characterization of cube-hypergraphs Study on centrality measures in social networks: a survey Distance preserving subgraphs of hypercubes Extremal k-edge-hamiltonian hypergraphs Hamiltonian chains in hypergraphs The composition of graphs A mathematical approach on representations of competitions: Competition cluster hypergraphs Concepts on coloring of cluster hypergraphs Sum numbers of hypertrees Competition hypergraphs Line hypergraphs Applying centrality measures to impact analysis: a co-authorship network analysis