Evolutionary Coincidence–Based Ontology Mapping Extraction Vahed Qazvinian Hassan Abolhassani Seyed H. HAERI (Hossein) Babak Bagheri Hariri Web Intelligent Lab, Department of Computer Engineering, Sharif University of Technology and School of Computer Science, Institute for Studies in Theoretical Physics and Mathematics (IPM) Tehran, Iran qazvinian@ce.sharif.edu, abolhassani@sharif.edu, shhaeri@ce.sharif.edu, hariri@ce.sharif.edu Abstract Ontology Matching is a process for selection of a good alignment across entities of two (or more) Ontologies. This can be viewed as a two phase process of: 1) ap- plying a similarity measure to find the correspondence of each pair of entities from two ontologies, and 2) Ex- traction of an optimal or near optimal mapping. This paper is focused on the second phase and introduces our evolutionary approach for that. To be able to do so, we need a mechanism to score different possi- ble mappings. Our solution is a weighting mechanism named Coincidence-Based Weighting – as explained in the paper. On that basis, a Genetic Algorithm is then introduced to create better mappings in succes- sive iterations. We will explain how we code a map- ping as well as our Crossover and Mutation functions. Evaluations of the algorithm is shown and discussed in the paper too. 1 INTRODUCTION Semantic web is rather a new concept, and is defined so that the agents will be able to understand Web content and communicate through it, as like as hu- man beings doing so now. Traditional knowledge- based systems were centralized, but on the other hand, semantic web is distributed and heterogenous. As a matter of fact, and according to (Mitra, Noy & Jaiswal 2003): “Information sources, even those from the same domain, are heterogenous in nature”. This heterogeneity has resulted in designing ontolo- gies to lessen the difficulties of agents’ understandings and communications. However, still another problem exists: ontologies themselves may have heterogene- ity. This is when two ontologies are trying to express same knowledge or concepts but they use different languages or words (Euzenat & Valtchev 2004). Ontology Alignment(OA) is a proposed solution to this problem by introducing a (proper) mapping of en- tities in two ontologies from two (different) domains. (Bouquet, Euzenat, Franconi, Serafini, Stamou & Tessaris 2004a) defines OA as: ... given two ontologies which describe each a set of discrete entities (which can be classes, properties, rules, predicates, etc.), find the relationships (e.g., equivalence or subsumption) holding between these enti- ties. Figure 1 shows a simplified OA framework. As shown in the figure, to extract an alignment, it is customary to first apply some measures (simple or complex) to reach to some initial similarity values. To this respect, there are already a vast amount of research works in the literature discussing about lexical and structural measures suitable for OA. Having such similarity val- ues, the next problem is how to form an ideal map- ping. We refer to this problem as Mapping Extrac- tion. The goal is to find correspondence of entities among two ontologies such that the overall similarity value is maximal. Therefore it can be viewed as an op- timization problem in which evolutionary approaches could be a legal solution. Figure 1: A Simplified Alignment Framework In this paper we introduce a genetic based algo- rithm for the Mapping Extraction problem. First we have an explanation of the related works in section 2. Then in section 3 explanations about graph theo- retical bases we use through the paper is given. To have a measure for calculation of how good a mapping is, the paper discusses coincidence based weighting in section 4. Our evolutionary algorithm is explained in section 5 and in section 6 evaluations are discussed. We also provide conclusions and explanations about our future works in Section 7. 2 RELATED WORKS Unfortunately and as stated in (Bouquet, Euzenat, Franconi, Serafini, Stamou & Tessaris 2004b), works on ontology extractions are not so common. How- ever, current researches on ontology mapping and its applications entails a large number of fields ranging from machine learning, concept lattices, and formal theories to heuristics, and linguistics. There are some similar works to match graphs, and trees (Hopcroft & Karp 1973, Papadimitriou & Steiglitz 1998), data- base schemas (Rahm & Bernstein 2001) and even in clustering compound objects with a machine learning technique (Bisson 1992). (Kalfoglou & Schorlemmer 2003) have come with a comprehensive review and presentations on the methods and approaches and the state of the art in ontology aligning. Related works to our research are of the following two categories: • The Ontology Alignment weighting and similar- ity measures. These works focus on introduction of new similarity measures between concepts of two ontologies, and a weight function to evaluate an alignment between two ontologies. • The Ontology Mapping Extractions, in which the researches try to address the problem of align- ment extraction and propose methods to find a (proper) alignment among many different candi- dates. There are also some works which address both prob- lems simultaneously. We will have a quick review on each category in the following two subsections. 2.1 Similarity Measures and Ontology Align- ment There are considerable amount of previous works on similarity measures and ontology alignment. Some standards of metrics are acknowledged and defined as in the CommonKADS methodology (Schreiber, de Hoog, Akkermans, Anjewierden, Shadbolt & de Velde 2000), or OntoWeb EU thematic network (OntoWeb. A survey on ontology tools. 2002), which are partly endorsed by recognized bodies. Also there have been some works on finding similari- ties of entities in two ontologies based on their struc- tural standings. (Valtchev 1999) computes the dis- similarity of elements in a hierarchy based on their distances from closest common parent. The Up- ward Cotopic distance is introduced by (Maedche & Zacharias 2002) where they find dissimilarity of en- tities in hierarchies of ontologies. (Zhong, Zhu & Y. Li 1995) introduces a measure to calculate simi- larity of WordNet1 concepts, i.e. a single hierarchy. In it, the similarity is computed based on the closest common parent and distance of the two entities from the root. On the other hands, some methods tend to- ward a trade-off between different features such as ef- ficiency and quality, as in QOM (Ehrig & Staab 2003), and some have used approaches to integrate various similarity methods (Ehrig & Sure 2004). Also compound metrics get use of simple measures by combining them, and hoping to improve the result of the mapping between two ontologies. One approach has been to define each measure as a dimension to find the Minkowski distance of two objects (Euzenat, Bar- rasa, Bouquet & Bo 2004). As introduced in (Euzenat et al. 2004) another approach for this problem has been a weighted average of features in which weights can be calculated by a machine learning technique. For example, Glue (Doan, Domingos & Halevy 2003) builds the similarity matrix by a machine learning ap- proach. Also in APFEL (M. Ehrig 2005) weights for each feature is calculated using Decision Trees. The user only has to provide some ontologies with known correct alignments. The learned decision tree is then used for aggregation and interpretation of the sim- ilarities. Abolhassani et al. (Abolhassani, Haeri & Hariri 2006) introduces a new method for compound measure creation without any need to the mapping extraction phase. 2.2 Mapping Extraction Previous works, do not especially cover the prob- lem of alignment extraction. A method for ontol- ogy alignment extraction is proposed by (Dieng & Hug 1998) which examines linguistical features to compare two ontologies on the basis of a IS-A rela- tionship. In (Melnik, Garcia-Molina & Rahm 2002), to extract a reasonable extraction, Stable Marriage (Gibbons 1985) problem is discussed. There are some other approaches, e.g. a machine learning approach to the problem is discussed in (Doan, Madhavan, Domingos & Halevy 2003), and (Mitra et al. 2003) describe a probabilistic based model. Staab et al. (Staab & Mdche 2002) have focused on structural and taxonomic comparison of two trees 1wordnet.princeton.edu to extract an alignment, in which dissimilarity of each two concepts is calculated based on their super- classes and subclasses. Stumme et al. (Stumme & Mdche 2001) uses shared instances of two ontologies that are to be mapped, however this work ignores the properties of classes. Zhdanova et al. (Zhdanova & Shvaiko 2006) expand the notion of ontology matching to a community- driven approach to enable web communities to es- tablish and reuse ontology mappings to achieve an adequate and timely domain representation. (Johnson, Cohen, Baumgartner, Lu, Bada, Kester, Kim & Hunter 2006) models inter-ontology relation- ship detection as an information retrieval task, where relationship is defined as any direct or indirect asso- ciation between two ontological concepts. Wand et al. (Wang & Gasser 2002) presents a specific formalization and algorithm for local interpretation of shared representations to build global semantic coher- ence for the distributed actions of individual agents, known as ”Mutual Online Ontology Alignment”. LOM, as described in (Li 2004), is a semi-automatic lexicon-based ontology-mapping tool that supports a human mapping engineer with a first-cut compari- son of ontological terms between the ontologies to be mapped, based on their lexical similarity and simple heuristic methods. 3 PRELIMINARIES AND NOTATIONS In this section, we define some necessary mathemati- cal concepts which are used throughout this paper. 3.1 Basic Definitions A graph Gi, by definition, consists of two sets: V (Gi), E(Gi), where V (Gi) is the set of vertices, and E(Gi) is the set of edges. The size of a graph is |V (Gi)|, which is denoted by |G|. Let us assume that labels assigned to nodes are chosen from a fi- nite alphabet Σ. Let λ /∈ Σ be a null character, and Σλ = Σ ∪ λ. 3.2 Metric Space According to (Rudin 1976), a set of points X along with a function, is said to be a Metric Space if the function associates a real num- ber with any pair of points p, q, denoted by d(p, q), and called the distance p, q, such that: ∀x, y ∈ O, δ(x, y) ≥ 0 (positiveness) ∀x ∈ O, ∀y, z ∈ O, δ(x, x) ≥ δ(y, z) (maximality) ∀x, y ∈ O, δ(x, y) = δ(y, x) (symmetry) Any function δ, satisfying the above conditions is said to be a distance function or a metric. In fact, the distance of two concepts belonging to two different ontologies is described as the distance of their labels in a metric space, and usually this metric distance is described by a distance function described above. 3.3 Typed Graph An ontology Oi, in this paper, is considered as a typed graph Gi. A typed graph, as defined in (Haeri, Hariri & Abolhassani 2006), is denoted by Gi(V, E, T ), for which E is of type: E : V ×V → T . Members of T are all from Σλ. In such a graph an edge e of type t be- tween vertices vij and vik is denoted by: e(vij , vik ) : t. A homeomorphism from a typed graph G(V, E, T ) to another typed graph G′(V ′, E′, T ) is a one-to-one correspondence between V and V ′. In this paper, each ontology Oi is modeled using a typed graph Gi where concepts of Oi are nodes of Gi, and the rela- tions/properties of Oi are typed edges of the graph. Figure 2: A sample alignment of two graphs G, G′ 3.4 Edge Preservation We will call an edge e(v1j , v1k ) : t ∈ E(Gi1 ) preserved under the mapping M, if and only if there is an edge e(M(v1j ), M(v1k )) : t ∈ E(Gi2 ). In other words, an edge e is preserved under mapping M if and only if ∃e′ ∈ E(Gi2 ) : e′ = (M(v1j ), M(v1k )), M(e) = e′, and is not preserved otherwise. The preservation of edges between corresponding nodes is the key point to find an ideal mapping. In fact in an ideal alignment most of the edges of one ontology are preserved in the second one. 3.5 Ontology Alignment (OA) In this section we will discuss our own understanding of a one to one alignment of two ontologies. A one to one alignment of two ontologies Oi1 , Oi2 is denoted by M : Oi1 → Oi2 and is a one to one correspondence between nodes of the two graphs of Oi1 , Oi2 : (Gi1 , Gi2 ). (Ehrig & Sure 2004) defines the mapping function in the following way: • M : Oi1 → Oi2 • ∀v ∈ Gi1 : M(v) = v′ if v′ ∈ Gi2 and δ(v, v′) < t, for t being a threshold v′ is the corresponding node of v under the mapping M. Figure 2 illustrates a sample alignment for two exam- ple ontologies G, G′. 3.6 Our Formulation of OA We denote the correspondence from ontology Oi to Oj while described by the concept of typed graphs, i.e. Gi to Gj by M : Gi → Gj . It is defined as follows: 1. ∀vi ∈ Gi, vi corresponds to only one vertex vj in Gj (denoted by M(vi) = vj ), or does not corre- spond to any vertex in Gj (denoted by M(vi) = null). And if v1, v2 ∈ Vi, v1 6= v2, M(v1) 6= null, M(v2) 6= null then M(v1) 6= M(v2) 2. The correspondence of edges, is determined by the correspondence of nodes: ∀ei = (vi1 , vi2 ) : t ∈ E(Gi) : if M(vi1 ) = vj1 6= null ,M(vi2 ) = vj2 6= null, and ej = (vj1 , vj2 ) : t ∈ E(Gj ) then ei corresponds to ej , M(e(vi1 , vi2 )) = e(vj1 , vj2 ), else ei does not correspond to any edge in Gj (denoted by (M(e(vi1 , vi2 )) = null) 3. Let M be a correspondence from Gi to Gj . We call M a map from Gi to Gj if ∀vi ∈ Vi, M(vi) 6= null, i.e.M(vi) ∈ Vj 4. Each map, M from Gi to Gj has a weight and this weight is defined by the coincidence-based technique described in section 4. The problem which is addressed in this paper is for- mulated as follows: INPUT: Two ontologies together with a matrix, rows and columns of which, stands for concepts of ontologies, and each cell shows the distance of the two concepts as given by a distance measure. OUTPUT: A proper alignment. In what follows, we explain a technique to score pos- sible mappings of ontologies, so called coincidence– based mapping, in the next section and then use this weight function to extract a proper alignment with evolutionary approaches in section 5. 4 COINCIDENCE BASED WEIGHTING In this section we introduce and discuss a new weight- ing model for an alignment, with which we will later design our genetic algorithm. The coincidence based alignment weight function is sufficiently discussed in (Haeri et al. 2006), and here, we will have an overview of it. Before talking about the weight itself, lets take some time, and dis- cuss the matter. There is a set of properties that we believe any mapping should convince. Considering a mapping M, between two ontologies with graphs Gi1 , Gi2 , and two nodes v1j , v1k ∈ V (Gi1 ) and their matches M(v1j ), M(v1k ), the weighting system should result a high weight if v1j is close to M(v1j ) and also v1k is close to M(v1k ) and when e = (v1j , v1k ) : t ∈ E(Gi1 ) is preserved under M . This case is considered to be the most desired one and should be given the highest value. An alternative is when the edge is not preserved. Here, a negligible negative point should be given. The reason for negative point is the fact that, the edge is not preserved and the structural matching of the graphs is interrupted. In this case the nodes are very close but the edge is missing. The farther any of the nodes is, from its match, the lower should be the positive value of the mapping. If the edge is preserved, we give this mapping a low positive value. But when the edge is not preserved, in fact it is an undesired mapping. So we give it a negative point. In this case not only the nodes are far from their matches, but also the edge is not preserved. According to the above considerations there should be six different categories which are graphi- cally shown in the Figure 3 (In the following expla- nations we assume that G, G′ are graphs of two on- tologies O, O′ to be aligned. a, b are concepts from G, and a′, b′ from G′): • Category I. a and a′ are too close2, and b, b′ are close as well, and the edge between a, b is preserved under the matching process. This cat- egory is of much importance. This is because actually the two edges coincide too much. To clarify the point suppose the case when a and b are “means of communication” and “mail” re- spectively, and a′, b′ are “communication” and “email”. The fact that there is an edge, (i.e. 2in terms of a distance function described before Figure 3: Different Properties of mappings in a metric space. Dotted edges with type t′ show that there might be an edge of type t′ or there might be no edge present. Dashed arrowed lines shows the mapping elements. (I) shows the first category where vertices (concepts) are close and the edge is preserved. (II) is the dual of category I, and different in that the edge is not preserved. (III) shows the case when the edge is preserved but only one of the endpoints of the two edges are close to each other. (IV) is dual case of category III, and different in that the edge is not preserved (V) The edge is preserved, but none of the endpoints are close to what they have been matched (VI) is the dual case of category V, but different in that the edge is not preserved. rdf s : type) between both a, b and a′, b′ shows that the two edges coincide too much, and that the two ontolgoies are describing the same world. • Category II. In this category, the two peers of an edge are close to their matches, that means, a is close to a′ and b is close to b′ as well. The only difference between this category and the previous one is that here, the edge is not preserved. As de- scribed in (Haeri et al. 2006) consider, e.g. when O describes the Glazing Technology, whilst O′ is the ontology of simple glasses manufacturing stu- dio. Let a, b be “Glass” and “Frame”, and a′ , b′ the same respectively. Although δ(a, a′), δ(b, b′) are both small, the non-preservation of the edge (a, b) is a negative point. The fact that the ver- tices coincide, make us not to penalize this cat- egory much, because at least concepts are close to their matches and vertices coincide. • Category III. In this category the edge is pre- served but only one of the a or b is close to its match. This is good but not as much as the previous category. Consider two ontologies, de- scribing two different worlds. Suppose O is de- scribing a “High Tech manufacturing” while O′ is describing a “Supermarket”. Let a, b be “Laptop Computer” and “Product” respectively and a′, b′ be “Laptops” and “On Sale Item” respectively. The two ontologies are describing two totally dif- ferent domains, whilst a and a′ are close. So it seems as if such ontologies are getting close “ from the side of a”. We would like to give such category a large weight, yet smaller than that of the category I. • Category IV. As category II can be consid- ered to be the dual of category I, this category can be the dual of category III. The reason ori- gins in that only one peer of the edge in O is too close to what it is matched to, yet the edge is not preserved under the matching. As it is clear in Figure 3 IV the edge between a, b is not pre- served, and b is far from b′. The only positive point of such a matching is the fact that a and a′ are close. As an example, just to make things clearer, consider O to be describing a hotel’s ser- vices, where a is “egg” and b is “omelette” and O′ is describing a “Supermarket” and a′, b′ are “egg” and “shampoo” respectively. This match- ing, which maps a to a′ and b to b′ is not de- sired and is most probably getting into mistake. However, this mistake should not be penalized as much as the mistake in category VI. • Category V. The last two categories describe the situation where none of the peers of an edge is close to what it is matched to. Even though the edge might be preserved (as in category V) the two edges do not coincide at either end points. In other words, in these categories, both a, a′ and b, b′ are far from one another, and the difference is in the preservation of edges. The fact that the vertices are not close to their matches, is quite enough to make us indifferent about the edge preservation. Because even if the edge is preserved, the two edges are not that much co- incident. Both cases are not desired and should obtain low points. A clear example of category V, would be when a is “Plant”, b is “Water”, a′ is “Car”, and b′ is “Fuel”. No need to discuss that this mapping is not a desired one. • Category VI. This case is even worse in cat- egory VI than that of category V, where the edge is not even preserved. An example would be when a, b are “mammal” and “elephant” in O which is describing a “zoo” and a′, b′ are “glasses” and “frame” respectively in O′ which is incidently describing a “glasses manufactur- ing company”. There is neither a similarity be- tween endpoints of the two edges, nor is there any preservation. The vertices in this category are mapped to what have no similarities, seman- tically. According to the above cases, the following weigh function is suggested: w(M) = w0(M) − wl(M) − wr(M) w0(M) = ∑ (v1,v2):t∈E(G) , (M(v1),M(v2)):t∈E(G′) f (v1)+f (v2) wl(M) = ∑ (v1,v2):t∈E(G) , (M(v1),M(v2)):t /∈E(G′) g(v1)+g(v2) wr(M) = ∑ (v1,v2):t /∈E(G) , (M(v1),M(v2)):t∈E(G′) g(v1)+g(v2) The functions f and g, referred to as Normaliza- tion Functions (Haeri et al. 2006) , are in the form: f : R → R+ g : R → R+ f, g are related to the distance function. In fact, f should be a positive decreasing function, so that if δ(v, M(v)) grows, it decreases to reduce the positive point. And on the other hand g should be a posi- tive increasing function to grow with the growth of δ(v, M(v)) to increase the negative point for that match. In any other cases, in one of the above six categories w will misbehave. Normalization functions are defined by tuning the system. This will be de- scribed again later. According to (Haeri et al. 2006): “no [much] work is so far done on the problem of Ontology Alignment or Ontology Matching in which graph theoretic back- bone of problem is scrutinized.”. With the use of graph theory and such a modeling we believe that there is a vast area for new work on the problem of on- tology aligning. The coincidence measure explained in this section is a step forward in this direction. We believe it can be used in various ways for the Map- ping Extraction problem. The sole usage of it in this paper is introduced in the next section. 5 GENETIC ALGORITHM (GA) This section describes the developed genetic algo- rithm. Matching two general graphs in polynomial running time algorithms is impossible, because the problem in its general case is MAX SNP-Hard (Arora, Lun, Mot- wani, Sudan & Szegedy 1992). So a random search algorithm could be a good idea when designed care- fully. This leads us to the idea of using genetic algo- rithms. Genetic Algorithmic solutions are evolutionary algo- rithms, which will approach the final state by grad- ually improving the solution. Any problem which is solved by a GA, should first be coded in a way that it can be easily handled in different parts of the GA. Each coding of a solution will form an individual. Some individuals which are stored and processed in each iteration form the population. the population improves in each step with the use of crossovers and mutations and the best individual will finally be re- ported as the answer of the GA. In the following sub- sections we explain about different parts of our GA solution. 5.1 Coding a Mapping To code a mapping we use hashmaps (Cormen, Leiser- son, Rivest & Stein 2001) in which keys are concepts of one ontology and entries are concepts of another one. This data structure help us easily manipulate a one-to-one alignment, with a search of concepts in O(1). Entry for each key is actually the corresponding node of that key in the mapping. That is, if vi ∈ O is mapped to v′i ∈ O′ then in the hashmap h we’ll have the vi as the key, and h(vi) = v′i. 5.1.1 Pairs According to the coincidence-based weighting, we de- fine a Pair, as two concepts from one ontology, be- tween which there is a relation (So there is an edge in the graph of that ontology between them). Fig- ure 2 shows the alignment of two ontologies, in which (v1, v2), (v1, v3), (v3, v4), (v2, v4) are pairs of G. A pair also has a weight according to the alignment it involves in. Clearly speaking, a pair is a function of the form: P : V × V × T −→ R where V is the set of vertices in ontology graph G and T is a set of labels in Σλ. So an ontology in a matching has a limited number of pairs. The weight of a pair depends on the alignment in which the ontology is involved. Let Gi1 , Gi2 be two graphs of two aligned ontologies, and v1j , v1k ∈ V (Gi1 ). Also assume an edge between v1j , v1k to be of type t, e1jk = e(v1j , v1k ) : t. P (v1j , v1k , t) in the alignment of two ontologies is given by: P (v1j , v1k , t) = { f (v1j ) + f (v1k ) e1jk preserved −(g(v1j ) + g(v1k )) e1jk not preserved −∞ if e1jk /∈ E(Gi1 ) For a couple of concepts which do not form a pair the value of P function is set to be −∞. Definition of pairs is useful in crossover function which will try to improve the structural matching. In the alignment of two ontologies, Oi1 (Gi1 ) , Oi2 (Gi2 ), say M : Oi1 → Oi2 , we also define the weight of a single concept from one ontology, W (v1j ) where v1j ∈ V (Gi1 ), as follows: if v1j ∈ V (Gi1 ), M(v1j ) ∈ V (Gi2 ) W (v1j ) = ∑ ∀v∈Gi1 :e(v1j ,v):t∈E(Gi1 ) P (v1j , v, t) 5.1.2 An Example To make things clear about the definition of pair and its corresponding weights described above, we give an example on how to compute these weights. In the Figure 2 we have: P (v1, v2, t2) = f (v1) + f (v2) P (v1, v3, t1) = f (v1) + f (v3) P (v3, v4, t3) = −g(v3) − g(v4) P (v2, v4, t4) = −g(v2) − g(v4) P (v1, v4, ti) = P (v2, v3, ti) = −∞ W (v1) = (f (v1) + f (v2)) + (f (v1) + f (v3)) W (v2) = (f (v2) + f (v1)) − (g(v2) + g(v4)) W (v3) = (f (v3) + f (v1)) − (g(v3) + g(v4)) W (v4) = −(g(v4) + g(v3)) − (g(v4) + g(v2)) Now, with these definitions, it is the time, to de- scribe the steps of our genetic algorithm. 5.2 Initialization As of any other Genetic Algorithms, a primary pop- ulation is needed. A population is made up of some individuals each of which is a solution to the problem (a mapping in this problem). The start population, is initialized randomly, with an initial size of 1000 in- dividuals. The ideal mapping can be reached more quickly if the initial individuals, are made on the ba- sis of the labels of concepts, that is if v1j in Gi1 and v2j in Gi2 have same labels, then let v1j corresponds to v2j in the initial mapping. 5.3 Selection In each iteration, we sort the 1000 individuals accord- ing to their fitness described in section 4 (coincidence based weight function), and we select the 500 best in- dividuals as parents of the next step. From these 500 individuals, with the use of crossover and mutation functions (as we will see later), 1000 new individuals are created. These 1000 individuals are sent to the next iteration as parents. Two crossover functions are designed, one based on pairs, and the other based on solitary vertices. In the following two subsections we explain each one in detail. 5.4 Crossover I. In the first crossover, Crossover I, the pairs are in primary concern and best pairs from the parents are preserved in offsprings. For every single node in the first ontology graph, the pairs of that node are examined in the two mappings (i.e. two parents), and the best pair, which has the highest value of weight is copied in the offspring. If a matched node in ontology graph G′ is already as- signed to some other node of G, the assignment is done to a random unassigned node. In figure 4 two mappings as parents and the result of crossover I are shown. The pair (vi, vj , t) in parent 1, has greater weight than in parent 2. It is resulted from computing P (vi, vj , t) in both alignments: In parent 1 we have P (vi, vj , t) = f (vi) + f (vj ) because the edge e(vi, vj ) : t is preserved under M, but in parent 2 it is P (vi, vj , t) = −(g(vi) + g(vj )) because the edge e(vi, vj ) : t is not preserved. So the pair in parent 1 is chosen for offspring. Since f, g are positive functions, in the offspring we have M(vi) = v′i, M(vj ) = v′j . It is clear that in this crossover the pairs in off- springs are not worse than those of parents. 5.5 Crossover II. In this crossover function, single nodes are compared according to their weights. As we described before, the weight of a single node in a mapping is the sum of weights of pairs in which, that node is included. Consider two parents in two ontology graphs Gi1 , Gi2 . Figure 4: crossover I. (a) part of parent 1 mapping. (b) part of parent 2 mapping (c) part of offspring To make an offspring from two parents, for every node in Gi1 , say v1j , the mapping with larger W (v1j ) is copied to the offspring. if M (v1j ) in Gi2 is already assigned with some other node of Gi1 , then v1j is put in a reserved list. At the end of the complete iter- ation of nodes in Gi1 , the nodes in the reserved list are randomly mapped to the unassigned nodes of Gi2 . The random assignment is not done in the middle of an iteration to prevent nodes of Gi2 to be assigned to some random nodes that can be assigned to better nodes later in the iteration. So this random assign- ment is postponed until all nodes of Gi1 are examined to map to nodes in Gi2 . As an example suppose v1m ∈ V (Gi1 ) should be mapped to v2m ∈ V (Gi2 ) and v2m is already mapped by some node from Gi1 , so if at that time we assign v1m to some random node like v2n ∈ V (Gi2 ), it will prevent a possible good mapping of v1n to v2n later in the iteration. So this random assignment is delayed until no more assignment is possible. In Figure 5 two mappings between two ontologies O, O′ are shown, and we want to decide the match node for vi ∈ V (G) in the offspring. In parent 1 we have: W (vi) = P (vi, vj , t2)+P (vi, vk, t1) = (f (vi)+f (vj ))+ (f (vi) + f (vk)) and in parent 2 we have: W (vi) = P (vi, vj , t2) + P (vi, vk, t1) = −(g(vi) + g(vj )) + (f (vi) + f (vk)) Again it should be noted that f and g are positive functions so the value of W (vi) in parent 1 (Figure 5 (a)) is greater than that of in parent 2 (Figure 5 (b)). So as it is shown in part c of the Figure, the corresponding node of vi ∈ V (G) in offspring is chosen by the mapping from parent 1, and therefore is v′i ∈ V (G′). This kind of crossover seems reasonable because the mapping of a single node in the offspring is not worst than that of the two parents. So by this as- Figure 5: crossover . (a) part of parent 1 mapping. (b) part of parent 2 mapping (c) part of offspring . sumption, little by little, mappings of nodes will con- verge to ideal ones. 5.6 Mutation A proportion of the population are mutated with some probability, different in various iterations. In mutation of a mapping of two ontologies with graphs Gi1 , Gi2 , two random nodes from Gi1 are chosen, and their matches in Gi2 are substituted with each other. Let v1j , v1k ∈ V (Gi1 ) are chosen randomly, also let M(v1j ) = v2j ∈ V (Gi2 ), M(v1k ) = v2k ∈ V (Gi2 ). In the mutation process we just substitute the match nodes of the selected ones. So the new mapping will be M(v1j ) = v2k ∈ V (Gi2 ), M(v1k ) = v2j ∈ V (Gi2 ). 5.7 Continuation One important issue with any Genetic Algorithm, is how to get use of the crossover and mutation func- tions, and how to create the new population, based on the old one. In our solution, the two previously explained crossover functions are invoked on the ith and i + 1th parents to create two offsprings. The last parent is mixed with the first one to produce last two offsprings. Our population as described before, contains 500 in- dividuals. These individuals are sorted decreasingly, and the sorted array forms the parents of the current step. In each iteration, these 500 parents, with the help of a series of crossovers and mutations (as ex- plained before) produce 1000 new individuals. The resultant array of individuals is then sorted and the best 500 of them are selected as parents of the next step. Figure 6 shows this process. Figure 6: Population generation in each step of GA 5.8 End Condition Basically there is no end for the execution of any evo- lutionary algorithm and specifically a genetic algo- rithm, where a user must pause the run process if she thinks the results are reasonable. However, to end the iteration of our GA, we used a threshold for conver- gence. The sequential GA is continued until the best mapping among all individuals in the population does not improve for more than 15 steps. Such mapping is reported as the answer for the problem of a proper alignment. The alignment process of two ontologies is then finalized. 6 EVALUATION To evaluate our Genetic Algorithm, we designed three kinds of experiments. In the first experiment, we tested the Genetic Algorithm with diverse mutation probability. In the second experiment, we tried to align two identical ontologies (actually we aligned one ontology with itself). This experiment helped us ex- amine the efficiency and accuracy of the algorithm, when two ontologies are more similar to each other. To verify our contribution, we came with a third ex- periment, in which we used a naive local search align- ment method. We already discussed about similarity measures in section 2. It is actually an important issue to pay attention in the ontology alignment and extraction process. There are some similarity measures pro- posed by some researchers. For example, (Euzenat & Altchev 2004) developed a similarity metric between concepts in OWL ontologies, which is a weighted com- bination of similarities of various features in OWL concept definitions: labels, domains, ranges of prop- erties, restrictions on properties, types, IS–A rela- tionships. (Mitra, Wiederhold & Decker n.d.) and (Mitra, Wiederhold & Decker 2001) use the combi- nation of interactive specifications of mappings and heuristics to propose proper mappings. In this paper we used the Levenshtein (Levenshtein 1966) string– based similarity measure for the concepts, where the dissimilarity of any two concepts (from two ontolo- gies) is calculated by the Levenshtein distance. This measure is suitable for our experiments since most of the heterogeneity in the Ontologies in our test collec- tion comes from lexical difference. However, it should be noted that our coincidence-based weighting and hence our GA solution is independent of any similar- ity measure. To apply it to any other alignment, one can select another suitable measure for that domain and find similarity values to be used in our algorithm3 6.1 Limitations The coincidence-based weighting has an innovative idea behind, however there are essential practical lim- itations to apply this method. The most important limitation is the available ontologies and test collec- tions. Most of them do not have a large taxonomic structure and so the method does not have enough merit for them. However, in our search for a suit- able test collection we found “Tourism” ontologies (Tourism ontology FOAM n.d.) a good one with ap- proximately 340 classes and concepts. 6.2 Various Experiments Characteristics For the tourism ontologies, an ideal alignment is in- cluded in the test collection. We use such information to calculate the precision (Baeza-Yates & Ribeiro- Neto 1999) for each experiment. Consider M to be a mapping: MO → O′. To find the accuracy of the method and calculate the precision, we need a model alignment M′ which is formed and extracted by some expert. Let G, G′ be the graphs for O, O′ respectively. Also suppose, S is the set of vertices, vi, in V (G) where M(vi) = v′i = M′(vi). In other words: ∀vi ∈ V (G) : vi ∈ S ⇔ M(vi) = M′(vi) Now, the precision of the alignment M is given by: P recision(M) = |S||V (G)| • Experiment 1 As discussed previously, in this experiment we aligned “TourismA” with “TourismB”. This is the main experiment to check the efficiency of our coincidence-based genetic algorithm. In this experiment, from each two parents, we made two offsprings one with the use of crossover I function and the other with crossover II. From the four different individuals (parents and the offsprings) we chose two best of them, to introduce as children of this amalgamation. Normalization Functions are as follows: f (v) = 1 eδ(v,M (v)) g(v) = 1 emax(5,15−δ(v,M (v))) These functions actually satisfy the characteris- tics expected from f, g (explained in Sec. 4). f is a decreasing function and decreases with the growth of δ and g is increasing. Exponential functions were chosen for f, g so that f, g would have close and comparable values. In fact, these functions match the discussions on positive and negative points for different categories of a coin- cidence based weight. – Experiment 1.1 After the 1000 individuals are created, we mutate the lower half of them (with the mutation function described before) with a probability of 0.7. 3as cited in related works section, there are some good works for selection of an appropriate measure for a domain which can be applied here. Figure 7: Precision result of experiments – Experiment 1.2 After the 1000 individuals is created, we mutate the lower half of the them (with the mutation function described before) with a probability of 0.3. – Experiment 1.3 Mutation was done on each one of the 1000 individual in the all of the 1000 children with the probability of 0.5. • Experiment 2 In this experiment we are aligning “TourismA” with itself. This actually is a verification that shows how efficient the genetic algorithm will work, if two ontologies are more similar and actually more coincide. The generation summary is similar to the previous experiment, and mutation was done on the lower half of the individuals, with the probability of 0.5. Normalization Functions are similar to the previous experiment, f (v) = 1 eδ(v,M (v)) g(v) = 1 emax(5,15−δ(v,M (v))) • Experiment 3 This experiment actually provides a baseline comparison of the GA method with a naive lo- cal search method. In this part, we implemented a naive hill-climbing local search method. For the start point, we made an initial alignment. In this initial alignment, all concepts in “TourismA” is matched with concepts in “TourismB”. For a node vj in TourismA if there were a node v′j with label label(vj ) in TourismB, we matched vj with v′j . Otherwise we mapped vj to a random node of TourismB. After that, in each iteration, the best single change (mutation) was preformed to improve the weight value of alignment. We iterated the method until almost 1000 steps, where, the re- sults did not improve for more than 15 steps. 6.3 Results Figure 7 shows the result of the above experiments according to the precision measure. As it is shown, with identical graphs, Genetic algorithm finds the best mapping and precision is 1. With other exper- iments, however, the result is a little inaccurate in Figure 8: Convergence of results in GA comparison with the ideal alignment and precision is approximately 0.8. In our experiments, the distance threshold, which we talked about in section 3.5, is set to be 4. We chose this number by experience, however there could be other solutions to determine this number, like ma- chine learning techniques, etc. It is also possible to add one level of iteration to our genetic algorithm to test different values for distance threshold and select the threshold that results the best total weight for mapping. Since the labels for concepts in a typical Ontology normally does not exceed the length of an English (or other natural language) word the upper limit of the distance threshold can be said to be less than 20. So by execution of our genetic algorithm with different distance threshold vales (e.g. 1,2 to 20) we may find the best threshold value for an alignment task. We also did an investigation on iteration number and convergence of the result in this genetic algorithm for Experiment 1. The results are shown in figure 8. Figure 8 shows the weight of the best alignment, i.e. the best individual in the population, reached in each iteration of the running process in the Ge- netic Algorithm. The figure shows the best align- ment weight for all three experiments, 1.1 in red, 1.2 in blue, and 1.3 in green. As it is clear, in all experi- ments the Genetic Algorithm converges before almost 35 iterations. The precision average for convergence is 79%. This means that the GA will not find a better solution, if it runs more than this number of itera- tions. Usually at such circumstances all individuals in the population, converge to a single individual and equal to each other, so that the combination of them in the form of crossovers will not improve the solution and will result in the same individuals. Besides, the mutation in most cases will result in a worse individ- ual than a better one. The reason is quite clear. The optimal states (individuals) are much less than worse states, and the mutation is absolutely random. Muta- tion is mostly useful when the population is trapping in local maxima, and the mutation will move it to another point in the Genetic Algorithm. It should be noted that in this area the main cri- teria of the goodness of an approach is the precision of the extracted mapping and reaching to a solution with the optimal time complexity is beyond the state of the researches. 7 CONCLUSION AND FUTURE WORK In this paper, we discussed our method for Map- ping Extraction in an Ontology Alignment frame- work. First, we introduced our modeling of an on- tology with the concept of typed graphs. Then our coincidence-based measure for weighting of different candidate alignments were discussed. We would like to state that this experience gives us the impression that our solution which gets help from graph the- ory (and which was contrived for ontology alignment only) can be even further extended to a wider range of problems. Graph matching and similarity measures of graphs have many applications in machine vision, pattern recognition, bio–informatics, and etc. We defined two cross-over functions based on coincidence-based measure. They are crafted in a way that ensure the new generations are not worse than previous ones. In our experiments our GA solu- tion converges very rapidly, for example after approx- imately 40 iterations, which, in the order of magni- tudes, is considerably better than a naive ”candidate mapping generation and test” approach. This number can even be reduced more by choosing a biased initial population, where labels can be involved to choose better initial mappings. There are also some weaknesses with Genetic Algo- rithms. One of them is the dependency of results to initial population. The more significant weakness is when the two ontologies are sparse graphs, or even worse is when there are like forests. In these cases the domain for crossover is not a soft one, and small changes in an individual in crossover or mutation might take it to a very far point. The reason for this anomaly is that in sparse graphs or jungles, the number of disconnected vertices is more, and there- fore a small change in the alignment will map one node to another vertex which, more probably, will not form a coincident mapping. Roughly speaking, the more connected and taxonomic the two ontologies are, the better the results of the Genetic Algorithmic coincidence-based extraction will be. In continuation of our research, work is now be- ing done on tree-like ontologies (which seem being a most common form). Once we can align tree ontolo- gies, we can model ontologies as trees and align the resulting trees. We are also interested in extending our theory and mechanisms for matching ontologies based on their various graphical shapes, properties of subgraphs, etc. 8 Acknowledgments This research was in part supported by a grant from IPM (No. CS1385-4-01). References Abolhassani, H., Haeri, S. H. & Hariri, B. (2006), ‘On ontology alignment experiments’, Webology 3(3). Arora, A., Lun, C., Motwani, R., Sudan, M. & Szegedy, M. (1992), Proof verification and hard- ness of approximation problems, in ‘33rd IEEE symposium on the Foundations of Computer Sci- ence (FOCS)’, pp. 14–23. Baeza-Yates, R. & Ribeiro-Neto, B. (1999), Modern Information Retrieval, Addison Wesley. Bisson, G. (1992), Learning in fol with similarity mea- sure, in ‘Proceedings of the 10th American Asso- ciation for Artificial Intelligence conference, San- Jose (CA US)’, pp. 82–87. Bouquet, P., Euzenat, J., Franconi, E., Serafini, L., Stamou, G. & Tessaris, S. (2004a), Specification of a common framework for characterizing align- ment, deliverable 2.2.1, Knowledge web NoE. Bouquet, P., Euzenat, J., Franconi, E., Serafini, L., Stamou, G. & Tessaris, S. (2004b), ‘Specification of common network framework for characterizing alignments’, Knowledge web NoE 2.2.1. Cormen, T. H., Leiserson, C. E., Rivest, R. L. & Stein, C. (2001), Introduction to Algorithms, sec- ond edition, MIT Press. Dieng, R. & Hug, S. (1998), Comparison of p̈ersonal ontologiesr̈epresented through concep- tual graphs, in ‘13th ECAI, Brighton (UK)’, pp. 341–345. Doan, A., Domingos, P. & Halevy, A. (2003), ‘Learn- ing to match the schemas of data sources: A multistrategy approach’, Machine Learning 50(3), 279–301. Doan, A., Madhavan, J., Domingos, P. & Halevy, A. (2003), ‘Ontology matching: A machine learning approach’, Handbook on Ontologies in Informa- tion Systems. Springer-Velag, 2003. . Ehrig, M. & Staab, S. (2003), Qom quick ontology mapping, in ‘Proc. ISWC-2003.’. Ehrig, M. & Sure, Y. (2004), Ontology mapping — an integrated approach, in ‘1st European Semantic Web Symposium (ESWS)’, pp. 76–91. Euzenat, J. & Altchev, P. (2004), Similarity–based ontology alignment in owl–lite, in ‘The 16th Eu- ropean Conference on Artificial Intelligence’. Euzenat, J., Barrasa, J., Bouquet, P. & Bo, J. (2004), ‘State of the art on ontology alignment’, Knowl- edge Web, Statistical Research Division . Euzenat, J. & Valtchev, P. (2004), An integrative proximity measure for ontology alignment, in ‘Proc. 3rd International Semantic Web Confer- ence (ISWC2004) .’. Gibbons, A. (1985), Algorithmic Graph Theory., Cambridge University Press,. Haeri, H., Hariri, B. & Abolhassani, H. (2006), Coincidence-based refinement of ontology align- ment, in ‘3rd Intl. Conference on Soft Computing and Intelligent Systems’. Hopcroft, J. & Karp, R. (1973), ‘An n5/2 algorithm for maximum matchings in bipartite graphs’, SIAM Journal on Computing 2(4), 225–231. Johnson, H. L., Cohen, K. B., Baumgartner, W. A., Lu, Z., Bada, M., Kester, T., Kim, H. & Hunter, L. (2006), ‘Evaluation of lexical methods for de- tecting relationships between concepts from mul- tiple ontologies’, Pac Symp Biocomput pp. 28– 39. Kalfoglou, Y. & Schorlemmer, M. (2003), ‘Ontology mapping: the state of the art’, The Knowledge Engineering Review 18, 1–31. Levenshtein, V. I. (1966), ‘Binary codes capable of correcting deletions, insertions and reversals.’, Sov. Phys. Dokl. 6, 707–710. Li, J. (2004), Lom: A lexicon-based ontology mapping tool, in ‘Proceeding of the Perfor- mance Metrics for Intelligent Systems (Per- MIS04).Information Interpretation and Integra- tion Conference (I3CON), Gaithersburg, MD.’. M. Ehrig, S. Staab, Y. S. (2005), Bootstrapping on- tology alignment methods with apfel., in ‘Pro- ceedings of the 4th International Semantic Web Conference (ISWC-2005), ser. Lecture Notes in Computer Science’, pp. 186–200. Maedche, A. & Zacharias, V. (2002), Clustering on- tologybased metadata in the semantic web., in ‘Proceedings of the 13th ECML and 6th PKDD’. Melnik, S., Garcia-Molina, H. & Rahm, E. (2002), Similarity flooding: a versatile graph matching algorithm, in ‘Proc. 18th International Confer- ence on Data Engineering (ICDE), San Jose (CA US)’, pp. 117–128. Mitra, P., Noy, N. F. & Jaiswal, A. R. (2003), Omen: A probabilistic ontology mapping tool, in ‘4th international semantic web conference (ISWC 2005)’, Vol. 3729, pp. 537–547. Mitra, P., Wiederhold, G. & Decker, S. (2001), A scal- able framework for interoperation ofinformation sources, in ‘In The 1st International Semantic Web Working Symposium (SWWS01), Stanford University, Stanford, CA,’. Mitra, P., Wiederhold, G. & Decker, S. (n.d.), ‘The prompt suite: Interactive tools for ontology merging and mapping.’, International Journal of Human-Computer Studies, volume = . OntoWeb. A survey on ontology tools. (2002), avail- able online : www.ontoweb.org/deliverable.htm. Papadimitriou, C. H. & Steiglitz, K. (1998), Combi- natorial Optimization Algorithms and Complex- ity, Prentice-Hall. Rahm, E. & Bernstein, P. (2001), ‘A survey of ap- proaches to automatic schema matching’, VLDB Journal 10(4), 334–350. Rudin, W. (1976), Principles of Mathematical Analy- sis, third edn, McGraw-Hill, New York. Schreiber, G., de Hoog, R., Akkermans, H., Anjewier- den, A., Shadbolt, N. & de Velde, W. V. (2000), Knowledge Engineering and Management, MIT Press. Staab, S. & Mdche, A. (2002), ‘Measuring similar- ity between ontologies’, Lecture notes in artifi- cial intelligence (2473), 251–263. Stumme, G. & Mdche, A. (2001), Fca-merge: bottom-up merging of ontologies, in ‘In Proc. 17th IJCAI, Seattle (WA US)’, pp. 225–230. Tourism ontology FOAM (n.d.). avail- able online: http://www.aifb.uni- karlsruhe.de/WBS/meh/foam/ontologies/. Valtchev, P. (1999), ‘Construction automatique de taxonomies pour laide la reprsentation de con- naissances par objets.’, Ph.D. Dissertation, Uni- versite Grenoble. . Wang, J. & Gasser, L. (2002), Mutual online ontology alignment, in ‘Proc. of the AAMAS 2002 Work- shop’. Zhdanova, A. V. & Shvaiko, P. (2006), Community- driven ontology matching., in ‘Proc. of ESWC’, pp. 34–49. Zhong, J., Zhu, H. & Y. Li, a. Y. Y. (1995), Using information content to evaluate semantic simi- larity in a taxonomy, in ‘Proceedings of the 14th International Joint Conference on Artificial In- telligence (IJCAI-95)’.