From data mining to knowledge mining: Application to intelligent agents


From data mining to knowledge mining: Application to intelligent
agents

Amine Chemchem ⇑, Habiba Drias *
USTHB-LRIA, BP 32 El Alia Bab Ezzouar, Algiers, Algeria

a r t i c l e i n f o

Article history:
Available online 10 September 2014

Keywords:
Knowledge mining
Induction rules
Classification
Clustering
Cognitive agent

a b s t r a c t

The last decade, the computers world became a huge wave of data. Data mining tasks were invoked to
tackle this problem in order to extract the interesting knowledge. The recent emergence of some data
mining techniques provide also many interesting induction rules. So, it is judicious now to process these
induction rules in order to extract some new strong patterns called meta-rules. This work explores this
concept by proposing a new support for induction rules clustering and classification. The approach
invokes k-means and k-nn algorithms to mine induction rules using new designed similarity measures
and gravity center computation. The developed module have been implemented in the core of the
cognitive agent, in order to speed up its reasoning. This new architecture called the Miner Intelligent
Agent (MIA) is tested and evaluated on four public benchmarks that contain 25,000 rules, and finally it
is compared to the classical one. As foreseeable, the MIA outperforms clearly the classical cognitive agent
performances.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

Nowadays, the induction rules have become inseparable
pattern of the artificial intelligence thanks to their existence as
the basis for many disciplines, such as the agent technology, data
mining and knowledge discovery. . .This paper is about how to
extend data mining techniques to induction rules in order to
extract meta-rules. There are many data mining tasks for instance:
clustering, classification, association rules mining, regression, pre-
diction,. . .We are interested through this work in the first two
tasks which are used on many applications (the image processing,
the intrusion detection,. . .etc) and can be solved by different
algorithms (k-means, HCA, fuzzy c-means. . .for clustering, KNN,
SVM,ID3,. . .for classification). K-nn and K-means are in the top
ten of data mining algorithms (Wu et al., 2008). The latter are
extended to induction rules by introducing new version of similar-
ity measure and gravity center computation. The algorithms called
K-NN-IR and K-means-IR are developed and demonstrated on a
public large scale benchmark including 25,000 induction rules.
The whole idea behind this work is to improve the reasoning
process by integrating the knowledge mining module in today’s
intelligent agent in order to speed up the reasoning engine process.

The rest of this paper is organized as follows: Next section shows a
short history of data mining. Section 2 summarizes related works.
In Section 3: induction rules representation are presented,
followed by proposing mathematical preliminaries. In Section 5,
the suggested algorithms are described and followed by the defini-
tion of a new architecture for intelligent agent. Then, experimental
results are shown in Section 7 compared to the previously
proposed algorithms. Finally we conclude by making some
remarks and talking about future works.

2. Data mining overview

The generation of models from a large number of data is not a
recent phenomenon. Egypt Pharaoh Amasis organizing the census
of the population in the fifth century BC Rocchi (Rocchi, 2003). This
is the seventeenth century we begin to analyze the data to find
common characteristics. In 1662, John Graunt published his book
‘‘Natural and Political Observations Made upon the Bills of Mortal-
ity’’ in which he analyzed the mortality in London and trying to
predict the appearances of the bubonic plague. In 1763, Thomas
Bayes shows that we can determinate not only probabilities from
observations derived from experience, but also the parameters
for these probabilities. Legendre published in 1805 an essay on
the least squares method for comparing a set of data with a math-
ematical model. From 1919 to 1925, Ronald Fisher develops the
analysis of variance as a tool for its proposed medical statistical

http://dx.doi.org/10.1016/j.eswa.2014.08.024
0957-4174/� 2014 Elsevier Ltd. All rights reserved.

⇑ Corresponding authors.
E-mail addresses: aminechemchem@gmail.com (A. Chemchem), hdrias@usthb.

dz (H. Drias).

Expert Systems with Applications 42 (2015) 1436–1445

Contents lists available at ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a

http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2014.08.024&domain=pdf
http://dx.doi.org/10.1016/j.eswa.2014.08.024
mailto:aminechemchem@gmail.com
mailto:hdrias@usthb.dz
mailto:hdrias@usthb.dz
http://dx.doi.org/10.1016/j.eswa.2014.08.024
http://www.sciencedirect.com/science/journal/09574174
http://www.elsevier.com/locate/eswa


inference. The 1950s saw the advent of computer technology and
computer calculation. Same methods and techniques are emerging
such as segmentation, neural networks and genetic algorithms, and
then in the 1960s, the decision tree, the method of mobile centers,
these techniques allow researchers to exploit and discover models
more accurate. The advent of the microcomputer stimulates
research and statistical analyzes are more numerous and precise.
The term ‘‘data mining’’ had a negative connotation in the early
1960s, expressing contempt for statisticians research approaches
without correlation assumptions. it fell into oblivion, and Rakesh
Agrawal employed again in the 80s when they were beginning
research on databases with a volume of 1 Mb. The concept of data
mining makes its appearance – according Pal (2007) – when the
IJCAI1 conferences took place in 1989. Then, in the 1990s, came
the machine learning techniques such as SVM in 1998, complement-
ing the tools of the data analysis. At the turn of the century, a
company like Amazon uses these tools to offer our customers
products that may interest. Actually, There are many tasks of data
mining such as: Supervised and unsupervised classification,
association rule mining, prediction and regression.

2.1. Supervised classification

Classification of a collection consists of dividing the items that
make up the collection into categories or classes (Kotsiantis,
2007; Jain, Murty, & Flynn, 1999). In the context of data mining,
classification is done using a model that is built on historical data.
The goal of predictive classification is to accurately predict the tar-
get class for each record in new data, that is, data that is not in the
historical data. A classification task begins with build data (also
known as training data) for which the target values (or class
assignments) are known. Different classification algorithms use
different techniques for finding relations between the predictor
attribute’s values and the target attribute’s values in the build data.
K Nearest Neighbor (K-NN from short) is one of those algorithms
that are very simple to understand, furthermore, it works incredi-
bly well in practice, especially in the anomaly detection domain
like Liao and Vemuri (2002), also for text categorization like in
the work Guo, Wang, Bell, Bi, and Greer (2006). Also it is surpris-
ingly versatile and its applications range from vision to proteins
to computational geometry to graphs and so on. With KNN algo-
rithm, we can obtain a satisfactory results, in addition, its basic
principle is very simple, and easy to implement. It also might sur-
prise many to know that K-NN is one of the top 10 data mining
algorithms. K-NN is an non parametric learning algorithm, it is
used when the data set does not obey a defined function as
(gaussian mixtures, linearly separable etc). K-NN algorithm can
be explained as follows, in the first time, training data that are
already classified are considered, and then to classify the new data,
we have to compute the similarities distance between this new
data and all training data. After that the k nearest neighbors are
extracted. In the end the new data is assigned to the most frequent
class of these neighbors.

2.2. Clustering data technique

Clustering data mechanism consist to put the homogeneous
data into the same group or class in order to dispatch the hetero-
geneous data into different groups. In the literature, it exists differ-
ent manner to group the data, the two principals are: the
hierarchical and the partitioning clustering. For the hierarchical
clustering, the clusters are inside each others. This category of
clustering is used when data can be separated in different levels.

Also, CHA is the most known hierarchical algorithm, it starts by
putting each instance in one cluster after that it computes the dis-
similarities for all two instances to combine the clusters that have
the lower distance. This process is repeated until we get one cluster
(Steinbach, Ertöz, & Kumar, 2004; Han, Kamber, & Pei, 2006). In the
contrary of the partitioning clustering, it consists to cluster the
data separately. K-means is one of the simplest pure partitioning
learning algorithms that solves the well known clustering problem
(Han et al., 2006; MacQueen et al., 1967). The procedure follows a
simple and easy way to classify a given data set through a certain
number of clusters (assume k clusters) fixed initially. The main
idea is to define k gravity centers, one for each cluster. The cen-
troids should be placed in a cunning way because the clustering
result depends on their location in the clusters. In order to
optimize the efficacy of the outcomes, it is judicious to place them
as much as possible far away from each other. The next step is to
take each point belonging to a given data set and associate it to
the nearest centroid. When no point is pending, the first step is
completed and an early grouping is done. At this point we need
to recalculate k news centroids of the clusters resulting from the
previous step, and iterates the process. The latter stops when no
more changes of the clusters are observed, in other words when
the centroids do not move any more.

3. Related works

Our interest in this study revolves around two main subjects:
scalable cognitive agent and knowledge mining in general which
involves induction rules mining. As for first subject, we found very
few papers with ideas about the notion of scalable cognitive agent
like Cao, Gorodetsky, and Mitkas (2009), and nothing about the
paradigm that we would like to cover in this article. However,
we notice that biologists and psychologists are showing interest
in the study of scalable brain (Eliasmith, 2013). What can be said
about the second topic is that the literature offers a large spectrum
of detailed research on knowledge Mining. Mining knowledge
including simple data and other patterns have been examined
intensively over the last decade. In the following, we will talk
about some knowledge mining.

Many works are about mining association rules in order to
obtain meta rules whose purpose is to reduce the large number
of discovered rules. The CLOSET algorithm was proposed in
Strehl, Gupta, and Ghosh (1999) as a new efficient method for
mining closed itemsets. CLOSET uses a novel frequent pattern tree
(FP-tree) structure, which is a compressed representation of all the
transactions in the database. Moreover, it uses a recursive
divide-and-conquer and database projection approach to mine
long patterns. Another solution for the reduction of the number
is introduced by Hahsler and Chelluboina (2011) used an item-
set-tid set search tree and pursued with the aim of generating a
small non redundant rule set. To this goal, the authors first found
minimal generator for closed itemsets, and then, they generated
non redundant association rules using two closed itemsets. A
new algorithm to group rules via hierarchical clustering has been
developed in Berrado and Runger (2007) to visualize the large
number of rules. The clustering of rules is done by defining a
new distance called dJaccard that represents the number of items of
the two rules divided by the number of unique items. Saneifar,
Bringay, Laurent, and Teisseire (2008) were interested in discover-
ing sets of data. In their paper, they have developed a new similar-
ity measure between two rules and extended k-means algorithm to
cluster them. In literature some works about induction rules anal-
ysis have been proposed: In Poongothai and Sathiyabama (2012b),
eh authors have developed a new algorithm to select the interest-
ing induction rules from all the discovered rules in web mining1 International Joint Conference on Artificial Intelligence.

A. Chemchem, H. Drias / Expert Systems with Applications 42 (2015) 1436–1445 1437


https://isiarticles.com/article/46043