key: cord-0057790-vw8huwv1
authors: Zhang, Yuliu; Zhao, Bo
title: On the Evolution of Knowledge Graph Abroad and Its Application in Intelligent Education
date: 2020-10-30
journal: e-Learning, e-Education, and Online Training
DOI: 10.1007/978-3-030-63955-6_2
sha: ca47106c78200b8dd7a97e077c59a1d4c6f33374
doc_id: 57790
cord_uid: vw8huwv1

The development of education is more and more dependents on the intelligent learning support services in today’s information age. The intelligent education supported by artificial intelligence (AI) has drawn more attention. In particular, knowledge graph (KG) becomes the key to promote the development and innovation of education with the developing of AI. So, the evolution of the research hotspots of KG is reviewed based on the literature of Web of Science in this study. We explore the development of intelligent education on different aspects, including educational KG, cognitive diagnosis and personalized education.

According to the NMC's horizon 2019 report, AI is the key of higher education to revolutionize in the next four to five years [1] . As an interdisciplinary research field, with rapid development of AI, KG will support for "AI + education" in the future, and push education towards "intelligent education". In the realization process of intelligent education, such as semantic search of knowledge, recommendation system of personalized learning, and construction of learner portrait, most of them rely on large-scale KG. From the beginning of Google's intelligent search engine, to now, big data analytics, chatbots, personalized education and recommendation system, are all closely related to the KG. Therefore, by comprehensively analyzing the evolution of KG, it is great significance to promote innovation in the education. From the perspective of bibliometrics, this study explores the development of intelligent education by analyzing the evolution of KG.

In this study, we extracted data with "Knowledge Graph" or "Knowledge Visualization" from the Web of Science database, time range from 1996-2019. By eliminating the irrelevant literature to KG, a total of 1210 residual literature were identified, which are used as the sample data source.

Scientific citation data visualization analysis software (CiteSpace) and Bibliographic Items Co-occurrence Matrix Builder (BICOMB) were adopted. Through the Citation Analysis and keyword co-occurrence analysis, hot topic distribution and frontier trend of KG were discussed.

With the change of time, the number of literatures is an important indicator to measure the development trend of a field. In BICOMB software, "Year" is used to count the annual number of literatures, which distribution curve is significant (see Fig. 1 ). We can see it is to increase at a quick rate in recent years from the fitted exponential function of the curve (y = 2.043e 0.1321x ) and the determination coefficient (R2 = 0.8286), the number of relating literatures. The foreign research about KG began in the 1950s. In 1955, Garfield [2] pioneered the idea of using citation indexes to retrieve literatures. In 1965, Price [3] proposed the citation relationship between network and scientific literature. In 1968, J. R. Quillian proposed the semantic web. In 1977, Feigenbaum put forward the concept of knowledge engineering, and deem that knowledge engineering is the application of AI, and then ontology was introduced and became the method of representing knowledge in the real world. In 1998, Tim Berners-Lee proposed the concept of Semantic Web. On this basis, the specification of knowledge description on the World Wide Web was further proposed by the W3C. In 2006, Tim Berners-Lee [4] put forward the concept of linked data to generate large-scale networks based on the relation between entities.

In 2012, Google proposed the "knowledge Graph", which reflects the enhanced application of large-scale KG in intelligent search engine. The typical representatives of large-scale network knowledge, such as DBpedia [5] , are built on the basis of Wikipedia structured knowledge base. With the further development of AI, it has entered the stage of cognitive intelligence, which promotes the development of construction technologies of KG.

The keywords of the literature are the extracted from the full text, which can objectively reflect the research hotspots in a certain field [6] . CiteSpace is used to generate the cooccurrence graphs of keywords (see Fig. 2 ). Co-occurrence of high-frequency keywords.

The Pathfinder Network algorithm is used to simplify the network for highlight structural features, which are mainly determined by parameter r. the relation of triangle inequality is defined as Eq. (1):

Where w ij represents the link weight between node i and node j, w nk n k+1 represents the link weight between node n k and n k+1 nodes, and r is the Minkowski distance.

In addition to the "Knowledge Graph", there are the top 10 high-frequency keywords: Ontology (52), Knowledge Visualization (52), Visualization (40), Semantic Web (33), Knowledge Graph Embedding (28), Knowledge Representation (21), Neural Network (16), Link Prediction (16), Dbpedia (15) , Recommender System (14) . As can be seen from the fluctuation of word frequency, which shows that it focuses on the key technologies of KG construction such as knowledge extraction, knowledge representation, knowledge reasoning and knowledge graph completion.

Clustering analysis of keywords can clearly reveal the hot topics in the field, cluster was extracted from the keywords of cited literature. The extraction of cluster uses a Long-likelihood-ratio algorithm, which computed as Eq. (2):

LLR is the log-likelihood ratio of the word W i and C j , the vector V ij (∝, β, γ ) is composed of the frequency (∝), concentration (β), and dispersion (γ ) of the word W i . The vector V ij is used to determine whether W i can be used as a feature word of category C j . p C j /V ij and P C j /V ij is the density functions of the categories C j and C j .

The clustering graph of keywords as shown in Fig. 3 , Modularity Q = 0.55, Mean Silhouette = 0.7309, therefore, the structure of cluster is remarkable and convincing. 

(1) Knowledge visualization. Knowledge visualization is a graphical method to construct and transmit complex knowledge. It uses scene visualization, relationship visualization, and process visualization to create and transfer of knowledge, in order to deepen students' "memory" of knowledge and promote learners' cognitive processing. The visual expression has a strong "integrity" characteristic and makes people have a "global consciousness", which can help learners better understand and learning the knowledge. (2) Knowledge representation. Knowledge representation is the study of how to represent the knowledge of the objective world in a form that is easy for computer or machine to recognize and understand. RDF triples are mostly used to describe the relationship between entities. In recent years, with the development of deep learning, knowledge representation learning based on entities, concepts, and relationships have become mainstream [7] . Moreover, in knowledge representation, the fusions of cross-media elements and spatio-temporal dimensions are also the trend of future research. (3) Deep learning. Deep learning (DL) is a kind of machine learning based on deep neural network, which uses statistics to model specific problems in the real world and uses trained data to solve similar problems in the field. It is a semi-theory and semi-empirical model with flexible expression. DL is derived from artificial neural networks, which can be divided into convolutional neural networks (CNN) and deep belief nets (DBN). CNN is a well-known DL architecture inspired by the natural visual perception mechanism of the living creatures [8] . DBN is probabilistic generative models that are composed of multiple layers of stochastic, latent variables [9] . In addition, Google's TensorFlow framework can used to implement open source DL systems, which can support CNN, RNN and LSTM algorithms, and also provides a tool on the web. Which can use graphics to present the real-time characteristics of the entire network. (4) Neural networks. Neural networks, often called artificial neural networks (ANN), are based on the basic principles of neural networks in biology, and are rooted in neuroscience, statistics, mathematics, and computer science. The neural network simulates the thinking of the human brain, and can have human-like understanding and judgment, which is a further extension of traditional logic calculations. As an indispensable part of machine learning, neural networks have been widely used in computer vision, decision optimization, image segmentation, cognitive science, and so on. At present, convolutional neural networks (CNN) and recurrent neural networks (RNN) are relatively mature. CNN is a neural network especially suitable for computer vision applications [10] , and RNN is a neural network for processing sequence data [11] . The toolkit, such as tensorflow and sklearn has been applied neural network models to solve practical problems.

Nowadays, the construction of educational KG is regarded as the key for intelligent education. The educational KG can be constructed to provide students with a variety of knowledge services, such as knowledge query, personalized learning path, etc. When constructing the KG about a specific domain, it is necessary to discuss with the domain experts to customize the schema of the domain KG. However, the researchers mainly focus on the coded explicit knowledge, most of which do not consider the tacit knowledge. Therefore, it is urgent to construct generic and domain KG for comprehensive construction platform. Some KG construction platforms have been constructed according to the relevant methodologies. For example, Baidu has built K-12 education KG to realize personalized learning, and providing intelligent services for students.

In terms of education and teaching, intelligent education is faced with the problem of students' cognitive overload. As mentioned in the Cognitive Load Theory, when the cognitive load is controlled within the range of working memory, effective learning can be happened [12] . How to reduce the cognitive load and set the gradient of students' learning are the key. However, student's learning paths can be found and created by cognitive diagnosis and student's cognitive load can be reduced effectively.

On the other hand, according to Constructivism Learning Theory, learning is the process by which learners actively select and process external information based on their own experience [13] . In order to avoid the emergence of "Lost in learning" problem, it is urgent to build the cognitive graph based on KG through students' cognitive diagnosis. It has "cognition" first, which can make cognitive diagnosis for students. Then "reasoning", it is to provide a personalized learning path, which can promote students' effective learning.

With development of technology such as the Internet and AI, traditional educational ideas about effective learning of students have been inherited and surpassed. David Pawl Ausubel believes that students' learning should be as meaningful as possible if it is valuable [14] . Therefore, the development of personalized education should also aim at intentional, active, real, constructive and cooperative learning.

It is urgent to create a fair, open and personalized education environment by the deep combination of AI and education, which was proposed at the international conference on AI & education in May 2019. In particular, with the coming of 5G era, unstructured data transmission can be achieved through the 5G's wireless communication technology, which provides a more three-dimensional digital environment for intelligent education [15] . In the future, with the progress of KG and 5G, intelligent education will become the main way of online learning.

EDUCAUSE Horizon Report: 2019 Higher Education Edition

Citation indexes for science: a new dimension in documentation through association of ideas

Networks of scientific papers

DBpedia: a nucleus for a web of open data

Research on the teaching quality of colleges and universities in China: track, hotspots and future trend -a Citespace visualization based on fourteen core journals of higher education

Knowledge representation learning: a review

Recent advances in convolutional neural networks. Pattern Recogn

Deep belief networks

Rob: Visualizing and understanding convolutional networks

Speech recognition with deep recurrent neural networks

Cognitive load theory: instructional implications of the interaction between information structures and cognitive architecture

An epistemological glance at the constructivist approach: constructivist learning in Dewey, Piaget, and Montessori

Educational psychology: a cognitive view

How 5G will Shape Innovation and Security: A Primer

Acknowledgments. The research is supported by a National Nature Science Fund Project (No. 61967015), and Undergraduate Education and Teaching Reform Project in Colleges and Universities of Yunnan (JG2018060).