key: cord-1015136-j3wcnnh7
authors: El Aouifi, Houssam; El Hajji, Mohamed; Es-Saady, Youssef; Douzi, Hassan
title: Predicting learner’s performance through video sequences viewing behavior analysis using educational data-mining
date: 2021-05-03
journal: Educ Inf Technol (Dordr)
DOI: 10.1007/s10639-021-10512-4
sha: ca50c80b18d183f931991004cdcaf9a25c4a1d6f
doc_id: 1015136
cord_uid: j3wcnnh7

This paper analyzes how learners interact with the pedagogical sequences of educational videos, and its effect on their performance. In this study, the suggested video courses are segmented on several pedagogical sequences. In fact, we’re not focusing on the type of clicks made by learners, but we’re concentrating on the pedagogical sequences in which those clicks were made. We focalize on the interpretation of the path followed by a learner watching an educational video, and the way they navigate the pedagogical sequences of that video, in order to predict whether a learner can pass or fail the video course. Learner’s video clicks are collected and classified. We applied educational data mining technique using K-nearest Neighbours and Multilayer Perceptron algorithms to predict learner’s performance. The classification results are acceptable, the kNN classifier achieves the best results with an average accuracy of 65.07%. The experimental result indicates that learners’ performance could be predicted, we notice a correlation between video sequence viewing behavior and learning performances. This method may help instructors understand the way learners watch educational videos. It can be used for early detection of learners’ video viewing behavior deviation and allow the instructor to provide well-timed, effective guidance.

Due to the international epidemic COVID-19 and the application of quarantine by countries. Schools and educational institutes have adopted distance learning. Consequently, the learning process is evolving and enriching thanks to the introduction of video. So, distance learning and especially Video-based learning are becoming widely adopted by teachers in many flipped, blended and online classes. The purpose to obtain the best possible results have created a need to implement effectivemethods and techniques for monitoring learners. This context creates traces of billions of interactions with videos (Giannakos et al. 2014) .

Recently, using video as a learning resource has drawn attention to the need for analyzing the viewing behavior so as to ameliorate the effectiveness of video lectures, predicting learner's performance, and likewise the improvement of the learning process in general (Mongy 2007 ). On the one hand, video viewing behavior analytics can grant an important advantage in the learning process to understand the use of videos by learners (Ozan and Ozarslan 2016) . Relevant studies have been carried out, such as the analysis of video viewing behavior in flipped classrooms (Beatty et al. 2017; Dazo et al. 2016) , learners viewing engagement with in-video quizzes (Kovacs 2016) , and identifying learning styles (Dissanayake et al. 2018 ). On the other hand, predicting learner's performance can support both tutors and e-learning systems (Mimis et al. 2019) . It has become an emerging research field according to the large volume and variety of educational data. Several works have been conducted in this research area based on diverse factors and aspects using Educational data mining (EDM). EDM exploits statistical, machine learning, and data-mining (DM) algorithms over the different types of educational data in order to study educational questions (Minaei-Bidgoli et al. 2003 ; Kotsiantis and Pintelas 2005; Golding and Donaldson 2006; Romero and Ventura 2010; Abdous and Yen 2010; Huang and Fang 2013; Kabakchieva 2013) .

Moreover, several studies have emphasized the significance of learner video viewing behavior as a core feature for performance prediction modeling, using different data and factors; such as the number of clicks performed (Brinton and Chiang 2015; Giannakos et al. 2015; Kleftodimos and Evangelidis 2014; Lemay and Doleck 2020; Lu et al. 2018) , learner demographics, forum activities, learning behavior (Qiu et al. 2016) , frequency of video viewing (Lemay and Doleck 2019; 2020) and clicks sequences (Yu et al. 2019) . However, those studies have ignored the clicks behaviour vis a vis the pedagogical sequences which can be a very important feature to improve video viewing behavior analysis.

In this study, we took advantage of educational data mining methods to study learners' engagement with the pedagogical sequences of an educational video. We focused on interpreting the path followed by learners and the way they navigate those pedagogical sequences. In order to predict whether a learner pass or fail a video course, we inquire if there is any relationship between learners' performance as well as the way they watch and navigate the pedagogical sequences of a video course.

The rest of this paper is organized as follows, Section 2 presents related works. The methodology is described in Section 3. Results are reported and discussed in Section 4. Section 5 concludes the paper.

As the number of learners watching videos on Web-based systems increases, more and more interactions have the potential to be gathered and analyzed (Giannakos et al. 2014 ). There have been various ways in which students actually view video courses. Many students view the whole video on a single go, many see it again after having watched it, some select and view a sequence of the video several times, and some others skip one segment to another de Boer (2013). Many works have studied performance prediction based on learners' video viewing behavior settings using Educational Data mining.

Data mining is the discovery of interesting, unexpected or valuable structures in large datasets (Hand 2007) . It contains several algorithms and techniques to look for hidden, valid, and potentially useful patterns. Data mining techniques are classified into two categories: supervised learning and unsupervised learning. In supervised learning, the training data includes both the input and the desired results. In unsupervised learning, the model is not provided with the correct results during the training (Donalek 2011) . Several works has proposed in literature using data mining for the propose to analyse student's videos behaviours (Kleftodimos and Evangelidis 2014; Giannakos et al. 2015; Brinton and Chiang 2015; Qiu et al. 2016; Lu et al. 2018; Doleck 2019, 2020) .

In Kleftodimos and Evangelidis (2014) Kleftodimos et al, used the clustering approach to find groups of learners with similar indicators regarding their engagement with the video. The analysis showed no association between clusters and learners' performance (final grades).

In Giannakos et al. (2015) Giannakos et al, presented a video learning analysis system (VLAS) which is a video analytics application. The authors used data traces produced by learners who interact with VLAS, including their history of video clicks navigation to learn about their attitudes as well as their learning outcomes. The study showed a correspondence between the level of cognition/reflection of each question and the number of clicks made by the learner. But the number of students who participated in the experiment is reduced (11 students).

In Brinton and Chiang (2015) Brinton et al, used Support Vector Machine classification algorithm to predict if a student will provide the correct answer for questions at the first attempt via clickstream information and social learning networks. They concluded that video clickstream events can be used as learning features to improve prediction quality.

In Qiu et al. (2016) Qiu et al, proposed a Latent Dynamic Factor Graph Model. Various features have been used; student's demographics, learning activity patterns in course forums, videos click stream and assessment grade in order to model learning behavior, assignment performance prediction and certificate earning prediction in MOOCs. The proposed model outperforms several alternative methods in predicting students' performance on assignments and course certificates.

In Lu et al. (2018) , the learning analytics and educational big data approaches were applied on a blended calculus course with the objective of finding the moment when student's academic performances could be predicted. In this work, features such as video-viewing behaviors, out-of-class practice behaviors, homework and quiz scores, and after-school tutoring were included. They concluded that the final performance can be predicted more accurately when one-third of the semester is elapsed.

In Lemay and Doleck (2019) , several classifiers (Logistic, SMO, Naïve Bayes, J48, JRIP, IBK, Random Forest, and WekaDeepLearning4J) are used to assessing the relation between students' video watching behavior and the course grades. In this work, features such as Rewinds, Fast forwards, Pauses and Plays, fractional and total amounts played, paused, playback rate, and the number of videos viewed per week were all included. They concluded that frequency of video viewing per week is a better predictor than individual viewing features such as plays, pauses, seeking, and rate changes.

In Yu et al. (2019) , several classifiers (K-nearest neighbor, Support Vector Machine, and Artificial Neural Network) are used, with click records of MOOCs videos, the feature sequence of the viewing learning behavior is established by the n-gram approach, in order to predict students' learning outcomes via their learning behaviors. This study showed a correlation between video viewing behavior and learning outcomes.

In Lemay and Doleck (2020) , the authors evaluate the predictive and explanatory significance of ten features of video viewing using several learning techniques (Logistic, SMO, NaiveBayes, J48, JRIP, IBK, Random Forest, and WekaDeepLearn-ing4J), to predict student's test performance on video quizzes. They concluded that the number of videos viewed per week was responsible for the majority of results variance and they also found that a model with eight features had high accuracy.

Unlike those works, the clicks behaviour vis a vis the pedagogical sequences are used in this study as the main features for video viewing behavior analysis.

In this section, we present the methodology followed for predicting learner's performance through video sequences viewing behavior analysis.

As shown in Fig. 1 , following Romero and Ventura (2007) work on data mining in e-learning systems, the method for this work had four distinct phases: data collection, data preprocessing, data mining and data interpretation. (Romero et al. 2008) 

The data set used in this study was obtained from students' clicks data, enrolled in computer engineering course (professional license) from the virtual learning environment at the polydisiplinary faculty of Taroudant. 1 The number of learners for this study was (N=66). Four video courses on C++ language were introduced to learners. The video courses were delivered via Moodle platform (Rice 2006) , where we integrated Vidtrack plug-in 2 (is a simple activity plugin for Moodle that records video events). We have made some improvements in Vidtrack plug-in functionalities; such as adding the possibility of segmenting a video into several sequences and developing the seek click so that we can specify whether it is a forward or a backward jump. The details about video courses are given in Table 1. A test was proposed for each video course, consisting of multiple-choice questions to assess the learners and gather their final grades. We can group the students regarding their final grades in several ways. In this work, we chose to categorize students with one of two class labels: 'Pass' for grades above or equal to 5.0, and 'Fail' for grades less than 5.0, as shown in Table 2 .

The students' final grades distribution is shown in Fig. 2 . The learners' results differ from one video to another. We noted that the number of learners who successfully passed the quiz tests exceeds the number of learners who failed the quiz tests in three video courses.

Data preprocessing transforms the raw data into a format that will be more easily and effectively processed. The data sample selected in the preceding step was preprocessed in order to clean and standardize variable types, formats, and content. We then generated new variables based on transformation and combination of the original ones. Data stored in various tables in Moodle Log database were merged into a single set. Only three attributes from three tables required for the data mining process, were selected which are Id, Sequences, and Grade. Table 3 presents the attributes selected and their description. Data were gathered from learners' clicks through the video course as well as their grades. The recorded clicks are: play, pause, jump forward, jump backward, and end. The total number of recorded clicks is (N=7423). The Sequences field contains learners' interactions (clicks) with video courses. The recorded clicks were transformed according to the pedagogical sequences in which those clicks were made. Then, we represented them as sequences (e.g., Fig. 3) .

Some students dropped the course after watching a couple of videos, thereby some of the final grades for certain video courses were missing.

In this work, pedagogical video is organized as networks of associatively connected fragments based on the linear content hierarchy of the main video. The segments of the video in our case represent the pedagogical sequences, so the number of sequences of the video denotes the number of segments. The segmentation of the video or the determination of the pedagogical sequences is done manually by the teacher of the video course. As shown in (Fig. 4) , first the the starting time and the end of each pedagogical sequence is defined. Then, the structure of pedagogical sequences in the video is classified by the teacher in order to present the video course in a suitable and appropriate way to satisfy learners' needs (see Fig. 4 ). 

In this study the data mining process was applied to predict learner's performance.

We investigated the impact of K-nearest neighbours and Multilayer Perceptron (MLP) algorithms for data analysis.

classifying objects based on closest training examples in the feature space. An object is classified by a majority vote of its neighbors. K-NN is a type of instancebased learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classification. It is a very effective algorithm for a variety of problem domains including text categorization (Yang and Liu 1999) . Most k-NN classifiers use simple Euclidean distances to measure the dissimilarities between examples represented as vector inputs (Weinberger et al. 2006 ). K is a positive integer, typically small. If k =1, then the object is simply assigned to the class of its nearest neighbor (Yu et al. 2011) . The best choice of k depends upon the data; generally, larger values of k reduce the effect of noise on the classification, but make boundaries between classes less distinct.

ing method commonly used to solve a number of different problems, including pattern recognition, speech recognition, image recognition and interpolation (Noriega 2005) . 

In this study, we used WEKA 3.9 workbench a machine learning tool (Hall et al. 2009 ), which includes various machine learning algorithms. The WEKA IBK classification filter is used in the dataset, which is a K-NN classifier. The algorithm is performed with different values of the parameter K for each video course data.

The WEKA DeepLearning4j filter (Lang et al. 2019 ) is used in the dataset, for training and testing MLP models. DeepLearning4j WEKA filter allows building deep neural networks. We applied a simple Multilayer Perceptron (MLP) using the standard configuration of the DeepIearning4j classifier which includes: one output layer with softmax as activation function, MCXENT as loss function, Xavier as weight initialization function, stochastic gradient descent algorithm for optimization and learning rate set to 0.01.

All experiments are conducted with full training set and Three-fold crossvalidation for each video course data as our evaluation approach. We split data set randomly into 3 sets of equal size; two sets were used for training, and one set for test validation.

The following evaluation measures are used to evaluate our data mining model: True Positive (TP) and False Positive (FP) Rates, Precision, % of correctly/incorrectly classified instances, Kappa Statistic, and ROC Area. The results of the experiments are outlined in Tables 4, 5, 6, and 7, Figs. 5, 6, 7 and 8.

The overall accuracy of the k-NN classifier is about 65.07%, and varies depending on the data of each video course. The detailed accuracy (Table 4 ) reveals that the True Positive Rate is high in three video courses (66-86%), low in one video course (45%) in the class Pass, whereas it's low in two video courses (27-40%), medium in one video course (53%), and high in one video course (80%) concerning the class Fail. The Precision is medium for all video courses in the class Pass (60-70%), and it is medium in all video courses concerning the class Fail (50-67%).

Educational video with a duration T

Seq2 Seq3 Seq4 Seq5

Results The average accuracy of the MLP classifier is about 61.13%. The detailed accuracy results (Table 5 ) reveal that the True Positive Rate is high in three video courses (84-90%), low in one video course (45%) in the class Pass. Whereas, it's low in three video courses (18-37%) and high in one video course (74%) concerning the class Fail. The Precision is medium for all video courses in the class Pass (55-69%), medium in three video courses (50-65%) and low in one video course (37%) concerning the class Fail (50-67%).

The results for the classification model comparison are presented in Fig. 5 . Among the four video courses, our system prediction accuracy rate varies between 60% and 67% using the K-NN classifier and between 57% and 67% using MLP classifier, without a remarkable disparity. The highest classification accuracy is achieved by the K-NN algorithm 67.27% in 'Function: Prototype' video course. The lowest classification accuracy is marked by the MLP algorithm 56.36% in the same video course. Although both algorithms have achieved the same classification accuracy in the video course 'Functions Introduction', the K-NN algorithm outperforms the MLP algorithm in the rest of the video courses. Figures 6 and 7 show the correctly classified instances vs. incorrectly classified instances of the classsifiers. 

Cross-validation has generally proved to be statistically good enough to evaluate the classifier's performance. Confusion matrices are very useful for evaluating classifiers. A Confusion Matrix was generated (Table 6 ). Two cases of class attributes are labeled with the letters A-Pass and B-Fail. The number of correctly classified instances is set on the matrix diagonal, and other elements of the matrix indicate the number of incorrectly classified instances. With regard to the classification accuracy of the two classes (Pass, Fail), it is obvious that the predictions are good for the 'Pass' Class in most of video courses with the K-NN and MLP classifiers. Contrarily, they are not ideal for the 'Fail' class in most of video courses regarding the results of all classifiers.

The results for the Kappa Statistic (an index comparing correct classifications against chance classifications and varying from -1 for complete disagreement, to 1 for perfect agreement), reveal that the K-NN model is above the chance with a minimum value of 0.153 in all video courses, whilst with MLP classifier three video courses are above the chance with a minimum value of 0.108. Except for the video course 'Function: Prototype', the Kappa value is negative as figured in Fig. 8 .

The ROC curve (receiver operating characteristic curve) is created by plotting the true positive rate against the false positive rate (if the ROC area is less than 0.5, random predictions outperform the model). The achieved results for the generated classification models are outlined in Table 7 . The K-NN classifier attains values of the ROC Area above 0.55, which means that all prediction models are reliable. Whereas, the MLP classifier attains values of the ROC Area above 0.51 in three video courses, excluding one value which is less than 0.5 concerning 'Function: Prototype' video course. The Objective of this study was to discover the effects of video sequences viewing behavior on learners' performance. Using two classification algorithms, the K-NN algorithm seems to be more accurate to predict learners' performance.

The findings of this research show that video sequences viewing behavior is correlated with learners' performance. The path of video pedagogical sequences followed by learners can be an effective feature for performance prediction.

We note that our models perform much better in predicting instances of class 'Pass' than those of 'Fail' class. 

In this paper, we applied educational data mining to predict learner's performance in video courses (either passed or failed). This study analyzed the influence of video sequences viewing behavior to determine the relationship between this behavior and learning outcomes. We used learners' clicks data collected from four video courses via Moodle platform. We have implemented classification method using k-NN and MLP classifiers to predict learners' performance.

The obtained results showed that the prediction accuracy rates are notable and acceptable (K-NN 60-67%, Multilayer Perceptron 57-67%) with a slight disparity to K-Nearest Neighbors favor. They indicate that learners' performance could be predicted using video sequences viewing behavior as a significant feature. The findings of this research can be used as reference to video processing field particularly for segmentation, annotation and recommendation problems.

As for future work, we will integrate other factors, namely the time a learner spends on viewing each pedagogical sequence and the difficulty of its content. This factors will be useful to better understand learners' video viewing behavior so as to improve the prediction accuracy. We'll also focus on using automated video segmentation method, increase the number of learners participating in the experiment as well as process and transform data in graph format to apply graph-educational data mining.

A predictive study of learner satisfaction and outcomes in face-toface, satellite broadcast, and live video-streaming learning environments. The Internet and Higher Education

Analysis of student use of video in a flipped classroom

Mooc performance prediction via clickstream data and social learning networks

An empirical analysis of video viewing behaviors in flipped cs1 courses

Learning from video: Viewing behavior of students

Identifying the learning style of students in moocs using video interactions

Supervised and unsupervised learning

Video-based learning and open online courses

Making sense of video analytics: Lessons learned from clickstream interactions, attitudes, and learning outcome in a video-assisted course

Predicting academic performance

The weka data mining software: An update

Principles of data mining

Predicting student academic performance in an engineering dynamics course: A comparison of four types of predictive mathematical models

Predicting student performance by using data mining methods for classification. Cybernetics and Information Technologies

Using metrics and cluster analysis for analyzing learner video viewing behaviours in educational videos

Predicting students marks in hellenic open university

Effects of in-video quizzes on mooc lecture viewing

Wekadeeplearning4j: A deep learning package for weka based on deeplearning4j. Knowledge-Based Systems

Grade prediction of weekly assignments in moocs: Mining video-viewing behavior. Education and Information Technologies

Predicting completion of massive open online course (mooc) assignments from video viewing behavior

Applying learning analytics for the early prediction of students' academic performance in blended learning

A framework for smart academic guidance using educational data mining. Education and Information Technologies

Predicting student performance: An application of data mining methods with an educational web-based system

A study on video viewing behavior: Application to movie trailer miner

Multilayer perceptron tutorial

Video lecture watching behaviors of learners in online courses

Modeling and predicting learning behavior in moocs

Moodle E-Learning Course Development

Educational data mining: A survey from 1995 to

Educational data mining: A review of the state of the art

Data mining in course management systems: Moodle case study and tutorial

Distance metric learning for large margin nearest neighbor classification

A re-examination of text categorization methods

Predicting learning outcomes with mooc clickstreams

Three-Dimensional Model Analysis and Processing

Acknowledgements This work was supported by Al-Khawarizmi Program for research Support in the Field of Artificial Intelligence and its Applications (Morocco).The authors wish to thank the participants of the study who kindly spent their time and effort. We also want to thank Pr.Fouzia Boukbir (English teacher) for her help and devotion.

We declare that we have no conflict of interest

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.