key: cord-0059078-2pdl42u3 authors: Ocaña, Mauro; Mejía, Rebeca; Larrea, Carolina; Cruz, Estefanía; Santana, Leonardo; Empinotti, Marina title: Investigating the Importance of Student Location and Time Spent Online in Academic Performance and Self-regulation date: 2021-02-16 journal: Artificial Intelligence, Computer and Software Engineering Advances DOI: 10.1007/978-3-030-68083-1_31 sha: 2bd2ef6e7aa8107e36bfd955e45678c49e62a33a doc_id: 59078 cord_uid: 2pdl42u3 The advent of a potential pandemic worldwide alerted educational institutions to take preventive measures in order to implement the best strategy. This investigation reports on a pilot study that intended to establish the benefits of transitioning to either an online or a blended-learning modes. To this end, we firstly analysed academic performance and time spent on a set of online activities completed by language learners at beginner and upper-intermediate levels . Secondly, student location was also examined, and it was found that it has a strong relationship with academic performance based on the average grades. Statistical analyses of variance and regression models were carried out, to analyse the importance of the factors in the outcome variable. In addition, a test was done to compare performance between the study groups. Based on student location in combination with the other variables, it was possible to monitor that the students performed academically better when off-campus. This led us to the preliminary conclusion that moving to a fully online or blended-learning mode, either due to the pandemic or an independent decision, students would perform equally, if not better, than when on-campus. Information and communication technologies (ICT) have introduced new strategies in higher education. Most of today's universities offer study programs available online [1, 2] . As there was a risk to move most, if not all educational activities online, this study analysed the potential of a mode that combined face-to-face interaction with independent study. With a focus on this, some foundations on Blended Learning were used. Blended learning (BL) is considered an educational strategy that supports students in terms of flexibility and accessibility. Some BL studies, e.g. [3, 4] , refer to a combined modality of 70% face-to-face and 30% online learning. Within BL, [5] indicate the need for research that allows understanding how students act on their own when faced with knowledge acquisition in online spaces. For it, [6] explains that self-regulated students are active in their own learning process. Self-regulated learning is known as the process focused on perseverance and motivation to achieve proposed personal academic goals [7] . Educational platforms for online learning have been enhanced based on teaching methodologies that include instructor-led meetings, seminars and more, which have made them as effective in terms of learning as face-to-face courses [8] . To achieve active participation, a challenging task from the teacher is required, which refers to the course or program being designed and centred on the students and responding to their specific needs [9] . Blended learning courses employ active learning strategies relying on various pedagogical approaches [10, 11] , ranging from fully online curricula with face-to-face interaction to courses that integrate traditional face-to-face classroom instruction with online components that extend learning beyond the classroom [12, 13] . In other words, it is necessary to create an environment that encourages students and get involved in intense and fruitful interactions with the instructor, the material and the study partners, proposing activities that promote the development of critical thinking, collaborative learning and self-control [12, 14, 15] . Furthermore, guidance should be provided, which contributes to student engagement and learning [9] . Student accesses, online time and navigation captured within a LMS are key in the exploration of virtual learning [16, 17] . These factors have provided important data with potential information to be translated into actionable practices by BL instructors [18] . Admittedly, as noted by Allen & Seaman [19] , students are substantially less likely to complete online courses and their dropout rates are higher compared to students who took the same course face-to-face [20, 21] and [22] . Thus, the lack of engagement and self-regulatory learning skills continue to be a serious impediment to learning success in the context of blended learning [22, 23] . The concept of student engagement is widely used in educational contexts. Student engagement is defined as "the time and effort students devote to activities that are empirically linked to desired outcomes of college…" [24] . This definition draws on previous works that divides its components. For example, students engage behaviourally when they exhibit attendance or involvement, emotionally when they show interest, and cognitively when they seek opportunities for challenge [25] . So, behavioural and cognitive components are recognized as (strategic or) self-regulated behaviours. Zimmerman and Martinez-Pons [26] lay the foundations to measure SRL behaviours by pointing to, for example, time management (referred to as regularity of access) and environmental structuring (i.e. finding an adequate time and/or space to study). These are commonly known as structuring and activating type of SRL behaviours respectively. Students have access to different educational resources to learn at their own pace which can foster team or collaborative work with their peers [27] . Another concept recurrent in SRL studies is the way in which students seek some kind of assistance, either from their instructor or from their peers. In this way, it is revealed that, contrary to common assumptions, SRL behaviours do not take place in isolation [28] . Not all students engage or show self-regulated patterns in a similar way and not even the same student acts in a similar way over a given period of time. A study carried out by Kizilcec [29] on the Coursera platform found four patterns of engagement: Completing (completed most assessments), auditing (did assessments intermittently), disengaging (did assessments only at the beginning and then reduced in engagement) and sampling (did only one or two assessments). Ferguson and Clow [30] later replicated the study on the FutureLearn platform and discovered how patterns of engagement changed over time. When classifying students according to their SRL behaviours, it was found that frequency of access to LMS was important for student success [31] . Despite these findings, there is still a perception that different platforms (having different designs) yield different results. Therefore, taking into account that design is key in an educational context, studies using Moodle are also described. Some of these studies, e.g. [27] and later [32] detected which variables of the LMS are worth analysing to unveil student behaviours. Recently, it was found that the day of the week and frequency of access were key in determining clear SRL behaviours and enabling student success in a flipped classroom [2] . They classified the students' regularity of access in "Not regular" (<100 times), "Moderately regular" (100-200 times), and "Highly regular" (>200 times). These traits, which occur in all e-learning interactions and are linked to academic performance, were scrutinized throughout our study and constitute the framework of this research. Given that student behaviours had to be analysed in their real context, with regard to the time spent on activities and the location of access, the LMS captured the events in its logs. Therefore, the data analysis aimed to answer the following research questions: RQ1. How do students spend their time on assessment-related tasks in an online course and how often do they do it? RQ2. How do students' average grades differ in relation to their visits in-class/out-class visits? RQ3. To what extent do students' average grades change in respect to their total number of visits? This study uses a non-experimental observational design [33] as the variables to be observed in a Learning Management System (LMS) were not manipulated by the researchers. In order to estimate the relationship among the variables observed (explained in the paragraphs below), regression analysis tests were performed. The study population was 103 Ecuadorian university students of different university careers who studied English as a Foreign Language (EFL) course at the end of 2019. Students were divided into two groups according to their level of language proficiency: A1A2 (Beginners, n = 74) and B1B2 (Upper Intermediate, n = 36). Their face-to-face class ran two hours a day, from Monday to Thursday. Of these, one hour is allocated to the multimedia laboratory practice where they had the opportunity to access the online activities of this study, at their will. The learning activities were designed as an open course. That is, none of their assessments were tied to a formal grading system nor did students receive any extra academic credit. In doing so, these activities served as supplementary practice to reinforce their standard curriculum. Although all tasks were based on institutional curricula, they were not mandatory. In the specific case of assessment items, they were presented in various styles: multiple choice, drag and drop, matching, true/false, random short-answer matching, select missing words, and gap fill. Those students who reached at least 70% of completion of the total number of activities received a certificate of participation. After the course was completed, the system logs were extracted as a.csv file. These files contain each action recorded throughout the sessions along with several chains of information e.g. timestamp. The analysis of the extracted data was carried out in two stages. The first stage aimed to answer RQ1 to better understand the behaviour of students and how they used their time during the proposed course. While the second stage we answered RQ2 and RQ3. For this, a statistical analysis of variance was performed by regression models, to analyse the importance of the factors in the result variable. In addition, a test was done to compare performance between the study groups. Time spent by students was a common factor in the behaviour patterns at stage 1, as well as in-class and out-of-class work at stage 2. Therefore, throughout the duration of the project, we computed the time each participant worked, in a daily basis, on the variables of the system selected for this study to have a common measurement factor. So, the factor time, regardless of its duration, will be hereinafter called "visits". a) How did students spend their time on assessment-related tasks in an online course and how often did they do it? Students' visits to the platform can be classified in different ways. Taking as reference previous studies e.g. [29, 30] and [2] , for the current study, these visits were classified as follows: First, visits were classified into time intervals according to their duration in minutes. Based on the task design, a visit of 3-14 min was considered a reasonable access time, therefore this interval time was labelled "acceptable time". This time interval was considered a guide to classify subsequent intervals. A lower interval of 0,001-2 min was labelled as "fast time" and there were two longer intervals, one of 15-180 min called "long" and the other >180 min called "exceeded time". Although the latter emerged when dealing with data records, it is assumed that times >180 min were not specifically spent on completing the task, but most likely fiddling. To check the day that most students visited the platform, a simple count of visits per day was performed. Whereas to inspect their preferred hours of the day, their visits were classified from 0:00 a.m. to 06:59 as "before dawn"; "Morning" from 07:00 a.m. at 11:59 a.m. "Afternoon" from 12:00 pm to 5:59 pm; and "Night" from 18:00 to 23:59 (Figs. 2 and 3) . Each left side of the equation is the dependent variable, also known as the "outcome" and the right side of the equation are the independent variables, also known as "factors." To consider the location of access, each action in the LMS record is accompanied by an IP. The IPs that were in the range reported by the Universities' IT departments were considered in class, while all the other unregistered IPs were considered out of class connections, whether from home, a cafe or a library. Stage 1 Day Figures 1 and 2 show the difference between groups A1A2 and B1B2 regarding the total count of daily time intervals of visits. The differences are most notable in the time intervals labelled "acceptable time" and "fast time," meaning that students did small activities, which are more manageable to complete as a short quiz or quiz review (Figs. 4 and 5) . show students' preference of access to complete their assessment-related tasks. It is logical to obtain a good percentage of visits on Fridays since, on that day the students did not have face-to-face classes. Furthermore, it is clear that "morning" is important in both groups with respect to the preferred time of day, maybe because some authors argue that it is the most favourable time of the day to better academic performance (Newport, 2007). As can be seen, there is a large percentage of visits on Fridays. While the visiting hours of preference are in the "morning". For this analysis, students who registered at least one participation in the assessment tasks throughout the duration of the project were considered as subjects. Therefore, the average grades will work as the outcome (or dependent) variable, while in-class or out-of-class will become the factors (independent variables) linked to the (academic) outcome of groups A1A2 and B1B2. Students who did not record any assessment were not considered for the analysis as this would substantially affect the overall performance of students who made an effort to measure their own performance through assessments. Table 1 shows the data of the two groups used for the regression analysis, as well as for the paired t-test. For the latter, a similar number of students in both groups is required as a prerequisite. Therefore, student number 22 from the A1A2 group was discarded due to low grade point average and low frequency of visits throughout the course. The main statistical descriptors of the performance of the groups under study are shown in Table 2 , with a confidence level of 95%. Using the data shown in Table 1 , the analysis of variance (ANOVA) by fit linear regression models was performed separately for the study groups, as shown in Tables 3 and 4 . When the group means were analysed, apart from the factors (in-class and out-class), the out-class factor denotes a significant value, since its p-value is <0.05 (considering a confidence level of 95%). Furthermore, according to Tables 3 and 4 , the contribution of the "out-class" factor is 48.02% for group A1A2 and 44.19% for group B1B2, which means a slightly greater influence in group A1A2. This is the factor that will be further analysed below. What was described in previous paragraphs, and found in Tables 3 and 4 , can be shown through surface graphs in Figs. 10 and 11 , where the highest academic performances are in red and orange, while the lowest ones are in blue and light blue. In general, in Figs. 10 and 11, when groups are compared, for the case of A1A2, the out-of-class factor is more influential, with respect to group B1B2, as it leaned on the Y axis (out-class factor) . Therefore, for group A1A2 in Fig. 10 , the highest academic performances are in the Y axis, where there are up to 14 out-of-class accesses, in contrast to the 7 in-class accesses in the X axis. This is also corroborated with p = 0.489 for the interaction between factors (in-class/out-class) and p = 0.476 for the in-class factor alone as shown in Table 3 . For the group B1B2 in Fig. 11 , the "out-class" factor is significant too, although with less intensity than for group A1A2. That is, the best academic performances leaned to the axis Y, where there are between 8-12 accesses outside of class along with one access in class (red zone). The blue areas (which denote a low performance) are leaned towards axis X (in-class factor), which means that the lowest scores were obtained when students were in class. This fact is corroborated by the value p = 0.372 for the in-class factor and p = 0.780 for the interaction of the two factors as shown in Table 4 . Figure 12 below shows a fitted line plot for groups A1A2 and Fig. 13 for group B1B2 and how the data is scattered along the line. This allows for relations to be fitted to a linear model and include conditions that were not considered for this study, such as a higher number of out-of-class accesses. Table 3 . Regression Analysis: Average Grades versus in-class and out-class for A1A2 group Table 4 . Regression Analysis: Average Grades versus in-class and out-class for B1B2 group In addition, the respective equations are shown as well as the predictor coefficient of determination, R-squared. The R-squared is a statistical measure of how close the data is to the fitted regression line. The R-squared value shown by the A1A2 group of 60.7% is more representative than 53.7% which is the value of group B1B2. This can be considered as a high-moderate value. In addition, a correlation analysis was performed between the significant factor "outclass" (independent variable) and the outcome "average grades" (dependent variable) to confirm the strength and direction of the relation (either directly or inversely proportional) between both variables. Tables 5 and 6 show the Pearson correlation between the average grade and the out-class factor, once the p-value <0.05. For the case of group A1A2, the Pearson correlation (0.791) was shown to be slightly higher than for group B1B2 (0.748) as can be seen in Tables 5 and 6 respectively. In both cases, these values show a positive correlation (directly proportional) and strong (close to 1, perfect correlation). For the other correlation between in-class versus average grade, although previously known to be not significant, an inverse Pearson correlation is observed for both A1A2 (−0.418) and B1B2 (−0.242), which means that the more students logged in from an in-class setting, the lower the grade they obtained. Answer to RQ3: To what extent did students' average grades change in respect to their total number of visits? This analysis was carried out in a similar way to the previous case, with the difference that the total number of visits to the system, labelled here as "Total access" includes both in-class and out-class factors to review their influence on the performance of the classes as a whole see Table 1 . In Tables 7 and 8 , the observed value for the analysed factor "Grand Total" is p = 0.002 for group A1A2, and p = 0.001 for group B1B2, which makes this factor significant for both groups. Figures 14 and 15 show a fitted line plot for the "total access" factor versus average grades for the two study groups. In this case, the R-squared shown by group A1A2 of 34.3% is worse than B1B2 of 42.6%. This confirms the previous analysis that by including the factor "in class" within the total access class, this fitted line model is weakened. The Pearson correlation of the total access class in this case was 0.612 for A1A2 and 0.674 for B1B2. To complement this stage of the study, the performance of the two groups was compared via the t-Student test. As the performance of the groups as a whole was considered, the average scores of the groups A1A2 and B1B2 are shown in Table 1 .As a precondition for performing this test, it is necessary to verify whether the variations in the analysis groups are equal or not. Depending on this, if the t-Student analysis is for equal variances, it is required to apply the "two-tail" (variables) analysis, otherwise it would be done for "one-tail". In this sense, the t-Student test for equal variances was used to reveal if the means of group A1A2 are equal to the means of group B1B2, or if there are differences in performance between the two groups. Hypothesis testing: Where, H 0 is the null hypothesis, H 1 is the alternative hypothesis. In this case, a two-tailed analysis is proposed, that is, to analyse the two directions of the mean of the t-Student distribution. Using the Fisher's statistical test regarding the study groups, it was verified that the variations of the two groups are equal, as shown below in Table 9 . The result of the t-Student analysis is shown in Table 10 . In this case, as |t stat | ≤ t critical , that is 0.65 < 2.02, H 0 is accepted and, therefore, it is concluded that the average of A1A2 is equal to the average of B1B2. This means that there are no statistically significant differences in the performance (i.e. average grades) of the two study groups. Figure 16 shows the main statistical parameters calculated and compared for each study group, as well as the normal distribution of the analysed data, which graphically shows what was analysed in the previous paragraphs. The vertical axis indicates academic performance through average grades. This includes the maximum and minimum average grades, their means, medians and quartiles. This study aimed to discover how time on task and academic performance (average grades represented by the testing component in the platform) differed based on whether students are in or out of the campus. Analyses of these variables allowed suggesting if students would be affected when transitioning to an online or blended-learning mode. In a first stage, relying on the observation of the data, available on any analytics panel nowadays, it is evident how the participants of this study dedicated their time to tasks related to evaluation, mainly in time intervals that did not interfere with their face-to-face lessons. While the A1A2 (beginner) group preferred to work more during the weekdays, the B1B2 (upper intermediate) group opted for a more balanced approach throughout the week (RQ1). This can be seen as a simple reminder of how and when to do online supplementary activities in a course, especially when a group of people have regular contact with each other. Later in a second stage, the out-of-class factor was found to be determinant in higher academic performance (or higher average grades). Regression analysis showed that when students logged in from an off-campus location, they performed significantly better than when they logged in from an on-campus location (RQ2). To some extent, being in class appears to be counterproductive for students to fully exercise a sense of self-regulation. This might be related to some perception of the classroom as a restricted environment. Based on statistical procedures of regression analysis, the total number of visits (which included in-class and out-class visits to the learning activities in the platform) showed a lower impact on academic performance (RQ3) than when the factor out-class was analysed alone. This means that the factor "in class" contributed to reducing the statistical significance of the "total number of visits", which reinforces the conclusion that when carrying out activities in class, a sense of self-regulation is not exercised and therefore, it impacts "negatively" 'on academic performance'. A limitation of the current analysis is the number of students. Another limitation is that the presence of an instructor was not analysed. This could be further achieved by cross-checking the records of a group of students accessing simultaneously on the same day, at the same time from a similar IP range which can be interpreted as everyone going to a lab class. Advances in Social Computing -Third International Conference on Social Computing, Behavioral Modeling, and Prediction, SBP 2010, Proceedings. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Using learning analytics to explore self-regulated learning in flipped blended learning music teacher education Emerging practice and research in blended learning Blended Learning: The Convergence of Online and Face-to-Face Education. Promising Practices in Online Learning Conducting Research in Online and Blended Learning Environments: New Pedagogical Frontiers. Routledge A social cognitive view of self-regulated academic learning Bandura: La auto-eficacia: El ejercicio del control Current status of research on online learning in postsecondary education Effects of Interactivity on student achievement and motivation in distance education Student LMS use and satisfaction in academic institutions: the organizational perspective Learning or performance: predicting drivers of student motivation Relationship between use of online support materials and student performance in an introductory finance course Blended learning: a dangerous idea? Internet High Cooperative learning: smart pedagogy and tools for online and hybrid courses Blended spaces, work based learning and constructive alignment: impacts on student engagement Student hits in an internet-supported course: how can instructors use them and what do they mean? Evaluation and revision of the Study Preference Questionnaire: creating a user-friendly tool for nontraditional learners and learning environments Understanding, evaluating, and supporting self-regulated learning using learning analytics Online Report Card: Tracking Online Education in the United States Online Learning: Does It Help Low -Income and Underprepared Students? Review of developments in research into English as a lingua franca Online and hybrid course enrollment and performance in Washington State community and technical colleges The effect of learning style on preference for web-based courses and learning outcomes What student affairs professionals need to know about student engagement School engagement: potential of the concept, state of the evidence (in English) Construct validation of a strategy model of student self-regulated learning A multivariate approach to predicting student outcomes in web-enabled blended learning courses Becoming a self-regulated learner: an overview Deconstructing disengagement: analyzing learner subpopulations in massive open online courses Access patterns of online materials in a blended course Educational process mining: a tutorial and case study using Moodle data sets (chap. 1) We are grateful to Dr. Edwin Ocaña for the suggestions and reviews prior to the document, as well as the follow-up to the research project.