key: cord-0906451-6czdhtms
authors: Okoye, Kingsley; Arrona-Palacios, Arturo; Camacho-Zuñiga, Claudia; Achem, Joaquín Alejandro Guerra; Escamilla, Jose; Hosseini, Samira
title: Towards teaching analytics: a contextual model for analysis of students’ evaluation of teaching through text mining and machine learning classification
date: 2021-10-11
journal: Educ Inf Technol (Dordr)
DOI: 10.1007/s10639-021-10751-5
sha: 13ebb6c1316159f3f9c39264c05b7136b64968ed
doc_id: 906451
cord_uid: 6czdhtms

Recent trends in educational technology have led to emergence of methods such as teaching analytics (TA) in understanding and management of the teaching–learning processes. Didactically, teaching analytics is one of the promising and emerging methods within the Education domain that have proved to be useful, towards scholastic ways to make use of substantial pieces of evidence drawn from educational data to improve the teaching–learning processes and quality of performance. For this purpose, this study proposed an educational process and data mining plus machine learning (EPDM + ML) model applied to contextually analyze the teachers’ performances and recommendations based on data derived from students’ evaluation of teaching (SET). The EPDM + ML model was designed and implemented based on amalgamation of the Text mining and Machine learning technologies that builds on the descriptive decision theory, which studies the rationality behind decisions the learners are disposed to make based on the textual data quantification and statistical analysis. To this effect, the study determines pedagogical factors that influences the students' recommendations for their teachers, what role the sentiment and emotions expressed by the students in the SET play in the way they evaluate the teachers by taking into account the gender of the teachers. This includes how to automatically predict what a student’s recommendation for the teachers may be based on information about the students’ gender, average sentiment, and emotional valence they have shown in the SET. Practically, we applied the Text mining technique to extract the different sentiments and emotions (intensities of the comments) expressed by the students in the SET, and then utilized the quantified data (average sentiment and emotional valence) to conduct an analysis of covariance and Kruskal Wallis Test to determine the influential factors, as well as, how the students’ recommendation for the teachers differ by considering the gender constructs, respectively. While a large proportion of the comments that we analyzed (n = 85,378) was classified to be neutral and predominantly interpreted to be positive in nature considering the sentiments (76.4%), and emotional valence (88.2%) expressed by the students. The results of our analysis shows that for the students’ comments which contain some kind of positive or negative sentiment (23.6%) and emotional valence (11.8%); that females students recommended the teachers taking into account the sentiments (p = .000). While the males appear to be slightly borderline in terms of emotions (p = .056) and sentiment (p = .077). Also, the EPDM + ML model showed to be a good predictor and efficient method in determining what the students’ recommendation scores for the teachers would be, going by the high and acceptable values of the precision (1.00), recall (1.00), specificity (1.00), accuracy (1.00), F1-score (1.00) and zero error-rate (0.00) which we validated using the k-fold cross-validation method, with 63.6% of optimal k-values observed. In theory, we note that not only does the proposed method (EPDM + ML) proves to be useful towards effective analysis of SET and its implications within the educational domain. But can be utilized to determine prominent factors that influences the students’ evaluation and recommendation of the teachers, as well as helps provide solutions to the ever-increasingly need to advance and support the teaching–learning processes and/or students’ learning experiences in a rapidly changing educational environment or ecosystem.

Recent trends in educational technology have led to emergence of methods such as teaching analytics (TA) in understanding and management of the teaching-learning processes. Didactically, teaching analytics is one of the promising and emerging methods within the Education domain that have proved to be useful, towards scholastic ways to make use of substantial pieces of evidence drawn from educational data to improve the teaching-learning processes and quality of performance. For this purpose, this study proposed an educational process and data mining plus machine learning (EPDM + ML) model applied to contextually analyze the teachers' performances and recommendations based on data derived from students' evaluation of teaching (SET). The EPDM + ML model was designed and implemented based on amalgamation of the Text mining and Machine learning technologies that builds on the descriptive decision theory, which studies the rationality behind decisions the learners are disposed to make based on the textual data quantification and statistical analysis. To this effect, the study determines pedagogical factors that influences the students' recommendations for their teachers, what role the sentiment and emotions expressed by the students in the SET play in the way they evaluate the teachers by taking into account the gender of the teachers. This includes how to automatically predict what a student's recommendation for the teachers may be based on information about the students' gender, average sentiment, and emotional valence they have shown in the SET. Practically, we applied the Text mining technique to extract the different sentiments and emotions (intensities of the comments) expressed by the students in the SET, and then utilized the quantified data (average sentiment and emotional valence) to conduct an analysis of covariance and Kruskal Wallis Test to determine the influential factors, as well as, how the students' recommendation for the teachers differ by considering the gender constructs, respectively. While a large proportion of the comments that we analyzed (n = 85,378) was classified to be neutral and predominantly interpreted to be positive in nature considering the sentiments (76.4%), and emotional valence (88.2%) expressed by the students. The results of our analysis shows that for the students' comments which contain some kind of positive or negative sentiment (23.6%) and emotional valence (11.8%); that females students recommended the teachers taking into account the sentiments

The application of teaching analytical (TA) methods for analysis of the different datasets collected about the educational process and performances, e.g., the students learning experiences and outcome, can enhance the level of impact of several educational initiatives and resultant technologies, both in terms of students' satisfaction and at institutional levels. Studies have shown that many educators depend on the outcomes of the teachers and/or students evaluation of teaching (SET), to explore the effectiveness of the various strategies used to create, manage, or improve the various educational activities that underlie the higher educational institutions' (HEIs) programs and curricula (Badri et al., 2006; Bianchini et al., 2013; Boring, 2017; Holmes et al., 2019; Tondeur et al., 2020) . Also, there have been speculations or discourse on the strengths, weaknesses, opportunities, and threats to using information collected about the teaching-learning activities to support the educational processes (Ferguson, 2012; Gedrimiene et al., 2019; Mangaroska & Giannakos, 2019; Papamitsiou & Economides, 2014; Renz & Hilbig, 2020) . This ranges from the proliferation of SET-generated data to support and improve the several learning processes (Larrabee Sønderlund et al., 2019; Prinsloo et al., 2012; Slade & Prinsloo, 2013) , to theoretical approaches that aim to combine the teaching practices with design-based research, and innovative methods that support educators with didactic/ technological tools to improve the quality of teaching (Herodotou et al., 2019a, b; Ndukwe & Daniel, 2020) .

Nowadays, the need to effectively use information (data) derived from the students' evaluation of teaching or the teacher'´ performance in addressing the different challenges with educational technologies and data mining have not been overemphasized (Altrabsheh et al., 2014; Badri et al., 2006; Crues et al., 2018; Perrotta & Williamson, 2018; Romero & Ventura, 2020) . This, perhaps, has particularly become important, when the outcomes of the developed methods can be used to adapt, monitor, predict, and recommend best practices for the teaching and learning process within the higher education domains (Abu Zohair, 2019; Aldowah et al., 2019; De Quincey et al., 2019; Nganji, 2018) . To note, while Badri et al (2006) stated that SET has become a factor in the renewal and promotion of long-term contracts, merits, and award-related decisions for the teachers in several HEIs. Perrotta and Williamson (2018) , on the other hand, notes that how the various educational activities/processes are managed essentially depends on the methods which are applied for the (educational) data collection, including its measurement, analyses, and interpretation within the educational context.

For all intents and purposes, this study believes that there is not only the need for teaching analytical methods that aim to efficiently extract contextual-based insights from the several collections of educational datasets and information recorded in SET. But also, there is the necessity to transform the derived information into actionable plans or practices that can help improve the teaching-learning processes across the different HEIs. In this study, the text mining and machine learning technique was used to analyze the students' evaluation of teaching and recommendation of the teachers based on the opinions or perceptions (e.g., sentiment analysis, and emotional valence) of the students, as well as determines what the students' recommendation may be based on the identified sentiments and emotional valence scores. Whereas, the Text mining method is held to support the educational processes and/or information management, due to its capability to analyze and derive (new) relevant information from the textual datasets which are recorded in the various databases of the HEIs (Kumakawa, 2017; Lau et al., 2005; Tseng et al., 2018) . The Machine learning classification model or technique is applied to make predictions based on the identified students' features or input datasets (Abu Zohair, 2019; Ghosh et al., 2020; Ofli et al., 2016; Viji et al., 2020; Wong & Yeh, 2019) . Moreover, the study also believes that by efficiently using the extracted information or results of the two methods (Text mining and Machine learning) correctly, the higher educational institutions or educationalists can define adequate teaching analytical methods or innovative technologies that can not only be used to support the effectiveness of the teaching-learning processes in general, but can also be utilized to maintain a strong relationship amongst the stakeholders (e.g., HEIs, teachers, students) (Ndukwe & Daniel, 2020; Payne, 2006; Piedade & Santos, 2010; Renz & Hilbig, 2020; Tseng et al., 2018) . Therefore, in theory, the resultant model (EPDM + ML) and methodological approach used in this study shows to be important towards the promotion and advancement of the stakeholders´ experiences and/or teaching-learning performances/outcomes at large (Bowdre, 2020; De Fortuny et al., 2013; Dollinger & Lodge, 2018; Er et al., 2019; Gomes & Ma, 2020; Hilliger et al., 2020; Pedró et al., 2019; Tóth & Surman, 2019; Yadav & Berges, 2019) .

The use information or insights drawn from analyzing educational datasets such as SET, can help the Educators in the development of innovative teaching practices to foster the learning processes. Moreover, achievement of strategic or idiosyncratically analysis of the (educational) datasets, otherwise allied to the notion of "datafied-Education" has shown to be one of the most pertinent challenges that faces the effective delivery of the teaching-learning processes, both in the literature and in practice (Cerratto Pargman & McGrath, 2021; Hilliger et al., 2020; LALA, 2020; Mahmoud et al., 2020; Martens et al., 2020; Ndukwe & Daniel, 2020; Pettersson, 2020; Slade & Prinsloo, 2013) . For example, while Cerratto Pargman and McGrath (2021) noted the main gaps with education in the literature and practice to be that the educational data-driven practices are highly context sensitive, not synonymous with evidence-based practices, and are not sustainable per se. The authors (Cerratto Pargman & McGrath, 2021) , on the other hand, notes that with the growth in digitalization or datafication of Education, that availability of significant amounts of data, i.e., educational "big" data, have created opportunities for the use of technologies such as AI and/or machine learning techniques to gain valuable insight into how students learn in higher education, as described in this study. To this end, this study strongly believes that useful dissemination of "student-generated data" can be used not only to understand, support, and provide an increased learning process or performance metrics for the stakeholders (teachers and students) particularly in addressing the social-technical or pedagogical challenges related to the teaching-learning process (Dimitriadis et al., 2021; Mahmoud et al., 2020; Raffaghelli et al., 2020; UNESCO, 2014 UNESCO, , 2021 . But it also demonstrates the importance of the use of data through a predictive analytical approach to inform or shift the current culture of academic advising or teaching paradigms, from one of compliance to one that focuses on students' learning success and recommendations based on their educational experience (Bowdre, 2020) .

Along these lines, this study shows that there is a need for innovative methods or approaches, such as the EPDM + ML model proposed in this paper, for extraction of educational-based information from the unprecedented datasets recorded and stored about the students' evaluation/recommendation of the teaching-learning performances, to help transliterate them into actionable plans for education in general. In our analysis, we extended the educational process and data mining (EPDM) model proposed in Okoye et al., (2020) to show how the amalgamation of the Text mining and Machine learning techniques which we grounded on the descriptive decision theory (Baucells & Katsikopoulos, 2011; Chandler, 2017) , can be used to analyze the (educational) data (SET) towards improvement of the end-to-end teachers-students learning process and interactions within the higher educational setting. To this end, we proposed an educational process and data mining + machine learning (EPDM + ML) model for fostering the teaching analytics and performance evaluation.

The main research questions studied in this paper are as follows:

• How can we analyze the educational datasets (SET) within the higher education context to understand the pedagogical and social-technical factors that influences the students' recommendation for the teachers? • How do we exploit the resultant information or pieces of evidence (e.g., sentiment and emotional valence) extracted from the SET data to determine whether the students evaluate and recommend their teachers by taking into account the gender of the teachers or construct? • What role does the average sentiment and emotional valence expressed by the students play in the way they evaluate the teachers? • How can we predict what a student's recommendation for the teachers may be based on the gender of the students, average sentiment, and emotional valence scores?

To answer the identified research questions, the study designed a number of constructs it used to conduct the analysis and investigations:

• In the text mining approach, we applied the EPDM model to determine the different sentiment and emotions expressed by the students in the comments for the teachers by considering the gender differences. To this effect, we performed a sentiment and emotional valence analysis (i.e., polarity or textual data quantification) to determine the intensities of the comments provided by the students and their impact on the teachers' evaluation. • Also, we determined the main factors or differences in the way the students evaluated and/or recommended the teachers based on the quantified data (average sentiment and emotional valence) by holding out the students' gender as one of the potential heightening factors. • For the machine learning method, we built a textual data classification model to determine to what extent the approach (EPDM + ML) is capable of predicting what a students' recommendation or evaluation (scores) for the teachers would be by taking into account the students' gender, average sentiment, and emotional valence. • Finally, we provided an empirical discussion of the implications of the study findings based on the significant factors, gender differences or perspectives, and machine learning classifications outcomes.

Consequently, the study makes the following contributions to knowledge, based on the analyzed constructs and effort to provide answers to the research questions:

(1) It shows the capability of the Text mining approach and Machine learning technologies towards effective (educational) data analysis, and understanding of its main implications/application within the education domain. (2) It defines a text mining and machine learning model (EPDM + ML) that makes use of the comments (textual data) provided by the students in SET to determine the impact of the different emotions and sentiments expressed by the students in connection to their recommendation of the teachers based on the gender differences.

(3) It develops a machine learning classification model that was trained to predict to a significant level (high accuracy), the ratings or scores a student is most likely to give to their teachers by considering the students' gender, sentiments, and emotional valence constructs.

(4) It demonstrates how data about students' evaluations of teaching (SET) can be utilized to provide solutions to the ever-increasingly need to social-technically and pedagogically advance/support the teaching-learning processes and/or students' experiences in a rapidly changing educational market or ecosystem. (5) It introduces a teaching analytical method that shows to be effective towards the understanding/improvement of the end-to-end teaching-learning processes in education, as well as, how compounding factors such as the gender differences affects the way students rate or evaluate the teachers' performances and/or learning outcomes.

The fundamental feature of the Text mining (Altrabsheh et al., 2014; Binali et al., 2009; El-Halees, 2011; Pandey & Pandey, 2019; Wen et al., 2014) and Machine learning (Abu Alfeilat et al., 2019; Abu Zohair, 2019; Dey et al., 2016; Ghosh et al., 2020; Ofli et al., 2016; Viji et al., 2020) techniques, both in what can be defined in theoretical and technological paradigms; is that both methods can be used to understand the several patterns or relationships that exist in the (educational) datasets stored in the information systems or databases of the several organizations' processes (Jones, 2019; Tur et al., 2017; van der Aalst, 2016) . With Text mining, we note that the supported methods can be applied to determine the connections between the real-time processes and the intended users or stakeholders (Wen et al., 2014) . Whereas, with Machine learning methods, we note that the technique can be used to predict (e.g., through automatic classification) (De Fortuny et al., 2013; Ofli et al., 2016) the relationships that exist between the process instances in connection to the processes in question. Indeed, the congruence of the aforementioned features of both methods (Text mining and Machine learning) can be harnessed to understand and foster the teaching-learning processes for the said stakeholders (HEIs, teachers, and students) based on the descriptive decision theory or concept (Baucells & Katsikopoulos, 2011; Chandler, 2017) as demonstrated in this paper. Typically, whilst the sentiments/emotions or experiences of the users (e.g., learners) can be determined from data collected about the teachers-students interactions through the text mining technique, the machine learning technique or classification models, on the other hand, can be used to predict what the learners' recommendation or assessment about the teachers-students interactions could be based on the expressed/ extracted sentiments or emotional valence (Abu Zohair, 2019; Bollen et al., 2011; Dey et al., 2016; Ghosh et al., 2020) .

As an example, Wen et al (2014) mined collective sentiments in a massive open online courses' (MOOCs) forum post, to monitor the students' thoughts towards the several offered courses, and the result shows that sentiment analysis (text mining) can be used to provide an effective method to engage the learners even in social settings (Bollen et al., 2011; Brinton et al., 2014; Crues et al., 2018) . The outcome of the exploratory study comprised of a survival model (Wen et al., 2014) that was built as a predictive/monitoring tool for determining the efficacy of certain humanexpressions or language-behaviours (e.g., the impact of students' opinions in the MOOCs environment) based on the probability of certain events happening. Also, taking into account the connectedness between the text mining technique (e.g., sentiment analysis) and machine learning or classification models (Ofli et al., 2016) , the study of Dey et al. (2016) notes that the sentiments which are often found in the comments or feedbacks (e.g., SET) can be categorized by polarity (i.e., positive, neutral, or negative Kalaivani, 2013; Litman & Forbes-Riley, 2004; Okoye et al., 2020) , and then utilized to provide valuable pointers or indicators in connection to the various reasons or purposes for which the datasets are analyzed (e.g., the advances in teaching analytical methods and/or students' evaluation of teaching described in this study). Besides, the authors (Dey et al., 2016) also used a statistical method that supports the K-nearest neighbour (KNN) (Abu Alfeilat et al., 2019; Ghosh et al., 2020; Viji et al., 2020) and Naïve Bayes' (Zhou et al., 2020) supervised machine learning algorithms to capture the different words/sentence polarities and elements of the subjective styles or patterns.

Along these lines, this study shows that through extraction of the different polarities or intensities of the sentiments and emotional valence expressed by the students in the SET data, that we are able to ascertain and provide new and vital information in relation to the top emotions the undergraduates (students) show when completing the SET instrument and/or rating of the teachers' performance, and then provide strategies for further improvements through the machine learning and statistical analysis procedures. Consequently, we applied the k-nearest neighbour (KNN) classifier/algorithm (Abu Alfeilat et al., 2019; Ghosh et al., 2020; Viji et al., 2020) to model and make predictions about the students' recommendation scores for the teachers based on the quantified (polarized) sentiments and emotional valence.

Teaching analytics (TA) is a term used to describe emerging methods and technologies that are used to support the educational processes to ensure an increased teaching or scholastic practices/pedagogies within the education domain (Ndukwe & Daniel, 2020; Romero & Ventura, 2020; Wise & Jung, 2019) . Regardless of the context in which the TA methods or tools are applied, the main goal of the TAsupported methods is to facilitate some form of knowledge extraction (sense-making analytics) from the readily available educational datasets and then foster a response (actionable analytics) based on the derived information (Wise & Jung, 2019) . According to Wise and Jung (2019) , embedding the use of educational technologies or learning analytical tools into teaching practices to help inform instructional-based decisions may represent to be a cumbersome and/or time taking process. For example, the authors (Wise & Jung, 2019) noted that the gap between finding/discovering some of the interesting patterns or knowledge from the educational data/processes, to taking actionable responses or decisions is one of the most critical considerations to take into account when aiming to bridge the pedagogical support or complexities within the higher educational processes. On the other hand, TA is also allied to the concept of "business intelligence" (BI) (Ndukwe & Daniel, 2020; van der Aalst, 2016) in education, as the resultant insights drawn from the methods can be used to not only improve the teaching practices in the various HEIs' settings, but also utilized to provide meaningful or valuable actions to drive the business operations forward, based the pieces of evidence or insights drawn from analyzing the datasets (e.g., educational data) collected about the processes in question. Moreover, whereas Romero and Ventura (2020) notes the educational data mining (EDM) methods to be closely related to the overlapping terms such as the Teaching Analytics, Big Data in Education, Academic Analytics, Datafied or Data-Driven Education, Institutional Analytics, Educational Data Science, and Data-Driven Decision-Making (Cech et al., 2018; Cerratto Pargman & McGrath, 2021; De Fortuny et al., 2013) in Education within the current works of literature. Ndukwe and Daniel (2020) notes that analysis of educational datasets collected about interactions of the teachers in relation to the students' learning processes (i.e., teacher-centric learning design) is a promising way of increasing knowledge about the teaching processes, and how it can be effectively sustained. Therefore, improving the effectiveness and efficiency of the HEIs and the teaching practices, including other benefits such as students' success and active learning engagement, curriculum enhancement and development, etc. (Ndukwe & Daniel, 2020) .

Educational data mining (EDM) and the supported methods which are mainly used to enhance the different organizational processes through insights drawn from educational data, have experienced significant growth and attention over the years both in theory and in practice (Abu Zohair, 2019; Alizadeh et al., 2019; Bogarín et al., 2018; Bowdre, 2020; Cerratto Pargman & McGrath, 2021; Dommett et al., 2019; Exter et al., 2018 Exter et al., , 2019 Romero & Ventura, 2020; Sánchez-Mena et al., 2019; Wang & Zhu, 2019) . A majority of existing studies demonstrates that the educational institutions are gradually becoming "data-hungry", thus, are increasingly seeking data and the results which are obtained from the applied methods for their own use or educational purposes (Clark, 2015; Romero & Ventura, 2020; Williamson, 2018) . For example, Williamson (2018) noted the "Data Future" program as one of the many educational initiatives that will foster new ways of standardizing and quantifying accumulated data and information for education purposes. Also, the several "smart learning tools" and "learning analytical platforms" (Aldowah et al., 2019; Jones, 2019; Larrabee Sønderlund et al., 2019; Perrotta & Williamson, 2018) , otherwise allied to the notion of "business intelligence" (Ndukwe & Daniel, 2020; van der Aalst, 2016 ) are now being used by the different institutions to harness the various educational-process-related decisions and strategies (De Fortuny et al., 2013) .

Indeed, in the modern-day educational settings or twenty first Century education; data are now being collected and stored about the teaching and/or learning processes at an unprecedented rate, either for use in understanding the different activities or sub-processes that make up the educational processes, or for use in advancing the technical capabilities of the educational technologies and impact in the diaspora. Perhaps, such advancements, both in development and use of the educational technologies, have resulted in what the different organizations or educators now call "technology-based education" or yet "educational innovation". As an example, the research by Daniel (2015) notes that digitalization of educational processes have led to a significant amount of innovation, and has inadvertently spanned the conception of datafication or datafied-Education (Cerratto Pargman & McGrath, 2021; Perrotta & Williamson, 2018; Prinsloo, 2017; Prinsloo & Slade, 2017; Slade & Prinsloo, 2013; Williamson, 2018) . Nonetheless, the several educational institutions in question have experienced systematic transformations in the various scholastic endeavors, thanks to the new trends in educational data mining and technologies which are purportedly used to foster the way in which educational data are being generated, collected, and analyzed to support the educational processes.

Perhaps, most HEIs have had to amalgamate the digital (educational) technologies in their different activities in order to ensure the quality or reliability of the underlying business models and operations (Kori et al., 2018; Lawrenz et al., 2019; Medne et al., 2020; Mourad, 2017) . Moreover, studies within the EDM field have also focused on addressing some of the challenges in using those methods and technologies to provide innovative opportunities for teaching and learning in the different contexts (Abu Zohair, 2019; Alizadeh et al., 2019; Bogarín et al., 2018; Dommett et al., 2019; Exter et al., 2018 Exter et al., , 2019 Munro, 2018; Romero & Ventura, 2020; Wang & Zhu, 2019) . For instance, Wang and Zhu (2019) observed that the use of digital technologies in teaching is capable of supporting high-quality and transformed educational process by comparing the performances/outcomes of the learners in a MOOC-based, flipped classroom, and traditional class settings, respectively. Whereas, the study by Abu Zohair (2019) that used the machine learning techniques to analyze educational datasets, shows that adequate (accurate) prediction of students' data or analysis may have not only been crucial in improving the students' performance/experiences (Benkwitz et al., 2019; Crues et al., 2018; Kori et al., 2018; Weston et al., 2019) , but also represents as a useful tool towards the promotion of the various university's ranking or status (Medne et al., 2020; Mourad, 2017; Tóth & Surman, 2019) .

This study notes the implications of the educational data mining (EDM) in terms of quality of education, innovation, and the future of education (UNESCO, 2014 (UNESCO, , 2015 (UNESCO, , 2021 . For example, the research by Mayer-Schönberger and Cukier (2014) noted that the acclimatization of digital technologies and education has paved the way for ground-breaking innovative systems used to drive the higher educational institutions forward. Symptomatically, the resultant models which incorporates modern-day modes of teaching and learning such as Challenge-based learning, memorable university experience, inspiring professors, and flexibility as to how, when, and where learning takes place have become the contemporary goals of the HEIs (TEC, 2018 . Moreover, Dommett et al. (2019) opined that digital technologies for Education would ostensibly stand as the practical bridge between the several activities that underlies the educational processes and the usefulness of the said educational models. To note, Jones (2019) shows that Educators are consumably adopting the modern educational models to track, aggregate, and analyze the students learning behaviors or profiles that are logged in the databases of the several HEIs (Kori et al., 2018; Medne et al., 2020) . Apparently, the most adopted technologies include the educational data mining (Bogarín et al., 2018; Romero & Ventura, 2013 , machine learning (Abu Alfeilat et al., 2019; Abu Zohair, 2019; Dey et al., 2016; Ghosh et al., 2020; Herodotou et al., 2019a, b; Litman & Forbes-Riley, 2004; Muldner et al., 2011; Viji et al., 2020) , and learning analytics (Ferguson, 2012; Ferguson & Clow, 2016; Larrabee Sønderlund et al., 2019; Mangaroska & Giannakos, 2019; Ndukwe & Daniel, 2020; Noroozi et al., 2019; Papamitsiou & Economides, 2014 . Interestingly, all of the aforenoted methods and studies have one common goal; which are directed towards achieving an effective and quality of Education, models, and learning outcomes (UNESCO, 2015 (UNESCO, , 2021 .

Likewise, the work in this study believes that the developed educational models and technologies can provide innovative methods and practices within the higher education contexts. Ranging from the development and application of intelligent methodologies that are aimed to transform the students' learning experiences, to empowering the teaching analytical methods and processes, and what could be called the three-dimensional "expressive-communication-relational" pedagogic skills implemented within the higher education settings. To these effects, the study proposed the educational process and data mining plus machine learning model (EPDM + ML) as an extension to the EPDM model proposed in Okoye et al. (2020) that proves to be useful in improving the teaching process and practices in HEIs, as well as, qualities and experiences that the students value in their instructors, e.g., by addressing the gender preconceptions in terms of teachers-students engagement, and the methods' adequacy in supporting the educational process initiatives in the diaspora.

Prior studies have looked into how best to apply Educational technologies and models to effectively facilitate the teachers-students learning processes, practices, and experiences (Altrabsheh, 2016; Barton & Dexter, 2020; Bowdre, 2020; Cerratto Pargman & McGrath, 2021; Crues et al., 2018; Mackness et al., 2010; Ndukwe & Daniel, 2020; Renz & Hilbig, 2020; Tondeur et al., 2020; Wen et al., 2014) . Ndukwe and Daniel (2020) notes that TA can help Educators (e.g., teachers) to improve the teaching pedagogies and learning outcomes through the provision of tools or platforms that allows them to use data to reflect on teaching. Indeed, the utmost goal of TA-supported methods should be on how to extract or derive meaningful information from the several educational datasets that are stored at an unprecedented rate within the educational databases or information systems; that normally would not be observable by the ordinary eyes but with ample application of state-of-the-art models and/or methods to help uncover hidden patterns/relationships or knowledge from the readily available datasets (Ndukwe & Daniel, 2020) . For example, whereas Tondeur et al (2020) proposed a technological pedagogical content knowledge (TPACK) method to reflect on the role of technology in Education. Ndukwe and Daniel (2020) introduced what could be called a theoretic road map that is directed towards guiding the researchers or educationalists in improving the quality of teaching and/or learning processes by engaging with the educational datasets. The outcome of their approach was a teaching outcome model (TOM) that showed to be useful not only for understanding the state-of-the-art methods or educational technologies related to teaching analytics and its implications for the future of education, prospects or mechanism in higher educational settings, but also in understanding the connection between the conceptual frameworks of teaching analytics (TA) (Ndukwe & Daniel, 2020) , Learning Analytics (LA) (Ferguson, 2012; Herodotou, et al., 2019a, b; Jones, 2019; Papamitsiou & Economides, 2014; Renz & Hilbig, 2020; Romero & Ventura, 2020) , and learning design (LD) (Holmes et al., 2019; Mangaroska & Giannakos, 2019) in general. Accordingly, this study proposes the Educational Process and Data Mining plus Machine Learning model (EPDM + ML) that is built on conceptual frameworks of the TA, LA, and LD that is based on descriptive decision theory (Baucells & Katsikopoulos, 2011; Chandler, 2017) to provide a data-focused or analytical method that shows to be useful not only towards understanding of the teachers-students learning processes/outcomes to help inform and improve the quality of teaching pedagogies, but also the need for creating teaching-data-literacy or contextual-based analysis to uncover and address the different teaching-learning challenges that can be found within the higher education settings. The method (EPDM + ML) is described in detail in the next section of this paper.

The study presents in Fig. 1 , the main architecture or building blocks of the EPDM + ML model which it applied for implementation of the method described in this paper.

As shown in Fig. 1 , the EPDM + ML model design and implementation is described in two phases. First, we applied the main functions of the EPDM model defined in Okoye et al. (2020) to analyze the comments (textual data) provided by the student in the SET to help deduce and quantify the average sentiments and emotional valence scores of the individual comments. The process also comprised of the cleaning and filtering of the dataset to allow for the text mining and model deployment to follow. The text analysis or textual data quantification is then performed using appropriate text mining tools, packages, and libraries to extract the values that were utilized to comprehend the level of impact of the students' comments or intensity of the sentiment/emotions they expressed across the datasets. Technically, we used the R statistics tool (Rstudio, 2020), an integrated development environment that supports the text mining algorithms and methods such as sentimentr, sentiment Analysis, pander, etc. to analyze the data. Thus, we extracted the sentiment scores and emotional valence of the different comments, and consequently, utilized the quantified results (i.e., ave_sentiment and emotional_valence) to compare its relatedness to the quantitative measures or data (recommendation_by_students) as contained in the provided SET instrument, by considering the influence and significant differences in the way the students recommended the teachers by gender.

For the second phase of the model (Fig. 1) , we developed a machine learning classification model that predicts what a students' recommendation for the teachers may be based on the students' gender, ave_sentiment, and emotional valence scores. To do this, we trained and analyzed the extracted information (i.e., recom-mendation_by_students, student_gender, ave_sentiment, and emotional_valence) using the k-nearest neighbour machine learning algorithm (Abu Alfeilat et al., 2019; Ghosh et al., 2020; Viji et al., 2020; Wong & Yeh, 2019) supported by libraries such as CaTools and Class Library in R (Rstudio, 2020). Also, a cross-validation or performance evaluation measure using the confusion matrix and performance metrics (van der Aalst, 2016) was used to determine the accuracy, error-rate, specificity, precision, recall, and F1-score of the model being able to classify the predicted scores.

The study makes use of data we collected from the Student Opinion Survey (ECOA, 2013) within the higher education context to conduct the series of experiments and implementation of the EPDM + ML model described in this paper. ECOA is a (SET) system designed for the collection of information about students' opinions with respect to the outcome of the different offered courses and teaching programs across the various campuses of the host institution, where this research was conducted. We analyzed the SET data collected through the (ECOA) survey from undergraduates about their teachers' performances for the academic year of 2019. The survey instrument was applied across the 26 campuses of the institution spread across the entire national regions of the host country (Mexico); covering around 14 Divisions/Schools, 78 Departments, and 1082 Courses. Therefore, we assume that a wide range of the students' opinions and/or recommendations about the courses and teachers' performance were represented in the collected data. For the purpose of this study, we analyzed both the textual data (comments) provided by the students in response to the question, "Why would you recommend or not recommend the teacher", the gender of the teachers and students, and the quantitative data they provided in response to the question "Please rate your recommendation for the teacher?" which was measured on interval scale of 0 to 10 where 0 means the lowest rating and 10 means the highest rating. It is also important to mention that to get an unbiased and ethical evaluation/analysis of the available data; the names of the students who completed the survey were withheld from the data for anonymity purposes, even though, their gender distribution or demographic information were disclosed. The questionnaires were completed by the students at the end of their respective programs or courses. Also, considering the privacy and ethical point of view; the students who provided the comments were informed about the purpose of the applied questionnaires, and were not directly involved in the analysis performed in this study.

Considering the validity and reliability of the SET data; we note that the ECOA instrument is an institutional survey administered and maintained by the host university, and has been used for several years by the institution for the purpose of evaluation/assessment of the teachers' performance based on answers or comments provided by the students. The instrument has been used and validated in previous studies (Hernández, 2013; Montemayor-Gallegos, 2002; Salinas & Martínez, 2018) . While, the comments given by the students were a free choice open-ended question, the recommendation of the teachers was a close-ended interval scale question between 1 and 10. The estimated minimum sample size for the study was 40 participants which we considered to be the scientifically acceptable sample size (n > 30 or 40) (Roscoe, 1975) for conducting the different experimentations and analysis in this study when compared to the large enough sample size (n = 85,378) we have used.

Describing the data sample and size, the dataset we used for the study includes a total sample of n = 85,378 responses we analyzed after cleaning and filtering out the incomplete datasets and students who did not comment in the data. We noted a sample size of n 1 = 45,294 for the male students, and n 2 = 40,084 for female students which we utilized throughout the series of experiments and analyses in this paper. Also, for the training and testing of the proposed machine learning model, we have randomly selected a total sample of n 3 = 1000 (n = 700 used as training set, and n = 300 for test set) to evaluate the classification process or performance metrics of the model.

The EPDM + ML model implementation and analysis was carried out to:

• Determine the average sentiment and emotional valence of the individual comments given by the students in the SET. • Determine the marginal mean differences and/or influence that the average sentiment and emotional valence scores has on the students' recommendation of the teachers, and how the results differ by considering the gender differences. • Determine the extent or capability of the machine learning classification model being able to predict what the students' recommendation for the teachers would be by considering the students' gender, sentiment, and emotional valence displayed in the data or comments.

• Text mining or Sentiment analysis: Used to extract the intensities (polarity) of the comments provided by the students in the SET through polarization or textual data quantification. • Analysis of covariance (ANCOVA) and Kruskal Wallis Test: Used to determine the marginal means of effects that the extracted sentiments and emotional valence has on the students' recommendation for the teachers, and how the results differ by considering the gender construct. • K-nearest neighbor (KNN): Classification algorithm or predictive model used to predict what the students' recommendation for the teachers may be by considering the students' gender, sentiment, and emotional valence scores.

For the data analysis and implementation of the various phases of the EPDM + ML model as defined in Fig. 1 ; the study used the text mining method (EPDM) to extract the different values or scores representing the polarity (intensities) of the average sentiment and emotional valence expressed by the students in the comment (textual data) using the relevant packages, functions, automation, and quantification methods in R. It is noteworthy to mention that the text (comments) provided by the students were mainly in Spanish and was analyzed in its original form. However, we report the results and outcomes of the experiments in English to cover the wider spectrum of international audience/readers and educational objectives of this study.

The study implemented the first phase of the EPDM + ML model by determining the intensities of the students' comments towards the teachers as contained in the ECOA SET by using the EPDM method, a Text Mining technique previously proposed in Okoye et al (2020) . Typically, as shown in Tables 1, 2 and 3, the outcome of the method (EPDM), is a quantified or polarized values that are used to denote the intensities of the different comments provided by the students by using the positive ( +), neutral (0), and negative (-) connotations to represent the scores (Litman & Forbes-Riley, 2004; Okoye et al., 2020) ; whereby the values with positive (sentiment and/or emotional valence) ( +) scores represent an attractive sentiment/ emotion, whilst the negative (-) scores signify an aversive sentiment/emotion. The zeros represent sentiment/emotions that are classified as neutral (0), and thus, have no emotions or sentiment attached to them. It is also important to mention that by sentiment scores; we refer to the average or impact of the different individual comments provided by the students. Whereas the emotional valence is obtained by summing up the scores of the words which the model has identified as a term that can be used to express an emotion in the texts (comments). As shown in Table 1 ; the sentiment analysis we did to extract the scores for the different comments were analyzed based on the students' gender distribution. The results are as shown in Table 1 and summarized in Table 2 and Fig. 2 .

As shown in Tables 1 and 2, and Fig. 3 ; we made use of the sentiment_by, get_ sentiment, and get_nrc_sentiment functions in R (Rstudio, 2020) to establish the different word counts and the average_sentiment scores for each comment in the data including the standard deviations. The sentiment scores were represented as interval values between -1 and 2 (Fig. 2) , denoting the levels of intensity or impact of Table 1 , we made use of the first and last five comments in the analyzed data to explain the results of the method; whereby the comments that came out with a positive ( +) interval value signifies a positive (good) sentiment, whereas comments with a negative (−) value represent an aversive (bad) one. In total, we found that the male students expressed the utmost negative sentiments (comment) with a minimum (min) value of − 0.716 when compared to the min (− 0.707) expressed by the females (Table 2) . Nevertheless, the male students also expressed the most positive sentiments with a maximum (max) value of 1.395 in contrast to the female counterparts who expressed a max average of 1.223, vice and versa.

Furthermore, to quantify the levels of emotional valence of the comments provided by the students by gender, we applied the get_nrc_sentiment function which is supported by the pander method or algorithm in R to extract the different scores for each comment provided by the students based on the gender differences. Technically, the get_nrc_sentiment functions by obtaining and quantitatively labeling the intensities of the words which can be used to express an emotion in the texts by using the positive (+ +), neutral (0) and negative (-) values (Litman & Forbes-Riley, 2004; Okoye et al., 2020) to represent each relevant word it finds in each case. The results of the method is as shown in Table 3 and Fig. 3 .

In Table 3 , we showed the emotional valence scores of the first 120 comments. The Comments column, [1] to [106] , represents the id of the first individual comment in each case within each row. The comments with positive (valence) scores represent an attractive emotion, whilst the negative scores signify an aversive valence. The zeros represent comment which are classified as neutral, and thus, with no emotions attached or words which can be used to express emotions were not found. We note, as shown in Table 3 and Fig. 3 , that the valence scores for the male students ranged between − 4 and 6, whereas those found for the females ranged between -4 to 13. As example, we show the specific comments for the min and max emotional valence that we found for the different genders including an example of the neutral comments, as follows:

Male student comment [15043] Positive (Max) Valence Score: 6 > "Excellent professor. He [teacher] not only shows his mastery but also his love for literature in class. Very fun class, challenging due to the time management needed to complete the assignments, but fun and relaxing either way."

Female student comment [7758] Positive (Max) Valence Score:13 > "I'd recommend her [teacher] because her classes were full of "hands-on" activities, examples and interesting information, and she explain very clearly theoretical information, and didn't asked us to memorize but to learn. I liked that her quizzes were short but continuous, and just about relevant information. Seriously, I discovered the huge passion I have for art, and she inspire me to get deeper and learn by myself more about the background of art and artists; this changed my perception of art, from just appreciating it to analyzing and feeling it. As well, I felt the confidence to approach and ask for help when I needed; once I received a word of support from her, that helped me a lot".

Male student comment [43944] Negative (Min) Valence Score: − 4 > "He is a teacher who has a great knowledge of the subject, however I think that he does not manage to transmit that knowledge to the students at all and that makes one lose interest in the subject and only take the subject simply by passing it even if one does not understand the subject. everything seen in class".

Female student comment [38731] Negative (Min) Valence Score: − 4 > "Excellent teacher, however sadly the course has lost my interest since it is not a dynamic class, or with additional material to that offered by the book".

Male student comment [1] (Neutral) Valence Score: 0-> "Because he knows a lot about the subject, they are things that will serve us in the future, in addition, his way of teaching is very specific which makes the class interesting".

Female student comment [2] (Neutral) Valence Score: 0-> "She explains very well, has a lot of patience".

In Figs. 4, 5 and 6 and Table 4 , we report the overall emotions of the students about the teachers' broken down by gender of the teachers and students. We looked at both genders (teachers and students), and the emotions that the students deem as crucial in the SET evaluations. Principally, we applied the sentiment/emotions classifications within the educational domain as noted in Litman and Forbes-Riley, (2004) and Okoye et al. (2020) in the results. 

Having established the polarity (intensities or textual data quantification) of the different comments provided by the students in the SET; the study turned its attention to determine the effect that the students' gender has on the recommendation of the teachers based on the linearity of the independent pairwise comparisons (i.e., the estimated marginal means) by controlling the ave_sentiment and emotional_valence expressed by the students using an Analysis of Covariance test (ANCOVA) (Alao et al., 2019; . In other words, we determined the effect that the students' gender have on the recommendation of the professors whilst also controlling (taking into account) the impact or influence of the covariates (students' average_sentiment and emotional_valence) on the resultant outcomes. We believed that the covariates might explain some of the differences in the marginal means in terms of the recommendations given to the teachers by the students considering the students' gender. For instance, whether the varying negative/positive sentiments and emotional valence as described in the earlier section (text mining) may lead to a higher or lower recommendations by the students taking into account the gender differences of the students. Henceforth, we analyzed the contributing effect that the students' average sentiment and emotional valence may have on the outcome of the recommendations using the ANCOVA method, and then, conducted a Kruskal Wallis test (Elliott & Hynan, 2011; Frey, 2018) to determine where the significance differences may lie by considering the different genders. To do this, we assume that there is homogeneity between the covariates and the students' gender. The results of the ANCOVA and Kruskal-Wallis tests are as reported in Tables 5, 6, and 7. As gathered in Table 5 , the marginal mean of effect or result of the ANCOVA test shows that there is a significant difference or effect between the students' gender and the recommendation of the teachers (p = .048). This means that the students' gender plays a part in the recommendation scores given to the teachers, and also varies by gender. Also, when we took into consideration the influence (covariance) that the ave_sentiment and emotional_valence of the students played in the recommendation of the teachers; we found that whereas the ave_sentiment (p = .000) expressed by the students contributed to the test outcome, the students emotional_valence (p = .376) do not influence their recommendation of the teachers. Nevertheless, when analyzing the effect of the controlled independent variables (Ave_sentiment*Emotional_valence) while taking into account the influence of the uncontrolled independent variable (Student_Gender) i.e., Student_Gender*Ave_sentiment*Emotional_valence (see :  Table 5 ), we found that the overall mean effect of all the combined factors were significant (p = .001).

Having found that there is a significant difference in the students' gender and the average sentiment of the students when they recommend the teachers; we deemed it necessary to conduct a Kruskal-Wallis test to help determine where the significant differences may lie between the genders. The result of the method is as reported in Table 6 . As reported in Table 6 , we found that the significant differences in terms of the recommendation of the teachers when analyzed by the students' gender, alongside the impact of the ave_sentiment and emotional_valence of the students when doing so; is observed for the female students (p = .000). This means that the female students significantly take into account the sentiment when recommending the teachers, whereas their male counterparts do not (p = .124). In any case, both genders (male p = .164, female p = .834) do not consider their expressed emotions when recommending the teachers.

Finally, we checked to see if the significant results as presented in Table 6 differ by considering the gender of the teachers. The result of the method is presented in Table 7 .

In Table 7 , we found that the marginal means effect of the average sentiment for female students, as explained in Table 6 , was significant for both genders of the teachers (male teachers p = .009, female teachers p = .001). Moreover, another interesting finding is the fact that the male students appeared to be borderline in terms of the sentiment and emotions when recommending the teachers; with the male students attaching emotional_valence of p = .056 for the male teachers, and ave_sentiment of p = .077 for the female teachers, respectively.

To implement the second component of the EPDM + ML approach; we developed a machine learning classification model that predicts what a students' recommendation for the teachers would be based on the students' gender, average sentiment, and emotional valence parameters. As defined in the provided steps or procedures in Algorithm 1, we trained the model with the students' gender (Gd), recommendation scores given by the students in the SET (Rec), and the extracted average sentiments (ave_ sentm), and emotional valence scores (EV) (see: Tables 1 and 2), using the k-nearest neighbour (KNN) algorithm (Abu Alfeilat et al., 2019; Ghosh et al., 2020; Viji et al., 2020; Wong & Yeh, 2019) in R statistics (Rstudio, 2020).

As gathered in the algorithm (Algorithm 1), the EPDM + ML model which can be applied to analyze any given educational data, especially as it concerns the studentsgenerated dataset (SET), functions as follows; First, the captured dataset (ED) is imported into the integrated development environment, and then the EPDM method/ functions (Line 6) (Okoye et al., 2020) are applied to the (textual) data, in this case, the comments provided by the students, to extract the average sentiment (ave_ sentm) and emotional valence (EV) scores (Line 7). The next steps defined in Lines 9 to 11, involves the creation of the training (df.TR) and test (df.TS) dataset which we used for the model predictions and classification process by concatenating the considered variables which are, consequently, stored as an object we called ML, i.e., c(Rec, Gd, Ave_sentm, EV) (see: Line 9). In Lines 12 to 16, the defined dataframe or objects (df.TR and df.TS) containing the values of the training and test sets, respectively, was analyzed by using the k-nearest neighbour method, knn(), and in turn, the outputs (ConfusionMatrix, precision, recall, specificity, accuracy, error-rate, F1-score) are returned in Line 17. It is noteworthy to mention that the (ML) input dataset (n = 1000) was randomly selected from the data sample which perhaps contained the student genders and the recommendation variables; with the training set (df.TR) consisting of n = 700 cases (i.e., 70% of the input dataset) and the remainder utilized as test set (df.TS) (n = 300). For the experimentation, we set the value of k (k-value) to be k = sqrt(n) (Cover, 1968) where n = 1000. The results of the predictions or classifications by the model/outcomes are as represented in the Confusion matrix (Ariza-López et al., 2019; van der Aalst, 2016) (see : Table 8 ), and Tables 9 and 10, respectively. In Table 9 , we reported the details of the Optimal value of k (i.e., knn31) based on the closest k = sqrt(n) in order to illustrate how the model predicts each of the recommendation scores by the students in each run test of the model. As shown in the confusion matrix or performance metrics table (Table 8) and Table 9 , the recommendation scores by the students was denoted from 0 to 10, with 0 being the lowest and 10 being the highest rating. Furthermore, in Table 10 , we conducted a k-fold cross-validation method (Dehghani et al., 2019; Wong & Yeh, 2019; Xiong et al., 2020) to determine the performance of each of the input run test by the knn model. To do this, we determined the closest k-values for the model by establishing the square root of the input dataset (i.e., k = sqrt(n), where n = 1000), Table 8 Confusion matrix (performance metrics) for optimal k-value, knn = 31 Accuracy = 1.00 (100%), 95% CI = (0.98, 1), p <.00 .000

.000

.000

.000

.000

.000

.000

.000

.000

.000

Cohen's Kappa (expected at 0.9) 1.00 and then executed in each run test for the predictions, the values of the closet square root of n for k, i.e., k 26 to k 36 as shown in Table 10 . Accordingly, we utilized the results of the classification process as shown in the confusion matrix (Table 8), and Tables 9 and 10 to calculate the precision, recall, specificity, accuracy, F1-score, and error-rate by the model, respectively, whereby the cross-validation or performance metrics are defined as follows (van der Aalst, 2016):

• TP-number of true positives; representing the instances of the scores that were correctly classified as positive • TN-number of true negatives; representing instances of the scores that were correctly classified as negative • FP-number of false positives; representing instances of the scores that are predicted to be positive but should have been classified as negative • FN-number of false negatives; representing instances of the scores that were predicted to be negative but should have been classified as positive.

8 Performance metrics: precision, recall, specificity, accuracy, error-rate, and F1-score

The closest k-values for the ML input dataset (n = 1000) was used for calculating the performance of the outputed (scores) using a k-fold cross validation method (Dehghani et al., 2019; Wong & Yeh, 2019; Xiong et al., 2020) . This was done in order to assess and to validate the performance of the knn model (see : Table 9 ). The outcome for each of the run test (knn26 to knn36) for the model is as represented in Table 10 . In turn, we found that majority of the k-values (i.e., knn26 to knn32) utilized in each run test of the model happened to be high representing a total of 63.6% (7 out of 11 models) (see : Table 10 ) of the executed parameters. Moreover, the remainder of the run tests (36.4%) also presented a high accuracy and acceptable levels of performance measures (see: Table 10 ). As gathered in Tables 8, 9 and 10, the precision of the knn model, otherwise known as the positive predicted values, i.e., (TP)/(TP + FP) determines what proportion of the predicted scores actually tallies with the actual scores as contained in the dataset. As an example, we note in Table 9 for the closest Optimal k-value knn31(k = sqrt(n)) also reported in Table 10 , that the precision value which was 1.00 indicates that all the predicted vs actual scores did match. In other words, given the high precision value (1.00), we can say that there is an empirically evidence of 100% chance that a specific students' recommendation score that has been predicted by the model were essentially the correct scores.

On the other hand, recall (sensitivity) which describes what proportion of the students' recommendations (scores) by considering the students' gender, ave_sentiment, and emotional_valence was correctly identified by the model. For example, as reported in Table 9 for the knn36 optimal value for the model, the recall, which is also known as true positive rate, i.e., (TP)/(TP + FN) equals to 1.00 (100%) which means that the model did not necessarily miss any of the considered variables or parameters when calculating the recommendation scores.

Accordingly, when considering the specificity which represents the true negative rate, i.e., (TN)/(TN + FP) by calculating, for instance, the proportion of the students' gender that were actually classified as either male or female. Henceforth, given the specificity score of 1.00 (100%) which is again very high, we assume or accept that those classifications by the model were correct.

Therefore, from the sensitivity (recall) result (1.00), we know that if the model predicts what a students' recommendation score for the teachers would be based on the students' gender, average sentiment, and emotional valence, then the predicted score is presumably correct. Whereas, considering the specificity (1.00), the result shows that if the model estimates or outputs what a students' score is; taking into account the students' gender, then there is a 100% good chance that the score is truly correct.

Finally, we determined the accuracy of the model, which represents the total number of the correct predictions by the model (which can be found on the diagonalaxis values of the confusion matrix -see: Table 8 ) divided by the total number of the test dataset that was utilized to predict the scores (n = 300), i.e., (TP + TN)/(TP + TN + FP + FN) = > (165 + 28 + 25 + 16 + 9 + 15 + 9 + 8 + 9 + 6 + 10)/300, is equals to 1.00 (100%). While, the error-rate which represents how often is the model classification wrong, i.e., (FP + FN)/(TP + TN + FP + FN) = > (0 + 0)/300, is equals to 0.00 (0%). We also combined the precision and recall into an F1-score, which means the harmonic mean of the precision and recall, i.e., (2 × Precision x Recall)/(Precision + Recall) = > (2 × 1.00 × 1.00)/(1.00 + 1.00) is equals to 1.00 (100%). Moreover, the Cohen's kappa coefficient metric (p expected is 0.9) (Carpentier et al., 2017) , which measures how good the model predictions are compared to random guessing or assignment is equals to 1.00.

Thus, going by the high and acceptable values of the precision, recall, specificity, accuracy, F1-score, and error-rate by the model, which was uttermostly observed for the majority of the k-values (63.6%) in the cross-validation of the method; we concluded that the KNN textual data classification approach by using the EPDM + ML, is a good predictor and efficient method to determine what the students' recommendation scores for the teachers would be taking into account the students' gender, average sentiment, and emotional valence parameters.

This study introduced the EPDM + ML model (Fig. 1) , as an extension of the educational process mining and data mining model proposed in Okoye et al. (2020) . This was done to show the technical and scholastic ways on how to utilize pieces of evidence idiosyncratically drawn from educational dataset to inform and improve the teaching quality and performances for the stakeholders (teachers and students). The model (EPDM + ML) was developed through the amalgamation of the Text mining and Machine learning technique we grounded on the descriptive decision theory (Baucells & Katsikopoulos, 2011; Chandler, 2017) which studies the rationale behind the decisions that users (e.g., students) are disposed to make by means of the textual data quantification and statistical analysis. Studies that have looked into the text mining (e.g., sentiment analysis) method and its main application within the different studied contexts, have shown that machine learning techniques can be a good predictor of the students' feedback and/or recommendation of the teachers' performances or outcomes (Abu Zohair, 2019; Altrabsheh et al., 2014; De Fortuny et al., 2013; Dey et al., 2016; Litman & Forbes-Riley, 2004; Ofli et al., 2016) . For instance, whereas Altrabsheh et al (2014) applied machine learning algorithms such as the Naive Bayes (NB), Complement Naive Bayes (CNB), Maximum Entropy (ME), and Support Vector Machines (SVM) to analyze the real-time students feedback. This study employs the k-nearest neigbour (KNN) machine learning method (Abu Alfeilat et al., 2019; Ghosh et al., 2020; Viji et al., 2020; Wong & Yeh, 2019) , which have shown its usefulness for classification problems, to predict the students' evaluation of teaching and their recommendations for the teachers within the setting of higher education. In practice, the main purpose of the machine learning technique such as the one defined in this paper (EPDM + ML) is to predict what can happen across the dataset by learning the characteristics and/or relationships of some subsets or analyzed variables/parameters. Thus, by determining the number of process instances executed in turns of the k-values (see: Tables 8 and 9) in the available SET data, and by referencing the quantified variables (i.e., ave_sentiment and emotional_valence); we predicted the recommendation of scores for the teachers by the students based on the most occurring labels in the k nearest ones (Abu Alfeilat et al., 2019; Abu Zohair, 2019; Dey et al., 2016) . The results show that the KNN text classification model (EPDM + ML) is a good predictor and useful technique that can be used to determine what the students' recommendation scores for the teachers would be taking into account the students' gender, average sentiment, and emotional valence, as reported and explained in detail in the earlier section.

It is important to mention from the findings on how the students evaluate their teachers' performance and recommendation through the ECOA SET survey, as shown in Tables 1 and 3 , and Figs. 2 and 3; that a greater proportion of the students' comments that we analyzed (n = 85,378) were considered to be neutral, i.e., equal to zero and largely interpreted to be positive in nature considering the comments which have shown the average sentiments and emotional valence. Thus, we note for the sentiment analysis that 76.4% (i.e., 65,219 out of 85,378) of the provided comments were classified as neutral, whilst for the emotional valence, 88.2% (i.e., 75,280 out of 85,378) were classed as neutral, respectively. On the other hand, for the comments which came out to contain some sort of either positive or negative sentiment, i.e., 20,159 out of 85,378 (23.6%), and emotional valence (11.8%), i.e., 10,098 out of 85,378; we found that the female students recommended the teachers by taking into account the sentiments (p = .000), whilst the male students appeared to be slightly borderline in terms of the emotions and sentiment when doing so with the closest non-significant values being p = .056 for emotional valence, and p = .077 for average sentiment, respectively. Also, when considering the different genders of the students in terms of the average sentiments scores; we note for the neutral comments (i.e., comments with zero values) that there was 35,296 out of 45,294 (77.9%) for males, and 29,923 out of 40,084 (74.7%) for the females. Whereas for the remainder comments which shows to contain some form of sentiment, there was 9998 out of 45,294 (22.1%) for the males, and 25.3% (10,161 out of 40,084) for the females, respectively. Likewise, considering the students' gender in terms of the emotional valence scores; the males showed a total of 89.2% (40,394 out of 45,294), and females 87.0% (34,886 out of 40,084) for the neutral comments. Whereas for the positive/negative emotional valence scores, they showed a total of 10.8% (4900 out of 45,294) for males, and 13.0% (5198 out of 40,084) for females, respectively.

Concerning the implications of this study, both in practice and the wider spectrum of scientific research and/or educational technologies, in particular, we note that prior studies have looked at the effect that educational technologies, such as the teaching analytical methods, have on the teaching perspectives and experiences for the students (Boring, 2017; Engen, 2019; Gallego-Arrufat et al., 2019; Gomes & Ma, 2020; Gordillo et al., 2019; Ndukwe & Daniel, 2020; Silva et al., 2019) . Whereas some of the existing studies argued that the students evaluation of teaching (SET) may not necessarily be the most effective way of determining the teachers' teaching performances and assessment (Boring, 2017; Gomes & Ma, 2020) , other studies have also highlighted the early indicators or success factors that have been achieved over the years, particularly through the use of the educational technologies and data to support the teaching-learning processes and development (Bowdre, 2020; Clark et al., 2020; Engen, 2019; Hilliger et al., 2020; Kori et al., 2018; Oyedotun, 2020; Raffaghelli et al., 2020; Silva et al., 2019) . To note, Engen (2019) mentioned that, there is now more than ever, the necessity for ample understanding of the emerging methods and digital technologies, both in terms of what can be called the culturalto-social aspects for use by the educators, in fostering an effective teachers-students learning processes and experience (Dimitriadis et al., 2021) . Whereas, Ndukwe and Daniel (2020) explored the broad conception of usefulness and importance of the teaching analytics (TA) within the Education domain. Their review study (Ndukwe & Daniel, 2020 ) premeditated on establishing a framework to help describe and inform the different aspects of TA in education by developing a model that allows educationalists to gain farther insights into how the method (TA) can aid the stakeholders (e.g., teachers) to advance/enhance the several teaching dimensions, pedagogies, and outcomes in practice.

On the other hand, Gomes and Ma (2020) argued that through engagement, for instance, by measuring helpfulness and students' expectations, that the educators may find an alternative to SET, and its implications for practice within the different contexts or educational domain. In theory, the authors (Gomes & Ma, 2020) notes that alternative methods to students evaluation of teaching must involve observing or studying the students' emotional state or affective outcomes, thus, engaging the paradigm of the disconfirmations, by arguing helpfulness (e.g., emotional wellbeing of the students or educational support that are provided beyond the traditional classroom settings) to mean overall satisfaction for the said stakeholders (educators, teachers, students, etc.). In the same vein, by studying the sentiment and emotional valence (intensities) of the comments provided by the students in SET towards the teachers' recommendations, which forms part of the main contributions of this paper; it is deemed to be a useful method towards achieving, both in theory and in practice, the aforenoted objectives and alternatives to SETs, particularly in leu and aftermath of the recent Covid-19 outbreak that have impacted the teaching-learning processes (IEEE, 2020b; Viner et al., 2020) , and contingency plans by the Educators (Bao, 2020; Kummitha, 2020; Lin & Wang, 2021; Ma et al., 2021; Woolliscroft, 2020) in ensuring that the students are learning effectively through the several educational technologies that are used to facilitate the continuous teaching and learning, and students wellbeing in the diaspora. Moreover, the results of our study shows that an adequate understanding and analysis of the different factors, such as the sentiments and emotions expressed by the students, as well as, how to leverage that information to not only understand how the students evaluates the teachers by considering the gender differences or preconceptions, but also, in predicting what the students' recommendation of the teachers' performances or assessments score would be; stands to be a major contribution as it concerns efficient application, monitoring and management of the teachers-students learning processes and experiences, socio-technical and general well-being (Al-Maskari et al., 2021; Çevik & Bakioğlu, 2021; Dimitriadis et al., 2021; Garcez et al., 2021; Petersoni et al., 2018; Rapanta et al., 2020; UNESCO, 2020) .

Furthermore, the main driver for this research, and the series of experimentations conducted in this paper are as follows; first, there is now more than ever an increasing need to recuperate or reinstate the teachers' and students' learning processes/ experiences through emerging and innovative (state-of-the-art) methods such as the TA, following the rapidly changing educational environment, curricula, and ecosystem, or yet, what could be called the post-Covid-19 Education era or backdrops (Bao, 2020; IEEE, 2020a, b; UNESCO, 2020 UNESCO, , 2021 Woolliscroft, 2020) . Second, data about the students' evaluations of teaching (SET) are now captured and stored at an unprecedented rate within the several educational information systems and databases, which can be leveraged to provide adequate measures or solution to not only monitor, but to foster and/or ensure that the students are learning effectively. Perhaps, this has now become inevitable, in the post-Covid 19 era, particularly in connection to the many educational technologies that are being put in place by the different HEIs to help foster the teaching-learning processes for the stakeholders. In the same vein, we introduced the EPDM + ML model and its underlying analysis and implementation, to help bridge the identified gaps and challenges both in literature and in practice. Besides, the method (EPDM + ML) can be adopted by the Educationalists, Process innovators, Technologists, and Policy-makers in preparedness and/or advancement of the several educational activities and initiatives that underlie the present-day teaching and learning processes, as well as, provision of valuable and effective support for the said stakeholders at large (e.g., the teachers, students, educational community, etc.).

Data-driven methods, such as the EPDM and Machine learning technique represented in this study cannot be fully described without placing emphasis on the concept of datafication (Cerratto Pargman & McGrath, 2021; Prinsloo, 2017; Raffaghelli et al., 2020; Renz & Hilbig, 2020; Slade & Prinsloo, 2013; Webb et al., 2018) . The "datafication" theory or practices acknowledges different ethical considerations surrounding the outcomes or results of the data mining and/or machine learning techniques. Thus, the study deemed it necessary to discuss some of the related ethical implications of the applied methods particularly as it concerns the sociotechnical perspective on data usage within the educational context, that at the same time underlies the method of this paper. For example, existing studies that looked into the ethical challenges and perceptions in use of data within the higher education context Slade & Prinsloo, 2013) emphasized the need to collect and analyze educational datasets under conditions that ensure trust among the different stakeholders (e.g., HEIs, teachers, students, etc.). Having said that, the procedures or various steps and stages of analysis and handling of the educational data utilized for this study, were performed within the social structure and moral purposes/ standpoints of the technical expertise (Perrotta & Williamson, 2018; Prinsloo, 2017; Prinsloo & Slade, 2017; Slade & Prinsloo, 2013) . Moreover, the resultant predictions and conclusions of the study have been made bearing in mind the veracity and variability of the captured datasets, and the need for appropriate procedures for carrying out the data-driven segmentation and diversification (Perrotta & Williamson, 2018) . These also comprised taking into account the ethical pedagogies requiring the higher educational institutions to provide contextualized technical solutions that hypothetically aim to improve the effectiveness and quality of the teaching-learning processes or practice in the diaspora Slade & Prinsloo, 2013) . Therefore, whilst the socio-technical standpoint of the different higher education institutions and the underlying (methodological) algorithmic decision-making (De Fortuny et al., 2013) and recommendation systems (Prinsloo, 2017; Prinsloo & Slade, 2017) such as the EPDM + ML model proposed in this paper, tends to offer a huge potential. The study urges that we must also recognize the ethical challenges and risks that are complementary to the Data-driven methods or datafied-Education per se (Cerratto Pargman & McGrath, 2021; Hilliger et al., 2020) . Perhaps, there is no harm but instead, an ample opportunity for more intrinsic studies to span, when we acknowledge the threats and implications in applying the new and emerging technology-focused (algorithmic) data-driven decision-making (De Fortuny et al., 2013) methods, and pedagogical practices within the higher education domain or teaching-learning settings. This is especially feasible, and at the same time sustainable, when suitable and sufficient measures like sensing, processing, acting, and learning are subordinately put into place alongside the developed and implemented technologies and innovations (Prinsloo, 2017; Prinsloo et al., 2012; Renz & Hilbig, 2020) .

In summary, this study aimed to identify the glitches and opportunities with teaching analytics and technologies that can used to support the learning processes for the users especially as it concerns the students' evaluation of teaching within the higher education context. It studied the extent to which factors such as sentiments and emotions expressed by the students in SET impact their recommendation of the teachers considering both gender constructs. In the efforts to provide answers to the research questions and objectives; we ruminated the new trends and use of state-of-the-art technologies such as the Text mining and Machine learning towards effective teaching analytics and educational process innovation. To this end, the study proposed the educational process and data mining plus machine learning model (EPDM + ML) that proves to be effective with a high level of accuracy and efficacy towards the contextual analysis of data collected from SET in a setting of higher education. The results of the method (EPDM + ML) can be applied by educators or HEIs to understand and achieve an improved educational process/management through the data-driven and/or technology-focused solutions. However, while the authors believe that the proposed method and work therein are suitable for contextual analysis of the educational data and ample understanding of the teachers-students learning processes and/or perspectives based on the SET; this may also come with some limitations or threats to the validity of the study. For example, although the study introduced a conceptual framework and approach for analyzing the different sets of descriptive and quantifiable datasets about the SET through its proposed method, there could potentially exist or emerge other ways to approach this. The threats may also be related to the velocity, volume, variety, vagueness, and variability of the several educational datasets that are collected at an increasing rate within the education domain, or yet, modern-day educational settings. Moreover, there could also be bigger areas and analytical compositions or components that may have not been yet addressed considering the scope of this paper. Perhaps, this is because the congruence of the text mining and machine learning techniques are emerging technologies or practicalities within the educational domain, and there are not too many methods or educational design-frameworks that considers both approaches in the current literature. Henceforth, this research represents as an added incentive, both in terms of theoretical and methodological know-hows, towards a more rigorous and robust researches to come particularly within the wider areas of Educational technologies, Teaching competences, and/or Teaching analytical methods that can be employed for higher education and process management.

In this study, the authors shows that (educational) data collected from students' evaluation of teaching (SET) can be contextually analyzed using technologies such as the Text mining and Machine learning techniques. The methods (Text mining and Machine learning) can be utilized to extract and provide valuable information that can not only be used to understand the teachers-students learning processes, but can also be leveraged to drive the several educational processes forward. For this purpose, the study proposed the Educational Process and Data Mining plus Machine Learning model (EPDM + ML) that was designed based on amalgamation of the text mining and machine learning classifications to analyze the SET data. Technically, the text mining method was applied to understand the extent or intensities (polarization) of the sentiments and emotions expressed by the students when recommending the teachers in the captured SET. While the machine learning classification model was built to predict what the students' recommendation for the teachers may be based on the extracted or quantified data (average sentiment and emotional valence) by considering the students' gender. Theoretically, this study demonstrated that the contemporary idea of applying methods such as the Text mining and Machine learning for educational purposes, is a promising practice or teaching pedagogy. This is due to the fact that the method (text mining and machine learning) can be used to provide a more robust and contextual analysis of the several educational datasets, e.g., the SET, and in consequence, employed by the Educators to not only understand the different patterns or relationships that exist within the datasets, but also utilized to improve the teaching-learning processes at large. The study has applied the EPDM + ML model using the case of SET data collected within a higher education setting to illustrate the application of the different functional elements or components of the proposed method. In practice, the study assumes that the Educationalists must take the additional responsibilities of applying the EPDM + ML model in understating the different activities that underlie the educational processes or teaching practices/performance evaluations in their differ contexts. This would not only guarantee or warrant an efficient approach towards the understanding of the teachers-students experiences, and how well to effectively improve on them. But also, by doing so, the educators would have consequently ensured to put in place a robust and effective teaching analytical method useful for educational process innovation and management. Future works can adopt the proposed model, text mining, and machine learning approach presented in this study, to analyze the various datasets collected about the students' learning processes in the different contextual domains. The further studies can also focus on reconstructing or modifying the proposed model (EPDM + ML) to include other components or functionalities that may have not already been introduced in this paper.

Effects of distance measure choice on k-nearest neighbor classifier performance: A review

Prediction of Student's performance by modelling small dataset size

Students academic and social concerns during COVID-19 pandemic. Education and Information Technologies

Estimation of semiparametric mixed analysis of covariance model

Educational data mining and learning analytics for 21st century higher education: A review and synthesis

Evaluating a blended course for Japanese learners of English: Why Quality Matters

Sentiment analysis on students' real-time feedback

Sentiment analysis: Towards a tool for analysing real-time students feedback

Thematic accuracy quality control by means of a set of multinomials

Identifying potential biasing variables in student evaluation of teaching in a newly ac-credited business program in the UAE

COVID-19 and online teaching in higher education: A case study of Peking University

Sources of teachers' self-efficacy for technology integration from formal, informal, and independent professional learning. Educational Technology Research and Development

Descriptive models of decision making

Using student data: Student-staff collaborative development of compassionate pedagogic interventions based on learning analytics and mentoring

Instructor characteristics and students' evaluation of teaching effectiveness: Evidence from an Italian engineering school

A new significant area: Emotion detection in e-learning using opinion mining techniques

A survey on educational process mining

Twitter mood predicts the stock market

Gender biases in student evaluations of teaching

The use of predictive analytics to shift the culture of academic advising toward a focus on student success

Learning about social learning in moocs: From statistical analysis to generative model

Kappa statistic to measure agreement beyond chance in free-response assessments

Data competence maturity: Developing data-driven decision making

Be careful what you wish for! Learning analytics and the emergence of data-driven practices in higher education

Investigating students' E-Learning attitudes in times of crisis (COVID-19 pandemic). Education and Information Technologies

Descriptive Decision Theory

The green paper needs big data

Critical success factors for implementing learning analytics in higher education: A mixed-method inquiry

Estimation by the nearest neighbor rule

How do gender, learning goals, and forum participation predict persistence in a computer science MOOC?

Big Data and analytics in higher education: Opportunities and challenges

Predictive modeling with big data: Is bigger really better? Big Data

Student centred design of a learning analytics system

Subject Cross Validation in Human Activity Recognition

Sentiment analysis of review datasets using Naïve Bayes' and KNN classifier

Human-centered design principles for actionable learning analytics

Co-Creation strategies for learning analytics

Staff and student views of lecture capture: A qualitative study

Student Opinion Survey (ECOA)-(Encuesta de opinión de los alumnus

Mining opinions in user-generated contents to improve course evaluation. Software Engineering and Computer Systems

A SAS® macro implementation of a multiple comparison post hoc test for a Kruskal-Wallis analysis

Compren-diendo los aspectos culturales y sociales de las competencias digitales docentes

Aligning learning design and learning analytics through instructor involvement: A MOOC case study

Comparing computing professionals' perceptions of importance of skills and knowledge on the job and coverage in undergraduate experiences

Conceptions of design by transdisciplinary educators: Disciplinary background and pedagogical engagement

Learning analytics: Drivers, developments and challenges

Learning analytics community exchange: Evidence hub

Kruskal-Wallis Test. The SAGE encyclopedia of educational research, measurement, and evaluation

Competence of future teachers in the digital security area. Competencia de futuros do-centes en el área de seguridad digital

Digital transformation shaping structural pillars for academic entrepreneurship: A framework proposal and research agenda. Education and Information Technologies

Learning analytics in education: Literature review and case examples from vocational education

Machine learning based supplementary prediction system using K nearest neighbour algorithm

Engaging expectations: Measuring helpfulness as an alternative to student evaluations of teaching. Assessing Writing, 45, 100464

Effectiveness of MOOCs for teachers in safe ICT use training. Efectividad de los MOOC para docentes en el uso seguro de las TIC

Factores que inciden en la evaluación del desempeño docente por los alumnos de nivel superior en la Universidad TecMilenio, campus Ciudad Juárez / Factors for Teacher Performance Assessment for Upper Level Students at University TecMilenio, Campus Ciudad Juarez

Empowering online teachers through predictive learning analytics

A large-scale implementation of predictive learning analytics in higher education: The teachers' role and perspective. Educational Technology Research and Development

For learners, with learners: Identifying indicators for an academic advising dashboard for students

Learning analytics for learning design in online distance learning

How COVID-19 is affecting industry 4.0 and innovation

Learning analytics and higher education: A proposed model for establishing informed consent mechanisms to promote student privacy and autonomy

Sentiment Classification of Movie Reviews by supervised machine learning approaches

The academic, social, and professional integration profiles of information technology students

A text mining examination of University students' learning program posters

Smart technologies for fighting pandemics: The techno-and human-driven approaches in controlling the virus transmission

Building Capacity to use Learnig Analytics to Improve Higher Education in Latin America (LALA Project)

The efficacy of learning analytics interventions in higher education: A systematic review

Text mining for the hotel industry

The significant role of metadata for data marketplaces

Using virtual reality to facilitate learners' creative self-efficacy and intrinsic motivation in an EFL classroom. Education and Information Technologies

Predicting student emotions in computer-human tutoring dialogues

Online teaching self-efficacy during COVID-19: Changes, its associated factors and moderators. Education and Information Technologies

The ideals and reality of participating in a MOOC

Learning analytics stakeholders' expectations in higher education institutions: A literature review

Learning analytics for learning design: A systematic literature review of analytics-driven design to enhance learning

Being on the wrong side of the digital divide": Seeking technological interventions for education in Northeast Nigeria

Learning with big data: The future of education

Sustainability of a university's quality system: Adaptation of the EFQM excellence model

Reliability and validity of the opinion survey carried out to the students to evaluate and provide feedback on the performance of the ITESM-Single Edition teachers

Quality assurance as a driver of information management strategy: Stakeholders' perspectives in higher education

An analysis of students' gaming behaviors in an intelligent tutoring system: Predictors and impacts

The complicity of digital technologies in the marketisation of UK higher education: Exploring the implications of a critical discourse analysis of thirteen national digital teaching and learning strategies

Teaching analytics, value and tools for teacher data literacy: A systematic and tripartite approach

Towards learner-constructed e-learning environments for effective personal learning experiences. Behaviour and Information Technology

Multimodal data to design visual learning analytics for understanding regulation of learning

Combining human computing and machine learning to make sense of big (Aerial) data for disaster response

Impact of Students Evaluation of Teaching: A Text Analysis of the Teachers Qualities by Gender

Sudden change of pedagogy in education driven by COVID-19: Perspectives and evaluation from a developing country

Applying natural language processing capabilities in computerized textual analysis to measure organizational culture

Learning analytics and educational data mining in practice: A systemic literature review of empirical evidence

Exploring autonomous learning capacity from a self-regulated learning perspective using learning analytics

Achieving excellence in customer management. Handbook of CRM

Artificial intelligence in education: Challenges and opportunities for sustainable development

The social life of Learning Analytics: Cluster analysis and the 'performance' of algorithmic education. Learning, Media and Technology

Understanding innovative pedagogies: Key themes to analyse new approaches to teaching and learning

Understanding digitalization and educational change in school by means of activity theory and the levels of learning concept. Education and Information Technologies

Business intelligence in higher education: Enhancing the teaching-learning process with a SRM system

Fleeing from Frankenstein's monster and meeting Kafka on the way: Algorithmic decision-making in higher education

An elephant in the learning analytics room-The obligation to act

Learning analytics: Challenges, paradoxes and opportunities for mega open distance learning institutions

Supporting the development of critical data literacies in higher education: Building blocks for fair data cultures in society

Online University teaching during and after the covid-19 Crisis: Refocusing teacher presence and learning activity

Prerequisites for artificial intelligence in further education: Identification of drivers, barriers, and business models of educational technology companies

Data mining in education

Educational data mining and learning analytics: An updated survey. Wires Data Mining and Knowledge Discovery

Fundamental research statistics for the behavioral sciences

RStudio-RStudio

Características Personales y Práctica Docente de Profesores Universitarios y su Relación con la Evaluación del Desempeño

Higher education instructors' intention to use educational video games: An fsQCA approach. Educational Technology Research and Development

Incorporating computing professionals' know-how: Differences between Assess-ment by students, academics, and professional experts

Teacher's digital competence among final year Pedagogy students in Chile and Uru-guay. Competencia digital docente en estudiantes de último año de Pedagogía de Chile y Uruguay

Learning analytics and higher education: Ethical perspectives

Learning analytics: Ethical issues and dilemmas

Retrieved from http:// model otec21. itesm. mx/ files/ folle tomod elote c21

HyFlex + Tec |The Flexible Digital Plus Model and Virtual-InPerson Learning. Tecnológico de Monterrey

Enhancing pre-service teachers' technological pedagogical content knowledge (TPACK): A mixed-method study

Listening to the voice of students, developing a service quality measuring and evaluating framework for a special course

Text mining analysis of teaching evaluation questionnaires for the selection of outstanding teaching faculty members

Using Twitter in higher education in Spain and the USA

Global citizenship education: Preparing learners for the challenges of the 21st century

Competency based education. Learning portal-planning education for improved learning outcome

Covid-19 Education: From disruption to recovery. School closures caused by Coronavirus (Covid-19)

Global Education Coalition

Process mining: Data science in action

Efficient fuzzy based K-nearest neighbour technique for web services classification

School closure and management practices during coronavirus outbreaks including COVID-19: A rapid systematic review. The Lancet Child and Adolescent Health

MOOC-based flipped learning in higher education: Students' participation, experience and learning performance

Analysis of covariance in randomized trials: More precision and valid confidence intervals, without model assumptions

Challenges for IT-enabled formative assessment of complex 21st Century skills SLO-National institute for curriculum development

Sentiment analysis in MOOC discussion forums: What does it tell us

Predicting women's persistence in computer science-And technology-related majors from high school to college

The hidden architecture of higher education: Building a big data infrastructure for the 'smarter university

Teaching with analytics: Towards a situated model of instructional decision-making

Reliable Accuracy Estimates from k-fold Cross Validation

Innovation in response to the COVID-19 pandemic crisis

Evaluating explorative prediction power of machine learning algorithms for materials discovery using k-fold forward cross-validation

Computer science pedagogical content knowledge: Characterizing teacher performance

Smart tour route planning algorithm based on naïve Bayes interest data mining machine learning

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

The authors would like to acknowledge the technical and financial support of Writ-

Kingsley Okoye 1 · Arturo Arrona-Palacios 1 · Claudia Camacho-Zuñiga 2 · Joaquín Alejandro Guerra Achem 3 · Jose Escamilla 4 · Samira Hosseini 1,5