key: cord-0499619-41j28omx
authors: Edalati, Maryam
title: The Potential of Machine Learning and NLP for Handling Students' Feedback (A Short Survey)
date: 2020-11-07
journal: nan
DOI: nan
sha: 6e9ff95df65088b7244d62d424f72e29253de67f
doc_id: 499619
cord_uid: 41j28omx

This article provides a review of the literature of students' feedback papers published in recent years employing data mining techniques. In particular, the focus is to highlight those papers which are using either machine learning or deep learning approaches. Student feedback assessment is a hot topic which has attracted a lot of attention in recent times. The importance has increased manyfold due to the recent pandemic outbreak which pushed many colleges and universities to shift teaching from on-campus physical classes to online via eLearning platforms and tools including massive open online courses (MOOCs). Assessing student feedback is even more important now. This short survey paper, therefore, highlights recent trends in the natural language processing domain on the topic of automatic student feedback assessment. It presents techniques commonly utilized in this domain and discusses some future research directions.

The main objective of this article is to review the literature concerning the use of Natural language processing (NLP) in student feedback. Feedback analysis is crucial for higher education since it could operate effectively on the curriculum. Teaching quality would significantly improve if the professors have a clear idea about how well different aspects of teaching such as content, structure, assignment and teaching methods, etc., work.

Traditionally in the universities and colleges, students could give feedback on the university website or in some cases on the printed forms. However, the growth of online course platforms such as MOOC and the issues related to it [1] , for instance, higher dropout rates [2, 3] increased the demand for incorporating students' feedback dramatically. Many higher education institutes and experts have had a strong interest to extract aspects and their related sentiment from this feedbacks [4, 5] . Manual aspects and its related sentiment extraction is time-consuming due to a large number of data. Therefore developing a reliable automated method to extract aspects and related sentiment of the aspect is necessary [6] .

Opinion mining is an effective method that is able to overcome the limitation of the old methods to exploit information that expressed through the feedbacks. Opinion mining (OM) or Sentiment Analysis (SA), extracts the user's opinions (sentiment) from a target text and pointed out their related polarity. In recent years, sentiment analysis have been applied to a variety of tasks including examining the spreading pattern of information and tracking/understating public reaction during given crisis on social media [7, 8] .

SA could be studied at three different levels: document-level, sentence-level, and entity or aspect-level [9, 10] .

The assumption in document and sentence level SA is on that only one topic expressed in the document or sentence. However, in many situations, this is not the case and a precise analysis requires also investigation arXiv:2011.05806v1 [cs.CY] 7 Nov 2020 at the entity and aspect level [4] . In aspect-level SA, first the aspects and opinions are extracted then classified them into similar classes, second the polarity of the opinions determined and summarized the results. Figure 1 demonstrated the classification levels and detailed about aspect-level SA [11] . Figure 1 : A taxonomic view on sentiment analysis techniques [11] .

Although in the recent decade a huge number of study conducted in different classification levels, we narrow down this survey on the aspect level classification between years 2019-2020.

Hemmatian et al. [12] have done a review on opinion mining area and its related classification techniques. Rana et al. [11] surveyed different techniques for the extraction of aspects from online reviews.

One of the forms of distance learning is e-learning. E-learning has become popular in recent years because of development in the technology [13, 14] and the utilization of NLP techniques to create effective learning management systems and elearning platforms [15] . In 2020 due to the COVID19 and its special circumstances, many educational institutes worldwide move completely their operations to e-learning. Massive Open Online Courses known as MOOCs are one of the first e-learning platforms. MOOCS are open-access online courses that allow for unlimited participation, as well as Small Private Online Courses (SPOCs) [16] . MOOCs offer a great platform to collect students feedback on a massive scale and to train and build models on.

The rest of the article is as follows. In Section 2, the most recent works is presented. Section 3 presented the techniques and approaches which have been used to conduct sentiment analysis and Aspect Based Sentiment Analysis (ABSA). Following that in Section 4 and section 5, Dataset Resources and Research Directions discussed respectively. Finally in the Section 6 the conclusion and future directions are presented.

Kastrati et al. [17] proposed a model for aspect-based opinion mining. They collect the dataset containing more than 21 thousand reviews on the Coursera. All reviews were in English. Authors used three different techniques as representation techniques: Term Frequency (TF), and term frequency-inverse document frequency (tf*idf), and word embedding. They first classified the comments based on five aspect categories including Instructor, Content, Structure, Design and General, then the aspects classified based on the polarity [Positive, Negative, and Neutral]. Four conventional machine learning classifiers, namely Decision Tree, Naïve Bayes, SVM, and Boosting and a 1D-CNN model were used. The result shows conventional machine learning techniques achieved better performance than 1D-CNN.

Sangeetha et al. [18] used the Vietnamese students feedback corpus (UIT-VSFC) dataset. The dataset is in Vietnamese that converted to English language for their work. The dataset consisting of 16,175 students feedback sentences. They focus to sentiment classification [positive, negative and neutral]. In their proposed method, input sequences of sentences are processed parallel across multi-head attention layer with fine grained embeddings (Glove and Cove) and tested with different dropout rates to increase the accuracy then the information from both deep multi-layers is fused and fed as input to the LSTM layer. They compered their proposed method with the other baseline models [LSTM, LSTM + ATT, Multi-head attention]. The result shows the proposed method shows better over all result.

Nikoli et al. [19] proposed a method for the Aspect-based sentiment analysis in the Serbian language at the sentence segment level. They used a dataset that contain both official Faculty and online surveys. The dataset labeled for 7 aspect classes [professor, course, lectures, helpfulness, materials, organization and other] and 2 polarity classes [positive, negative]. Authors used term frequency-inverse document frequency (tf*idf) as representation techniques. For classification they used three standard Machine learning multi-class classification model (Support vector machine, k-nearest neighbours (k-NN), and multi-nomial NB (MNB), and a cascade classifier including set of SVM classifiers organized in a cascade structure. The results indicates F score for the aspects classification vary between 0.49 and 0.89. The F score is 0.83 for positive sentiment and 0.94% for negative sentiment.

Sindhu et al. [6] proposed a supervised aspect based opinion mining system based on two-layered LSTM model so that the first layer predicts the six categories of aspects [Teaching Pedagogy, Behavior, Knowledge, Assessment, Experience, and General] and second layer predict polarity [positive, negative, and neutral] of aspect. The authors used a data set of last five years students feedback of Sukkur IBA University for this study. The accuracy for the aspect extraction and sentiment polarity detection are 91% and 93% respectively.

Kastrati et al. [5] proposed an aspect sentiment analysis to automatically identify sentiment or opinion polarity expressed towards a given aspect related to the MOOC. Their methods takes advantage of the weak supervision strategy to train a deep network to automatically identify the aspects on MOOC by using either very few or even no manual annotations. In addition, the proposed framework examines the sentiment towards the aspects commented on a given review. The result shows F1 score of 86.13% and 82.10% for aspect category identification and aspect sentiment classification respectively.

Lundqvist et al. [20] have conducted a research to evaluate student's feedback within a MOOC. They worked on the dataset containing 25,000 reviews from MOOC's users. The participants divided into three groups (beginner, experienced, and unknown) based on their level of prior experience. The VADER (Valence Aware Dictionary for sEntiment Reasoning) sentiment algorithm was used for sentiment analysis.

Shaikh et al. [21] proposed a two-step strategy based on Machine Learning and Natural Language Processing (NLP) techniques to extract the aspect and polarity of the feedback text respectively. They used 10,000 labeled student feedbacks collected at Sukkur IBA University Pakistan. Their methods is divided into three main steps. In the first step, the student feedbacks are classified into the teacher or course entity by using the Naive Bayes Multinomial classifier. Once the entity has been extracted, a rule-based system developed to analyze and extract aspects as well as opinion words from the text by using predefined rules. In the final step, the authors used SentiWordNet to extract the sentiment regarding extracted aspects. The result shows first, the overall precision of 83.89% and 84% on the teacher and course entity respectively. Second the overall precision of 83% and recall of 80% for extracting different aspects of teacher and course and finally 90% accuracy for the sentiment classification.

Authors in [22] conducted a document-level sentiment analysis. They used a dataset containing 191 students' feedback. The data analyzed through the Orange environment (an open source tool for machine learning and data visualization). They made a comparative study of the obtained results using the Ekman and Plutchik models. Five emotions anger, disgust, fear, joy, sadness and surprise have been studied. The result shows almost 37% of the documents were marked identically by the two models. Table 1 lists some of the reviewed papers detailing the level, review representation, classification model, and the Language. 

Despite a variety of models that have been used in different researches all the papers used the same architecture for extraction and classification of the aspect. Figure 3 illustrated the different steps in aspect-based SA. Aspect-based SA generally consists of four main steps: Pre-processing, Review Representation, Aspect Extraction, and Aspect Sentiment classification. We will discuss all these steps separately in the following subsections.

After collecting the students' feedback from either the online platforms or traditional classrooms, the raw data included html tags (for reviews were scraped from the web) [17] , spelling mistakes, emoticons, icons, and unwanted symbols like (!,@,&,#,}), repeated letters, numbers and stop words [18] . In order to prepare the raw data for the feature extraction step, these unwanted elements need to removed from the data and lower-casing the sentences. In the end, the text normalization (reducing words to their word root part) is done by either stemming [17, 23, 24] and lemmatization. Nikolić et al. [19] have done the preprocessing by removing stop words (the most common words in the Serbian language), stemming and various n-gram (contiguous sequences of n words) lengths and frequencies.

Since machine learning/deep learning algorithm couldn't feed by the text, the text should convert to the numerical format (vector). There are many algorithms that can be used to convert text data to vector of numbers: Bag-of-Words (BoW) [17, 25] , term frequency(tf) [17, 26] , term frequency inverse document frequency(tf*idf) [17, 19, 23, 27, 28] , Word embedding. Semantics, on the other hand, can play a vital role in identifying correct aspects employing objective metric [29] . These are discussed in next section.

Semantics has been widely used in the education domain, for instance, a document level review can be enriched with semantics [30] using background knowledge provided by an ontology [31] and through the acquisition of its relevant terminology [32] . Further objective metrics [33] could be exploited for sentiment analysis. It is used to enrich existing concepts in domain ontologies for describing and organizing multimedia documents [34] . Semantics can be represented as word embeddings, that are one of the key breakthroughs in the NLP. Word Embeddings are d-dimensional space representations of words' distributed. As represented in Figure 3 word embeddings are the first step in DL architecture after the pre-processing step. The output of the word embedding step used as input for the DL architecture for either extracting the aspect or classification of sentiment. Following pre-trained word embedding Word2Vec [35] which have developed by Google in 2013, other word embedding methods such as GloVe [36] and fastText [37] developed by Stanford University and Facebook respectively. In 2018 Devlin et al. [38] developed Bidirectional Encoder Representations from Transformers (BERT).

Word2Vec [5, 6, 17] , GloVe [5, 17, 18, 39] , FastText [5, 17] are the most used word embedding in student feedback SA. Estrada et al. [24] and Sangeetha et al. [18] used BERT and Contextualized word vector for embedding respectively. 

Generally, the classification techniques for SA divide into two main categories:

• Supervised methods. In this method labels according to the corresponding characteristics are available and the model would be learned by using the labels.

• Unsupervised methods. Unsupervised methods try to classify data by discovering patterns in unlabeled data.

When it comes to SA, classification methods divided into two main groups: Conventional ML methods and deep learning methods. Figure 3 illustrates the comparison between the conventional ML techniques' steps and deep learning techniques' steps. Finding an effective method for aspect extraction and aspect polarities classification is a challenging task. Many pieces of research conducted on the comparison between different methods. In comparison between the conventional machine learning methods and deep learning methods, conventional methods such as Decision Tree [17] , Logistic Regression [26] and evolutionary approach called EvoMSA [24] show better results. Conventional ML methods such as Support vector machine (SVM) [17, 19, 23, 24, [26] [27] [28] , Naïve Bayes (NB) [17, 19, 21, 24, 28] , Random Forest (RF) [24, 26] , Decision Tree (DT) [17, 28] and, K-Nearest Neighbor (KNN) [19, 28] are the most popular methods for SA. Other conventional machine learning methods such as Boosting [17] , Bernoulli Naive bayes [24] , K-means [23] , logistic model tree (LMT) [26] more or less used in the different researches.

Deep Learning Techniques used widely in SA because they are nonlinear and could process a large amount of data natural data in their raw form [40] . Deep learning is a machine learning method that allows computational models with multiple processing layers to learn representations of data with multiple levels of abstraction [40] . As illustrated in Figure 3 , Deep Neural Network (DNN) methods for SA consist of three parts: dense word embedding, multiple hidden layers between the input and output and output units [4] .

The limitation of the other deep learning architectures such Recurrent Neural Networks (RNN) has been improved with the introduction of networks such as long short-term memory (LSTM) [4] . The authors in [5, 6, 24-26, 39, 41] used LSTM architecture. The basis of LSTM is a memory cell that controls the read, write and reset operations of its internal state through output, input and forget gates [4] . Convolution neural networks (CNN) are the other most used architecture [5, 17, 24] . Since CNN is a a non-linear supervised model, it could better fit the data compare with the linear model and it also does not need extensive handcrafted features such as fixed language rules [42] . Poria et al. [42] also showed that deep CNN is more efficient for aspect extraction than existing approaches.

Year Level Review Representation [5] students' reviews that are collected from Coursera Sangeetha et al. [18] Vietnamese students feedback corpus (UIT-VSFC) Nikoli et al. [19] Official student surveys and online reviews Sindhu et al. [6] students feedback of Sukkur IBA University Kastrati et al. [17] Students Reviews of MOOCs Shaikh et al. [21] student feed backs collected at Sukkur IBA University Pakistan Marcu et al. [22] the opinions of students from eleven high schools in Suceava Moharil et al. [27] 6942 students' reviews collected from university feedback form Estrada et al. [24] 2200 words, 24556 opinions, 12084 opinions collected from senti-TEXT, eduSERE, platform SERE, SenttiDict. Kandhro et al. [25] 3000 students' reviews collected from university feedback form Lwin et al. [26] 3000 students' reviews collected from university feedback form Mostafa [39] 16175 sentences of students' reviews Nimala & Jebakumar [41] 16175 students sentences from survey Srinivas & Rajendran [23] Course Forum and Questionnaire

Hariyani et al. [28] Course Questionnaire Table 2 : Datasets used in the reviewed studies

Most of the researchers in the students' feedbacks SA domains created their own dataset as no benchmark dataset is available publicly. There are two main sources for collecting students' reviews:

• Classrooms' reviews.

• E-learning's reviews.

In light of online education development, abundant data have produced every day. One of these online educational platforms is Massive Open Online Courses-MOOCs. Many pieces of research conducted on the students' feedback on MOOCs [5, 17, 20] . Marcu et al. [22] for reflecting the characteristics of Romanian high-school students, They collect their data from eleven high schools in Suceava. They asked students age between 16 to 18 answer the question "Please describe in a few words, honestly, how you feel about your school" in a google doc. The answers were in Romanian and for further analysis, they translate them into English. After removing irrelevant feed backs 191 records used for data mining.

Other datasets that used in the students' feedback domain elaborated in the Table 2 .

There are several challenges and limitations in students' feedbacks sentiment analysis. Some of the open issues can be listed as follows:

• Lack of benchmark datasets. Despite the diverse nature of datasets, there is no open access dataset and neither any benchmark available. Social networking and collaboration tools used in education [43] can be valuable source of acquiring students' feedback.

• Lack of resources like lexica, corpora, dictionaries for low-resource languages (most of the studies are conducted in English or Chinese language).

• Identifying figurative speech like sarcasm, irony, from text.

• Generalization -Most of the techniques are domain-specific and thus do not perform well in different domains.

• Incapability to handle complex language involving constructs such as double negatives, unknown proper names, abbreviations etc.

• Contextualization and conceptualization of sentiment. Techniques, especially machine learning/deep learning, developed for sentiment analysis need to focus on incorporating the semantic context using sources such as Wordnet, SentiWordnet [44] or semantically representation using ontologies [32, 45] for better grasping one's opinion from the text.

With increasing the demand for online education on one hand and distance study due to the COVID19 pandemic, on the other hand, students' feedback analysis is the most vital task for professors and educational institutes. Hence aspect extraction and sentiments analysis of students' feedback is a hot domain for researchers. There is a huge amount of research conducted on the sentiment analysis of the student feedbacks. Therefore to prevent repetition and giving the latest methods of opinion mining that used in the literature this article focused on the students' feedback papers published in recent years. The findings suggest to create a publicly available benchmark data set, need to incorporate semantics and advanced embedding models for aspect extraction and sentiment analysis.

Towards understanding the MOOC trend: pedagogical challenges and business opportunities

MOOC dropout prediction using machine learning techniques: Review and research challenges

Predicting student dropout in a MOOC: An evaluation of a deep neural network model

Deep learning for aspect-based sentiment analysis: A comparative review

Weakly supervised framework for aspect-based sentiment analysis on students' reviews of MOOCs

Aspectbased opinion mining on student's feedback for faculty teaching performance evaluation

Cross-cultural polarity and emotion detection using sentiment analysis and deep learning on COVID-19 related tweets

Informational flow on twitter -corona virus outbreak -topic modelling approach

Mining and summarizing customer reviews

Sentiment analysis and opinion mining

Aspect extraction in sentiment analysis: comparative analysis and survey

A survey on classification techniques for opinion mining and sentiment analysis

Multimedia learning objects framework for e-learning

HIP-a technology-rich and interactive multimedia pedagogical platform

Semantic tags for lecture videos

Higher education and the digital revolution: About moocs, spocs, social media, and the cookie monster

Aspect-based opinion mining of students' reviews on online courses

Sentiment analysis of student feedback using multi-head attention fusion model of word and context embedding for lstm

Aspect-based sentiment analysis of reviews in the domain of higher education

Evaluation of student feedback within a mooc using sentiment analysis and target groups

Aspects based opinion mining for teacher and course evaluation

Sentiment analysis from students' feedback : A romanian high school case study

Topic-based knowledge mining of online student reviews for strategic planning in universities

Opinion mining and emotion recognition applied to learning environments

Performance analysis of hyperparameters on a sentiment analysis model

Feedback analysis in outcome base education using machine learning

Integrated feedback analysis and moderation platform using natural language processing

Mining student feedback to improve the quality of higher education through multi label classification, sentiment analysis, and trend topic

SEMCON: semantic and contextual objective metric

A general framework for text document classification using SEMCON and ACVSR

Using context-aware and semantic similarity based model to enrich ontology concepts

The impact of deep learning on document classification using semantically rich representations

Semcon: a semantic and contextual objective metric for enriching domain ontology concepts

Integrating word embeddings and document topics with deep learning in a video classification framework

Efficient estimation of word representations in vector space

Glove: Global vectors for word representation

Enriching word vectors with subword information

BERT: pre-training of deep bidirectional transformers for language understanding

Student sentiment analysis using gamification for education context

Deep learning

Sentiment topic emotion model on students feedback for educational benefits and practices

Aspect extraction for opinion mining with a deep convolutional neural network

An analysis of social collaboration and networking tools in elearning

Enriching semantic knowledge bases for opinion mining in big data applications

An improved concept vector space model for ontology based classification