key: cord-1052053-ln0it5zw
authors: Choudrie, Jyoti; Patil, Shruti; Kotecha, Ketan; Matta, Nikhil; Pappas, Ilias
title: Applying and Understanding an Advanced, Novel Deep Learning Approach: A Covid 19, Text Based, Emotions Analysis Study
date: 2021-06-25
journal: Inf Syst Front
DOI: 10.1007/s10796-021-10152-6
sha: b60e149a732506f1660364006df3fb2411a9f991
doc_id: 1052053
cord_uid: ln0it5zw

The pandemic COVID 19 has altered individuals’ daily lives across the globe. It has led to preventive measures such as physical distancing to be imposed on individuals and led to terms such as ‘lockdown,’ ‘emergency,’ or curfew’ to emerge in various countries. It has affected society, not only physically and financially, but in terms of emotional wellbeing as well. This distress in the human emotional quotient results from multiple factors such as financial implications, family member’s behavior and support, country-specific lockdown protocols, media influence, or fear of the pandemic. For efficient pandemic management, there is a need to understand the emotional variations among individuals, as this will provide insights into public sentiment towards various government pandemic management policies. From our investigations, it was found that individuals have increasingly used different microblogging platforms such as Twitter to remain connected and express their feelings and concerns during the pandemic. However, research in the area of expressed emotional wellbeing during COVID 19 is still growing, which motivated this team to form the aim: To identify, explore and understand globally the emotions expressed during the earlier months of the pandemic COVID 19 by utilizing Deep Learning and Natural language Processing (NLP). For the data collection, over 2 million tweets during February–June 2020 were collected and analyzed using an advanced deep learning technique of Transfer Learning and Robustly Optimized BERT Pretraining Approach (RoBERTa). A Reddit-based standard Emotion Dataset by Crowdflower was utilized for transfer learning. Using RoBERTa and the collated Twitter dataset, a multi-class emotion classifier system was formed. With the implemented methodology, a tweet classification accuracy of 80.33% and an average MCC score of 0.78 was achieved, improving the existing AI-based emotion classification methods. This study explains the novel application of the Roberta model during the pandemic that provided insights into changing emotional wellbeing over time of various citizens worldwide. It also offers novelty for data mining and analytics during this challenging, pandemic era. These insights can be beneficial for formulating effective pandemic management strategies and devising a novel, predictive strategy for the emotional well-being of an entire country’s citizens when facing future unexpected exogenous shocks.

The pandemic COVID19 has become the most significant challenge humanity has faced since World War 2 (WW2). It is reported that Covid 19 has led to more deaths in the United States of America (USA) than both the Pearl Harbor War and September 11 terror attacks (Haltiwenger, 2020) . COVID-19 itself is highly infectious, and the speed by which it can mutate is rapid and in different varieties, with reported six strands of active coronaviruses widely spread worldwide. It had infected more than 17 million of the worldwide population in late July 2020. In early March 2020, the total infected cases were still not reaching 100,000 (WHO, 2020) . This global pandemic's expeditious spread and its resulting mortalities have led to it being identified as one of the deadliest pandemics of the last two centuries.

The outbreak of the novel virus was abrupt and rapid. Governments worldwide have been forced to develop and implement preventive measures like strict social distancing policies during these times. These measures are also known as 'lockdown' in some countries or other countries as a government-enforced curfew or emergency (Lin et al., 2020) . This measure led to individuals being forced to remain indoors within their homes and allowed to go out only in exceptional times for essential items such as grocery shopping. The abrupt change in daily lives has created a multifold, compromising effect on various critical sectors of society such as financial, health, social circles, and environmentally. Each of these variations has caused ripple effects on the mental wellbeing of individuals. Along with medical practitioners, technology has also emerged as a strong support pillar for managing this pandemic situation. Subsequently, immense research efforts and technology developments are occurring to improve COVID 19 symptoms detection Ai et al., 2020; Wang et al., 2020; Sheng et al., 2020; Abdel-Basset et al., 2020a) for understanding the infection curve (Hu et al., 2020; Abdel-Basset et al., 2020b; Kalsi et al., 2018) and enhancing the infrastructure management (Reeves et al., 2020; Keesara et al., 2020) .

The pandemic has also led to research studies understanding its impact on public and healthcare staff sentiments (Rajkumar, 2020; Caleo et al., 2018) . Evidence suggests that anxiety, depression, and stress are common and expected reactions to the COVID 19 pandemic (Rajkumar, 2020) . Recent studies of the effect of similar pandemics on the population indicate that the factors that have contributed the most to reducing the psychological impact of isolation at home were the receipt of clear and consistent information (Caleo et al., 2018; Cava et al., 2005; Lau et al., 2010) , well-defined explanations for seeking social isolation (DiGiovanni et al., 2004) or having social, moral and economic support for those seeking the support, as well as the absence of new contagions (Desclaux et al., 2017) . Additionally, information to the wider population is viewed to be important as it reduces their perception of risk to an epidemic (Rolison & Hanoch, 2015) (Singh et al., 2021) . A unique aspect of this pandemic is the availability and accessibility to Online Social Networks (OSNs) such as Facebook, Twitter, Instagram, etc. OSNs have also been viewed to serve as a communication channel that can be positive and negative. OSNs are positive when sharing correct and valid important information (Gao et al., 2020) . Comparatively, OSNs are negative when viewed as sources of misinformation and fake news (Park et al., 2020) .

Research studies suggest that understanding positive, negative, and neutral sentiments of a particular cluster of people or locations from various sources like OSNs are not enough for understanding the overall picture of society's entire mindset during a global pandemic (Fenwick et al., 2020) . As COVID 19 is a worldwide phenomenon, where possible, global data should be used to examine and inspect for the significant emotional variations in individuals (ibid). By doing so, the much-needed top view insights can be proffered. The study of lockdowns can also be a management studies issue as the pandemic impacts businesses and individuals alike. Management research has had a long tradition of comparing countries as this allows an understanding of and identification of landscape commonalities and differences in dominant national management 'paradigms' or recipes' (Parry et al., 2020; Walker et al., 2014) . A major challenge when performing such an activity manually is processing substantial amounts of generated big data. In such instances, techniques like Machine Learning (ML) and Deep Learning (DL) drawn from Artificial Intelligence (AI) can be used for analysis and understanding, which are proving to be very beneficial.

For the OSN platform that the data was to be collated for this study, Twitter was selected. This was because it is among the most prominently used microblogging social media platforms for thought sharing and is one of the important data sources for researchers when analyzing public emotions and sentiments (Giachanou & Crestani, 2016) . It also offers timeline-based conversations (in question and multiple answer format), which makes it easier to analyze the context of discussions and to identify sentiments and emotions. To ensure that the correct selection was being made, Facebook was the other OSN that was considered as it offers conversations in the form of shared posts either publicly or privately. It also provides replies in likes and comments, but these attributes create hindrances when collating data and cause logical data analysis problems. When comparing the privacy and ease of use aspects for data collection, Twitter has fewer privacy restrictions and offers easier data downloading due to its Application Programming Interface (API) in comparison to Facebook's API. A drawback with Twitter and other such OSN platforms is that of accuracy. Researchers have found that this can be reduced and overcome by ensuring that tweets are pre-processed, sorted and cleansed and then utilized (Liu & Shi, 2019) , a strategy that we utilized.

From a literature review of the pandemic outcomes, measures taken to prevent and cure it, it was found that research of the pandemic and preventive measures such as the lockdown are rare. Further, global studies of the pandemic and preventive measures were fewer. In terms of emotions and AI, sentiment analysis is the closest technique that is utilized in research. Therefore, this research team searched for studies about sentiment, or text based emotion analysis affiliated with the pandemic and lockdown measures. This led to very few studies of this nature, which motivated this team to overcome the gap by forming the aim: To identify, explore and understand globally the emotions expressed during the earlier months of the pandemic COVID 19 by utilizing Deep Learning and Natural language Processing (NLP). To fulfil this aim, the authors employed the text-based Emotions Analysis (EA) technique, which is a rarity in the AI and analytics arena. Emotion Detection and Recognition from text is a recent field of research that is closely related to Sentiment Analysis. Sentiment Analysis aims to detect positive, neutral, or negative feelings from a text; whereas, Emotion Analysis aims to detect and recognize types of feelings through the expression of texts, such as anger, disgust, fear, happiness, sadness, and surprise. Emotion detection may have useful applications, such as gauging the happiness of citizens or understanding the perceptions of consumers (Yean, 2015) .

By determining the stated aim, the intention is to offer an insight into various individuals' emotions around the globe during this challenging era. Recently, sentiment analysis has been employed with Twitter analytics via different methodologies such as lexicon-based (Zhang et al., 2011) , emoticon-based (Liu et al., 2012) (Go et al., 2009) , machine learning-based (Neethu & Rajasree, 2013) (Mendon et al., 2021) , and deep learningbased (Dos Santos & Gatti, 2014) . To minutely analyze multiple emotions at a deeper level from generalized tweets, this paper proposes using a State-of-the-art Natural Language Processing (NLP) model RoBERTa , which is a pre-trained language model developed by Facebook. RoBERTa provided improved results in text emotion analysis when compared to existing pre-trained models such as BERT, DistilBERT (Sanh et al., 2019) in some of the recent research studies (Delobelle et al., 2020; Møller et al., 2020) . This is a novel aspect of AI data analytics. Studies based on text-based emotion analysis are rare within this challenging pandemic era, so we offer a novel contribution in this area.

For academia, this study's benefits are applying DL and a state-of-the-art NLP model RoBERTa for examining and understanding the public reactions during the pandemic. ML and DL are innovations that are presently of immense interest; thus, this study will offer insights into the applications of these innovations during the pandemic. Academic studies on emotion analysis are scarce, with the pandemic's findings even less and studies utilizing AI for an understanding even less; therefore, this study will be beneficial to offer an understanding of emotional well-being from a global perspective amiss in literature. For industry, the benefits of this study are deep insights into emotional wellbeing that the organization's workforce could also be facing but are not aware of. For policymakers, the results of this study replicate the impact of the policy implementations and acceptance by the public.

To inform readers, the following is offered. Following this introduction, section 2 provides an overview of previous literature findings that reveal other earlier sentiment analysis and COVID 19 research. Section 3 elaborates briefly on the research methodology and explains the process utilized for the dataset generation and its attribute details. Section 4 explains the analysis of this study, which is followed by section 5 that offers the findings of the application of text-based emotion analysis on the pandemic dataset. Section 6 offers a discussion and the implications of this study. Section 7 draws the paper to a close by offering the conclusion, limitations, and future directions of this study.

The term social distancing is a relatively older concept; however, its enactment has seen variations with the changing times. Early studies of the preventive measure of social distancing found that it is a multi-faceted intervention where facets or stages unfold as the pandemic impacts society (Kwon et al., 2020) . These facets may change in subsequent studies as deeper insights into COVID 19 evolves. Using the keywords of 'social distancing, Covid 19, and Twitter', several studies related to this one were found. For example, Kwon et al. (2020) identified the facets of social distancing: (1) Purpose and justification. Social distancing is a disruptive nationwide behavioral measure that is being used extensively to bring the pandemic to manageable levels for healthcare systems.

(2) Implementation of social distancing to not only avoid mass gatherings but also to maintain a 6-ft distance amongst individuals. Governments also closed non-essential businesses, restaurants, and in the earlier phase of Covid 19, schools, colleges, and universities. (3) Social activity disruptions impose travel restrictions and emphasize less human face-to-face interactions. (4) Adaptation to social distancing by accepting a new way of life and conducting virtual daily life activities like online schooling, working remotely through teleconferencing, online food shopping, telehealth-based visits, and online entertaining through platforms such as Netflix. (5) Positive emotions and (6) negative emotions facets associated with the emotional response to social distancing. These facets could potentially measure the levels of distress culminating over time due to disrupting social behaviors and activities that are usually associated with mental and emotional wellbeing.

Studies of Covid19, social distancing, and Twitter are few. Saleh et al.'s (2020) study, between March 27 and April 10, 2020, used English-only tweets matching two trending social distancing hashtags, #socialdistancing and #stayathome, is a similar study. By analyzing tweets using NLP and ML models, sentiment analysis was employed to identify emotions and polarity. A sample of 574,903 tweets led the study to identify positive and negative polarity and objective polarity. Approximately half (50.4%) of the tweets primarily expressed joy, and one-fifth (20%) expressed fear and surprise. Fenwick et al. (2020) found that initially and contrary to the view that Covid'19 has led to dangerous misinformation that needs to be regulated more strictly, social media and Twitter had led to the triggering of a more effective policy response based around social distancing, lockdown, and containment. Ahmed et al. (2020) completed a study that evaluated the #FilmYourHospital conspiracy theory on Twitter by attempting to understand the drivers behind it. Twitter data related to the #FilmYourHospital hashtag were retrieved and analyzed using social network analysis across a 7-day period from April 13-20, 2020. The data set consisted of 22,785 tweets and 11,333 Twitter users. The Botometer tool was used to identify accounts with a higher probability of being bots. The most important drivers of the conspiracy theory are ordinary citizens; one of the most influential accounts is a Brexit supporter. We found that YouTube was the information source most linked to by users. The most retweeted post belonged to a verified Twitter user, indicating that the user may have had more influence on the platform. There were a small number of automated accounts (bots) and deleted accounts within the network. Doogan et al. (2020) identified tweets about COVID19 Non-Pharmaceutical Initiatives (NPIs) in six countries and compared the trends in public perceptions and attitudes towards NPIs across these countries. They aimed to identify factors that influenced NPI regimes' public perceptions and attitudes during the early phases of the COVID-19 pandemic. The team analyzed 777,869 English language tweets about COVID 19 NPIs in six countries (Australia, Canada, New Zealand, Ireland, the United Kingdom (UK), and the United States of America (USA)). The relationship between tweet frequencies and case numbers was assessed using a Pearson correlation analysis. Topic modeling was used to isolate tweets about NPIs. A comparative analysis of NPIs between countries was conducted. From the findings, the New Zealand dataset displayed the greatest attention to NPIs, and the USA dataset showed the lowest. Topic modeling produced 131 topics relating to one of 22 NPIs, grouped into seven NPI categories: Personal Protection (n = 15), Social Distancing (n = 9), Testing and Tracing (n = 10), Gathering Restrictions (n = 18), Lockdown (n = 42), Travel Restrictions (n = 14), and Workplace Closures (n = 23). While less restrictive NPIs gained widespread support, more restrictive NPIs were perceived differently between countries. Four characteristics of these regimes were seen to influence public adherence to NPIs: timeliness of implementation, NPI campaign strategies, inconsistent information, and enforcement strategies.

For their research, Wicke and Bolognesi (2020) utilized an analysis of the discourse around #Covid-19 and large tweet numbers posted on Twitter during March and April 2020. They used topic modeling to analyze such topics where the discourse could be classified. Then, a WAR framing was used to refer to specific topics, such as the virus treatment, but not others, such as the effects of social distancing on the population. The WAR frame was then measured and compared to three alternative figurative frames (MONSTER, STORM, and TSUNAMI) and a literal frame used as control (FAMILY). The results revealed that while the FAMILY frame covers a broader portion of the corpus, among the figurative frames, WAR, a highly conventional one, is the frame used most frequently. Yet, this frame does not seem to be apt to elaborate the discourse around some aspects of the current situation. Therefore, it was concluded that a plethora of framing options or a metaphor menu might facilitate the communication of various aspects involved in the Covid-19-related discourse on social media, thereby supporting individuals to express their feelings, opinions, and beliefs during the current pandemic.

Having identified related Covid 19, social distancing, and Twitter studies, an understanding of other studies that utilized sentiment and emotion analysis for their research was formed and presented in the next sub-section. By considering these studies, we identified the contribution of this study, which is applying deep learning to text-based emotion analysis.

With the advent of digital technologies and sizeable online data amounts about individuals becoming available, researchers have begun to study the human thought processes and sentiments for enhancing the consumption of technologybased services (Anderson, 2012; Kolekar et al., 2016) . For this purpose, applications in various fields like affective computing, information sciences, psychology, and marketing management. However, generally, automating the process of emotion detection with good accuracy is still challenging. The two primary sources used profusely for emotion detection are taken from a text or facial expressions/speech. Since the pandemic is still spreading and more data is emerging, we could not pursue a fully-fledged study. We also wanted to ensure that text-based emotion analysis could be applied to emerging and novel data; therefore, we adopted an exploratory study stance. Various social media platforms serve as one of the essential data sources for text-based emotion detection. For this purpose, we employed Twitter's tweets as Twitter is the most sought-after platform due to its opinion sharing model structure. Since 2008, several researchers have presented insights into various techniques based on text-based sentiment analysis. For these studies, keywords, lexicons, emoticons, deep learning algorithms, ensemble models were used (Agrawal & An, 2012; Kaur & Gupta, 2013; Yadollahi et al., 2017; Medhat et al., 2014; Zhang et al., 2018) . Various research projects considered varied text inputs such as tweets, Facebook comments, product reviews, blogs, post texts, and so forth. Various sentiment analysis techniques have been applied to detect emotions, opinions, views, sarcasm, and sentiments depending on these inputs. Binali et al., 2010 , described keyword-based sentiment analysis as a technique to find a correlation between the arrangements of words of a given text to understand the depicted emotion. Al-Ayyoub et al. (2015) proposed the idea of using the unsupervised lexicon-based methodology to analyze the sentiment polarity of user feedback and reviews on specific events that they developed to form a tool. Researchers (Wood & Ruder, 2016; Purver & Battersby, 2012; Suttles & Ide, 2013; Dhaoui et al., 2017 and Vashishtha & Susan, 2019) have also proposed the use of hashtags, emoticons, and emoji as one of the very effective ways of supervised sentiment learning from the social media text. They proved that along with keywords or lexicons, these add-ons helped to increase the classification abilities.

Most of the proposed techniques utilizing Twitter-based sentiment analysis have employed classifiers of AI that are trained using different tweet features. Classifiers such as Support Vector Machines (SVM), Logistic Regression (LR), Random Forest (RF), Naıve Bayes (NB), and Conditional Random Field (CRF) have been preferred extensively, which worked along with unigrams, bigrams, and n-gram feature sets (Davidov et al., 2010; Stojanovski, 2015) . In most sentiment analysis studies, Machine Learning algorithms are used as they work well with a labeled dataset, which is difficult to generate with manual annotations every time an emotion detection domain is considered. Additionally, with the large available datasets, performing a detailed analysis to learn from multiple layers of data representations is difficult with ML algorithms compared to DL algorithms, which suggested a further reason for employing DL. A completed survey study revealed that sentiment analysis can be performed at the document level, aspect level, and sentence level (Zhang et al., 2018) . A document level helps to identify the opinions of the individuals, such as a review of a service or a product. The sentence-level helps identify the positive, negative, or neutral sentiment depicted by each line of the given document. Aspect-based sentiment analysis considers the given text from the perspective of the entities it describes and its feedback. For an efficient sentiment analysis at a fine-grained level, multiple researchers have proposed DL techniques, which led this team to consider using a DL technique too. Alongwith DL, various supervised algorithms such as CNN, RNN, and bi-directional LSTM were employed (Severyn & Moschitti, 2015; Araque et al., 2017; Sohangir et al., 2018) . However, along with sentiments, Bollen et al. (2011) recommended the idea of analyzing the "public mood" via Twitter data. This mood analysis included the classification of six emotional states: happy, alert, sure, vital, kind, and calm. Subsequently, immense research has been conducted in this domain, where, namely, six emotions from Twitter data were identified and classified: happiness, anger, fear, anxiety, sadness, and joy. Table 1 summarizes the techniques and identified emotions found in the literature. Table 2 identifies multiple techniques that have been employed to determine a combination of emotions from social media data. During the COVID 19 pandemic, several research contributions were made in the last few months to detect the preventions and cures of Covid 19 misinformation, to classify the red zone areas where the pandemic was prevalent, to track human mobility, to optimize the resource utilization, or to develop vaccines (Choudrie et al., 2020) . However, the emotional well-being of various professionals and individuals in society has been studied at a minimal level. Since an understanding at a deep and minute level was being sought by this study, a DL, rather than a ML technique was viewed most suitable. Table 2 shows some of the significant studies published recently that relate to COVID 19 and emotion analysis.

Having understood the theoretical aspects of this study, the next section explains how the dataset used for this study was created. The most relevant paper for this proposed work was "TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations," published by Azzouza et al., 2019 . This paper represented the idea of using the BERT pertained model for the sentiment analysis for producing the sentence depictions. The model's accuracy was testing by implementing various classification algorithms such as CNN, LSTM and has achieved the F1 score of 71.82%. In this paper, the authors have trained their model with BERT to classify the sentiments, which inspired authors to implement a similar idea for emotion detection from tweets.

The purpose of the proposed research work was to develop a deep learning model which would automatically identify the emotional tone of the tweet to find out multiple emotion Th ak uran d Jain (2020) •Observational study and Survey Raised a concern over the increased number of suicides during the pandemic due to mental breakdown and highlighted the major causes as stress, economic shutdown, and social isolation. 3

Montemurro (2020)

•Observational study Explained that the pandemic is affecting the medical front line staff and creating a larger impact on secondary health care workers and common individuals due to sudden changes in lifestyles, which led to insecurities at multiple levels. 4

Venigalla et al.

•Twitter API •NLTK Library •Web development framework and libraries for chart visualization Developed a web portal to display daily the mood of Indians based on the Twitter data. Along with this, they also listed the important trigger events for mood analysis.

Du be y( 2020) •Tweepy and RTweet API •NRC emotion lexicon set

Presented a comparative analysis based on Twitter data regarding the public sentiment towards the US and India leaders. 6

Kleinberg et al.

•Direct Survey via a prolific platform for dataset creation. •Linear regression sentiment classifier •TFIDF and POS feature extraction set Developed a real-world worry dataset based on the responses of UK residents.

Further study revealed the major reasons for Worry among individuals: money, family, death, loneliness, etc.

7 Ab d-Al r az a q et al. (2020) •LDA topic modeling algorithm •Unigram and bigram word embeddings •Tweepy

Identified the main topics posted by Twitter users related to the COVID-19 pandemic. The maximum depicted the positive sentiments, and only the tweets pertaining to racism and deaths were negative sentiments. 8

Samuel et al.

•R Programming •ML classifiers Naive Bayes, Logistic regression for sentiment analysis Identified the fear progression sentiments of COVID based on US Twitter data using the R statistical method and two standard ML classifiers, Naive Bayes, Logistic regression. 9

Aslam et al.

•Lexicon based emotion detection and word sentiment polarity detection Extracted the news headlines from 25 top news sources and identified the positive, negative and neutral sentiment of Covid 19 news. Also identified the eight emotions that were depicted. Identified the feeling of depression from the Twitter tweets collected from March to July 2020.

Nemes and Kiss (2020)

Identified tweets into sentiment classes weakly positive, weakly negative, strongly positive, strongly negative from the Twitter data collected between April -May 2020. 

For the emotion analysis, a dataset had to be created that was an emotion-based dataset, and it was used to train and test our emotion analyzer. In this instance, the dataset "Emotion in Text," published by CrowdFlower, was readily available and was utilized for training the AI model. "Emotion in Text" is currently one of the largest open-source emotion datasets available. Being an open-source dataset, it was accessible to everyone without any monetary payment for entry and allowed access much more easily than another paymentbased dataset. The dataset led to 39,740 tweets categorized into 13 categories. These are 'anger', 'boredom', 'empty', 'enthusiasm', 'fun', 'happiness', 'hate', 'love', 'neutral', 'relief', 'sadness', 'surprise' and 'worry'. The dataset had an imbalance of tweets, with many tweets belonging to the 'neutral,' 'anger' and 'worry' category. Thus, of the 39,740 tweets, 3250 tweets were selected such that each category had exactly 250 tweets. The tweets were chosen to include tweets with a complete sentence structure and were not just a random collection of words. Along with this, it was ensured that the tweet portrayed the emotion it was tagged with. Since users' tweets during the pandemic were being categorized, an important emotion of depression was included in the dataset. For the data collection, reference was made to Reddit, a social discussion website. Within Reddit, a subreddit is a topic-specific community that was also consulted. User posts from a subreddit 'r/depression' were collected. This was a community where users watched out for one another and offered support to anyone suffering from depression. Several posts were read, and a total of 250 user posts that portrayed the 'depressed' emotion was collected. The posts had a maximum of 280 characters, which is also a tweet's character limit. After this, both the sampled CrowdFlower dataset and the selected Reddit posts were combined to provide a textual dataset containing tagged emotions. This concluded our emotion dataset with 3500 user posts, and each labeled into one of the 14 emotions. The dataset was used to create our NLP model that classified tweets according to their emotions. All this was possible due to the power of transfer learning and using state-of-the-art NLP language models where a model can acquire knowledge from related tasks and data and do very well on the target task. This data analytics process was also the novel contribution of AI and machine learning that this study provided.

Twitter data was collected using Twitter's API and Tweepy, a python library. Tweepy helps in creating the Twitter bots. These bots are small automated programs that help with fetching tweets from the tweeter API. Tweepy also uses an authentication interface named OAuth, which authorizes users when seeking to download tweets from Twitter. After the data collection, the keywords for downloading the tweet were selected. These were determined from diverse news websites such as BBC.co.uk, CNN.com, WHO, NHS, and other healthcare sector organizations. The team ensured that authentic news websites and other recognized organizaiotns were utilized for this step. The final set of keywords used to collect the tweets were: #Coronavirus, #Corona, #Wuhan, #COVID, #Social Distancing, #Pandemic, #Lockdown, #Epidemic.

To identify the periods that COVID 19 should cover, reference was made to different web sources where Wikipedia data offered the most in-depth information. This led to the formation of Table 3 to illustrate the pandemic's growth (or not). An added reason for selecting the periods was that twitter data revealed a sudden rise in the number of daily corona virus-related tweets in the first five months. This suggested to the team that individuals were still enthusiastic for monitoring and expressing their opinion about the happenings.

From Table 3 , awareness of the nature of Covid 19 was revealed to commence largely from January/ February 2020, and in many of these countries, the volume of tweets increased as the numbers of Covid cases intensified. The countries selected for this study were based on the numbers of internet users, as shown in Table 4 .

Following the selection of the countries, data from the Twitter API was collated. For this, it was known beforehand that the API had limits when requesting the Twitter data. The limitation was that there are 15-min windows for collating the tweets. Each window allowed a maximum of 180 requests to obtain data from Twitter by using a free developer account. With these limitations, 2 million tweets were collated regularly from February to June 2020. The Tweets sampling approach involved a convenience sampling approach that included selecting Tweets written in the English language and the selected keywords that led to an NLP model. Data that were also incorporated were tweets with geo-tags from the list of global countries selected for this study.

Twitter's tweets were considered because Twitter shares individuals' thoughts in short text messages known as tweets. Compared to other OSNs such as Facebook or Instagram, Twitter offers users a facility with fewer images, video data, and indirect thought sharing methods in the form of likes or comments, making it easier to read and understand, particularly when analyzing the data. Further, Twitter's conversations are timeline-based (in a question and multiple answer format) that makes it easier to analyze the context of discussions and identify sentiments towards it.

Once Twitter's tweets were downloaded, the exploratory data analysis phase was pursued. Initially, tweets were viewed for suitability and further accuracy and discounted on the following basis: If Tweets contained only certain random words, incomplete sentences, or had two or three words. The team also discounted tweets because the emotion of tweets may not offer an in-depth overview of the emotions; thereby, leading to a flawed result. This led to an overall selection criterion that tweets and words containing less than five words were removed to ensure uniformity and clarity. This significantly reduced the number of collected tweets. "Retweets" were also removed to avoid duplication. This led to approximately 1.5 million tweets for February, March, April, May, and June 2020. The tweets were then further processed to remove all the HTML texts, '@' mentions, URL links, and #hashtags. This was because Informal Text Communication (ITC) emoticons can express complex emotions such as sarcasm, irony, or non-textual humor by simulating facial cues and surpassing the text (Kelly, 2015; Wolny, 2016) . Thus, an emoticon placed at the end of a text that expressed the exact opposite emotion of a text allowed users to reproduce emotions such as sarcasm or irony, which were removed. This was also on the basis that emoticons with sarcasm are topic-dependent and contextual. What was also learned is that an algorithm needs additional information for classifying sarcasm correctly (Poria et al., 2016) . Thus, analysis of a tweet for its genuine emotion was extremely difficult (Asghar et al., 2018; Yang et al., 2019) . Consequently, emoticons were removed entirely from all the tweets during the pre-processing phase. Following these careful steps, datasets consisting of clean, preprocessed tweets with their required metadata for this research aim were ready to be analyzed and discussed in the next section. Figure 1 shows the overall analytical process of this study. An analytical method used for various data mining purposes until recently is text-based multi-class emotion analysis. To contribute to the deep learning arena, the authors propose a deep learning-based implementation framework that involves preprocessing the collected tweets, analyzing using the latest pretrained model RoberTa, fine-tuning it further by applying the concept of Transfer Learning using Emotion dataset, and then identifying the text-based emotions from the self-created 99 G e r m a n y 9 9 9 9 9 9 9 9 9 9 I n d i a 9 5 9 6 9 7 9 7 9 7 Italy 100 Covid 19 dataset. Therefore, the application of the Roberta model along with transfer learning for multi-class data analytics classification on Covid'19 data along with visualizations was the novelty and contribution of this study. Initially, the Transfer Learning technique was implemented to train the model. This was based on the concept of training the model with an available similar dataset and then testing it on differently curated datasets. A similar available dataset was the standard emotion dataset "Emotion in Text" (Mohammad & Kiritchenko, 2018) . Once it was confirmed that the model was applicable, it was used to analyze users' tweets in the Twitter Covid 19 dataset. Four deep learning classifiers were considered to implement the multi-class classifier: LSTM, Bi- surprise, and worry. These classes were categorized on a monthly basis, emotion-wide, and based on a country. The details of the dataset and classification method are presented in the following section. Determining emotions from a piece of text can be achieved using multi-class text classification. For this study, transfer learning was utilized to build a classifier that detected emotions from tweets. Transfer learning is a technique that utilizes a deep learning model trained on a very large dataset. The large dataset is finetuned by a small dataset that is used to perform a specific task. The pre-trained model is then trained utilizing a massive amount of unlabeled text datasets such as Wikipedia. For this study, RoBERTa, a pre-trained model developed by Facebook AI based on Google's Bidirectional Encoder Representations from Transformers (BERT), was used. BERT is designed to pre-train deep bidirectional representations from an unlabeled text by jointly conditioning the left and right context in all layers. As a result, the pre-trained BERT model was finetuned with just one additional output layer to create state-of-the-art models for a wide range of NLP tasks, and one of them was text classification. BERT was pretrained using MaskedLM and Next Sentence Prediction objectives (Devlin et al., 2018) . When training BERT, BooksCorpus, and English Wikipedia were used. Facebook retrained BERT with a few modifications, including training the model for a longer period and with more data; thereby, removing the next sentence prediction objective, training on longer sequences, and dynamically changing the masking pattern applied to the training data. Additionally, RoBERTa was trained using CommonCrawl News Dataset and OpenWebText, an open-source text corpus of Reddit posts with at least 3 upvotes .

To create the model to predict text-based emotions, the pretrained RoBERTa model was finetuned using an emotions dataset. Fine-tuning was then completed using several methods such as, training the entire architecture, training some layers while freezing others or freezing the entire architecture, attaching a few neural network layers of our own, and training this new model. Since the emotion dataset had a minimal amount of data compared to the pre-trained model, the team decided to freeze the entire architecture in order to prevent updating of model weights during finetuning. To implement transfer learning using RoBERTa, Transformers, a State-ofthe-art Natural Language Processing library developed by HuggingFace Inc. , was employed. The Roberta-base model had 12-layers and approximately 125 M parameters, which were selected for this study. The default arguments were used to train the model for emotion analysis. The RoBERTa-base-uncased tokenizer was utilized to tokenize, generate sentence embeddings and encode the data. AdamW was used as the optimizer to optimize the neural network's weights, as it is an improved version of the Adam optimizer (Loshchilov & Hutter, 2017) . The model was then trained for ten epochs using the emotion dataset to finetune the RoBERTa model. The training was performed with a learning rate of 1e-5, with early stopping methods being used to prevent overfitting the model to the data. The dataset had 3500labelled tweets, and the dataset was divided into a 75/25 split such that 2625 tweets were part of the training dataset and 875 tweets were a part of the testing dataset.

The data processing architecture of RoBERTa is very similar to BERT, except for the tokenizers, pretraining schemes, and training periods are different. Roberta uses byte-level BPE tokenizers with the dynamically masked language model and a longer training period and iterations compared to the currently used pre-trained models. The architecture of the model is shown in Fig. 1 , and to operationalize it, the following steps are needed: 1) Each tweet sentence from the curated Covid'19 dataset was entered as an input to the tokenizer module and had byte pair encoding tokenizer as the first step. For this, the Note: the Roberta model requires space to commence with, added after the start token as a prefix.

prefix_space + tokens + [SEP] + padding 3) Then, the tokens were passed to the pretrained RoberTa model to complete the model training phase. The model was then trained for longer sequences and for larger minibatch sizes with a dynamic masking pattern generation each time and an input sentence being fed into the model. 4) The model consisted of the following layers:

i. The input layer that converts each tweet into a step 2 like representation and then into a numerical vector representation. ii. The attention masking layer that avoids the attention head to consider the padded tokens. iii. Dropout layer that ensured that the model is not overfitted. iv. 2 Conv1D layer was used for the start and the end scores, respectively. v. Softmax activation function was applied to get the index of the selected text.

Thereafter, the input was fed to the classifier that performed the multi-class emotion classification and identified a particular emotion class for each tweet,. A confusion matrix with categories (true positives, false positives, true negative, false negative) was also obtained to evaluate the model performance. 6) The performance of the implemented deep learning model was evaluated for accuracy of the new model and compared with other standard implemented models by considering the following parameters:

i. MCC Coefficient: The Matthew correlation coefficient (MCC) was a measure that showcased the quality of two-class classifications by correlating the coefficient of the predicted and observed classification values. This was on the assumption that if the model prediction achieved good results in all the four confusion matrix categories, then higher values of the MCC coefficient result ranged between −1 to +1. ii. Training Loss: Training loss was a number that indicated the extent of times the model displayed bad predictions on a single tweet example. The loss value varied depending upon the weights and biases set in the model, and the aim was to achieve the lowest possible loss value. The loss values were used to achieve the most optimized model iii. Evaluation Loss: The loss value, in this case, was calculated at the validation stage using the validation dataset portion as the evaluation loss. It was like a training loss, but it was not used to update the weights and biases; instead, it showed the model performance during the testing phase. iv. Accuracy: Accuracy showed a measure of how often our model performed in terms of correct predictions. Its values were calculated based on the confusion matrix as true positives + true negatives /total samples v. Precision: The exactness of the model was depicted in the form of precision, which showcased how often a model assigned a correct class label to the input. It was calculated as true positives / true positives + false positives vi. Recall: The completeness of the model was depicted in the form of Recall, which resulted from a given class, and how often our developed model could correctly make predictions and keep the count of false negatives minimum. It was calculated as: true positives / true positives + false negatives vii. F1-Measure: F1 was a harmonic combination of precision and Recall that showcased how well the model could minimize the values of true negatives and false negatives. It was calculated as 2 * ((precision * recall) / (precision + recall))

For this study, two types of analysis were employed: model analysis and text-based emotional data analysis. Figure 2 shows the model performance analysis completed to evaluate the implemented workflow performance and comparisons of the results with other existing techniques and similar studies. The achieved results obtained an overall accuracy of 80.33% with an F1-measure score of 75.25 and an MCC coefficient value of 0.78 (Fig. 3 ). This is a novel result, as previous studies completed on similar topics were referred to and this study's result was achieved. Similar academic studies are shown in Fig. 4 , and Table 5 , where the dissimilar results are identified and the novelty of this study's model are shown. This also implies that our developed model can make 80.33% Fig. 9 Emotion-wide tweets per million users per country in April accurate predictions. Thus, if an academic team wants to make predictions about the impact of the lockdown using this study's model, they can confidently do so with an 80.33% accurate outcome being obtained (Fig. 5) . The analysis also utilized generated visual representative graphs that studied the impact of COVID 19 on the emotional state of mind. For this purpose, examining twitter sentiments solely from one perspective was not enough. To overcome this, an overall view of Twitter-based worldwide sentiments for each month was employed. This involved considering various countries and the emotions they faced. This multi-faceted analysis helped to understand the impact on various individuals and the ways individual countries managed the pandemic situation from February to June 2020. By focusing mainly on worry, depression, anger, sadness, surprise, relief, hate, and enthusiasm, an indicator was created. The reference for selecting emotions was taken from the National Research Council -NRC Hashtag Emotion Lexicon (Mohammad et al., 2013; Mohammad & Kiritchenko, 2015) that presents a list of words and their association with 8 different emotions (Fig. 6) . The indicator considered the emotions that were pertinent during the pandemic. To ensure that this study's developed model results can be applied to form an understanding aligned with daily lives, ie. Offering a sociotechnical perspective, the data findings were analyzed from three different perspectives:

This part of the study's analysis considered the emotions that individuals displayed in various countries every month between February and June 2020. This insight helped in understanding the variations in emotions occurring each month and in each country. In each month, various countries have exhibited different emotional aggregations. Some of the crucial observations in this analysis were:

Referring to Table 4 and Fig. 7 , it can be learned that 'worry' was extreme in February, particularly in China, where Covid 19 cases had peaked to 11,821, which was not the case in the other countries. However, in March, the other countries also began to face Covid 19 cases, which is particularly Table 4 . However, Fig. 8 does not represent this clearly because the Tweets were not largely in the English language; hence discounted. What was also found is that in February, China exhibited all the major emotions of Worry, sadness, depression, surprise, and enthusiasm. Referring to Table 4, in February, China had 11,821 cases compared to the other countries; therefore, the citizens were worried about the increase in Covid 19 cases, sad and depressed at the same time too. Surprise and enthusiasm were also exhibited as citizens were surprised about the rapid transmission of the virus at first (February 2020). Still, in March, as some news sites mentioned, China had managed to control the virus, which led to enthusiasm (Kathirgugan, 2020) .

In March 2020, Fig. 8 shows that countries like the USA, UK, Ireland, Australia, and New Zealand, South Africa exhibited all the major emotions of Worry, surprise, hate. In Table 4 , the Covid 19 cases indicated a rise in Covid 19 cases, which exhibited Worry, surprise, and hate. In March, Tweets from China were few, which surprised the team, but as Kleinberg (2020) also found, Tweets from China had reduced. This result was also confirmed in our new AI results model, as shown in Fig. 8. Comparing Fig. 8 's China details to those of China, there is a smaller share. Figure 8 also clearly identifies the other countries' shares increasing and disseminating the deadly virus in all the other countries, which was confirmed by the results of Table 4 .

For April, Fig. 9 showed that India exhibited majorly the emotions of Worry, enthusiasm, and surprise, shown in Table 4 . However, as Table 4 shows, Covid 19 cases had also increased for the USA, UK, which suggested the countries moving towards the grey emotions of depression, hate, and sadness. Along with the pandemic, during this month, India faced some unexpected events, such as the Tablighi Jamaat center gathering of devotees (Slater et al., 2020) that caused a 'super spreading' of the virus (Slater et al., 2020) .

In May, the UK faced a spike in Covid 19 cases as identified in Table 4 : 2514 to 171,257. This is also confirmed by our model's results in Fig. 10 , where the UK's emotions were a worry, hate, anger, and surprise. May also witnessed a downfall in the magnitude of all the grey emotions worldwide. This was due to countries beginning to inform citizens of the end of the lockdowns/social distancing (Brueck, 2020) . Therefore, emotions like anger, depression, and sadness began reducing (Fig. 11) . In June, the percentage of 'worry' emotions increased again. Almost all the countries exhibited this emotion on the higher side. Nevertheless, it was observed that, rather than the fear of catching COVID'19, the economic slowdown, employment prospects (Domm, 2020) , and the blurred hope of returning to normality were the main reasons for representing this emotion in this month.

The next part of the analysis considered the months and countries part of this study.

Months in all the Countries (Figs. 12, 13, 14, 15, 16, 17, 18, 19) For this part of the analysis, the overall variations in the emotions over the five months in the selected countries were considered. For instance, it was found that during the pandemic, the otherwise positive emotion of "Surprise" shown in Fig. 19 had a negative shade to it. The emotion of enthusiasm shown in Fig. 17 emphasized the natural human behavior of not losing hope, motivating each other, and continuing the battle to overcome COVID 19. Further, this emotion was majorly impacted by how political leaders and caregivers had interacted with the citizens. For this section, various countries' monthly emotions were represented in graphs. Some of the crucial observations from the analysis were:

The emotions of depression and sadness showed in Figs. 12 and 18 peaked in April when COVID 19 began to affect individuals. This was particularly evident in the UK, India, USA, and New Zealand (Osborne, 2020; FE Online, 2020; CBS, 2020) . After April, the emotions began to subside, with many countries facing these emotions moderately in May and increasing again, although slightly in June. Therefore, the procedure of unlocking revealed a positive sentiment amongst citizens despite individuals realizing the risk of the pandemic numbers increasing rapidly.

The emotion of Worry exhibited in Fig. 13 appeared prominently in February and March and then moderately in April with a drastic reduction in May. The magnitude of Worry has closely followed the infection trend in each country like China that has major worry tweets in February, the USA in March, and then India in April. Besides, the way the public was kept informed about the situation and major steps taken by the government have affected this emotion to a certain extent.

The emotions of anger and hate shown in Figs. 14 and 15 appeared strongly in March and April. The most Fig. 12 Tweets portraying the emotion 'depressed' in various countries on a monthly basis logical reason behind these emotions in these months was the financial distress that everyone started feeling after two months of lockdown in February and March. In addition, the emotion of anger was depicted either by those countries that had a large citizen population, which was causing healthcare service mismanagement or by the countries that lacked good healthcare services and pandemic management arrangements. The emotion of hate was one way of expressing anger against various events occurring during the COVID 19 pandemic. However, the emotion of anger had subsided immensely in June.

The emotions of relief and enthusiasm showed in Figs. 16 and 17 were equally but moderately portrayed by citizens during March, April, and May. This was interpreted as the citizens, notably disadvantaged and lower-income communities, grateful for government support and assistance. Additionally, mortality rates began to decline, and the numbers of recovered cases were on the rise. Notably, India illustrated the emotion of enthusiasm during March, April, and May, which the team associated with the "Modi Effect." (Figs. 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37) This study considered each individual continent and country's emotional voyage along with the month-wide and emotionwide analysis. This data analysis is helpful for countries in various continents when studying the emotional variation of their citizens. It can assist with developing future strategies for their citizen's preparation for the "new normal." These graphs offered an analysis of the five months and depicted the journey of each emotion during this time course. Some of the crucial observations from the analysis were:

The majority of the selected countries exhibited the emotion of "Worry" consistently until March and then began reducing (Figs. 25 to 37) . This was particularly evident in the USA (Fig. 25) , UK (Fig. 27) , Canada (Fig. 26) , Australia (Fig. 30) , Italy (Fig. 34) , UAE (Fig. 35) . However, countries like India (Fig. 28) , France (Fig. 33) , Germany (and New Zealand (Fig. 31) indicated a decline in the emotion of Fig. 13 Tweets portraying the emotion 'worry' in various countries on a monthly basis Worry from April but emerged again in June. Further, China (Fig. 29) is a country that displayed a sharp decline in the emotion of 'worry' from February 2020 that virtually disappeared until June 2020. A country that exhibited a small decline in the emotion of worry from February and remaining constant until a slight peak in June is Brazil (Fig. 32) . A country that was a slight exception was South Africa, where there was a sharp increase in worry from February and remained so in April. From April through to June, worry declined, but not largely (Fig. 37) .

Besides worry, the next strong emotion portrayed by the USA, India, UK, Australia, Canada, Italy, South Africa, and UAE is Hate and Surprise that peaked either in March or in April. This indicated that individuals began to fear the pandemic, which made them worried. Concurrently, the emotion of hate emerged due to unexpected events or human behavior, such as not accepting some of the restrictions or not using face masks or social distancing, which instilled a feeling of negative surprise and displayed individuals struggle to accept the faced situation.

The emotions of hate, sadness, depression and anger increased in April and May (Fig. 19) , which is also confirmed by the data in Table 4 . This was attributed to individuals' displeasure towards various government management measures of the pandemic and the pain of losing dear ones. However, simultaneously, the feelings of relief and enthusiasm were also present, which displayed the positive and collaborative sentiments that were expressed by individuals when supporting each other due to these challenging times.

After this, many of the countries displayed moderate level fluctuation in two grey emotions, i.e., depression, sadness, and two positive emotions, i.e., relief and enthusiasm. This shows that the basic human instincts and our way of living life as "to get up and fight together." Therefore, at one end, people were feeling lonely and depressed, but on the other side, they were trying to support each other and give strength to one another.

The bottom-most and least expressed emotion shown by almost all the countries are "Anger." This indicates that even though members of the public were experiencing discomfort and expressing hate, there was still some sympathy within individuals for the role of government and the limitations that they faced.

To obtain a more generic perspective of the emotions and countries, they were grouped together into continents. When considering the emotion worry, Oceania, Europe, and North American continents clearly displayed a peak in March 2020. Worry peaked for Africa in April 2020. Hate was the second most felt emotion and emerged in fluctuating terms in all the continents (Figs. 20 to 24). Surprise and depression were the next most felt emotions. An emotion that did not stand out as anger, which is a revelation as media sources usually referred to anger within individuals towards Covid 19 as it affected individual's daily lives, relationships, and livelihoods (Smith et al., 2020) .

2) Emotion Wide Analysis

Previous deep learning studies focused on emotions have used observations for their studies Montemurro, 2020; Thakur & Jain, 2020) , Twitter APIs, or Tweepy APIs (Dubey, 2020; Venigalla et al., 2020) (Arolfo et al., 2020) . Our study used deep learning-based NLP techniques with RoBERTa ), a pre-trained language model developed by Facebook to analyze Twitter APIs. This strategy was pursued after our study found from previous literature that many studies utilize sentiment analysis and machine learning techniques such as supervised or unsupervised learning for the understanding of emotions, but they are usually based on certain contexts and sentiment analysis; e.g., Arabic tweets were analyzed with sentiment analysis to identify rumors (Alzanin & Azmi, 2019) . In our study, the novelty was offered by using multiple emotion identification for several months and numerous countries that analyzed the impact of an unexpected exogenous shock event on several countries. Therefore, our study offers novelty for AI and machine learning by focusing not only on the implementation methodology but also on the information obtained by pursuing this type of text-based emotion analysis. By employing this technique, a novel predictive theory about the impact of the pandemic in a multi-country context was developed, which offers immense depth and understanding of the pandemic's impact. For instance, the Fig. 15 Tweets portraying the emotion 'hate' in various countries on a monthly basis framework could determine emotions in certain countries and continents during months and not focus only on a negative or positive stance that sentiment analysis offers.

Using RoBERTa and the Covid 19 context, our study also offered more emotions than the existing emotions that were initially found in archives by researchers such as Jain et al. (2017) or Goel and Thareja (2018) . The emotions that they used for their studies were Happiness, Sadness, Anger, Disgust, Surprise, Fear, Tired, Afraid, Sleepy, Relaxed, Bored, Excited. The authors utilized some of these for the study but found some emotions more specific to the pandemic time period.

In academia, AI is becoming increasingly important as it allows the understanding and exploration of large volumes of data. Our study has shown that using deep learning and NLP concepts drawn from AI, an understanding of the emotions that were expressed in various countries can be obtained. By utilizing a comparative, multi-country study, it was learned that worry is an emotion that was constantly displayed in the months of February to June, with the Asian continent being most worried. This was also confirmed by our research that found: "Since the outbreak of the coronavirus (and the disease it causes, COVID-19) began, reports of racism toward East Asian communities have grown apace", which caused worry in the Asian citizens of the repercussions resulting from these actions (Serhan & Mclaughlin, 2020) . Therefore, employing AI and a multiple countries aspect can offer novel insights into various countries' cultural aspects and citizens.

From the results, an added emotion that was noted is that of surprise. Individuals were surprised by the impact of the pandemic. Although governments have emphasized extreme caution, individuals did not expect the pandemic to impact their livelihoods and lifestyle. For instance, the UN World Tourism Organization (UNWTO) found that travel and tourism were among the most affected sectors of almost every economy due to a massive fall of international demand amid global travel restrictions, including many borders fully closed, in order to contain the virus (UNWTO, 2020) . Employment also became a major source of concern: "Many people have lost their jobs or seen their incomes cut due to the coronavirus crisis. Unemployment rates have increased across major economies Fig. 16 Tweets portraying the emotion 'relief' in various countries on a monthly basis as a result" (BBC, 2020). An added implication and novelty of this study is the application of Roberta when analyzing the tweets that identified the impacts of the preventive norms on Covid 19.

For industry, this research implies that organizations can identify and understand the impact of the lockdown in their respective countries. Further, for organizations seeking information about individuals' reactions to the lockdown, our study provides graphical and detailed textual information, which is amiss in sentiment analysis and Covid 19 studies. Sentiment analysis has been used previously in several ways, including identifying the determinants of usage satisfaction of mobile payments that could enhance service adoption in India, a country that is using mobile devices extensively (Kar, 2020) . Chang (2019) utilized sentiment analysis to devise a model of social influence to help organizations discover influential individuals on social media.

For policymakers, our results have revealed that there is a direct correlation with the pandemic's outbreak, its management strategies, and the emotional state of society. Such an analysis can assist government policymakers to plan future policies and activities for the wellbeing of society, particularly when unexpected exogenous shocks such as the pandemic emerge in society.

The aim of this study was to identify, explore and understand globally the emotions expressed during the earlier months of the pandemic COVID 19 by utilizing Deep Learning and Natural language Processing (NLP). To address the aim, this study utilized and analyzed a global Twitter microblog dataset to explore and understand how across the globe, individuals' emotions have changed between the months of January to June 2020. It also identified and formed an emotion classification that was formed using a state-of-the-art DL technique. The implemented model identified an improved performance when classifying multiple emotions from Twitter texts with increased classification accuracy. With the classification model, this study presented emotion trend analysis from three different perspectives: i.e., month-wide, emotion wide and country-wide. Due to the COVID 19 pandemic, individuals experienced an unusual amalgamation of emotional energies and thought processes, which this study captured and understood using Twitter tweets. Twitter was utilized as it is amongst the most prominently used microblogging ONS platform that is used for thought sharing. It is also one of the important data sources for researchers to analyze public emotions and sentiments. As Twitter is useful to analyze emotions and thought sharing, the authors utilized eight emotions to understand the various emotions of individuals across diverse continents. The emotion that stood out in all the continents was worry, which featured more prominently in the Asian continent from February to June 2020.

Overall, this study's key findings applied a combined AI transfer learning and RoberTa deep learning model approach that provided results with improved accuracy of 80.33% and an MCC score of 0.78. Using these models, the emotion of Worry was more prominent in the earlier months of lockdown, i.e., February and March. The later months of April and May witnessed an increased intensity in the emotions of hate, anger, and depressed emotion that showcased an impact of the financial and emotional stress on individuals. What also became clear was that the month of June began witnessing a peak in the emotion of Worry, which represented the anxiety among individuals regarding the extended lockdown protocol. Throughout the lockdown period, the emotions of enthusiasm and surprise were evident, but in varying proportions, the team attributed to human beings' positive, resilient spirit. The countries that largely expressed their opinions and feelings were the USA, UK, India, Australia, and Ireland from the tweets. Having identified the conclusions of this study, the next section discusses the limitations and future directions of this study.

For this study, Twitter tweets were utilized and are deemed essential for such a study due to the large scale, multi-country, global aspect. Twitter has approximately 160 million daily users, although subscribers to Twitter are increasing daily, with a very small percentage of users being spammers. While Twitter regularly updates its security measures to get rid of spammers, this is still a problem that it faces and can potentially affect the outcome of any Twitter-based research. Therefore, for future research, researchers should be mindful of spammers when considering Twitter's tweets in order to take care of data privacy. (Shaikh & Patil, 2018a; Shaikh & Patil, 2018b) . Fig. 18 Tweets portraying the emotion 'sadness' in various countries on a monthly basis A diverse limitation is that the datasets of tweets that were obtained did not provide all detailed information such as, demographics like age or gender. This implies that errors such as, information incompleteness and representativeness problems could be evident, and could lead to ethical and data security concerns; thereby, leading to biased results. To prevent such issues, the application of an ethics framework as suggested by Chang (2021) could be utilized in future studies and overcome; thus, offering an ethically compliant dataset.

At the time this study was completed, only nine hashtags were considered when collating the Twitter tweets, which were the most used hashtags from February to June 2020. However, as the pandemic has entered a second phase (October 2020), several new hashtags have appeared on a daily or weekly basis. Due to the time limit that authors had placed on this study, such tweets were beyond the scope of this study; therefore, not considered. Future studies could consider these tweets and offer deeper insight into this study. Statista (2020) found that 30.9% of Twitter users are of the age range 25 to 34 years, compared to 12% of the 50 years old and above users. Therefore, a large population Twitter's global population is less than 50 years old. Thus, of the 0.5 million collected tweets, substantial numbers of the users are drawn from a younger demographic sample population. To ensure that a response bias will not arise by focusing only on younger adults, older adults' emotions should also be considered in future studies. The proposed study has mapped the emotion variation to each month or country by considering the occurrence of generalized events. However, it can be reduced further to determine the impact of particular economic, financial or social events on public emotions; e.g., the impact of the financial slowdown, salary reductions, enhanced work/life balance situations on individuals emotions.

Finally, a social media analytics organization revealed that from the beginning of 2020, there was a daily rise in the number of tweets focused on individuals' emotions during the pandemic (Tweetbinder, 2020). From the downloaded tweets, it was found that the maximum numbers of expressed emotions occurred in March 2020. This suggested that individuals were increasingly using Twitter to express their feelings. Also, the emotions variations in these first five months were more evident as such kind of situation was faced by many individuals for the first time in their life so the enthusiasm to closely monitor the surrounding events and express their thoughts was also on the rise. Therefore, the research study analyzes the Twitter data of the first lockdown that lasted for five months (February 2020-June 2020). As the pandemic has spread, there are more global lockdowns, and we propose that a future direction is that the subsequent lockdowns and the expressed emotions could be considered in the future. Prof. Jyoti Choudrie is a Professor of Information Systems in Hertfordshire Business School, Management, Leadership and Organisation (MLO) department where she previously led the Systems Management Research Unit (SyMRU) and currently is a convenor for the Global Work, Economy and Digital Inclusion group. She is also Editorin-Chief for Information, Technology & People journal (An Association of Business School 3 grade journal). In terms of research, Professor Choudrie is known as the Broadband and Digital Inclusion expert in University of Hertfordshire, which was also the case in Brunel University. In both institutions Professor Choudrie maintained an active media profile where she made media comments on the digital divide, social inclusion, entrepreneurship, innovation and broadband development. She has also attained expertise in the non-adopters and adopters research area that has led her to understand the digital divide where her research influence lies. For this, she has used classic Information Systems area theories such as the Technology Acceptance Model and Unified Theory for the Use and Acceptance of Technology and Diffusion of Innovations theory as well as Theory of Planned Behaviour that has led to doctoral completions in Broadband, Online Social Networks and Electronic Government. This interest is maintained and updated by leading a mini-track on ICT adoption, use and diffusion in the Hawaii International Conference of Systems Sciences (HICSS) and European Conference of Information Systems (ECIS). To ensure her research is widely disseminated, Professor Choudrie co-edited a Routledge research monograph with Prof. C. Middleton: The Management of Broadband Technology Innovation and completed a research monograph published by Routledge Publishing and focused on social inclusion along with Dr. Sherah Kurnia and Dr. Panayiota Tsatsou titled: Social Inclusion and Usability of ICT-Enabled Services. In terms of projects, Professor Choudrie is presently researching older adults and Information and Communication Technologies (ICTs) where her interest is on the adoption, use and diffusion of Information and Communication Technologies in older adults, with an emphasis on entrepreneurship and the acceptance and use of innovative ICTs for older adults. This has led to her attending several older adults related global and national workshops and seminars. She is also very focused on examining and understanding the digital inclusion, digital divide aspects by considering internet access for older adults, which has led to her working with AGE (UK) Hertfordshire and a local, Indian radio station, Desi radio. She also works with Age (UK) Hertfordshire, Hertfordshire County Council and Southend YMCA where she is undertaking a Knowledge Transfer Partnership project investing the role of Online Social Networks (OSN). Finally, she is focused on Artificial Intelligence (AI) applications in organizations and society alike, which accounts for her interests in OSN, machine and deep learning. She has been a keynote speaker for the I n t e r n a t i o n a l C o n g r e s s o f I n f o r m a t i o n a n d Communication Technologies, Digital Britain conferences and supervises doctoral students drawn from around the globe.

Dr Shruti Patil has been an industry professional inthe past, currently associated with Symbiosis Institute of Technology as aprofessor and with SCAAI as a research associate,Pune Maharastra. She hascompleted her M.Tech in Computer Science and Ph.D in the domain of Data Privacyfrom Pune University. She has 3 years of industry experience and 10 years ofacademic experience. She has expertise in applying innovative technologysolutions to real world problems. Her research areas include applied artificialintelligence, natural language processing, acoustic AI, adversarial machinelearning, data privacy, digital twin applications, GANS, multimodal dataanalysis. She is currently working in the application domains of healthcare,sentiment analysis, emotion detection and machine simulation via which she isalso guiding several UG, PG and PhD students as a domain expert. She haspublished 30+ research articles in reputed international conferences andScopus/ web of science indexed journals, books.

Ketan Kotecha has expertise and experience of cutting-edgeresearch and projects in AI and deep learning for last 25+years. He haspublished widely in a number of excellent peer-reviewed journals on varioustopics ranging from education policies, teaching learning practices, and AI forall. He is also a team member for the nationwide initiative on AI and deeplearning skilling and research named Leadingindia.ai initiative sponsored bythe Royal Academy of Engineering and the U.K. under Newton Bhabha Fund. He hasworked as an Administrator at Parul University and Nirma University and has anumber of achievements in these roles to his credit. He is currently the Headof the Symbiosis Centre for Applied Artificial Intelligence (SCAAI). He isconsidered a foremost expert in AI and aligned technologies. Additionally, withhis vast and varied experience in administrative roles, he has pioneerededucation technology.

Nikhil Matta received the Bachelor ofTechnology in Computer Science & Engineering from Symbiosis InternationalUniversity, Pune in 2020 and currently pursuing a MS in Computer Science fromthe University of San Francisco. He is enthusiastic about web development, bigdata and applied AI. His research interests include natural languageprocessing, computer vision and reinforcement learning.

Top concerns of tweeters during the COVID-19 pandemic: Infoveillance study

HSMA_WOA: A hybrid novel slime mould algorithm with whale optimization algorithm for tackling the image segmentation problem of chest Xray images

An intelligent framework using disruptive technologies for COVID-19 analysis

Unsupervised emotion detection from text using semantic and syntactic relations

COVID-19 and the "Film Your Hospital" conspiracy theory: Social network analysis of twitter data

Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (COVID-19) in China: A report of 1014 cases

Lexicon-based sentiment analysis of arabic tweets

Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation-maximization. Knowledge-Based Systems

On the nature of thought processes and their relationship to the accumulation of knowledge, part XVI-The process of making a diagnosis

Enhancing deep learning sentiment analysis with ensemble techniques in social applications

Analyzing the quality of twitter data streams

T-SAF: Twitter sentiment analysis framework using a hybrid classification scheme

Sentiments and emotions evoked by news headlines of coronavirus disease (COVID-19) outbreak

Twitterbert: Framework for twitter sentiment analysis based on pre-trained language model representations

Computational approaches for emotion detection in text

The factors affecting household transmission dynamics and community compliance with Ebola control measures: A mixed-methods study in a rural village in Sierra Leone

The experience of quarantine for individuals affected by SARS in Toronto

Lockdown extended for most of coronavirus-battered New York

The impact of emotion: A blended model to estimate influence on social media

An ethical framework for big data and smart cities

Mental health care for medical staff in China during the COVID-19 outbreak

Month-wide tweets portraying various emotions per million users of South Africa Inf Syst Front an-identification-and-understanding

Enhanced sentiment learning using twitter hashtags and smileys

RobBERT: A dutch RoBERTa-based language model

Accepted monitoring or endured quarantine? Ebola contacts' perceptions in Senegal

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Social media sentiment analysis: Lexicon versus machine learning

Factors influencing compliance with quarantine in Toronto during the 2003 SARS outbreak

Korean twitter emotion classification using automatically built emotion lexicons and fine-grained features

Analyzing emotions in twitter during a crisis: A case study of the 2015 Middle East Respiratory Syndrome outbreak in Korea

Fragile economic recovery faces first big test with June jobs report in the week ahead

Public perceptions and attitudes toward COVID-19 nonpharmaceutical interventions across six countries: A topic modeling analysis of twitter data

Deep convolutional neural networks for sentiment analysis of short texts

Decoding the twitter sentiments towards the leadership in the times of COVID-19: A case of USA and India. Available at SSRN 3588623

Lockdown 5.0 Guidelines in India (state-wise): New Lockdown Extension rules announced; night curfew relaxed

Will the world ever be the same after COVID-19? Two lessons from the first global crisis of a digital age

Emotion detection and analysis on social media

Mental health problems and social media exposure during COVID-19 outbreak

Like it or not: A survey of twitter sentiment analysis methods

Twitter sentiment classification using distant supervision. CS224N project report

Emotion analysis of twitter data using hashtag emotions

A sentiment-and-semantics-based approach for emotion detection in textual conversations

The US is likely to have more daily COVID-19 deaths than 9/11 for the next 60 to 90 days, CDC director warns

Artificial intelligence forecasting of covid-19 in china

Crosscultural polarity and emotion detection using sentiment analysis and deep learning-a case study on COVID-19

Extraction of emotions from multilingual text using intelligent text processing and computational linguistics

Emogram: an open-source time sequencebased emotion tracker and its innovative applications

DNA cryptography and deep learning using genetic algorithm with NW algorithm for key generation

What affects usage satisfaction in Mobile payments? Modelling user generated content to develop the "digital service usage satisfaction model

A survey on sentiment analysis and opinion mining techniques

A proposed sentiment analysis deep learning algorithm for analyzing COVID-19 tweets

Covid-19 and health care's digital revolution

Do you know what I mean>:(: A linguistic study of the understanding of emoticons and emojis in text messages

Measuring emotions in the covid-19 real world worry dataset

Sentiment analysis and classification using lexicon-based approach and addressing polarity shift problem

Defining facets of social distancing during the covid-19 pandemic: Twitter analysis

Avoidance behaviors and negative psychological responses in the general population in the initial stage of the H1N1 pandemic in Hong Kong

Analyzing COVID-19 on online social media: Trends, sentiments and emotions

Google searches for the keywords of "wash hands" predict the speed of national spread of COVID-19 outbreak among 21 countries

A survey of sentiment analysis based on transfer learning

Emoticon smoothed language models for twitter sentiment analysis

RoBERTa: A robustly optimized BERT pretraining approach

Sentiment analysis algorithms and applications: A survey

A hybrid approach of machine learning and lexicons to sentiment analysis: Enhanced insights from twitter data of natural disasters

Emotion intensities in tweets

Using hashtags to capture fine emotion categories from tweets

Understanding emotions: A dataset of tweets to study interactions between affect categories

NRC-Canada: Building the state-of-the-art in sentiment analysis of tweets

NLP North at WNUT-2020 task 2: Pre-training versus Ensembling for detection of informative COVID-19 English Tweets

The emotional impact of COVID-19: From medical staff to common people

Sentiment analysis in twitter using machine learning techniques

Social media sentiment analysis based on COVID-19

Coronavirus lockdown eased: What you can and can

Balancing rigour and relevance: The case for methodological pragmatism in conducting large-scale, multi-country and comparative management studies

A deeper look into sarcastic tweets using deep convolutional neural networks

Experimenting with distant supervision for emotion classification

COVID-19 and mental health: A review of the existing literature

Rapid response to COVID-19: Health informatics support for outbreak management in an academic health system

Knowledge and risk perceptions of the Ebola virus in the United States

Understanding public perception of coronavirus disease 2019 (COVID-19) social distancing on twitter

Covid-19 public sentiment insights and machine learning for tweets classification

DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter

The other problematic outbreak: As the coronavirus spreads across the globe, so too does racism. The Atlantic

Role of differential privacy in a new age data privacy environment

A survey on privacy enhanced role based data aggregation via differential privacy

A novel ensemble-based classifier for detecting the COVID-19 disease for infected patients

India confronts its first coronavirus 'super-spreader' -A Muslim missionary group with more than 400 members infected

Big data: Deep learning for financial sentiment analysis

Twitter: Distribution of global audiences 2020, by age group

Twitter sentiment analysis using deep convolutional neural network

Distant supervision for emotion classification with discrete binary values

COVID 2019-suicides: A global psychological pandemic. Brain, behavior, and immunity

We have analyzed how it is been tweeted and the results is amazing: millions of tweets and very interesting information

Impact assessment of the covid-19 outbreak on international tourism

Fuzzy rule based unsupervised sentiment analysis from social media posts

Mood of India during Covid-19-An interactive web portal based on emotion analysis of twitter data

Diversity between and within varieties of capitalism: Transnational survey evidence

Response to COVID-19 in Taiwan: Big data analytics, new technology, and proactive testing

detail/30-01-2020-statement-on-the-second-meetingof-the-international-health-regulations-(2005)-emergencycommittee-regarding-the-outbreak-of-novel-coronavirus-(2019-ncov)

Framing COVID-19: How we conceptualize and discuss the pandemic on twitter

Hugging face's transformers: State-of-the-art natural language processing

Sentiment analysis of twitter data using emoticons and emoji ideograms

Emoji as emotion tags for tweets

Current state of text sentiment analysis from opinion to emotion mining

Social emotional opinion decision with newly coined words and emoticon polarity of social networks services

Emotion detection and recognition from text using deep learning

Combining lexicon-based and learning-based methods for twitter sentiment analysis

Deep learning for sentiment analysis: A survey

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations