key: cord-1008498-0ppdey6y authors: Chakraborty, Koyel; Bhatia, Surbhi; Bhattacharyya, Siddhartha; Platos, Jan; Bag, Rajib; Hassanien, Aboul Ella title: Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers—A study to show how popularity is affecting accuracy in social media date: 2020-09-28 journal: Appl Soft Comput DOI: 10.1016/j.asoc.2020.106754 sha: e9c7f15bf73b4b9432a2a72ae3ad81ec30af58e2 doc_id: 1008498 cord_uid: 0ppdey6y COVID-19 originally known as Corona VIrus Disease of 2019, has been declared as a pandemic by World Health Organization (WHO) on 11th March 2020. Unprecedented pressures have mounted on each country to make compelling requisites for controlling the population by assessing the cases and properly utilizing available resources. The rapid number of exponential cases globally has become the apprehension of panic, fear and anxiety among people. The mental and physical health of the global population is found to be directly proportional to this pandemic disease. The current situation has reported more than twenty four million people being tested positive worldwide as of 27th August, 2020 Therefore, it’s the need of the hour to implement different measures to safeguard the countries by demystifying the pertinent facts and information. This paper aims to bring out the fact that tweets containing all handles related to COVID-19 and WHO have been unsuccessful in guiding people around this pandemic outbreak appositely. This study analyses two types of tweets gathered during the pandemic times. In one case, around twenty three thousand most re-tweeted tweets within the time span from1st Jan 2019 to 23rd March 2020 have been analysed and observation says that the maximum number of the tweets portrays neutral or negative sentiments. On the other hand, a dataset containing 226668 tweets collected within the time span between December 2019 and May 2020 have been analysed which contrastingly show that there were a maximum number of positive and neutral tweets tweeted by netizens. The research demonstrates that though people have tweeted mostly positive regarding COVID-19, yet netizens were busy engrossed in re-tweeting the negative tweets and that no useful words could be found in WordCloud or computations using word frequency in tweets. The claims have been validated through a proposed model using deep learning classifiers with admissible accuracy up to 81%. Apart from these the authors have proposed the implementation of a Gaussian membership function based fuzzy rule base to correctly identify sentiments from tweets. The accuracy for the said model yields up to a permissible rate of 79%. is not just an infectious disease which is transmitted through contact and by small droplets produced when people cough, sneeze or talk, it is now becoming a source of depression, stress and anxiety because of misleading 55 information posted on social media. The mental health is directly affected because of the rapid spread of false information on social media. With the current situation of lockdown and social distancing, the prime dependency of individuals is on Internet and the highest activity has been reported [1] on social media. The statistics clearly shows the graph in Fig.1 depicting the increase in data usage of internet globally. 60 Social media has become a huge part of our life. It connects people to the outer world. Social media provides a way to showcase our lives, discretely, conveniently J o u r n a l P r e -p r o o f and on our own terms. People rely more on the posts and tweets shared on the social networking sites like Twitter®, Facebook®, and Instagram®. It is anticipated that social media should guide people in getting correct and authentic information on 65 Corona cases, but analyzing the posts, it has been seen that most of them have misled people by posting false data and figures. Fig.1 Internet users in the World: 2020 [1] 70 Social media is not allowing people to get through this disaster; rather the tweets and opinions on COVID-19 are becoming dangerous and a cause of concern which needs to be raised to handle misleading information from different sources. This paper focuses on the fact that people should stop sharing popular data in social media as 75 they might harm the overall impact during emergencies. People should be sensible enough to share those data which might be of help to anybody in general. Premium agencies should take special care and attention in deploying fact checkers through their media so that huge amount of unwanted information can be barred from spreading in crisis periods. 80 This paper deals with an extensive analysis of the sentiments emoted through the tweets since the beginning of this year pertaining to the novel COVID-19. The research begins with the collection of tweets from various COVID-19 related handles J o u r n a l P r e -p r o o f followed by extensive cleaning of the same. Two sets of datasets have been used in this work. While one dataset contains all the tweets that have been published during 85 the time span between December 2019 to May 2020, priority has been given to the second set of tweets that have been retweeted the most. Further, the sentiments of the tweets have been labelled and WordCloud is presented for every sentiment to show the ineffectiveness of social media in this time of distress. The most frequent words used in the texts have also been found out mathematically to portray how people are 90 being led to nowhere following tweets doing their rounds in Twitter®. The motivation of the paper lies in alerting the society that the extensive usage of social media worldwide needs to be restricted as it is becoming instrumental in disseminating useless information just as the pandemic had spread into humanity. The novelty of the work lies in proposing a model to deploy a Gaussian fuzzy rule based 95 technique to evaluate the sentiments expressed in the tweets. As a contribution to the society, this paper has demonstrated with facts that how it is shocking that though people are initially sharing positive and neutral data but further users are re-tweeting those tweets which are negative in nature. This paper emphasizes with substantial proof the need to employ "monitoring mechanisms" to prevent negative psychology 100 from being disseminated within the minds of the social media users. The work done in the paper also relates to the title as how instead of the magnanimous popularity of social media, it is leading to the dwindling of tweets and for which the same cannot be considered as a trusted source. State-of-the-art deep learning classifiers have been applied on the word vectors and doc2vec models and have been tested to find out the 105 model yielding the best accuracy. The implemented model gives an accuracy of above 75% on both the datasets used for this purpose. Other than the above, this paper proposes the implementation of fuzzy logic for taming the fuzziness of sentiments. As fuzzy sets are ideally suited to counter the ambiguities in life, the authors have proposed the initial integration of fuzzy logic in 110 effectively handling the sentiment identification of tweets. Fuzzy logic has proved to be efficient in handling intricate problems efficiently in diverse fields of business. To be precise, the present work is relevant in the present context as regards to the following approaches incorporated. J o u r n a l P r e -p r o o f a) While labelling the sentiments, real values have been considered while 115 classifying them to positive, negative and neutral sentiment scores. Through this, the raw data is being handled rationally to unearth the sentiments of the social media users. b) A fuzzy rule based model has been proposed to handle the uncertainty prevalent in the raw sentiments which would otherwise get truncated while rounding up the 120 values during labelling into various classes of positive, negative and neutral sentiments. Gaussian membership function has been used to characterize the fuzziness of the model and has been compared to the popular triangular membership function. c) We have implemented various state-of-the-art deep learning classifiers to extract the actual sentiment of social media users during this pandemic. 125 The rest of the paper is organized as follows. Section 2 provides a background of the work selected elucidating the basics of social media, the motivation behind this work, a brief introduction on sentiment analysis and emotional intelligence and related works. Section 3 details the results obtained by implementation of the state-ofthe-art classifiers along with those obtained with the proposed fuzzy rule based 130 technique for determining sentiments. The experimental results have been described in Section 4. Section 5 presents the discussions on the methods implemented. Finally, Section 6 draws a conclusion to the paper with future directions of research. wondering individuals what will come next. This situation can create panic and make individual feel afraid, overwhelmed, and helpless. While the threat is real, fear and having our emotions run amok will make the situation even worse. Uncertainty and 140 anxiety go hand-in-hand, according to experts at the Yale Center [2] for Emotional Intelligence (CEI), and that is why the many unknowns about the Corona Virus pandemic, when cases will peak, when schools will reopen, when it will be safe to visit loved ones, are creating widespread anxiety. In fact, people should adhere to strategies that can help mitigate anxiety as they are socially distancing and are briefed 145 with constant pandemic updates. In a series of webinars beginning March 25, 2020 CEI experts have been addressing ways of maintaining emotional health, regulating emotions, and developing resilience using emotional intelligence strategies. It is to admit that pandemics such as Corona Virus or COVID-19 happen once in a 150 lifetime, and hence the methods or measures to tackle the same are yet undefined. While some countries have been successful in combating the outbreak, there are some countries which have failed miserably to tackle the grave situation. Pertinent to the times that we are residing in, it is inevitable that social media will play an important part in our lives [3] . In the serious times of social distancing, where the governments 155 have imposed serious lockdowns and have made it mandatory that no people should move out of their houses even for the slightest of needs, it would have been better if social platforms would guide us in the right direction in these times of need. Contrary to expectations, it is being perceived that people have often indulged in sharing inappropriate content or misinformation through social media platforms. 160 J o u r n a l P r e -p r o o f Instead, it would be right if meaningful data could be shared so that people, who are turning into social media for appropriate content, could stay updated with the latest trends in developments regarding this dangerous disease that has grasped the entire world. It is alarming that these inaccurate materials are being shared in even wider circles leading to mental harassment of the people in general. It can be said that the 165 advent of this novel Corona Virus has opened doors for an entire new set of problems posing social media to be a nightmare. Dangerous techniques of spreading cures for the disease, deceptive declarations, oily trading pitches and various plots and schemes are doing the rounds of social media at a risky pace. The social media platform at this moment which could prioritize the sharing of meaningful content is Twitter®. Twitter® is one of the most trendy micro blogging sites, which is considered as a crucial depository of sentiment analysis [4] . Netizens tweet their expressions within allotted 140 characters. This work is conducted with two different data sets, the first one comprising all the unique tweets that have been 175 tweeted during the phase of the pandemic from December 2019 to May 2020. A total of 2, 26, 688 tweets have been collected from IEEE Dataport [5] . For the second case, data from Twitter® have been collected for this work between the span of 1st January, 2020 to 23rd March, 2020. For this, a total of 31, 50, 26, 574 tweets was downloaded using Tweet Binder [6] . The share of tweets per day has been amassing during the 180 mentioned period as is evident from various internet sources. While initially there had been less engagement in social media evident by the lower number of tweets that had surfaced, but eventually the rise has been alarming. It is anticipated from the smartest of all beings on earth, that people will obviously be uploading or sharing information that is of meaning and that which is accurate, so that it comes of help to the people 185 following them on social media. The unique dataset has hence been analyzed and presented to support the claim that though people uniquely are spreading positive vibes pertaining to the pandemic but the tweets those are negative or neutral in nature are being re-shared most. J o u r n a l P r e -p r o o f Hence, complete unique tweets and the tweets that have at least 1,000 re-tweets based 190 on the logic that accurate and not popular messages are being seen, shared or followed at par with outbreak of Corona Virus in the world, have been considered in this paper. The first dataset contains distinctive 2, 26, 668 tweets crawled through 'corona', 'covid', 'sarscov2', 'covid19' and 'coronavirus ' keywords. Coming to the second data set, 23, 000 tweets have been hence considered which has a minimum of 195 1, 000 re-tweets. The handles used to extract the tweets have been #covid19 OR #coronavirus OR coronavirus OR #covid-19. It has been seen that the most retweeted users in Twitter® are @realDonalTrump, @WHO etc. The motivation behind this work has been to portray the fact that how irrationally people are behaving in the grim times of a pandemic. This particular pandemic seems 200 to have claimed millions of lives and is leading the world to an utter recession in terms of economy. It would have been easier for the victims at this time to have gathered some structured information from social media. This call in for a strict quality check of the tweets to ensure that meaningful content are shared in these most used social networking sites. The outcome of this paper will lie in the initiation of fact 205 checking implied on social sites before its wide sharing, so that false and inappropriate news can be prevented from being disseminated within netizens. Sentiment analysis is the gathering of people's views regarding any event happening in real life. In such situations in which the world is currently going 210 through, understanding the emotions of the people stands extremely important. The grave scenario wherein people cannot go out of their houses demands exploring what the people is actually being thinking about the whole scenario. Hence, the authors have planned this work on understanding the demanding situation especially on social media [7] . 215 Emotional Intelligence (EQ or can be also be abbreviated as EI) on the other hand is the skill to appreciate, supervise and perceive the mental situation of fellow humans so that it can be entangled with your own to be able to converse efficiently, alleviate anxiety and resolve clashes [8] . EQ brings self control in oneself, awareness J o u r n a l P r e -p r o o f of strengths and weaknesses, empathizing with co-workers and helps to manage 220 healthy relationships. Although sentiment analysis and emotional intelligence are used synonymously, they are not the same. While sentiment analysis depends primarily on the data to categorize expressions as positive, negative or neutral, EQ further probes into the subtleness of the emotions articulated through the comments. EQ is much 225 more difficult and multifaceted than sentiment analysis. For example, for a particular comment, sentiment analysis will check whether the same is positive, negative or neutral, but emotional intelligence will further check whether the comment leads to sadness, dissatisfaction, or sarcasm if the comment is found out to be negative. Thus, EQ dives deep into the categorization done by sentiment analysis tools 230 As Novel Corona Pneumonia (NCP) reports a major warning to the international population health, cooperation is required by all the countries to combat it [9] . The .For example, online rumors accusing 5G deployments for causing COVID-19 led to mobile phone masts being attacked in the UK [28] . Wikipedia maintains an up-to-date 325 list of misinformation surrounding COVID-19. This confirms the spread of a number of dangerous forms of misinformation, e.g., that vinegar is more effective than hand sanitiser against the Corona Virus. Naturally, users who believe in such misinformation could proceed to undermine public health. One important use case would therefore be to develop classifiers and techniques to stem this flow. For 330 example, the study in [29] demonstrates testing simple interventions to reduce the spread of COVID-19 misinformation. An infodemic observatory analyzing digital response in online social media to COVID-19 has been created by the CoMuNe lab at Fondazione Bruno Kessler (FBK) institute in Italy and is available online. The observatory uses ML techniques based on Twitter® data to quantify collective 335 sentiment, social bot pollution, and news reliability and displays them visually. The end note comes out that the rapid spread of misinformation is undermining trust in vaccines and is crucial to public health [30] . Researchers are trying hard to come up with measures to detect the disease through various methods. One work has been reported in [31] , where the authors have proposed architecture to easily identify the 340 infected state of patient from chest X-ray images. The model aims to classify a patient as a positive or a negative COVID-19 affected person. Table I To make sense of the innumerable tweets being posted in social media per second, a model has been implemented that will successfully recognize the sentiments emoted J o u r n a l P r e -p r o o f Journal Pre-proof through the tweets. Sentiment analysis is one of the best possible methods to be able 350 to derive expressed emotions from unstructured texts by transforming the data into a structured format. The detailed model is illustrated in Fig. 2 The model aims to classify sentiments into positive, negative and neutral scores. Natural language Toolkit (NLTK) library has been used which acts as an appropriate 385 text processor for language dealings. For clarification purposes, the first dataset that contains approximate 2 lakh distinct tweets from December 2019 to May 2020 is named DATA_SET 1 and the second dataset that contains the most retweeted tweets from January 2020 to March 2020 is named DATA_SET 2. Irrespective of the dataset, the first step post crawling of data from Twitter® is to clean the tweets for 390 processing. Initially, duplicate rows or similar tweets are eliminated from the Comma Separated Values (CSV) file containing the tweets. The tweets are then cleaned to eradicate redundant symbols which are generally associated with tweets. Symbols like @, RT, #, URLs, numeric values and punctuation marks are cleaned by using the "re" python module. 395 It is to keep in mind that building a model for classification initially needs finding of relevant features that are present from the text available in the tweets. Hence, while training the model, the tweets can be broken down into words and be appended to the feature vector. If one word is added, the approach is termed as "unigram", for two words it is "bigram" and for three words it is "trigram", respectively. 400 Post the essential cleaning of the tweets, SentiWordNet (SWN) lexical resource has been used which is responsible for assigning sentiment scores to words. The Parts_Of_Speech tagging is done first to make it compatible to be implemented by SWN. The entire sentiment score of the sentence is calculated by summing up the total polarity based on positive and negative scores. A rise in the value of 405 positive_score indicates the high level of positivity of the cleaned tweet. In this case, a cleaned tweet is identified positive if the summation of the sentiment total is more than 0, negative if it is less than 0 and neutral if the value is equal to 0 as shown in Table II . Then the total numbers of positive, negative and neutral tweets are calculated by two methods, viz., TextBlob and Afinn [37] . While TextBlob is an efficient word library to carry out Natural Language Processing tasks, Afinn is a word list which has been specially designed for microblogs like tweets. It is generally observed that 415 TextBlob produces normalized scales in contrast to Afinn [38] . Table III shows the number of positive, negative and neutral tweets labelled by TEXTBLOB and AFINN methods for both the datasets. For DATA_SET 1, there is a maximum number of positive and negative tweets, but to our utter shock, we found out that the number of negative or neutral tweets is remarkably more in the re-tweeted DATA_SET 2 by 420 both the processes that have been used for the same. Fig. 4 shows the categories of positive, negative and neutral tweets labelled by TextBlob and Afinn methods for DATA_SET 1 and DATA_SET 2, respectively. For example, 'studies' and 'studying' will be 'studi' and 'study' for stemming respectively, but for lemmatization, both will be changed to 'study'. The changing of 440 cases of all tweets to lower case, removing of stop words, stemming and lemmatizing processes are done to transform the words to vectors to make it compatible for the various classifiers. The claim of the authors that social media has been unable to provide proper course in which the netizens shall combat a pandemic like COVID-19, has been validated by 445 the WordCloud presented in Fig 3(a) and Fig 3(b) . The majority of the words that have been portrayed in each of the sentiments has been visualized using the WordCloud modules. These too display words that do not prove any efficiency in representing a viable solution during emergencies. To embed a word into the semantic region, the Bag-of-Words vectorizers, namely Count Vectorizer and Tfidf Vectorizer from the sklearn library were used providing the result shown in Fig. 4(a) and 4(b) . This process is crucial as data to be fed in the 460 different models has to be in a mathematical format. Mathematically, fuzzy logic is a type of multiple valued logic in which the truth values range from 0 and 1 [39] . The values might include in themselves both 0 and 1. It had been envisaged to cater to cases which involve partial truths emerging in uncetain situations. In an analogy, the same uncertainty which arises while determining sentiments from texts has been mapped in this approach. Fuzzy rule 515 based approaches depend on the selection of membership functions and its intervals to depict the inherent system fuzziness. It is to be kept in mind that the range of the values of the membership functions should always be within 0 and 1. Though there is a plethora of techniques by which membership functions can be envisaged [40] , the most widespread use is that of the triangular membership function. Apart from the 520 triangular membership function, another membership function that is preferred for its efficiency to model human reasoning is the Gaussian membership function. Gaussian membership function (MF) [41] depends on two of its parameters, the mean of the data and its standard deviation. This method has been implied to present an alternative to the much used triangular membership function. The model implements a fuzzy rule 525 based system to determine the sentiment of a tweet with the help of Gaussian MF. The proposed model models the uncertainty in the sentiment analysis system as a fuzzy system, which is used to predict the nature of sentiments depending on the fuzziness in the positive and negative scores. The fuzzy inputs to the model, viz., positive score and negative score are characterized by the Gaussian membership 530 functions (LOW, MEDIUM and HIGH), whereas, the fuzzy output sentiment is characterized by the Gaussian membership functions (NEGATIVE, NEUTRAL and POSITIVE). The model is guided by a set of seven disjunctive fuzzy rules to determine the output sentiment. The following steps describe the operation of the proposed fuzzy rule based model. Step 1: Limited sized social media texts are retrieved and preprocessed. Step 2: VADER sentiment lexicon is used to label the data into three classes, viz., positive, negative and neutral based on their polarity scores. Step 3: Mamdani [42] style fuzzy inference technique is deployed to process each text. 540 a) The input variables are fuzzified. b) Inference rules are evaluated. The following seven proposed inference rules based on Mamdani fuzzy inference mechanism characterize the model. Table IV portray an improvement in the precision, recall and F-score when a comparison is made between the models under consideration. Further improvement in the comparative results may be effected by incorporating more membership grades in order to enhance the performance of the proposed methodology. Table IV , it is evident from the values of F1-score, Precision and Recall that the proposed Gaussian membership based fuzzy rule base system for determination of the sentiments outperforms its Triangular counterpart. Moreover, the proposed system 640 exhibits less sensitivity towards change of inputs, thereby indicating the stability of the system. In addition, the Mean Absolute Error value also stands as an indicator of the efficiency of the system, although the Log Loss measure is lower for the Triangular counterpart. It may be mentioned here that Log Loss seems to calculate only a comprehensive measurement of the performance and hence is harder to 645 decipher compared to the accuracy of the model. So based on the other available metrics like recall, precision and F-score it can be claimed that the proposed model provides better result in predicting sentiments from tweets. Datasets for this experiment have been obtained through #corona, #covid19, #coronavirus, coronavirus and #covid-19 since the inception of this year 2020. DATA_SET 1 contains around 2, 26, 668 tweets whereas the preliminary tweets which were collected for DATA_SET 2 stood up to 31, 50, 26, 574. But, as 680 mentioned earlier, the tweets with minimum 1000 retweets were considered for this experiment. Post this screening, approximate 23,000 tweets were taken for further processing of DATA_SET 2. Finally to fit the model, the data have been categorized into train, validate and test sets. 90% data from training set, 5% from validation set and rest 5% from the training set have been used. Maximizing the 685 training part is to prioritize the number of tweets in the dataset. To show the accuracy of the implemented model, unigram, bigram and trigram have been performed considering both the vectorizers mentioned in the previous section. N-grams [45] are defined as all the potential combinations of the contiguous words that are present in the tweets. While unigram defines single words, 690 bigrams consider two adjacent words, trigrams consider three adjacent words. After the texts have been transformed into vectors, classification algorithms are executed. It has been observed that for self-created analyser models, there is no perfect classification algorithm that exists. This work contains hyper-parameteric classifiers from Naïve Bayes Models [46], Ensemble models [47] , Support 695 Vector Machine Models [48], Linear Models [49] viz., Multinomial Classifier, Bernoulli Classifier [50] , AdaBoost Classifier [51] , LinearSVC Classifier [52] and Logistic Regression Classifier [53] . The Naïve Bayes classifier models are very effective in making predictions for sentiment analysis and are based on Bayes' Theorem [54] . Examples of Naïve Bayes classifiers dealing with disconnected characteristics include Multinomial Naïve Bayes and Bernoulli Naïve Bayes classifiers. While the former symbolizes occurrences with events produced by a multinomial, the later consists of occurrences which are self-regulating binary variables 710 symbolizing inputs [54] . Ensemble models works on the idea to unite the forecasts of several classifiers in an attempt to choose the optimal solution from multiple classifiers generated for the same problem [55] . The AdaBoost Classifier is one of the most popular boosting enabled ensemble algorithms. The instances within the data are weighted to ultimately classify the predictions 715 which are finally combined to make the final prediction. Support Vector Machine models are mainly used in classification problems where the appropriate hyper-plane is computed to efficiently divide separate classes of data. It provides excellent results in transforming a non separable problem to a separable one based on the labels that have been characterized. In general, 720 LinearSVC classifiers prove efficient for text data classification cases. Linear models on the other hand formulate a forecast by implementing a linear function of the input characteristics. One example of linear model is the logistic regression, which works on categorical data as its target variable. Amongst all these, the challenge is to create a classifier that provides optimum accuracy in 725 the model. Here, K-fold Cross Validation [56] is also used as a resampling technique to check the steadiness of the model. Initially, the Bag-of-Words [57] models have been considered which has been further extended by the more intricate Doc2Vec models. Table V shows the results obtained with all the classifiers by implementing the Bag-of-Words models for DATA_SET 2. Table 730 VI shows the accuracy obtained by all the state-of-the-art classifiers being implemented on DATA_SET 1. In both the tables, Vec_Gram denotes the combination of the vectorizer used along with the n-gram used. For example, cv_2 represents Count Vectorizer [58] with bigrams and tf_1 represents tfidf vectorizer is implemented with the unigram range. For DATA_SET 2 , Logistic 735 Regression Classifier gives the highest accuracy of 75% with bigrams under the Tfidf Vectorizer [58] . In case of DATA_SET 1, the highest accuracy of 81% is obtained through Logistic Regression with trigrams under the Tfidf Vectorizer. But, as the datasets has unequal numbers of positive, negative or neutral tweets, a Random Forest Classifier [59] has also been used to create a balance. 740 J o u r n a l P r e -p r o o f A comparative analysis showing the results of implementation of all the classifiers that gives the best accuracy on each of the Doc2Vec models has been shown in Table VIII for assuring that the model works the same for both the datasets. DATA_SET 1 760 is ten times the size of DATA_SET 2 and yet the model exhibits more or less the same behavior for both the datasets. It is observed that Logistic Regression classifier performs best in all the test cases. Obviously, the time taken for the model to train J o u r n a l P r e -p r o o f DATA_SET 1 is much more compared to the time taken for DATA_SET 2 due to its larger size. 765 Finally, the aforesaid model is estimated on the testing data and the accuracy yields up to 81% and 75% accuracy for DATA_SET 1 and DATA_SET 2, respectively. To consider whether our assumption of the fact that social media is unable to play an important part during this pandemic, we have performed the non-parametric tests on the dataset to validate if they hold any significant outcomes in this work. The non-775 parametric independent t-test [63] yields values as t-value = 2.578 and p-value = 0.035. Now, as the p-value is less than the considered threshold of 0.05, then we can claim that there is noteworthy dissimilarity between the two means of positive as well as negative sentiment values. Hence, our claim holds true, that social media has not been useful enough to help people worldwide during COVID-19 outbreak. J o u r n a l P r e -p r o o f This paper prioritizes the fact that people should be much more aware while spreading information in social media. Precision should prevail over attractiveness. It should be kept in mind that there are other people who are depending on information shared by others to consider as a lead for their well-being. One of the major outcomes of this 785 paper is the establishment of the fact from DATA_SET 1 that people worldwide has shown positive sentiments towards the disease. It may be also mentioned that though the spread of the disease may be gigantic with spanning time, yet people had more or less positive or neutral vibes towards the entire pandemic span until now. From our observation of the re-tweeted DATA_SET 2, it must be stated that people had more 790 negative views while the lockdown were being imposed from March 2020 in most of the countries. The second outcome which can be derived from this work is that people are not sure or specific in the manner in which this disease could be combated which is clearly evident from the huge number of neutral tweets obtained from both the datasets. Tweets from WHO were also analysed but they also failed to provide precise 795 information which could be retrieved to better deal with the disease. On the other hand, the proposed model in this paper based on fuzzy logic was further implemented by Support Vector Machine (SVM) to yield an accuracy of 79%. Without further delay, all governments should deploy Fact checkers in social 800 media to prevent further sharing of unnecessary information for cases which are of such serious concern. Laws can be designed to impose restrictions on sharing false and useless news during emergencies. This work doesn't have the features to attend multilingual tweets, which could be considered as a probable future work in this direction. As an extension to this work, researchers can think of incorporating 805 emotional intelligence on the tweets so that the sentiments of the people can be further explored in a fruitful approach. Emotional intelligence applied on the tweets will also be an advantageous source of taking measures to put appropriate filters on these tweets, so that the old aged or sensitive people (living alone) do not get targeted to diseases like depression and anxiety. Obviously, other fuzzy rule based approaches 810 should be explored to yield better results in identifying sentiments. A very immediate and necessary work could be done by collecting all the available resources and creating an all-in-one repository relating to this pandemic, so that it could be easy for statisticians, researchers, doctors and people worldwide to have a one-stop solution of diseases like the dreaded COVID-19 that has kept the world devastated in the year 815 2020. J o u r n a l P r e -p r o o f Using emotional intelligence to combat COVID-19 anxiety Is the medium the message? Perceptions of and reactions 825 to crisis communication via twitter, blogs and traditional media A new ANEW: Evaluation of a word list for sentiment analysis in microblogs English language tweets dataset for COVID-19 A Survey of Sentiment Analysis from Social Media Data Understanding the Role of Emotional Intelligence in Usage of Social Media The continuing 2019-nCoV 840 epidemic threat of novel corona viruses to global health: the latest 2019 novel coronavirus outbreak in Wuhan, China Trait Emotional Intelligence and Problematic Social Media Use Among Adults: The Mediating Role of Social Media Use Motives When Emotions go Social-Understanding the Role of Emotional Intelligence in Social Network use. Research-in-Progress Papers The role of trait emotional intelligence in 850 gamers' preferences for play and frequency of gaming Understanding the effect of social media marketing activities: The mediation of social identification, perceived value, and satisfaction. Technological Forecasting and Social Change Personality traits and psychological motivations predicting selfie posting behaviors on social networking sites The pandemic of social media panic travels faster than the COVID-19 outbreak Social media and emergency preparedness in response to novel coronavirus Infodemiological study on COVID-19 epidemic and COVID-19 infodemic. Preprints The impact of COVID-19 epidemic declaration on psychological consequences: a study on active Weibo users. International journal of environmental research and public health Using social media to explore the consequences of domestic violence on mental health Sentiment Analysis of Tweets in Saudi Arabia Regarding Governmental Preventive Measures to Contain COVID-875 19 Why do people share fake news? Associations between the dark side of social media use and fake news sharing behavior Coronavirus on social media Analyzing misinformation in Twitter conversations Identifying crisis-related informative tweets using learning on distributions. Information Processing & Management Isolation, quarantine, social distancing and 885 community containment: pivotal role for old-style public health measures in the novel coronavirus (2019-nCoV) outbreak The biggest pandemic risk? Viral misinformation Using social and behavioural science to support COVID-19 pandemic response Transmission Dynamics Model of Coronavirus COVID-895 19 for the Outbreak in Most Affected Countries of the World Systematic literature review on the spread of health-related misinformation on social media Fighting COVID-19 Misinformation on Social Media: Experimental Evidence for a Scalable Accuracy-Nudge Intervention. Psychological science Leveraging Data Science To Combat COVID COVID-19 Detection in Chest X-ray Images using a Deep Learning Approach COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification Sentiment analysis of nationwide lockdown due to COVID 19 outbreak: Evidence from India. Asian journal of 915 psychiatry Worldwide COVID-19 Outbreak Data Analysis and Prediction Tweeters During the COVID-19 Pandemic: Infoveillance Study The impact of COVID-19 epidemic declaration on psychological consequences: a study on active Weibo 925 users A new ANEW: Evaluation of a word list for sentiment analysis in microblogs Twitter sentiment analysis: The good the bad and the omg! Which Membership Function is Appropriate in Fuzzy System? Using Gaussian membership functions for improving the reliability and robustness of students' evaluation systems An experiment in linguistic synthesis with a 940 fuzzy logic controller Computational Intelligence in Agile Manufacturing Engineering A Fuzzy Logic Based Intelligent System for Measuring Customer Loyalty and Decision Making Aspect based sentiment analysis using support vector machine classifier Ensemble methods for classifiers Citius: A NaiveBayes Strategy for Sentiment Analysis on English Tweets An empirical study of the naive Bayes classifier Sentiment Analysis using Logistic Regression and Effective Word Score Heuristic Using Word Embedding and Ensemble Learning for Highly Imbalanced Data Sentiment Analysis in Short Arabic Text Estimation of prediction error by using K-fold cross-validation Sentiment Analysis using Logistic Regression and Effective Word Score Heuristic Evidence in Context: Bayes' Theorem and 970 Investigations Anonymous, retrieved 12 th Ensemble-based classifiers Distributed representations of words and phrases and their compositionality An application of machine learning 980 to detect abusive bengali text Tweet sentiment analysis with classifier ensembles Sentiment analysis on Twitter data with semi-supervised Doc2Vec Distributed representations of sentences and documents Efficiency of SVM classifier with Word2Vec and Doc2Vec models Cross-sectional study design and data analysis A comparison among significance tests and other feature building methods for sentiment analysis: A first study